Hi all, I’m a new Prefect user setting up a serve...
# prefect-community
Hi all, I’m a new Prefect user setting up a serverless orchestration stack over AWS. The ETLs my company runs sometimes download and process terabytes of data so we’re setting up our ephemeral containers to use Amazon’s Elastic File Storage to flexibly scale according to needs. The actual flow storage is a Github Storage Block. The problem is we don’t really understand how Prefect sets up the Github storage block, and hence where it downloads data from an ETL flow. We need to know the latter so we can mount EFS directly to that directory. The ETLs are set up to store data in a
directory of the parent directory enclosing the
directory - e.g. if flows live in
, store data in
. But it’s not clear if
itself is copied over by Prefect or where it would live in absolute terms if so —
? Any advice or wisdom from the community?
I have some flows that download files to the root of my repository and I use git storage as well. your current working directory when the flow runs should be the root of the repo, so it should be safe to point to files assuming that from within your code. sorry if that doesn't answer your question effectively, your use case seems more complex than mine
OK, that makes sense. How is the repo itself stored on a container Prefect spins up?
the default behavior of the github storage block is to clone the repo into the present working directory
it's probably opt/prefect/ since that's the entrypoint for the image but I'm not entirely sure
Based on your feedback and this issue https://github.com/PrefectHQ/prefect/issues/2861 I’m thinking the same
so then
Thank you! this is very helpful
I’ll report back once I get it working
awesome! np