Hi, I would like to reiterate this question :sligh...
# best-practices
b
Hi, I would like to reiterate this question šŸ™‚ https://prefect-community.slack.com/archives/C03D12VV4NN/p1671798107554319
m
If your parquet file is small, you could store it directly with your code in storage (S3, github, etc). If the parquet file is too big for that, you could grab it from a bucket at run time.
As for environment variables, that depends on your execution infrastructure. We use ECSTask blocks to execute all of our flows, so the environment variables are configured within the block (which gets translated into an ECS task definition in AWS).
Also, blocks are worth looking at as an alternative to environment variables. You can store configuration, secrets, or whatever you need as a block and read it in at run time.
b
it seems in my case, since I use DVC for all the things and Minio as an S3-like backend, I'll have to use DVC's Python API to get the files that I need.
What happens when a prefect flow depends on other modules? @Mike Grabbe
it's a python library that I wrote..
m
Ive dealt with these situations in two different ways so far: ā€¢ for specialized modules, you can include them as a child directory alongside the flow script, and directly import your code ā€¢ for generalized modules, bundle your code into a python package and install it 1) locally, and 2) on all infrastructure running prefect flows
šŸ‘ 1
b
what if I just change the PREFECT_API_URL in Python and run it anywhere? It seems possible to have GitLab CI schedule these flows as a job. I am asking because it just seems the most straightforward way to run updated code. I'm not really seeing how the distribution of the python package would happen