Hi :)
What is the best solution to deploy a custom flow with some python dependencies?
I know I need a remote storage (mostly git* or docker).
I didn't got the key differences between DockerRun (run_config) or Docker (storage).
For simple python dependencies: should I use the official prefect image and set the python_dependencies via the storage object? So that i don't even need an dockerfile.
Is there an example anywhere? I think it's an absolut basic task...
We use a single machine setup at the moment.
Best regards
Michael
Michael Hadorn
02/04/2021, 1:49 PM
And how do I have to deploy the created image?
z
Zanie
02/04/2021, 5:15 PM
⢠DockerStorage puts your flow in a docker image. DockerRun can use docker storage but can also use other storage so if your flow is using Github storage it'll pull from there then execute it in the docker image you've specified.
⢠For dependencies, DockerStorage with the
python_dependencies
is a simple solution. The downside is that everytime your flow changes you have to rebuild a docker image. If you use a DockerRun referring to an image you've built (use the prefect image as a base and install your own dependencies on top of it) then you can use Github storage to build your flow (which is much faster)
Zanie
02/04/2021, 5:15 PM
If you are using a single machine, you can just use a LocalAgent and install the dependencies on that machine manually as well.
Zanie
02/04/2021, 5:16 PM
If you're using
DockerRun
you would deploy your image using typical docker build/push commands then refer to the tag in your code. If you're using
DockerStorage
you would deploy your image by setting the
registry_url
m
Michael Hadorn
02/04/2021, 5:55 PM
Thank you very much for your answer! š
Ok, if I use different flows with different dependencies, i guess the best solution is a own docker image, which will load the code from github storage.
Are there any example configurations?
Michael Hadorn
02/04/2021, 5:59 PM
And: How to provide all my files (using some classes etc.)
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.