Hey Folks! We would like to use prefect as our mai...
# prefect-community
а
Hey Folks! We would like to use prefect as our main workflow manager. However, it is not completely clear how library versioning is managed. Let's say I would like to use an old version of pandas in one flow and a new one in another. Is there any way to integrate different venv's into each flow? How do you solve this problem?
a
If you’re using Docker storage you can list dependencies and their versions. Storage is attached to Flow, so for each flow you can set the specific pandas version as a dependency
upvote 1
n
Hi @Алексей Филимонов - this is a great question and is a primary value prop of Prefect. Prefect allows you to attach both storage and environments to your flow; storage provides a way to define where and how your flow is stored and environments specify how and where it should be run. Together these give you complete control over your flow's runtime environment, including the versioning of any of your dependencies. So, to expand on @ale’s answer, you could use Docker storage to define your dependencies out of the box like this:
Copy code
from prefect.environments.storage import Docker

storage = Docker(python_dependencies=["pyodbc"])
or you could define a completely custom base image and use that instead:
Copy code
storage = Docker(dockerfile="/path/to/Dockerfile")
... giving you complete control over your flow's dependencies and letting you re-use (or not!) images between flows. Here are some resources to help with this: • Configuring Docker storageExecution overviewStorage optionsDeployment considerations (includes a section on dependencies) • Multi-flow storage Hopefully these can help!
👍 2
а
@ale @nicholas Thank You for your answers! Sounds great! But how can I use the perfect cache? Suppose I would like to cache the result of executing a function (task) using cache validators, will this work when using Docker?
a
I have not used caching mechanism yet, so I’ll leave to Prefect Team to handle your question 🙂
🙏 1
n
Yes! Results can be cached/stored external to your container for use in other flows or in retry scenarios. For more information on that, check out the Caching and Persisting Data docs 🙂
@Marvin archive "Flow environment management"