Алексей Филимонов

    Алексей Филимонов

    1 year ago
    Hey Folks! We would like to use prefect as our main workflow manager. However, it is not completely clear how library versioning is managed. Let's say I would like to use an old version of pandas in one flow and a new one in another. Is there any way to integrate different venv's into each flow? How do you solve this problem?
    ale

    ale

    1 year ago
    If you’re using Docker storage you can list dependencies and their versions. Storage is attached to Flow, so for each flow you can set the specific pandas version as a dependency
    nicholas

    nicholas

    1 year ago
    Hi @Алексей Филимонов - this is a great question and is a primary value prop of Prefect. Prefect allows you to attach both storage and environments to your flow; storage provides a way to define where and how your flow is stored and environments specify how and where it should be run. Together these give you complete control over your flow's runtime environment, including the versioning of any of your dependencies. So, to expand on @ale’s answer, you could use Docker storage to define your dependencies out of the box like this:
    from prefect.environments.storage import Docker
    
    storage = Docker(python_dependencies=["pyodbc"])
    or you could define a completely custom base image and use that instead:
    storage = Docker(dockerfile="/path/to/Dockerfile")
    ... giving you complete control over your flow's dependencies and letting you re-use (or not!) images between flows. Here are some resources to help with this: • Configuring Docker storage Execution overview Storage options Deployment considerations (includes a section on dependencies) • Multi-flow storage Hopefully these can help!
    Алексей Филимонов

    Алексей Филимонов

    1 year ago
    @ale @nicholas Thank You for your answers! Sounds great! But how can I use the perfect cache? Suppose I would like to cache the result of executing a function (task) using cache validators, will this work when using Docker?
    ale

    ale

    1 year ago
    I have not used caching mechanism yet, so I’ll leave to Prefect Team to handle your question 🙂
    nicholas

    nicholas

    1 year ago
    Yes! Results can be cached/stored external to your container for use in other flows or in retry scenarios. For more information on that, check out the Caching and Persisting Data docs 🙂
    @Marvin archive "Flow environment management"
    Marvin

    Marvin

    1 year ago