Jason Raede

    Jason Raede

    9 months ago
    Hi yall. Working through deploying our first flow to production. We have several flows in a repository and some shared modules that are imported into each flow that contain tasks and other utility code. From the documentation it seems like the only way to ensure that the shared modules are available at flow execution time is to package them up in a Docker image and use a Docker/ECR/K8S agent. That feels a little heavy - is there any way to package up dependencies like that during pickling? Folder structure is below. The flow needs access to stuff in
    src/tasks
    and the tasks need access to stuff in
    src/utils
    src
    flows
    ▪︎
    my_flow.py
    ▪︎
    my_other_flow.py
    tasks
    ▪︎
    shared_task_1.py
    ▪︎
    shared_task_2.py
    utils
    ▪︎
    shared_lib_1.py
    Thanks!
    Kevin Kho

    Kevin Kho

    9 months ago
    Hi @Jason Raede, there is none because
    cloudpickle
    just very recently added support for deep copying of modules. I think we’re unsure if it’ll work at the moment, but it might be possible eventually.
    Jason Raede

    Jason Raede

    9 months ago
    OK, so recommendation for now is
    DockerStorage
    + one of the docker agents?
    Or can docker storage work with a local agent?
    Kevin Kho

    Kevin Kho

    9 months ago
    Docker storage needs one of the Docker agents I think. But I wanna mention the possibility that you can also use DockerRun + Github Storage/S3 Storage. If your container has all the dependencies and they don’t really change, you can just specify your image and run your Flow on top of it
    Jason Raede

    Jason Raede

    9 months ago
    Got it. Ok, this is helpful, thank you!
    k

    Kirk Quinbar

    8 months ago
    @Jason Raede I have pretty much this same setup and am trying to figure out the best way to deal with the dependent python files. What did you end up doing to solve your issue?