Another quick question, but I've decided to split ...
# prefect-community
Another quick question, but I've decided to split my prefect tasks into multiple python files to make it a lot more organized and modularized. When I submit the flow to the Dask executor, I get a
. Do I need to dockerize the flow and specify python paths for this to work? Thanks in advance! My folder tree looks something like this
Copy code
├── alternative_data_pipelines
│   ├── thinknum
|.      ├──
│   │   └──
│   └── utils
│       ├──
│       ├──
│       ├──
│       └──
├── requirements.txt
Hi @Riley Hun, Welcome! Your Dask workers need access to the same dependencies as your flow in order to run correctly. Dockerizing your flow and installing the
package as part of your image build process is probably the best way to achieve this 👍
upvote 1
Got it! Thanks @Dylan. Apologies - kind of a dumb question - but do I need to update the docker image for the dask workers? Or is adding all dependencies in the docker image of the flow sufficient?
No such thing as a dumb question!
By default, Prefect configures Dask to use the same image in your Flow’s Docker storage for all workers 👍
Nice - thanks @Dylan! I'll keep tinkering around with things. One last quetion, if I may, is it acceptable to set environmental variables inside the docker image to later be used as secrets? I don't have Prefect Cloud installed - I'll be using Prefect Server deployed on a GCP VM. Or, do I need SSH into the VM, pull up the TOML config file and set the passwords there?
You can definitely use Environment Variables as secrets, we even have a secret class for that!
👍 1
I have a same issue, but with local agent. The Flow has dependency from my classes, and after successfully registering and running - i got error " not module named". And it solved If i placed directory with dependency in my environment (C:\ProgramData\Anaconda3) For example in airflow there is plugin directory for this case. Does prefect has same option ?
@Dylan on your above suggestion of installing the package? How can one do this? I’m quite new to python packaing and I also have multiple files I would need to copy/install
Essentially, you’ll make a package in a directory next to your flow and then install that package during the build step for your container
Thanks for the link Dylan. I’m already using packages and modules in my code, but I’m not sure how to do the “install package” part…
Do I need to create a wheel and all that?
Hey @Adam you don’t need to worry about creating a wheel; you just need to create a directory structure of python files with a and call pip install . in that directory
Thanks @Dylan. Do you have a full example of something like this? That would be super helpful! Especially where the copying the files into the Docker storage are concerned (currently I copy them individually but I’m sure there’s more elegant ways to copy the entire package etc)
Hey @Adam I’m actually going to write a blog post about just this topic 👍
I’ll post in the #announcements channel when it’s out!
I am also stuck with the "ModuleNotFound" error in dask. Waiting for the blog post!