Another quick question, but I've decided to split ...
# prefect-community
r
Another quick question, but I've decided to split my prefect tasks into multiple python files to make it a lot more organized and modularized. When I submit the flow to the Dask executor, I get a
ModuleNotFoundError
. Do I need to dockerize the flow and specify python paths for this to work? Thanks in advance! My folder tree looks something like this
Copy code
├── alternative_data_pipelines
│   ├── thinknum
|.      ├── __init__.py
│   │   └── thinknum.py
│   └── utils
│       ├── __init__.py
│       ├── logging.py
│       ├── snowflake.py
│       └── utils.py
├── requirements.txt
├── setup.py
└── thinknum_flow.py
d
Hi @Riley Hun, Welcome! Your Dask workers need access to the same dependencies as your flow in order to run correctly. Dockerizing your flow and installing the
alternative_data_pieplines
package as part of your image build process is probably the best way to achieve this 👍
upvote 1
r
Got it! Thanks @Dylan. Apologies - kind of a dumb question - but do I need to update the docker image for the dask workers? Or is adding all dependencies in the docker image of the flow sufficient?
d
No such thing as a dumb question!
By default, Prefect configures Dask to use the same image in your Flow’s Docker storage for all workers 👍
r
Nice - thanks @Dylan! I'll keep tinkering around with things. One last quetion, if I may, is it acceptable to set environmental variables inside the docker image to later be used as secrets? I don't have Prefect Cloud installed - I'll be using Prefect Server deployed on a GCP VM. Or, do I need SSH into the VM, pull up the TOML config file and set the passwords there?
d
You can definitely use Environment Variables as secrets, we even have a secret class for that! https://docs.prefect.io/core/concepts/secrets.html#environment-variables
👍 1
b
I have a same issue, but with local agent. The Flow has dependency from my classes, and after successfully registering and running - i got error " not module named". And it solved If i placed directory with dependency in my environment (C:\ProgramData\Anaconda3) For example in airflow there is plugin directory for this case. Does prefect has same option ?
a
@Dylan on your above suggestion of installing the package? How can one do this? I’m quite new to python packaing and I also have multiple files I would need to copy/install
d
Essentially, you’ll make a package in a directory next to your flow and then install that package during the build step for your container
a
Thanks for the link Dylan. I’m already using packages and modules in my code, but I’m not sure how to do the “install package” part…
Do I need to create a wheel and all that?
d
Hey @Adam you don’t need to worry about creating a wheel; you just need to create a directory structure of python files with a setup.py and call pip install . in that directory
a
Thanks @Dylan. Do you have a full example of something like this? That would be super helpful! Especially where the copying the files into the Docker storage are concerned (currently I copy them individually but I’m sure there’s more elegant ways to copy the entire package etc)
d
Hey @Adam I’m actually going to write a blog post about just this topic 👍
I’ll post in the #announcements channel when it’s out!
a
Amazing!
r
I am also stuck with the "ModuleNotFound" error in dask. Waiting for the blog post!