https://prefect.io logo
Title
j

John

03/29/2023, 11:27 PM
I'm wondering what's the best practice for Docker + Prefect. I have a custom Docker image for running my code. Prefect is installed separately using conda, and the prefect agent is daemonized with systemd. However, since the conda environment doesn't have anything other than Prefect, when building a deployment, I get:
Script at '/path/to/file.py' encountered an exception: ModuleNotFoundError("No module named 'numpy'")
.
r

Ryan Peden

03/29/2023, 11:36 PM
If you install conda when you build the image, that might be a good time to install numpy as well so it's always there when you need it
j

John

03/29/2023, 11:54 PM
Thank you. So it sounds like I should have all the python dependencies in both the Docker image and in the python environment (conda in my case) in which Prefect is installed? Is there a better option to avoid having duplicate packages/environments to save disk sapce?
r

Ryan Peden

03/30/2023, 1:04 AM
I don't think you need them in both places. A couple of options that might work: • When you cbuild the Docker image and install conda, also pre-create a conda environent with the dependencies you'll need when running Prefect. Then, when you run a container using the image, everything will be ready for you. • Or, you could add a step to the Dockerfile that sets up a shared package cache for Anaconda and then installs
numpy
and any other heavy dependencies in conda's
base
environment. Then, when you 'install' numpy at runtime so Prefect can use it, instead of re-downloading the package, it will just pull it from the local cache
🙏 1
j

John

03/30/2023, 3:04 AM
Some clarification questions for option 1: Do you mean create a conda environment (containing Prefect) inside a docker image or outside of docker? • If the conda environment is inside a docker image, that means I'll have to run Prefect terminal commands within docker image bash terminal, and use docker to daemonize prefect agent. • If it's outside docker, then you suggests somehow the conda environment can be shared with a docker image? I didn't know this is possible.
r

Ryan Peden

03/30/2023, 3:28 AM
ahh - my apologies, I think I misunderstood what you're trying to do. Your first message was correct. You'll need your dependencies in both places. You need the dependencies installed wherever you run your deployment creation because as part of creating a deployment, Prefect imports the deployment's Python flow file so it can gather information about your flow function's parameters. And when Prefect imports the file, Python sees
import numpy
and tries to import
numpy
as well, then complains when it it missing.
👍 1