Hey all, we are wondering how to add a custom pyt...
# prefect-community
r
Hey all, we are wondering how to add a custom python package or module as a dependency to our prefect flow. We have tried different options but all fail? Any suggestions? 🙂
✔️ 1
f
Hi i think you should make a python package in order to add dependencies
b
Or, you can put your your custom module in your python environment path ( site_packages directory) .
s
@fabian wolfmann This was our first thought, but it doesn't really work when the python package is developed alongside the Prefect flow (as in, in the same repository). That is, unless, you want to push your changes to the repo every time you want to test if it will work in the flow. But in that case I think the development speed would suffer considerably 😕
@bral Would you mind explaining this a bit more? From my understanding, during the docker build, Prefect will try to pull the required packages from e.g. pypi. So the local python environment is not copied over into the docker image.
f
im creating a python package, you can create a
setup.py
with Prefect as dependency and make
pip install -e .
with -e flag you say its editable package and it updated every time you call your package, im working like this and have no issue
s
@fabian wolfmann Thanks a lot for the feedback. But I guess I must be still missing something obvious, since our package is already installed in editable mode. But when the docker file is built it gives a "module not found" error during the health check. In comparison, simply running the flow in a local executor works without an issue
f
yes because docker will pull the python packages from pypi not local, you should try look for how to install local packages on docker image
b
@fabian wolfmann so when you are writing flow, you can store all your custom plugins/module in directory with project. After when flow deployed ( registered on server) you need add dependency of your package to environment where agent work, and if you use dask cluster - to environment for every worker too ( if multiserver cluster). In your case you can add your local pypi server in pip.conf inside docker image
r
Thanks already for the discussion! 🙏 Good point, that we have to install the package in the two parts. It's still not clear to me: 1. How to install the custom package in docker using
pip install -e .
? 2. How to install the custom package in the prefect environment (a dask-kubernetes cluster on AWS EKS in our case)?
Copy code
In your case you can add your local pypi server in pip.conf inside docker image
@bral , could you elaborate on this?
Concerning 1: Do you mean something like the following discussion, when refering to
pip install -e .
? https://stackoverflow.com/questions/44708481/how-to-install-local-packages-using-pip-as-part-of-a-docker-build
b
1. First install python and pip in your Dockerfile. 2. Copy your pip.conf using COPY directive . https://hackernoon.com/custom-python-pypi-repository-409f14975374
r
Thanks a lot! We will try it out tomorrow, greetings from Europe 🙂
@Severin Ryberg [sevberg] implemented an elegant solution that solved the issue for us: https://github.com/PrefectHQ/prefect/pull/3299
s
That's only half of the solution, actually 😜 . The other half depends on this, though, so I'll submit it once this change is accepted. I'll make a note here when its all in
Hello again everyone. In case you are interested, two new PR's have been merged into the latest Prefect version. The first (PR3299, mentioned above by Robin) allows entire directories to be copied into the Docker image's build environment. The second (PR 3342) allows for injecting arbitrary Docker commands into the build via the new "extra_dockerfile_commands" argument. Therefore, to add a local python module to your Docker image, now one only needs to add the module's directory to the "files" dictionary, and then add the following argument:
extra_dockerfile_commands=["RUN pip install -e /path/to/module/in/image", ]
🚀 1
r
PS: Actually, I guess that
-e
does probably not make so much sense and might break some python packages (e.g.
snowflake-connector-python
) in some cases.
s
I suggested the "-e" flag to avoid having a double copy of the python module in the docker image. If you didn't want to use it then you should additionally use a "RUN" command to remove the first copy. Anyway, regarding the issue of installing the snowflake connector, this is normally something you would install via a normal pip install command anyway (without copying any files from the host before hand), so I don't think this is a big concern
👍 1