I'm running into a bit of a catch 22. I am trying ...
# ask-community
m
I'm running into a bit of a catch 22. I am trying to set up a CI process that registers the flows. Here is an example
Copy code
import pandas as pd 
from prefect.storage.docker import Docker

# ... task definitions 

with Flow('test_flow', storage= Docker(
            registry_url=ecr_registry_url,
            image_name=a_repo_name,
            python_dependencies=["python==1.2"])
) as flow:
   
    # ... all the steps
I then to to execute
prefect build -p path/to/file.py
and it throws an error that pandas is not installed (which it isnt)
Copy code
ModuleNotFoundError: No module named 'pandas'
Is there a way to register a flow, without having to install the flow's dependencies first?
k
Hi @Marc Lipoff! Can I have a bit more detail? Wouldn’t it be a matter of installing the dependencies in the Docker container?
Or you specifically want to register without the dependencies?
m
Correct. I made my example a little more clear
The "Docker" storage does define that I want python==1.2 as a dependency.
đź‘€ 2
k
I’ll try something and get back to you
👍 1
No unfortunately, I don’t think this can be done because there is some serialization that happens so those libraries are needed and the flow is also checked.
m
Thats unfortunate. Appreciate you checking
k
@Marc Lipoff I was wrong. What you need to try is deferring the import inside a task. This will defer the execution.
m
ah ok...
k
Second, for a more accurate explanation, Prefect runs the flow script to serialize the flow metadata for registration.
z
I know some users also
try/except ImportErrors
at the top so they can get their flow to register without having the modules available. This is kind of a limitation of Python -- we need to be able to import the flow from that file which requires running the whole file.
k
Last thing is that you may need to change the Docker storage to
stored_as_script=True
in Docker storage so that the flow doesn’t get serialized.
m
I'll give it a try. So far, I'm having a few (circumventable) problems, with typing. For example, one function takes df: pd.DataFrame as an input
I think at approach is going to be a bigger headache than its worth. For example, without the imports, I cant use GreatExpectations tasks
k
Oh ok yeah that’s gonna a bit too hard to circumvent