Hello Team,
Has anyone had a chance to test/investigate packaging a Flow with multiple .py files (or in hierarchical package structure) to be executed in Docker Storage?
I’m trying to avoid adding extra parameters to the Docker Storage when writing a flow; I’d rather have my Flow packaged up as a python package and copied to the docker image which will later used by Prefect to execute the Flow.
Copy code
with Flow("Example-Flow",
storage=Docker(
image_name="example_flow_import",
image_tag="dev",
dockerfile="<path/to/dockerfile>",
python_dependencies=["numpy", "pandas"],
files={}, # Trying to avoid this
env_vars={}, # Trying to avoid this
)) as flow:
k
Kevin Kho
07/27/2021, 5:24 PM
Hey @Mehdi Nazari, I have a demo of a Python package added to the Docker file here. I think this might help.
m
Mehdi Nazari
07/27/2021, 5:36 PM
Thanks @Kevin Kho, So looking at the code, it appears it is possible to have a Local Agent run a docker image as an isolated execution environment for the flow?
In other words, I don’t necessarily have to bring up a Docker Agent for a docker image to be able to execute?
k
Kevin Kho
07/27/2021, 5:38 PM
The Storage is
Local
but the RunConfig is
DockerRun
, which makes the file paths relative to the container. This would still use the Docker Agent as opposed to the Local Agent to spin up the container.
m
Mehdi Nazari
07/27/2021, 6:17 PM
I’m getting an error testing your code; Do I need to install the python package on my local env in order for it to be registered?
k
Kevin Kho
07/27/2021, 6:18 PM
What is your error? I guess that would help if you are running into import issues.
m
Mehdi Nazari
07/27/2021, 6:22 PM
Yeah import issues I belive; this was after
Copy code
prefect register
k
Kevin Kho
07/27/2021, 6:23 PM
i think this will go away if you register from the project root.
python workflow/flow.py
m
Mehdi Nazari
07/27/2021, 6:28 PM
Still there…
k
Kevin Kho
07/27/2021, 9:32 PM
Oh sorry missed this. You could
pip install -e .
the package to install it. What IDE are you using?
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.