https://prefect.io logo
Title
c

Christopher Chong Tau Teng

11/29/2021, 8:09 AM
Hi there. I am trying to setup
DockerRun
but am faced with issue importing my flow task from another module — when I run my flow registration script below, it complains that
src
module is not found. I think I am doing something wrong because I am confused on how DockerRun works. I have my Python tasks defined as
task_test_flow
which reside in a different python module. I then built a docker image of that python module into
<http://gcr.io/some-bucket/prefect-flows:v1|gcr.io/some-bucket/prefect-flows:v1>
which I defined inside
DockerRun(image='')
here. Then I created the following registration script where I believe I need to import the task (as done so in https://docs.prefect.io/orchestration/flow_config/docker.html#dependency-requirements). But here’s the problem,
task_test_flow
is defined in the docker image
<http://gcr.io/some-bucket/prefect-flows:v1|gcr.io/some-bucket/prefect-flows:v1>
, and its not located in the same directory as this registration script below… I guess my question is, as I want to define my task separately from my flow registration script, how can I import my task (that is already built into a docker image) into my flow registration script? Do I even need to do that in the first place?
from datetime import timedelta
from prefect import Flow
from prefect.schedules import IntervalSchedule
from prefect.storage import GCS
from prefect.run_configs import DockerRun
from src.test_flow import task_test_flow


schedule = IntervalSchedule(interval=timedelta(minutes=1))

with Flow("test-flow", schedule) as flow:
    task_test_flow()

flow.storage = GCS(bucket="some-bucket")
flow.run_config = DockerRun(image="<http://gcr.io/some-bucket/prefect-flows:v1|gcr.io/some-bucket/prefect-flows:v1>")
flow.register()
I figured from https://github.com/kvnkho/demos/tree/main/prefect/docker_with_local_storage that my tasks scripts need to be in the same parent directory as the flow registration script so that import works, is that right?
a

Anna Geller

11/29/2021, 2:30 PM
@Christopher Chong Tau Teng sorry for late reply, I answered your last message and didn’t recognize this one in the morning. You need to use the same Prefect version and other dependencies in the registration environment, as those that you want to use in the execution environment, here: a Docker container. The easiest way to solve it would be to create an installable module with setup.py and install it in: 1. Virtual environment from which you register your flows 2. Add it in a Dockerfile to be installed on Docker image build. As long as your custom package is installed into those environments, you don’t need to worry about manually modifying PYTHONPATH. To give you more resources: • this blog post explains how to build such custom package and docker image https://medium.com/the-prefect-blog/the-simple-guide-to-productionizing-data-workflows-with-docker-31a5aae67c0a • this repository provides a similar example - you can use the Dockerfile from here as a template: https://github.com/anna-geller/packaging-prefect-flows/
c

Christopher Chong Tau Teng

11/30/2021, 2:38 AM
Thanks for your reply!
🙌 1