Hello all, I am in the process of migrating over c...
# prefect-community
m
Hello all, I am in the process of migrating over code to work with Prefect. All of our existing code is in docker containers. My assumption is that I would need a docker-agent to run the code once I rebuild the containers to have prefect inside of it. Is this correct? We are looking for the ability to restart different parts of the ETL task upon failure like prefect provides. Am I correct? Thank you all in advance!
z
Hi @Matthew Blau — there’s a few ways to do this. You can use
Docker
storage for your flows and package them with your exiting code then use a docker agent to run them. Alternatively, you can use file based storage (e.g. S3 storage) for your flows and use your existing code as a base docker image. In that case, you’d set up a
DockerRun
run config for your flow that runs it in your base image (or another run config that uses a docker image e.g.
ECSRun
)
m
Hi @Zanie, all of our docker images are local to the server that prefect is installed on. I was looking into the prefect base image with our existing code installed inside of it. That would accomplish having the tasks exposed to the Prefect UI, yes? with @task on the relevant parts of the code.
z
You’ll have to chain the tasks with
Flows
then when you register the flow use
Docker
storage with a reference your
Dockerfile
(or add your code using
extra_dockerfile_commands
)
m
My current understanding (which could definitely be flawed), is that I should take the existing dockerized code, modify the dockerfile to use prefect as a base image, e.g. from prefect-python-3.6 or similar, add @task decorators where relevant, and a flow.register() call at the end of the file as our software is fairly straightfoward of API call -> create CSV -> load data into database
z
So when you call
flow.register()
the default storage method is to take the flow object and pickle it on the local file system. When you then try to run your flow, it will not know that it should be executed in a docker container. This is why you need to specify
flow.storage = Docker(…)
before registration. You can pass the path to your
Dockerfile
in.
m
@Zanie Ah okay, so I can pass the path in, it will then build the container if it needs building, and from there I am able to run the flow from the UI?
z
Yep!
m
@Zanie this is the relevant documentation I should be looking for this, correct? https://docs.prefect.io/orchestration/agents/docker.html#requirements