Hi Prefect team! Can someone pls confirm or clarif...
# prefect-community
f
Hi Prefect team! Can someone pls confirm or clarify the following understanding that I have about how Prefect works with Docker: • I can run an app inside a docker container, any dependencies needed, e.g. pandas library, will be installed inside that container. Our repo already has a Dockerfile and a docker-compose.yml which build an image and run the app inside a container successfully. Necessary commands to do that are
docker-compose build app
, then
docker-compose run app
.  • Prefect Agent runs flows. The Local Agent runs flows with the local setup. This is not feasible for production since we’d need to manually make sure that whichever machine is hosting the flow has all dependencies installed. Better: run flows within a docker container. This way we don’t have to worry about dependencies, since every machine will simply be running the container. To run flows in containers we need to use the Docker Agent. • The Docker Agent DOES NOT REQUIRE DOCKER STORAGE, but can work with Docker Storage if desired. The Docker Agent can run with a locally available image (image="example/image-name:with-tag"). If Docker Storage is used, you can provide a url (
<http://registry_url|registry_url>
)to where your image is hosted, e.g. Docker Hub, and that url is used to get the image. • I should be able to run the Docker Agent with the image I had previously created with docker-compose. (It doesn’t seem to be able to find the image, even though I specify the absolute path. )
z
• You can run flows in docker containers using the ECSAgent or KubernetesAgent as well as the DockerAgent. • The DockerAgent can use DockerStorage or the DockerRun config and a file-based storage • I do not believe you can pass a registry-url to the Docker Agent, where do you see that? • What image did you create with docker-compose that can’t be used?
f
@Zanie thank you for replying! The registry-url is a param for Docker Storage, not Agent. Sorry if that was confusing! Is it right that the Docker Agent can run flows with a local image of the app? I created an image of the whole app, folder structure looks like this:
Copy code
App
-->app_logic
-->prefect_flows
Dockerfile
docker-compose.yml
I think I’m not fully understanding if the image refers to the image of the app? But what else could it refer to?
z
Yes the docker agent should be able to do that. Can you give the
DockerRun
config you’re setting on your flow, the log showing the image name in your local docker host, and the log of the agent failing to find the image?
💡 1
f
@Zanie Ok that’s good to know!! Sure! Let me share the screenshots and the code here:
Copy code
flow.run_config = DockerRun(
        image="/var/lib/docker/concierge_iro_reporting_app:latest"
    )
    flow.executor = LocalDaskExecutor(scheduler="threads")
    flow.register(project_name=project_name)
The registration works, but when I try to run it it’s just stuck in pending mode. When I check the agent it just seems to be waiting for flow runs. What am I missing?
z
For the image you should be able to just do
concierge_iro_reporting_app:latest
However, the agent failing to pick it up is a different issue
Ah — your flow storage is just local so it’s putting your hostname as a label on the flow (it’s only available on your machine). When the docker run image tries to load the flow it won’t be able to access it. You’ll need to setup flow storage somewhere the docker container can access it (e.g. S3, Github, etc.)
f
@Zanie Oh ok! To confirm: to use the Docker Agent I must have Prefect Storage set to something other than local? Does Prefect Storage host the image of my app too or do I need to host that somewhere else, or can I use my local image?
z
Prefect Storage just tells us where you want to store the flow in your own infrastucture
Local stores it on your local file system which cannot be accessed by a running docker container (unless you setup mount points or copy the flow into the docker image)
If your python module is installed in your docker image than you can use the path to the flow in your module e.g.
Local(path="my_<http://module.my|module.my>_flow")
or you could set the path for your local storage to somewhere known in your file system and copy it into your image (for this you need to pass
stored_as_script=True
)
f
@Zanie Oh ok! I thought Prefect Storage is only needed for the cloud orchestration, I didn’t realize it’s part of core as well! I’m trying to set storage up using the instructions here https://docs.prefect.io/core/idioms/file-based.html#file-based-docker-storage , the image I’m using already has my flow files added into it, so then this is my updated code:
Copy code
flow.run_config = DockerRun(
        image="concierge_iro_reporting_app:latest",
        labels=["docker"]
    )

    flow.storage = Docker(
        path="prefect_flows/flows/generate_report_lt_target.py",
        stored_as_script=True
    )
    flow.executor = LocalDaskExecutor(scheduler="threads")
    flow.register(project_name=project_name)
It doesn’t seem to be able to locate the file with the flow, see error message in screenshot. When I run my image and check if the files exists it returns true though:
Copy code
>>> os.path.isfile('prefect_flows/flows/generate_report_lt_target.py')
True
What am I missing?
z
Use
Local
storage instead of
Docker
storage there and you’ll have to make sure that path is consistent both locally and within your docker image.
And note that you are using orchestration. Just
core
is when you’re using
flow.run()
but you’re registering your flow to a backend.
Honestly, your life would be a lot easier here if you just used one of the remote storage backends (e.g. S3, Github) or used
Docker
storage with your
Dockerfile
as a base file instead of using
docker-compose
marvin 1
💡 1
f
I didn’t realize using the Dockerfile as base file would make things easier!! I’ve made sure to label the agent and flow accordingly and everything works now as it should! marvin (I just need to figure out how to set environment variables but I think that was somewhere in the docs) One of the next steps will be to use remote storage so there might be Qs coming up in the future 😅 Thank you for all of your help @Zanie!!
z
Wonderful! To use remote storage just set the registry url in the docker storage and you should be good!
🚀 1
🙏 1
Docker storage just takes an
env_vars
dict for that part
🙏 1