Can I reuse a single Docker storage object for mul...
# prefect-community
a
Can I reuse a single Docker storage object for multiple registered flows?
What I want to accomplish is to have a single image with the dependencies I need, and use that image to execute multiple flows
What I want to avoid is having to create an image for every flow, because it takes a while to build and transmit each image to the docker registry during our devops process
a
Yes, you can, provided that you use
stored_as_script=True
- see examples here https://github.com/anna-geller/packaging-prefect-flows/tree/master/flows_no_build
k
Yes and if you pre-build, just make sure to set
build=False
👍 1
a
This part is important though:
Copy code
if __name__ == "__main__":
    docker_storage.add_flow(flow)
    flow.register(project_name="community", build=False)
a
Thank you, this looks good for our purposes. A couple of questions
1. When executed, the flow will be run from inside of the container, so the custom modules will be available assuming they were added and the python path was adjusted. Am I interpreting these lines correctly? https://github.com/anna-geller/packaging-prefect-flows/blob/master/flows_no_build/docker_script_kubernetes_run_custom_ecr_image.py#L17-L18. It seems to me that the custom modules must also be available at registration time, because the flow's python module must be executed to register the flow
2. This pattern assumes that the image already exists and is in a repository. When does this image get built? I've only ever done
flow.register()
which builds the image in what seems like a prefect context, because there are several Docker build steps relate to prefect that we didn't define in our Dockerfile.
flow.register()
also couples the registration and image building, and this demo app shows how to do registration--but how is the image built?
Is my thinking accurate on #1?
k
Yes on 1 because the Flow file is evaluated so the imports are done. If
build=False
, you handle image building and uploading. If
build=True
, then we upload to the registry for you
a
@Adam Roderick correct, I usually do:
Copy code
pip install .
from the root project directory before I register the flow. But you can also move it to the task that needs that import if you don't want to do that. I kind of assumed you have the same dependencies in your local development environment so that you can first run your flow locally before building a Docker image and deploying your flow to production/Prefect Cloud.
on #2 - in order to use the same docker image for multiple flows, you really don't use the "typical" Docker storage registration which pickles your flow at registration, builds the image, and pushes it to the registry. Instead, you build the image yourself. I explained that in the docstring here:
Copy code
To use flow script files in Docker, we need the following arguments to be set properly:
- `stored_as_script` = True
- `path` must point to the path in the image
- if this path is different from/opt/prefect/flows/YOUR_FLOW.py,
  then you also need to change the default `prefect_directory` to your custom directory
- you need to add the flow to the storage explicitly before registering, e.g.: docker_storage.add_flow(flow)
- finally, registering the flow must happen from Python, since the `prefect register` CLI doesn't have the option
  to pass build=False, and this is critical to include to prevent pickling flow and rebuilding the image on the registration
- the parameter `ignore_healthchecks` on the Docker storage is optional and doesn't affect this process at all
example build commands are here https://github.com/anna-geller/packaging-prefect-flows/blob/master/aws_docker_build_commands.bash LMK if you have any more questions about it
a
Thanks for the info. This is enough to get us going
👍 1