Andreas Tsangarides
10/06/2021, 12:04 PMstored_as_script
argument in Docker() along with path
, but cannot see how I can avoid building and pushing an image for each flow that way...Anna Geller
registry_url
, then the image will be built, but it won't be pushed. The image will only live on the local machine you ran the registration step.
You can also explicitly specify `push=False`:
built_storage = flow.storage.build(push=False)
# this gives you a dictionary of flows and paths within the image
built_storage.flows
# ex. {"your-flow": "/root/.prefect/your-flow.prefect"}
Andreas Tsangarides
10/06/2021, 12:17 PMAnna Geller
Anna Geller
Anna Geller
image_name
and image_tag
explicitly. Since the image layers are cached, the build process is fast. To test, you can create several flows with a similar configuration to this:
from prefect import Flow, task
from prefect.storage import Docker
from prefect.run_configs import DockerRun
@task
def hello_world():
return "hello from Docker Flow #1"
with Flow(
"docker-flow-1",
storage=Docker(image_name="andreas-image", image_tag="latest"),
run_config=DockerRun()
) as flow:
hello_world()
if __name__ == '__main__':
flow.register("Docker_Flows")
let me know if this works for you.Andreas Tsangarides
10/06/2021, 1:41 PMflow.py
file
So, if I was to follow your suggestion:
1. I either build that image using docker run...
or I just register the 1st flow
2. When I register the second, it will use the cached image from (1) and add the second flow to the image?
Instead of the storage you defined for each flow, can I use something like this in each one of them?
storage = Docker(
# registry_url='<http://455197153980.dkr.ecr.eu-west-2.amazonaws.com/|455197153980.dkr.ecr.eu-west-2.amazonaws.com/>',
image_name="uk-prefect-flows-dev",
image_tag="latest",
dockerfile='Dockerfile',
path="src/flows/elexon_detsysprices/flow.py",
stored_as_script=True
)
Kevin Kho
Kevin Kho
Andreas Tsangarides
10/06/2021, 2:31 PMdocker run....
or can define a Prefect Docker
storage and then call .build()
on it
• Each flow is registered (using Prefect Storage attached to the flow) using S3
.
• The flows are registered as scripts, so everything is executed at run time, not at registration time!!!! That was critical for me, otherwise the Prefect tutorial for registering multiple flows using DockerStorage works
So.. in each flow:
# src/flows/elexon_detsysprices/flow.py
# imports
ENV = os.getenv("ENV", "local")
from prefect.storage import S3
storage = S3(
bucket="<bucket-name>",
key=f"{ENV}/elexon-detsysprices",
stored_as_script=True,
local_script_path=os.path.abspath(__file__)
)
with Flow("flow-name", storage=storage) as flow:
# .......
# do attach your run_config before registering! I use DockerRun here
# DockerRun(image="uk-prefect-flows-dev:latest", env={"ENV": "local"}, labels=["local"])
# register your flow (I do it using click elsewhere using the prefect client)
To build your image once...
# I do this in cli.py
storage = Docker(
# registry_url=registry_url, keep as None for local dev otherwise point to your ECR/dockerhub repo
image_name=image_name,
image_tag=image_tag,
dockerfile='Dockerfile'
)
storage.build(push=push) # push=False for keeping image on local machine