Peter Roelants
01/11/2021, 7:20 PMflow.register
the creation and registration need to happen in the same call. Is there an example somewhere on how to decouple these steps?
For example how to create and store a Docker build artefact that encapsulate a flow, and running/registering the flow stored in the Docker artefact at a later time without access to the original flow file.Billy McMonagle
01/11/2021, 7:28 PMif __name__ == "__main__":
flow.register(project_name=PROJECT_NAME, idempotency_key=flow.serialized_hash())
I've added a register_flows
script to my build process that does this:
#!/usr/bin/env bash
for flow in $(find flows -name "*.py"); do
echo "registering $flow"
python3 $flow
done
This builds the docker image and pushes to container registry. I am not sure how well this will scale, of course. I find myself wanting some kind of central "app" object that could handle all of the registration calls.Spencer
01/11/2021, 7:53 PMprefect.Flow
module variables using importlib
(inspired by Airflow's DagBag
mechanism). It annotates all the flows with the shared environment (configured in CI; decoupled from flow definition), storage and state handlers. Then the storage is built and each is registered flow.register(..., build=False)
.Billy McMonagle
01/11/2021, 8:11 PMSpencer
01/11/2021, 8:48 PM* instantiate storage
* get all flows from all the files
* for flow in flows: storage.add_flow(flow)
* set flows.storage attribute (and any others like environment)
* storage.build()
* for flow in flows: flow.register(..., build=False) # there are other fields here just omitted
Peter Roelants
01/12/2021, 7:30 AMidempotency_key
to prevent registering the flow when building the artefact?
@Spencer You are using the Storage.build()
combined with add_flow()
to build the flow without registering? I'll look into that. Can you then register the flow by only having a reference to the build artefact (and no reference to the original flow defined in Python)?
In general it sounds like Prefect is currently not designed to cleanly decouple storage and registration.Chris Ottinger
01/12/2021, 12:19 PMbuild_flow.sh
builds image with the flow. deploy_flow.sh
registers the flow with Prefect Cloud/Server. In the CI/CD pipeline, we have a step in between the build and deploy steps to push to our image repos.
The approach is slightly different from one that @Spencer has described in that we package a single flow (or small number of flows) in a single repo that maps to to a single flow image. Each repo has a unique set of build and deploy scripts with flow names hard-coded into the the build and deploy scripts.Billy McMonagle
01/12/2021, 2:31 PM