Nicolas Bigaouette
11/12/2020, 6:48 PMreplica
setting. In addition, the app is behind guvicorn. All this means that we have multiple of our backends running concurrently. As such, if we register a flow in each backends prefect will receive multiple register request for the same flow, which is obviously wrong...
How should I handle this use case? How can I have multiple instances of my application that use the same flow? We though of performing a search for the flow and creating it if not present. But then the flow's name (or anything that is used to perform the search) would become the unique key to identify a flow. From what I'm reading about prefect a flow name is not the unique key to identify a flow.
Any idea?
Thanks!!Dylan
Dylan
Dylan
Dylan
Dylan
Dylan
Nicolas Bigaouette
11/12/2020, 8:34 PMflow.register()
or should all four instances call it? If they all call register()
(with the exact same code), will prefect think that multiple versions of the flow exists or will the four instances be able to run the same exact flow?Joseph Haaga
11/12/2020, 8:39 PMidempotency_key
when you trigger a Flow run, and any other runs triggered over the next 24 hours with that same key will be disregardedNicolas Bigaouette
11/12/2020, 8:44 PMprefect
package API provides the same options?
In any case, that is for running a flow. If I understand correctly, I could ask prefect to run a specific flow from all the instances (passing the idempotency flag) of my backend and prefect will run the flow only once.Nicolas Bigaouette
11/12/2020, 8:45 PMNicolas Bigaouette
11/12/2020, 9:12 PMidempotency_key
flag can also be used for the flow.register()
call, including passing the code hash. So this becomes possible:
flow.register(
project_name="Hello, World!",
idempotency_key=flow.serialized_hash(),
)
Which seems to be doing what I need.
But, note that flow.serialized_hash()
was introduced in prefect 0.13.14 which was released... a week ago 😄 And it seems it's buggy too: https://github.com/PrefectHQ/prefect/issues/3653 (simple fix I think that should land soon)Dylan
Dylan
Dylan
flow.register
? Can you give me a 10,000 foot view of your business use case or the problem you’re trying to solve? I don’t think I have enough information to give you a good answer just yetNicolas Bigaouette
11/13/2020, 2:57 PMflow.register()
or maybe something else.
If I simply let each instance register()
the same flow, for example at application start up, prefect interprets this as four different versions of the same flow, and only the last one registered is active (if I understand correctly). This complicates things as our prefect setup (i.e. the active flows) would then depends on the number of replicas.
Right now what we are trying to do is to have a python module (say flowmodule
) with the flow
defined in it (using a context manager with Flow() as flow
). Importing this module will define the flow (that might be wrong, maybe we should have some kind of setup function that builds the flows and return them?) Then somewhere else in the code (for example in a REST route) we need to execute the flow based on data. To achieve this I created a prefect.Client()
pointing to the prefect server, then call flowid = flowmodule.flow.register(idempotency_key=flow_hash)
. Using a flow_hash
that is shared by all instances prevents multiple versions being registered to prefect. Then to run the flow we call client.create_flow_run(flow_id=flow_id)
. This way, any backend instance can run the same required flow when required.
I hope the description makes my use case a bit more clear... Do you think the way I define, register and run the flows make sense?
Thanks a lot for your input!Joseph Haaga
11/13/2020, 7:00 PMDeployment
that registers the Flow, so the actual application backend replicas need not worry about registration at allDylan
client.create_flow_run(version_group_id-"my_id")
. This way, your web app will always call the latest version of the Prefect Flow and you don’t need to change the code in your web app at all to change the flow. Check out this blog post for a similar example where a Lambda kicks off a parameterized flow run: https://medium.com/the-prefect-blog/event-driven-workflows-with-aws-lambda-2ef9d8cc8f1a