Hello! Does anyone know the priority that is placed upon flow labels? For example, if we have 2 servers, each with an agent (say agent
A
and agent
B
for simplicity), and we specify that a flow can run on either
A
or
B
, how is an agent selected to execute the flow run?
k
Kevin Kho
04/19/2022, 5:47 PM
It’s whatever agent picks it up first. They poll every 10 seconds so it’s whatever hits it. There is no concept of priority.
Kevin Kho
04/19/2022, 5:47 PM
You don’t mean 2 Prefect servers right?
a
Anthony Harris
04/19/2022, 5:47 PM
Correct, just 2 local agents on distinct servers
k
Kevin Kho
04/19/2022, 5:48 PM
Yeah that’s right there is no concept of load balancing
a
Anthony Harris
04/19/2022, 5:49 PM
Hmm. Alright. Thanks for the quick clarification! I was kinda hoping for something like it runs on
A
if
A
is available and then falls back on
B
only when necessary - is there some way to accomplish that with the current state of 1.0?
Anthony Harris
04/19/2022, 5:50 PM
Or even 2.0, because I'm hoping to move to it in the near future 🤞
k
Kevin Kho
04/19/2022, 5:55 PM
If you send it to A and then A can’t launch it because of a lack of resources, there are services that will requeue it to launch again, but there it won’t explicitly be directed to A.
There is nothing in 2.0 that explicitly supports this because this type of concept is more aligned with Task Queues like Celery where you know the resources of the workers and then allocate the work accordingly. They put work in a queue and then don’t release it until there is space.
With orchestrators the concern is about the scheduling of work. Dask though has that ability to function as a queue to a cluster so it’s more about executing that concept as tasks rather than Flows. On the Flow level, there is no resource management concept. Does that make sense?
Not ruling out the potential of it happening, but not a high priority roadmap item. Does that make sense?
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.