John Muehlhausen
12/14/2021, 12:54 AMprefect agent local start {...}
Kevin Kho
Tom Klein
12/14/2021, 9:09 AMAnna Geller
prefect agent local start --label xyz
Prefect Cloud has concurrency limits which allow you to limit the number of flow run that will be created with a given label, effectively limiting the number of flow run that will be executed on that agent. So there is a way of controlling the number of flow runs created, or specifying resource request for a Kubernetes job.Tom Klein
12/14/2021, 10:04 AMAnna Geller
Tom Klein
12/14/2021, 10:35 AMAnna Geller
Tom Klein
12/14/2021, 10:54 AMJohn Muehlhausen
12/14/2021, 1:48 PMAnna Geller
John Muehlhausen
12/14/2021, 1:56 PMAnna Geller
Anna Geller
John Muehlhausen
12/14/2021, 5:32 PMAnna Geller
Anna Geller
John Muehlhausen
12/14/2021, 5:45 PMJohn Muehlhausen
12/14/2021, 5:52 PMAnna Geller
John Muehlhausen
12/14/2021, 6:14 PMJohn Muehlhausen
12/14/2021, 6:14 PMIn the worst case, it could be that a flow run gets executed multiple times because more than one agent picked it up.
John Muehlhausen
12/14/2021, 6:15 PMAnna Geller
John Muehlhausen
12/14/2021, 6:26 PMJohn Muehlhausen
12/14/2021, 6:32 PMAnna Geller
Kevin Kho
Anna Geller
John Muehlhausen
12/14/2021, 6:58 PMJohn Muehlhausen
12/14/2021, 7:01 PMwith Flow("...", schedule, executor=LocalDaskExecutor(scheduler="processes", num_workers=6)) as flow:
# tasks...
res=flow.register(project_name="...",
set_schedule_active=True,
idempotency_key='v8', # increment on each code change
labels=['a_label'],
add_default_labels=False,
no_url=True)
Anna Geller
Kevin Kho
John Muehlhausen
12/14/2021, 7:03 PMJohn Muehlhausen
12/14/2021, 7:04 PMJohn Muehlhausen
12/14/2021, 7:06 PMAnna Geller
Kevin Kho
John Muehlhausen
12/14/2021, 7:37 PMKevin Kho
John Muehlhausen
12/14/2021, 8:00 PMJohn Muehlhausen
12/14/2021, 10:06 PMJohn Muehlhausen
12/14/2021, 10:08 PMprctl(PR_SET_PDEATHSIG,...
so that they don't outlive their agent.John Muehlhausen
12/14/2021, 10:09 PMZanie
John Muehlhausen
12/14/2021, 11:25 PMZanie
John Muehlhausen
12/14/2021, 11:47 PMprctl(PR_SET_PDEATHSIG
such that a child process count is a reliable indication of whether to accept more work.Zanie
I.e. instead of the agent needing to ask server whether it should accept more work, server can just not present a workload potential to that agent in the first place, if it knows the agent has started something that is still unterminated.This is basically the same as the closed implementation in Server for agent level concurrency. It just moves the concurrency check from a dedicated route to the route that retrieves ready flow runs. The issue here is actually not that the agent is polling the server to investigate concurrency, it is that aggregate count queries are not performant in Hasura. Once a certain scale is reached, the server will struggle to calculate the number of concurrent runs that a given agent has.
Zanie
Zanie
Thanks for your work on these issues. For now we will put our flows in the iron grip of the LocalAgent’s lifespan via LD_PRELOAD (for agent’s children, grandchildren,etc) ofYou’re welcome! Sorry I don’t have a great solution for you yet 🙂 We’re definitely thinking about this a lot, we just want to deliver a feature that works consistently and is well integrated with the rest of our offering. I like your workaround quite a bit hahasuch that a child process count is a reliable indication of whether to accept more work.prctl(PR_SET_PDEATHSIG