https://prefect.io logo
Title
n

Nikita Samoylov

12/17/2021, 11:19 AM
Hi guys, we used prefect cloud version connected with 4 agents. When we fired a flow one agent was chosen to execute this flow - that worked fine. Yesterday we switched to self hosted backend server and when we fire a flow we can see at GUI which agent is chosen to execute it - at this point everything looks OK in real life this flow is executed on all 4 workers simultaneously. So we waste a lot of resources. Can someone point me out how to debug this or maybe I miss something?
a

Anna Geller

12/17/2021, 11:32 AM
@Nikita Samoylov Prefect doesn’t manage resources and doesn’t load balance flow runs across local agents. I have an open PR to the docs that will explain it. Here is the explanation: We recommend assigning a unique label to each agent, particularly when running multiple local agents. While Prefect Cloud has a mechanism to ensure that each flow run gets executed only once, race conditions may occur if you have multiple agents with the same label. The example below shows that instead of starting all agents with label “prod”, we can add a number to make it unique.
$ prefect agent local start --no-hostname-label --label prod1
$ prefect agent local start --no-hostname-label --label prod2
Currently, Prefect has no notion of a task queue that would allow load-balancing flow runs across multiple agents based on available resources on each agent. If you need such functionality, you would need to assign unique labels to your agents, and then assign corresponding labels to your flows to spread the load across multiple agents. For instance: Flow 1:
from prefect.run_configs import UniversalRun
...
with Flow(name="example1", run_config=UniversalRun(labels=["prod1"])) as flow:
Flow 2:
from prefect.run_configs import UniversalRun
...
with Flow(name="example2", run_config=UniversalRun(labels=["prod2"])) as flow:
To help with scale, we recommend using one of the other agents, such as: • KubernetesAgentECSAgentVertexAgent
btw, maybe it’s a sign that you should stay with Prefect Cloud 😄 but seriously: Cloud Standard plan has 20,000 free task runs each month which makes it really easy to get started.
n

Nikita Samoylov

12/17/2021, 11:49 AM
@Anna Geller, thank you for reply. 1. If we use your approach and
prod1
agent is shut down for some reason the flows scheduled for this
prod1
will not be executed at all. How do you handle it? 2. We have mapped tasks in our flows. If mapped task is split to 100 tasks it will be counted as 100 separate successfully run tasks. And 20000 limit is finished pretty fast for us.
a

Anna Geller

12/17/2021, 11:59 AM
1. If agent prod1 dies, you would need to change the label to point to a different agent - you could do it directly from the UI. Or (recommended) find a way to ensure your agent stays healthy. You can run it in a supervisor process and in Cloud we have Automations that can track the agent’s health and allows you to trigger some action if some agent becomes unhealthy 2. We have automatic volume discounts and additionally if you contact sales@prefect.io we can find a plan that suits your needs based on your workload, number of users, etc. In general, this line is relevant: “To help with scale, we recommend using one of the other agents” because local agents are not designed to be used as a queue or load balancer, but rather you should use it if a specific process needs to run only on a specific machine (e.g. because you have some custom files and configuration there that would be difficult to do e.g. in a container).
k

Kevin Kho

12/17/2021, 2:53 PM
This is because Cloud has some state locking not implemented in Server so that a flow can also be picked up my multiple agents. The stability is on of the things Cloud provides