https://prefect.io logo
#prefect-server
Title
# prefect-server
a

Alfie

09/28/2021, 3:40 PM
Hi Team, I want to set up an environment with fast responding, and a cron scheduled flow should be executed right after the planned time point. Meanwhile the load maybe high sometimes, such as tens of flows to trigger in a second. Do you think Prefect can achieve the requirement? Or any suggestions about the deployment and settings? I want to just use local agent to execute the flows, of course there could be more than one agent instances. Thanks
k

Kevin Kho

09/28/2021, 3:44 PM
So I believe that yes it can, but for server specifically, you need to configure Prefect Server to handle more than 10 runs at the same time. See this
Then of course it also depends on the resources of your API/DB, but I think they should be able to handle the load. You may also need to horizontally scale the API if it can’t handle the load (too many tasks hitting it)
a

Alfie

09/28/2021, 3:49 PM
Thanks @Kevin Kho, in this pull request, it mentions “when a single flow has more than 10 runs scheduled at exactly the same time”. To my case, I will have different flows, and tens of different flows could be triggered at the same time. So does this pull request still apply?
k

Kevin Kho

09/28/2021, 3:51 PM
I don’t expect it to, but if you do run into issues where some flows are being lost, this is the place to start for sure
a

Alfie

09/28/2021, 3:55 PM
OK. Another thing is that seems scheduler has some latency to do the scheduling, seems it sleeps for some interval:
Copy code
{"severity": "DEBUG", "name": "prefect-server.Scheduler", "message": "Sleeping for 60.0 seconds..."}
{"severity": "DEBUG", "name": "prefect-server.ZombieKiller", "message": "Sleeping for 120.0 seconds..."}
Could it be the bottle neck of fast responding?
k

Kevin Kho

09/28/2021, 3:56 PM
Will look a bit
My take is that once the runs are already scheduled, the agent can query the database and fetch them even if the server is asleep. What this sleep will do is possibly delay the creation of new scheduled runs so it would depend if your runs are scheduled at a rate faster than that. Does that make sense?
a

Alfie

09/28/2021, 4:03 PM
OK, then that should be fine. But the interval should be configurable, right?
k

Kevin Kho

09/28/2021, 4:10 PM
So it’s not immediately configurable. You can find it here. That
Sleeping for 60 seconds
I believe is the gap between the end of the current loop and the next loop. (It will run again in 60 seconds). So if the default is 150 seconds, I guess it took 90 seconds to run.
I think if you really are trying to optimize latency, a task queue might serve the use case better where you have a queue of a bunch of things and you have workers picking them up upon availability (like Celery). The scheduling based on time and agent polling to pick up might be a bottleneck at some point. And I say this for Prefect Cloud too because there is some latency in general with Prefect
I think in general, 1 minute intervals are the current limit for Prefect to handle with low latency
a

Alfie

09/28/2021, 4:18 PM
If it’s just the latency for newly created flow to be found by the scheduler, that should be fine.
About the task queue, how does it work with Prefect server?
k

Kevin Kho

09/28/2021, 4:22 PM
Oh it wouldn’t work with Prefect. Prefect is primarily a batch oriented tool….so you could interact with a task queue from Prefect using Python. For example, AWS has SQS which is just a big queue of stuff to process that feeds into Lambda. You can add to the queue and stuff with Prefect.
a

Alfie

09/28/2021, 4:29 PM
I see. Seems the latency on agent side is fine, 0.25 second (https://github.com/PrefectHQ/server/blob/938df638465cc920756cbd769f821f02233fd468/src/prefect_server/services/agents/local_agent.py). So what I need to do is to have enough agent threads to execute the task.
I will test on what we talked. Thanks @Kevin Kho!
k

Kevin Kho

09/28/2021, 4:31 PM
Sure! The tricky thing might be the lack of a load balancing for the agents, but we’ll see
a

Alfie

09/28/2021, 4:37 PM
You mean one agent could grab all the flow runs even I run multiple agent instances?
k

Kevin Kho

09/28/2021, 4:41 PM
Yes because they just poll on a cycle so it’s just a matter of which one picks it up first.
But you can use labels to divide out the work maybe?
a

Alfie

09/28/2021, 4:45 PM
I can try that. Thanks for the reminder!
👍 1