Hi all I m using run deployment on existing deployments Pref Prefect Community #ask-community

Hi all. I'm using run_deployment on existing deplo...

Anco

12/12/2022, 12:23 PM

Hi all. I'm using run_deployment on existing deployments (Prefect 2.7.1) but I notice a 5 to 15 sec delay in the start of the flow run. I dont use scheduled runs so it should start right away. Should I look for parameters or config to remove the delay or is it something enviroment related?

✅ 1

Anco

12/12/2022, 12:30 PM

forgot to mention that the flow runs appear in the UI as 'scheduled' before they are executed.

Jeff Hale

12/12/2022, 2:56 PM

Hi Anco, The default is for the agent to poll the work queue every ten seconds, but that can be overridden if you want a shorter polling interval. See the docs section on settings. Here’s the one you want:

PREFECT_AGENT_QUERY_INTERVAL='10.0'

If you are spinning up infrastructure it could take some time. If you are communicating with anything external, there could be some latency from your network. If you want to schedule your flow runs you can now prefetch to get things started ahead of time.

Anco

12/12/2022, 3:30 PM

Thanks Jeff. I did play around with

PREFECT_AGENT_QUERY_INTERVAL='10.0'

a bit and set it to '0.1' but the delay is still there. I'm using DaskRunner for execution, but its connecting to an already running cluster. The logging output of the distributed.core starts at the soones 7 seconds after I trigger the run_deployment. Any ideas how to shorten this time?

Jeff Hale

12/12/2022, 3:39 PM

I’m not sure. If you can prefetch with a schedule that might be able to get things moving sooner, but I get that might not work for your use case.

Anco

12/13/2022, 9:09 AM

You're right about the prefetch. It also make no difference. I did see a an improvement if I invoke run_deployment once or even 3 times in a loop. In that case, the first flow run started with a 3.5sec delay.

Anco

12/13/2022, 9:10 AM

When I increase the number to 10, the delay also increases

Anco

12/13/2022, 9:47 AM

@Jeff Hale I have attached the agent logging. This is the output of loop of 3 run_deployment calls. The first 9 lines appeared instantly, without the delay. After that I see some warnings. Could these warnings be the cause of the delay?

log_agent_stdout.txt

Jeff Hale

12/13/2022, 3:14 PM

Hi Anco. Maybe. That looks like a new warning with 2.7.1. There’s now an open issue where you can add any information and follow along.

Anco

12/13/2022, 4:20 PM

I downgraded to prefect 2.6.9 and the warning was gone. The delay was not 😟

Jeff Hale

12/13/2022, 4:47 PM

Alright. Nice process of elimination! EDIT - sorry, I was confusing this with another issue.

Anco

12/13/2022, 5:05 PM

My setup was all local, no kubernetes or docker involved

Ryan Peden

12/13/2022, 5:20 PM

Hi Anco! It sounds like this just might be a result of the time needed to prepare everything to run a flow. When you call

run_deployment

, quite a few things happen: • A call gets set to the Prefect API to create a new flow run. • The flow run gets picked up by an agent. • The agent opens a new subprocess and runs the flow's command (usually

python -m prefect.engine

) • The new process has to start Python, when loads, parse, and initialize

prefect

and all its dependencies. • If using remote storage, Prefect downloads all the flow code from storage. Timing can vary depending on where the code is stored and how many files get get downloaded. If the flow code is on the local filesystem, it gets copied (along with any other code in its directory or subdirectories) into a temporary subdirectory. • The flow starts, and has to connect to the already-running Dask cluster; usually quick, but can add a bit of a delay. 5 to 15 seconds sounds is within the normal range of what I'd expect in this scenario. If you want the flows to run in-process immediately, calling the flow functions directly instead of using

run_deployment

will accomplish that, though this won't help when you need to run separate deployments with different settings or task runners.

🙌 1

Anco

12/13/2022, 5:38 PM

Thanks Ryan. We have already 3 or so years of experience with prefect and currently running prefect 1 with dask.distributed. (and we are quite happy). We run on our production environment 200K flow executions daily with 15 flows. The reason I'm interested in deployments and run_reployment() is because it allows me to efficiently 're-run' flows only by changing the parameter values. I had great hope that this will minimize data transfers reduce the ever growing data storage of the dask workers (we restart all workers once every 4 hours because of this). Also the concurrency of the work-queue seems a very interesting feature for controlling large peak loads.

Anco

12/13/2022, 5:44 PM

I will try a setup on some bigger machines because it seems that there is no deliberate delay or waiting built in, so perhaps there is another second to gain. But if I'm not able to reduce the start time of the flowrun to less then 1 second, it seems we are stuck with the old setup for a bit more.

Ryan Peden

12/13/2022, 5:45 PM

I'll check with a few of my colleagues to see if we can suggest some deployment/run patterns that will help you meet all your objectives. 🙂

🙌 1

Anco

12/13/2022, 5:47 PM

That would be great. Thanks!

Adam Gold

02/21/2023, 8:40 AM

Hey! Any news on this? is it possible to have deployments without 10-15 seconds delay (of downloading the code)? @Ryan Peden

4 Views

Open in Slack

Previous Next