https://prefect.io logo
h

Hui Zheng

10/09/2020, 12:01 AM
hello, Prefect, we have a scheduled flow that runs on k8e agent. Sometimes, the run started 10 mins later than its scheduled time. It seems related to some irresponsiveness of the tasks and The prefect-server has attempted
Copy code
Rescheduled by a Lazarus process.
For example, the one in the screenshot is scheduled for 11:10, but actually didn’t start until 11:20. Could anyone help understand why this happen and how to prevent it? Because we are building a new flow which need to run every 10 minutes with a very strict SLA. a 10 minute delay would be fatal to the new flow. Thank you
t

Thomas Hoeck

10/09/2020, 10:20 AM
Are you running it on AKS? @Hui Zheng THe reason I'm asking is that there is a bug on AKS with in-clusters coms https://github.com/Azure/AKS/issues/1052 that yields this error.
n

nicholas

10/09/2020, 12:33 PM
Hi @Hui Zheng - this sounds infrastructure related; could you give some more information about your flow's environment, executor, your k8s cluster etc?
h

Hui Zheng

10/13/2020, 7:19 PM
@Thomas Hoeck @nicholas we run it on Google GKE.
2 Views