Hey all I m seeing some strange behavior I ve deployed self Prefect Community #ask-community

Hey all! I'm seeing some strange behavior. I've de...

Jonathan Samples

02/14/2025, 12:07 AM

Hey all! I'm seeing some strange behavior. I've deployed self-hosted prefect server to ECS. And I'm using an ECS worker. Here is the strangeness: I have a parent flow that spawns many subflows. If a subflow fails, it propagates up and the parent flow fails BUT the sibling flows don't fail even though the infrastructure that was running them has shut down. AND then remain in a running state forever (or until I manually delete them). Can someone illuminate me on what I'm seeing?

Nate

02/14/2025, 12:24 AM

hi @Jonathan Samples are those subflows on the same infra as the parent? or are they subflows but triggered via

run_deployment

? mostly commonly zombie flow runs are because the infra OOM'd and it can no longer report state, so prefect API never gets an update

Jonathan Samples

02/14/2025, 12:25 AM

Thanks for the reply @Nate! Yes, those subflows are on the same infra... But its not an OOM error. Actually, I think the flow infra couldn't communicate to the prefect server (503)

Jonathan Samples

02/14/2025, 12:25 AM

But maybe a 503 to prefect server acts like an OOM in this case?

Nate

02/14/2025, 12:28 AM

hmm. I'd think the client would try again on a 503, or least by default it ought to

Jonathan Samples

02/14/2025, 12:28 AM

Is there any way of accounting for these zombie flows?

Jonathan Samples

02/14/2025, 12:28 AM

Does prefect server implement any kind of timeout for hearing back from flows?

Nate

02/14/2025, 12:29 AM

yep there's an approach using automations https://docs.prefect.io/v3/automate/events/automations-triggers#detect-and-respond-to-zombie-flows

🙏 1

8 Views

Open in Slack

Previous Next