Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

Hi, when a worker gets killed or crashes while it is executing a flow run, the flow run gets stuck in `running`  state. As this is confusing and inconsistent, I'd like to resolve it and get consistent states on our self-hosted Prefect server.
I was hoping that a migration from agents to workers would fix it, but despite the worker's heartbeat functionality and workers being recognised as unhealthy/dead by the server, the flow run still remains in `running`.

I can work around this by updating flows manually from external processes (e.g. k8s lifecycle hooks on pods), but I would like to avoid having to manage flow state from outside of Prefect itself.
I'd be curious to learn how others are approaching this issue. Maybe I'm also still missing something on Prefect, my expectation was that it would mark flow runs from dead workers as `failed`.

Does anyone have thoughts on this on how they are addressing this issue of flows getting stuck in an inconsistent `running` state on worker restarts, crashes, etc.?
It seems to be something that should be very common for people to encounter.