<@ULVA73B9P> why do some flows get stuck in runnin...
# ask-marvin
a
@Marvin why do some flows get stuck in running for days?
n
sorry marvin is down right now
this can happen for many reasons, sometimes the infra disappears (OOM, provider outage etc) and never updates the prefect API, so as far as it knows, the flow run is still
Running
a
ah ok. i've noticed that even when i set a timeout_seconds on the flow, it doesn't seem to timeout long-running flows like this. what is the best solution here?
I've also tried using automations
some flows keep running for days and the only way to stop them is to manually cancel them
n
well a
timeout_seconds
on the flow only works if the infra where the flow is happening still exists 🙂 do you know whether this is the case? what infra do your flows happen on? does it still exist for the flow run that appears as
Running
in the Prefect API?
a
right...
these are kubernetes nodes in azure. i believe we use spot nodes...maybe the nodes are getting intercepted? we have also run into OOM issues
n
i would
k get pods
or similar and see if the flow run pods are still alive / existing. maybe you can
describe
them and see something interesting. but if they're totally gone, the prefect API is probably unaware and you may need to
set_flow_run_state
to
Cancelled
or something