Hello fine people, I'm seeing an odd state transit...
# ask-community
a
Hello fine people, I'm seeing an odd state transition fairly reguarly (at least once a day), we have a false negative "crashed" flow run, where the state transitions from:
Scheduled → Pending → Crashed → Running → Completed
For now, we've adjusted our alerting to only trigger if a flow "stays in {state} for 5 minutes", but it would be nice to be able to alert when a flow "enters {state}". lmk if there is a better forum for this bug report.
w
Hey Alex, that false positive crash state is interesting. Can you tell me a bit about your infra and job setup?
a
We're using prefect-aws. These are ECSTasks running on Fargate
We're using prefect cloud.
Could this be a timeout issue? https://prefecthq.github.io/prefect-aws/ecs/#prefect_aws.ecs.ECSTask.task_start_timeout_seconds The default is 120 seconds, and the task transitions to "running" more than 120 seconds after it entered the "pending" state.
I might be answering my own question... prefect duck I'm guessing that is possible if the flow run reports "RUNNING" after the agent reports the run as "CRASHED". It might be helpful if the Event log in the UI indicated: • Who reported the state change (agent or task) • Why the state changed (likely due to timeout in this case)