Preston Sharpe
10/16/2025, 1:26 PM{
"type": "event",
"match": {
"prefect.resource.id": [
"prefect.flow-run.*"
]
},
"match_related": {},
"after": [
"prefect.flow-run.heartbeat"
],
"expect": [
"prefect.flow-run.Failed",
"prefect.flow-run.Crashed",
"prefect.flow-run.heartbeat",
"prefect.flow-run.Cancelled",
"prefect.flow-run.Completed"
],
"for_each": [
"prefect.resource.id"
],
"posture": "Proactive",
"threshold": 1,
"within": 90
}
I have confirmed the PREFECT_RUNNER_HEARTBEAT_FREQUENCY is set to 30.
It seems to happen when a flow starts, there is 1 heartbeat, then a success/failure before the next heartbeat. This still triggers the automationMarvin
10/16/2025, 1:26 PMMarvin
10/16/2025, 1:32 PMafter: ["prefect.flow-run.heartbeat"], the automation opens a new “expectation window” every time a heartbeat event is observed. In very fast runs, you can end up with a window that starts at an unlucky moment (e.g., a trailing/late heartbeat near completion), then no additional heartbeat or terminal event arrives after that heartbeat, so the window times out and fires even though the run actually completed successfully.
Two ways to avoid these false positives:
1) Anchor on Running instead of heartbeat
This pattern expects either a heartbeat or a terminal state shortly after the run enters Running. It’s robust for fast runs and avoids spurious windows started by late/lone heartbeats.
{
"type": "event",
"match": {
"prefect.resource.id": [
"prefect.flow-run.*"
]
},
"match_related": {},
"after": [
"prefect.flow-run.Running"
],
"expect": [
"prefect.flow-run.heartbeat",
"prefect.flow-run.Completed",
"prefect.flow-run.Failed",
"prefect.flow-run.Cancelled",
"prefect.flow-run.Crashed"
],
"for_each": [
"prefect.resource.id"
],
"posture": "Proactive",
"threshold": 1,
"within": 60
}
- With PREFECT_RUNNER_HEARTBEAT_FREQUENCY=30, set within to something like 45–60s. This catches true zombies (no heartbeat and no terminal state after Running) but won’t trip for quick completions.
2) If you really want to monitor mid-run heartbeats continuously
Use the same “after Running” window above to ensure the first heartbeat/terminal shows up, then add a second expectation that’s “after heartbeat, expect another heartbeat OR a terminal” — but only if you’re confident there aren’t late heartbeats occurring after terminal states in your environment. If you see any false positives with that, prefer the single Running-anchored expectation above.
A couple of checks to confirm the root cause:
- In one of the false positives, open the Event feed/timeline for that flow run and confirm the order and timestamps of:
- Running
- heartbeat
- Completed/Failed/Cancelled/Crashed
- If the window was opened by a heartbeat that arrived after the terminal event (or the last terminal event had already occurred), you’ll see why it fired.
Notes and references:
- Prefect event automation semantics (after/expect/within/threshold/posture/for_each): Event triggers and expectations
- Zombie flow patterns and guidance: Detect zombie flows
If you can share one example run’s event timeline (the 3–4 events and their timestamps), I can validate the exact ordering and suggest a minimal tweak.Preston Sharpe
10/16/2025, 1:40 PM