Jean Morris
09/17/2025, 1:28 PMMarvin
09/17/2025, 1:28 PMMarvin
09/17/2025, 1:39 PMprefect.flow-run.FAILED
)
- Scope: choose what to watch (e.g., a specific deployment, a flow, or tags).
- For each: set to deployment if you want one “recovery window” per deployment. This ensures multiple failures on different deployments don’t interfere with each other.
- Step 2:
- Event: Flow run completed (event name prefect.flow-run.COMPLETED
)
- Posture: Proactive (meaning: “alert if this event does not happen”)
- Within: 3 hours
- Correlation: same deployment as Step 1 (so any successful run of the same deployment counts as a recovery). If you instead want to require the exact same flow run to later complete, correlate by the same flow run — but note most users prefer “same deployment” to allow a new run to qualify as recovery.
3) Action:
- Add Send notification
- Select your notification block (Slack/Email/etc.), and customize the message.
4) Save and enable the automation.
Notes and clarifications
- Correlation choice matters:
- Same deployment: silences the alert if any run of that deployment completes within 3 hours after a failure (common “recovered” pattern).
- Same flow run: would only silence if the exact run later completes (less common).
- CLI: Prefect 3’s CLI does not support creating automations (you can list/inspect/enable/disable). Use the UI for creation.
- Docs:
- Event/automation concepts and trigger composition: https://docs-3.prefect.io/v3/concepts/event-triggers
Questions for you (so I can tailor this precisely)
- Should “recovery” mean the exact same flow run eventually completes, or any subsequent run of the same deployment completes?
- Do you want this automation scoped to a specific deployment/flow, or across all deployments?