Josh Paulin
03/02/2023, 7:41 PMWill Raphaelson
03/02/2023, 7:45 PMJosh Paulin
03/02/2023, 7:49 PMWill Raphaelson
03/02/2023, 7:58 PM{
"trigger": {
"match": {
"prefect.resource.id": "prefect.flow-run.*"
},
"match_related": {
"prefect.resource.id": [
"prefect.tag.prod"
],
"prefect.resource.role": "tag"
},
"after": [],
"expect": [
"prefect.flow-run.Failed"
],
"for_each": [
"prefect.resource.id"
],
"posture": "Reactive",
"threshold": 1,
"within": 10
}
but you’d make the threshold 4 and the within the number of seconds within which time you want to monitor.Josh Paulin
03/02/2023, 8:24 PMalert
, is it the match_related
that would be updated to be prefect.tag.alert
?
Also I guess it’s not possible to get quite the setup I’m looking for with 4 consecutive runs? Since I’ve got deployments that take differing amounts of times and schedules, I’m thinking there could be chances for false negatives if I didn’t create different sets of automations more tailored to each deployment characteristics.
To paint an extreme scenario, assume I have Flow A that runs very fast (< 1 min) and Flow B that runs very slow (> 1 hour). If I tried to create a single automation to cover both I’d end up with either A alerting too often because the within
time is so large that it overcounts, or B misses alerts altogether because even though it’s constantly failing it’s not doing so close enough in time.n
runs, no matter how close or far apart they wereWill Raphaelson
03/02/2023, 9:07 PMJosh Paulin
03/06/2023, 5:59 PM{
"trigger": {
"match": {
"prefect.resource.id": "prefect.flow-run.*"
},
"match_related": {
"prefect.resource.id": [
"prefect.tag.alert"
],
"prefect.resource.role": "tag"
},
"after": [],
"expect": [
"prefect.flow-run.Failed"
],
"for_each": [
"prefect.resource.id"
],
"posture": "Reactive",
"threshold": 4,
"within": 10800
}
}
Josh Paulin
03/06/2023, 6:13 PMWill Raphaelson
03/06/2023, 6:13 PM