https://prefect.io logo
Title
s

Schuyler Manchester

03/13/2023, 11:43 PM
Hey all, I'm wondering if there is a known issue with Automations, specifically sending notifications for flows entering failed state? I'm fairly certain it's broken right now.
I couldn't get our failed state automations to work, and while debugging I created an automation that will send a notification when a flow enters any state, and I get one when it enters Scheduled and one when it enters running. And then when it fails, it doesn't send a notification (this flow is specifically setup to fail to test automations): FWIW, it was working for failures for us at least earlier this morning.
Prefect flow run notification
Flow run run_bid_evaluation_workflow/shiny-sidewinder observed in state Scheduled at 2023-03-13T23:39:17.441375+00:00.
Flow ID: e06dc4f4-dec8-4ba3-951b-494a25ac6a8b
Flow run ID: ee7fa292-73b9-408b-9e13-6af92485cefe
Flow run URL: <https://app.prefect.cloud/account/6109cab0-55a0-4188-a8cb-70d794fb03cd/workspace/0f784f89-6d28-4e49-a013-abe9e2474aa7/flow-runs/flow-run/ee7fa292-73b9-408b-9e13-6af92485cefe>
State message: Run from the Prefect UI with defaults
Prefect Notifications | Today at 4:39 PM
4:41
Prefect flow run notification
Flow run run_bid_evaluation_workflow/shiny-sidewinder observed in state Running at 2023-03-13T23:41:44.952160+00:00.
Flow ID: e06dc4f4-dec8-4ba3-951b-494a25ac6a8b
Flow run ID: ee7fa292-73b9-408b-9e13-6af92485cefe
Flow run URL: <https://app.prefect.cloud/account/6109cab0-55a0-4188-a8cb-70d794fb03cd/workspace/0f784f89-6d28-4e49-a013-abe9e2474aa7/flow-runs/flow-run/ee7fa292-73b9-408b-9e13-6af92485cefe>
State message: None
Prefect Notifications | Today at 4:41 PM
w

Will Raphaelson

03/14/2023, 1:49 AM
Hi @Schuyler Manchester , thanks for raising the issue, can you give me the URL of the automation in question and I’ll look into it?
w

Will Raphaelson

03/14/2023, 2:17 AM
looking into this now, can you click into the failed state automation that isnt firing and give me that url, and also ideally the url of a flow run that failed that didnt trigger the automation? thanks again
Hey @Will Raphaelson I think I figured out the issue. It appears that if there is an extremely large error message, the automation fails. Once I changed and truncated the error message it worked. So looks like there is a bug to fix within Automation.
w

Will Raphaelson

03/14/2023, 5:09 PM
oh nice, i was just digging into this as well. did you see an automation failure event when you click into the automation and tab over to events? I think there might be something else at play here as well that i’ll write up shortly, checking with my team on something.
s

Schuyler Manchester

03/14/2023, 5:16 PM
I do not see the
prefect-cloud.automation.triggered
and
prefect-cloud.automation.action.executed
when an extremely long error message is present for a failed flow run.
w

Will Raphaelson

03/14/2023, 5:17 PM
hmm okay, i would expect to surface an automation failed event, I’ll have someone look into this. thanks.
👍 1
So the other thing at play is that the default for triggers made via the ui is that they can essentially only fire every ten seconds to avoid a runaway loop. I notice that flow run goes from running to terminal state in about one second, so its expected that the running state would fire, but not the terminal one. There are ways around this if you want to use a trigger policy like that in production, not just to debug.
s

Schuyler Manchester

03/14/2023, 9:04 PM
Sure, but I have automations that are not working where I want to know if a flow run enters a failed state (and it won't notify me if the error message is too large). The other one monitoring all state changes was just for debugging purposes.
w

Will Raphaelson

03/14/2023, 9:53 PM
yeah i gotcha, i’ll look into this tomorrow!
👍 1
s

Schuyler Manchester

03/20/2023, 3:00 PM
Hey @Will Raphaelson, wanted to follow up on this issue? Any chance you created a bug/issue for tracking this?
w

Will Raphaelson

03/20/2023, 3:22 PM
Hi
Hi Schuyler, sorry for not closing the loop on this. the decision we’ve made is to truncate very large error messages to keep the event. -> notification flow running. You will always see your full error message in logs. This is in production now - any concerns? Happy to revisit this.
s

Schuyler Manchester

03/20/2023, 4:44 PM
Nope, that sounds great! Thanks!