I just went in to cloud for the first time in a day or two a Prefect Community #prefect-cloud

I just went in to cloud for the first time in a da...

Chris Whatley

02/17/2023, 9:27 PM

I just went in to cloud for the first time in a day or two and it looks like my notification rules were migrated. They didn’t pick up a failed run from this AM. Additionally, the notification method (slack) is now sealed up in an “anonymous block” that I can’t see or edit. Do I need to recreate the automations?

✅ 1

Will Raphaelson

02/17/2023, 9:35 PM

Hi Chris, thanks for flagging. The old style of notification was migrated to automations about a month ago, and yes you are correct that we chose to use the old anonymous blocks we were using under the hood so as to (ideally) not disrupt service. If youd like to change the slack block, yes recreating a new automation will give you that flexibility moving forward. This shouldn’t have caused any missed alerting though - if you have reproduction steps would you file an issue in our github repository?

Chris Whatley

02/17/2023, 9:48 PM

OK. I’ll do so. I’m also seeing a 15 minute delay in notifications coming through which has made evaluating this a litle confusing.

Chris Whatley

02/17/2023, 9:53 PM

Actually I’m not really sure what I can put in a bug report. All I know is that the notification did not fire this morning on job failure like it always has and that when I retried the job after modifying the automation to fire on a completed state, it did send the notification - only it was 15 minutes after the run finished instead of close to immediate. All the runs I’ve done since are also working except that there’s a 15 minute delay.

Will Raphaelson

02/17/2023, 10:02 PM

Got it, thanks and sorry for the snag there, let me have someone on the team look into it.

Chris Guidry

02/21/2023, 2:19 PM

Thanks for that report, Chris. I'm very sorry about that, during the timeframe you were experiencing that Friday afternoon, I had released a change to improve the reliability of the automation triggering system, but it ended up creating a bottleneck that delayed message processing. It took until around 5:30-6pm ET to get it back to a steady state. Have you observed any problems since then?

Chris Whatley

02/21/2023, 2:36 PM

Thanks for checking up. I believe everything has been going through well since then.

Chris Guidry

02/21/2023, 2:39 PM

That's good to know! For more context, we've been putting in some protections about event ordering (like making sure that

Running

events are processed before their subsequent

Completed

events) and it's proving a little tricky with the volume of events we're seeing. Thanks for your patience and let us know if you see any more trouble.

Open in Slack

Previous Next