A short update on the above: all systems are back ...
# prefect-community
a
A short update on the above: all systems are back to normal since 11:31 AM UTC - the status page is updated and we keep monitoring the services. Thanks to everyone reporting the issue!
f
We are still seeing some fallout, of flows getting canceled or killed by Lazarus.
a
Thanks Florian, could you post more info of what you see? This might be unrelated to the incident, I'm afk but can check in the evening if you post more info incl.flow run ID
f
Sure.
cb076c5-f718-4225-b9e9-bb15e0d19665
This is probably the one. For some reason we were notified about it twice. The other errors were from before 11:31 UTC I just realized.
👍 1
a
weird, I can't find this run in the system
are you sure you sent the right flow run ID?
so it might be related to the brief API issue we had today - LMK if not and we can investigate more
f
I will check again tomorrow, but I copied the id from a notification link.
👍 1
Somehow the first character went missing. Sorry about that:
1cb076c5-f718-4225-b9e9-bb15e0d19665
a
I checked the logs and it doesn't seem related to the outage but to task runs losing heartbeats in general I was trying to find out more about the task runs that lost heartbeat - e.g. task run ID 40497634-4fe7-4005-93c8-f0e6d4a06a2b but it doesn't show any log messages in our system I don't even know how to help here honestly, the logs show very little and it looks like your tasks lost communication with the API and as a result, we don't have any state updates and logs in the system and Lazarus was triggered to reschedule the flow run - Lazarus was trying to be helpful here in a way that it wanted to reschedule the work that couldn't be finished
👍 1