We have an automation set up to alert us, when a q...
# prefect-cloud
j
We have an automation set up to alert us, when a queue becomes unhealthy. I've checked the prefect agent logs that is polling the problematic queue and it has reported no errors at all 😕 Automation details Trigger Type: Work queue health Work Queues: <queue 1>, <queue 2> Work Queue: Stays in Unhealthy For: 2 Hours First
prefect-cloud.automation.action.executed
event details look like so
Copy code
{
  "id": "ffcf0a09-c7cd-48f0-862a-a964474d0755",
  "account": "c1397d5f-b9f3-49e8-abb6-bce7d7b1412e",
  "event": "prefect-cloud.automation.action.executed",
  "occurred": "2023-03-20T19:19:49.450Z",
  "payload": {},
  "received": "2023-03-20T19:19:49.451Z",
  "related": [
    {
      "prefect.resource.id": "prefect-cloud.automation.82a15699-e855-4973-8c84-795b3b54ecb4",
      "prefect.resource.role": "automation",
      "prefect-cloud.name": "Production work queue unhealthy > 2h -> Slack",
      "prefect-cloud.posture": "Proactive"
    }
  ],
  "resource": {
    "prefect.resource.id": "prefect-cloud.automation.82a15699-e855-4973-8c84-795b3b54ecb4.action.0",
    "prefect-cloud.action.type": "send-notification"
  },
  "workspace": "32dfe242-315b-4405-b06d-8b6308d6b631"
}
What are the next steps to troubleshoot this? P.S. I'm running prefect agent via
systemd
on an EC2 (Ubuntu 22.04.2 LTS).
The reported unhealthy queue has hourly jobs scheduled in it. No flow run was skipped! So how can it be the queue was unhealthy (not polled) for over 2 hours repeatedly, when hourly jobs were run from it 🤷
If someone has some ideas how to troubleshoot this further (if at all possible for a Cloud user), I can open a github issue with my findings, if need be
Just got another one and immediately checking the queue in UI, it says it is "healthy" Is this some change in Prefect Cloud in last 1-2 days?
w
Hi @jpuris, yes if you could please open an issue with this detail plus any other reproduction steps im happy to dig in.
j
@Will Raphaelson here is the github issue https://github.com/PrefectHQ/prefect/issues/8871 I wish I had more info to add 🤷
w
that’ll work, thanks