https://prefect.io logo
Title
e

Esdras Lopes Nani

11/23/2022, 4:30 PM
Hey everyone! I'm having some problems with my agents (2 agents) , they are not submmiting any flows which has caused a late in my deployments (currently in 2513 Late runs) They are deployed in a EC2 with systemctl. I restarted them, installed last version of prefect (2.6.9) and the logs only shows that (no submission or any error).
Agent started! Looking for work from queue(s): xxxxx...
Don't know what else I can do ๐Ÿ˜… Thanks!
r

Ryan Peden

11/23/2022, 5:14 PM
Hi Esdras! Does the agent show the right API URL when it starts up? The output should look something like:
Starting v2.6.9 agent connected to <http://your-api-url>...
e

Esdras Lopes Nani

11/23/2022, 5:17 PM
Yes, it does, right account and workspace ids. I set the
PREFECT_LOGGING_LEVEL
to
DEBUG
and the logs are just
| DEBUG   | prefect.agent - Checking for flow runs...
r

Ryan Peden

11/23/2022, 5:21 PM
Thanks for checking that! Do the queue names the agents are connecting to look correct - no extra spaces or typos? (I know you've likely checked that already, but I want to start with the basics just in case ๐Ÿ™‚ )
e

Esdras Lopes Nani

11/23/2022, 5:22 PM
No problem! I have and the names are the same ๐Ÿ˜…
r

Ryan Peden

11/23/2022, 5:30 PM
Ok, excellent. If you load these work queues in the cloud UI, do your scheduled flow runs appear under 'Upcoming Runs'? And do you see anything under the 'Last Polled' heading at the right side of the screen?
e

Esdras Lopes Nani

11/23/2022, 5:37 PM
The flows don't appear in Upcoming Runs and the Last Polled time is the current time
r

Ryan Peden

11/23/2022, 5:42 PM
That is helpful! If you open the UI page for the deployments your scheduled flow(s) are part of, what do you see under the 'Work Queue' heading? For example, here's one of my deployments that uses a work queue named 'example_work_queue'. And if I click on the work queue name, it takes me to that queue and I see the scheduled runs for the queue.
e

Esdras Lopes Nani

11/23/2022, 5:52 PM
It takes me for the same Work Queue with the problem. I can see the scheduled and the late flows in the 'Runs' tab at the Work Queue UI
Can the number of late flows (almost 3000 now) confuse the processing? And is there some way to delete all the late flows at once?
I deployed a new flow in a new work queue and even the scheduler is on the deploy doesn't show in the flows runs as scheduled neither in the work queue Upcoming run
r

Ryan Peden

11/23/2022, 7:24 PM
Thanks for the update! I'm not sure what's happening; I'll check in with some of my colleagues to see if they have any ideas.
e

Esdras Lopes Nani

11/23/2022, 7:26 PM
Thanks! In my other account, I turned off all deployments that were scheduled and turn them on and now they don't appear as scheduled too. Maybe it's a general bug
r

Ryan Peden

11/23/2022, 7:38 PM
Do you have a concurrency limit set on your work queues?
e

Esdras Lopes Nani

11/23/2022, 7:44 PM
Yes, I do
a

Anna Geller

11/23/2022, 7:54 PM
can you provide more details Esdras? whatโ€™s your storage and infra block, how did you create that deployment?
Iโ€™d recommend taking it more step-by-step remove this existing deployment and create a new one without schedule then create an ad-hoc run to confirm that works then you can add schedule and concurrency limits
removing this deployment should remove all those stuck runs as well
e

Esdras Lopes Nani

11/23/2022, 8:10 PM
@Anna Geller I'm using S3 Block as Storage and Process as Infra (EC2 AWS). They were running fine until yesterday, last run at 08:00:06 PM (Brazil/Sao Paulo Time). We had no changes or updated neither in code nor on infra. I'll try to delete the deployment and restart from scratch. I'll comeback with the updates soon Tks!
๐Ÿ™Œ 1
:gratitude-thank-you: 1
@Anna Geller Deleting and recreating the deployement and creating a new Work Queue seems to do the job, but editing the deployment (from ui or updating the yaml without deleting it) makes the deployment not "Schedule" for future runs, so I have to redo all the deployment again. That's not a normal behavior, is something going on? I'm not ok with deleting all my production deployments and redeploing them again due that I lost all flows runs history and all the work of redoing all the deploys
a

Anna Geller

11/23/2022, 9:49 PM
We have a paid support channel via cs@prefect.io if you need more help troubleshooting this in a call. What we can do here is point you to the right place. I think something must be incorrectly configured on deployment or execution layer (agent and infra blocks) so I'd encourage you to troubleshoot this on the infrastructure level more step by step and try to find the root cause Concurrency limits and allocation of queue names on agent and deployments are also worth looking at