Our Local Agents are up and running with the corre...
# prefect-community
x
Our Local Agents are up and running with the correct labels, yet they are not picking up jobs. The agent screen shows they are querying, and indicates many late runs for the agent. I have tried to restart the agent, but to no avail. There is no other hint as to what the issue might be. When I try a quick run of a flow, that does get executed. Any idea what is going on here?
đź‘€ 1
b
Hello Xavier, are the flows stuck in a certain state? (ex: submitted, scheduled, etc.)
In the meantime, here are some resources for troubleshooting what could be happening here: https://discourse.prefect.io/t/why-is-my-flow-stuck-in-a-submitted-state/201
x
Hi @Bianca Hoch, thanks for your reply. The jobs are in a scheduled state. There is no label mismatch between the agent and the jobs. When you look at the agent screen, it is clear that the agent is polling, and it also shows late jobs for the agent.
When I try do a Quick Run on one of these flows, it does run as expected. For some reason the scheduled flows are not executing, and so we are building a large backlog of late jobs
I’ve tried clearing out all the late jobs, then restarting the agent, but still, nothing gets executed
We have not changed anything about our setup, this just started to happen over the weekend
b
Hey Xavier, would you mind sharing the id of the flow that is producing the late runs?
It should be in the url of the flow page
Also is this only occurring for one flow? or are the multiple?
It’s impacting all flows it seems
Nothing that is scheduled is running
b
Hello Xavier, I'm seeing a value of 'null' for the labels on the flow (MVIS Benchmarks) which is associated to the run ID you sent here.
x
I see the label ord-prod1 in the UI for that run?
b
That is odd. Have you tried re-registering the flow?
x
I have not…I can give that a go
🤞 1
Hi @Bianca Hoch I hope you are well…our agents have stopped running flows again after 4 days. I have restarted the agents and they are querying for flows, but flows just go into a late state. I tried re-registering, then deleting and registering flows again, but they simply do not get picked up. The labels are correct. The last time this happened, registering the flows again seemed to fix it, but not this time. This is impacting our production environment. If we can’t get to the bottom of this, we’re going to have to go back to airflow…
That is an example flow that I deleted and registered again…the flow label is ord-prod1
And the agent with that label looks to be alive and well… https://cloud.prefect.io/d3x/agent
b
Hello Xavier, sorry to hear that this is happening. If you deleted this flow (d9e2b05b-63ce-45d7-8d0e-68927ccf1645), I won't be able to see it in our system.
Do you have an example of another flow I can take a look at? Also, just out of curiosity, what version of prefect are you running?
We are using prefect cloud…
How do I check what version of the agent is running?
The dashboard shows the agents are polling, and have the expected labels
But submitting a flow with a given label never gets picked up
@Bianca Hoch I have also tried to stop, remove the agent, and the restart ir
@Bianca Hoch any pointers would be much appreciated. Our production environment is currently down because of this, and we are dead in the water until we can resolve this.
b
Hello Xavier. We recently had a user with a similar problem here: https://prefect-community.slack.com/archives/CL09KU1K7/p1660051379327909 . Maybe trying the troubleshooting process outlined in this thread could help?
Also, in regards to the prefect verison you are running, you can check that by running 'prefect --version' in your environment
As for the version of the agent, you can see that when you click the 'More' option when viewing agents in the UI.
x
Version 1.2.3
b
I'd recommend upgrading to the latest version of Prefect 1.0 in your environment (1.3.0), and restarting a fresh set of agents afterwards
x
Will try now
b
Unfortunately I can't access that hyperlink, for security purposes.
x
Copy code
[ec2-user@ip-10-0-141-84 prefect]$ ./bin/prefect --version
Usage: prefect [OPTIONS] COMMAND [ARGS]...
Try 'prefect -h' for help.

Error: No such option: --version
b
It may be prefect -v for you then, I use pipenv for my package management.
You can check this by typing 'prefect -h' like the error specifies
x
Copy code
[ec2-user@ip-10-0-141-84 ~]$ ./prefect/bin/prefect version
1.3.0
Upgraded agent to 1.3.0
And restarted
But still not pulling any flows…I have also tried to register the flows using 1.3.0
b
Hmmm.are you applying the label and schedule at registration time? Doing so will ensure that the label persists to the flow run at the scheduled time.
a
@Xavier Witdouck catching up - could you open a GitHub issue with a brief description of the problem you're facing and a minimal reproducible example? seeing a long thread with 50 messages indicates it's a candidate for GitHub issue
usually if your flow runs are stuck, in a Scheduled state it's a label mismatch, and if stuck in a Submitted state, it's an issue with your agent in both cases this discourse section will help you understand it better https://discourse.prefect.io/tag/stuck
x
Hi @Anna Geller this issue is affecting our production environment, we have had to manually run these jobs this morning. I’m afraid we have to switch back to Airflow as that has been 100% reliable for us. Thanks for your help
b
Hi Xavier, we're sorry to hear that. If you are interested in professional service time, we could schedule some time with you and some of our engineers. They would be able to help with any infra issues that may be occurring here. Should you decide to do so, feel free to reach out at cs@prefect.io.