Charles Hunt
10/12/2023, 3:55 PMprefect.agent - Aborted submission of flow run 'fcf22119-8e49-44cc-8f80-66978a26c273'. Server sent an abort signal: This run is in a PENDING state and cannot transition to a PENDING state.
The previous flow gets completed normally and "exited cleanly"
A bit of background:
The work pool has a concurrency limit of 1
I have tried changing the polling frequency to 15 seconds, but it made no visible impact to the frequency of this occuring.
Our flows are written in python and are stored in the github repo.
We have Python 3.10 installed on the VM
The VM that the agent is running on, does no other work and is dedicated runing only the agent and executing the jobs - all the tasks run locally on the VM, but every job completes before the next one is started.
We are running the latest version of the prefect agent
I see a few others report the same issue but I haven't seen a matching case with ours. Any help or pointers would be greatly appreciated.Jake Kaplan
10/12/2023, 3:59 PMCharles Hunt
10/12/2023, 3:59 PMJake Kaplan
10/12/2023, 4:00 PMCharles Hunt
10/12/2023, 4:02 PMJake Kaplan
10/12/2023, 4:02 PMCharles Hunt
10/12/2023, 4:02 PMCharles Hunt
10/12/2023, 4:03 PM01:30:26.406 | INFO | prefect.agent - Submitting flow run 'c1dba183-110e-409b-9d1c-9d11b1a40b77'
01:30:27.088 | INFO | prefect.infrastructure.process - Opening process 'cheerful-auk'...
01:30:27.245 | INFO | prefect.agent - Completed submission of flow run 'c1dba183-110e-409b-9d1c-9d11b1a40b77'
/usr/lib/python3.10/runpy.py:126: RuntimeWarning: 'prefect.engine' found in sys.modules after import of package 'prefect', but prior to execution of 'prefect.engine'; this may result in un
predictable behaviour
warn(RuntimeWarning(msg))
01:30:29.855 | INFO | Flow run 'cheerful-auk' - Downloading flow code from storage at ''
01:30:32.190 | INFO | Flow run 'cheerful-auk' - Creating ingest listener request...
01:30:32.591 | INFO | Flow run 'cheerful-auk' - Flow running on host s-uks-sv-vm-p-2
01:30:32.592 | INFO | Flow run 'cheerful-auk' - Creating ingest listener request...
01:30:32.593 | INFO | Flow run 'cheerful-auk' - {'JobName':'sap_orderitem_prod', 'DataJobId':310,'JobPathId':'Smythson', 'DataJobType':'OrderItem', 'RemoveSourceFile':'true'}
01:30:39.537 | INFO | Flow run 'cheerful-auk' - {'date': '2023-10-12T01:30:39.5334325+00:00', 'jobId': 1, 'status': 'completedSuccess', 'sendCount': 117, 'receiveCount': 117, 'jobStatus
ExceptionMessage': '', 'jobStatusExceptionStackTrace': ''}
01:30:39.705 | INFO | Flow run 'cheerful-auk' - Finished in state Completed()
01:30:42.175 | INFO | prefect.infrastructure.process - Process 'cheerful-auk' exited cleanly.
01:30:53.010 | INFO | prefect.agent - Submitting flow run 'fcf22119-8e49-44cc-8f80-66978a26c273'
01:30:55.267 | INFO | prefect.agent - Aborted submission of flow run 'fcf22119-8e49-44cc-8f80-66978a26c273'. Server sent an abort signal: This run is in a PENDING state and cannot trans
ition to a PENDING state.
Charles Hunt
10/12/2023, 4:04 PMCharles Hunt
10/12/2023, 4:06 PMJake Kaplan
10/12/2023, 4:14 PMfcf22119-8e49-44cc-8f80-66978a26c273
?Jake Kaplan
10/12/2023, 4:14 PMJake Kaplan
10/12/2023, 4:15 PMPENDING
for > some timeDeceivious
10/12/2023, 4:59 PMCharles Hunt
10/12/2023, 8:47 PMfcf22119-8e49-44cc-8f80-66978a26c273
Charles Hunt
10/12/2023, 8:47 PMStéphan Taljaard
10/13/2023, 5:17 AMdefault
queue does most of the work and has a concurrency limit of 25. The other
queue has a limit of 1.
- I had an agent running on an on-prem server polling the pool (so it picked up work for all the pool's queues)
- I wanted to run the work of the other queue in different infra, so I created a new GCP VM, started up an agent to poll only the other
queue of the default pool. I also changed my on-prem agent to only poll the default
queue of the default pool.
-> Every now and then, I'll get "stuck in pending" flow runs on my GCP VM (on the other queue), and no issues on my on-prem VM.
(I have not compared agent logs to see if for some reason my on-prem agent picks up from the wrong queue?)Jake Kaplan
10/13/2023, 1:40 PMThis run is in a PENDING state and cannot transition to a PENDING state.
can only appear if something else has put the run in a pending state. Either another agent/worker OR the same agent/worker that might have experienced some sort of failure (thats' why I asked about other logs with that id)