Zach Chin
11/07/2025, 3:09 AMWorker 'ProcessWorker 06e966df-0a35-4265-a440-8ad861f6caab' submitting flow run 'e33f8d75-d3f5-41ba-8dce-3bab1d0de71f'
07:05:05 PM
prefect.flow_runs.worker
Worker 'ProcessWorker 56a971a0-d860-4469-b5a5-eb94813beec8' submitting flow run 'e33f8d75-d3f5-41ba-8dce-3bab1d0de71f'
07:05:06 PM
prefect.flow_runs.worker
Worker 'ProcessWorker 4a37f28d-eaa8-41e4-8d0d-f015a66fe98b' submitting flow run 'e33f8d75-d3f5-41ba-8dce-3bab1d0de71f'
07:05:08 PM
prefect.flow_runs.worker
Aborted submission of flow run 'e33f8d75-d3f5-41ba-8dce-3bab1d0de71f'. Server sent an abort signal: This run is in a PENDING state and cannot transition to a PENDING state.
07:05:16 PM
prefect.flow_runs.worker
Aborted submission of flow run 'e33f8d75-d3f5-41ba-8dce-3bab1d0de71f'. Server sent an abort signal: This run is in a PENDING state and cannot transition to a PENDING state.
07:05:19 PM
prefect.flow_runs.worker
Opening process...
07:05:30 PM
prefect.flow_runs.runner
Completed submission of flow run 'e33f8d75-d3f5-41ba-8dce-3bab1d0de71f'
07:05:39 PM
prefect.flow_runs.worker
> Running set_working_directory step...
I noticed this much more when I tried scaling workers up to handle a large backlog, but after scaling down i notice it still happens, just less frequently.Marvin
11/07/2025, 3:09 AMMarvin
11/07/2025, 3:17 AMPREFECT_API_REQUEST_TIMEOUT=60 prefect worker start -p your-pool
If you see httpx timeout messages in worker logs, this typically helps.
- Partition the workload to reduce contention:
- Use separate work queues and point different workers at different queues:
prefect worker start -p your-pool -q queue-a -n worker-a
prefect worker start -p your-pool -q queue-b -n worker-b
- If you’re pushing through a large backlog, this spreads runs and reduces simultaneous claims on the exact same run.
- Throttle total concurrency at the pool to smooth bursts:
prefect work-pool set-concurrency-limit "your-pool" 10
This won’t eliminate races entirely, but it can reduce the blast radius during spikes.
- If you use per-worker limits (e.g., --limit 1) and you notice “Flow run limit reached” after an aborted submission, that’s the bug in #15952. Workaround: restart the worker process to clear the stuck local counter.
- Collect more signal when it happens:
- Run workers with debug logs to catch timeouts/retries
PREFECT_LOGGING_LEVEL=DEBUG prefect worker start -p your-pool
- Note any httpx.ReadTimeout, SSLWantReadError, or retry logs around the moment runs get stuck in PENDING.
A few clarifying questions so I can be more specific:
- Which Prefect version are you on?
- Cloud or OSS Server?
- What worker type are you using (Process, Docker, Kubernetes)?
- Are you setting --limit on workers, or using pool/queue-level concurrency limits?
- When you say “stuck in PENDING”, for how long do they stay there, and do you see any network timeout messages in worker logs at that time?
Related references:
- Duplicate PENDING after client timeout: #10263
- Retries/worker replacement loop: #16126
- Local concurrency stuck after abort (Prefect 3.x): #15952Marvin
11/07/2025, 3:17 AMPREFECT_API_REQUEST_TIMEOUT): Prefect settings