Demid
09/02/2025, 5:40 PMMarvin
09/02/2025, 5:40 PMMarvin
09/02/2025, 5:47 PMprefect concurrency-limit reset/delete <tag>
), invalidating existing leases.
- The server/database was reset/upgraded and cleared active leases around that time.
- Two processes tried to manage the same lease_id and one released it.
How to investigate
- Identify the limit/tag in use and whether the lease still shows as active:
prefect concurrency-limit inspect <tag>
This shows active_slots
with run IDs currently holding slots.
- Check for nearby logs like “acquired concurrency lease … lease_id=<uuid>” and the exact HTTP 404 response around the renewal call.
- Ensure your runner/flow task is not blocking the event loop (for long CPU work, use threads/processes) so renewals can run on time.
- Confirm no one reset/deleted the concurrency limit while this run was executing.
Mitigations / best practices
- Don’t reset or delete concurrency limits while runs are active.
- Avoid long blocking operations in the same loop where renewals happen; offload CPU-bound work to a thread/process pool.
- Keep worker pods/nodes from going to sleep or being hibernated; ensure stable network to the API.
- If this was a one-off due to a restart or a transient pause, just re-run. If recurring, capture DEBUG logs during acquisition/renewal.
- You can increase verbosity by setting PREFECT_LOGGING_LEVEL=DEBUG
on the worker.
If you can share a bit more, I can be more specific:
- Prefect version (assuming 3.x unless you say otherwise)
- Where you’re running (Cloud vs OSS Server), and any server restarts/upgrades at that time
- The tag/limit name and output of:
prefect concurrency-limit inspect <tag>
- The surrounding log lines (acquisition succeeded, lease_id value, and the 404 during renewal)Demid
09/02/2025, 6:09 PMMarvin
09/02/2025, 6:09 PMMarvin
09/02/2025, 6:10 PMprefect.server.concurrency.lease_storage.memory
(default; per-process)
- Filesystem: prefect.server.concurrency.lease_storage.filesystem
(shared across processes/replicas if they share a volume)
If you’re running multiple API workers or replicas, using memory storage can cause exactly the 404-on-renew issue you’re seeing because renewals may hit a different process that doesn’t have the in-memory lease. Switching to the filesystem backend (and ensuring all API workers share the same filesystem path) will fix that.
Update your server config to:
"concurrency": {
"lease_storage": "prefect.server.concurrency.lease_storage.filesystem"
}
Notes:
- Make sure all API processes/pods mount the same shared volume so they see the same lease files.
- Alternatives: run a single API worker or enable sticky sessions so the same process handles acquire and renew.
- Prefect Cloud manages this for you; no change needed there.
If you’d like, tell me how you’re deploying OSS Server (Docker Compose, Helm/Kubernetes, bare uvicorn), and I can share exact config snippets for that setup.Demid
09/02/2025, 6:31 PM