<@ULVA73B9P> Concurrency lease renewal failed - sl...
# ask-marvin
d
@Marvin Concurrency lease renewal failed - slots are no longer reserved. Terminating execution to prevent over-allocation. What's problem? when he tried to renew take 404, in logs
m
thought for 389.5 seconds
Short answer: your run had a reserved concurrency slot, but when the runner tried to renew the lease for that slot the API returned 404 (lease not found). Prefect then aborted the run to avoid using a slot it no longer owns. What the 404 means - Renewals call the concurrency-leases API. A 404 from renew means the server no longer recognizes that lease_id. - Docs for the renew endpoint: REST: renew_concurrency_lease - Related server API functions (acquire/release with leases): Python API: concurrency_limits_v2 Common causes - Lease expired before the renewal fired (runner was paused, event loop blocked, long GC/network stall, pod suspended, etc.). - The lease was already released (e.g., the run or a context manager exited and released slots) but the renewal task still fired once more. - The concurrency limit was reset or deleted while the run was active (
prefect concurrency-limit reset/delete <tag>
), invalidating existing leases. - The server/database was reset/upgraded and cleared active leases around that time. - Two processes tried to manage the same lease_id and one released it. How to investigate - Identify the limit/tag in use and whether the lease still shows as active:
Copy code
prefect concurrency-limit inspect <tag>
This shows
active_slots
with run IDs currently holding slots. - Check for nearby logs like “acquired concurrency lease … lease_id=<uuid>” and the exact HTTP 404 response around the renewal call. - Ensure your runner/flow task is not blocking the event loop (for long CPU work, use threads/processes) so renewals can run on time. - Confirm no one reset/deleted the concurrency limit while this run was executing. Mitigations / best practices - Don’t reset or delete concurrency limits while runs are active. - Avoid long blocking operations in the same loop where renewals happen; offload CPU-bound work to a thread/process pool. - Keep worker pods/nodes from going to sleep or being hibernated; ensure stable network to the API. - If this was a one-off due to a restart or a transient pause, just re-run. If recurring, capture DEBUG logs during acquisition/renewal. - You can increase verbosity by setting
PREFECT_LOGGING_LEVEL=DEBUG
on the worker. If you can share a bit more, I can be more specific: - Prefect version (assuming 3.x unless you say otherwise) - Where you’re running (Cloud vs OSS Server), and any server restarts/upgrades at that time - The tag/limit name and output of:
Copy code
prefect concurrency-limit inspect <tag>
- The surrounding log lines (acquisition succeeded, lease_id value, and the 404 during renewal)
d
@Marvin i have option to change "concurrency": { "lease_storage": "prefect.server.concurrency.lease_storage.memory" }, storage to database?
m
thought for 58.2 seconds
Short answer: there isn’t a database-backed lease storage today. The supported backends are: - In-memory:
prefect.server.concurrency.lease_storage.memory
(default; per-process) - Filesystem:
prefect.server.concurrency.lease_storage.filesystem
(shared across processes/replicas if they share a volume) If you’re running multiple API workers or replicas, using memory storage can cause exactly the 404-on-renew issue you’re seeing because renewals may hit a different process that doesn’t have the in-memory lease. Switching to the filesystem backend (and ensuring all API workers share the same filesystem path) will fix that. Update your server config to:
Copy code
"concurrency": {
  "lease_storage": "prefect.server.concurrency.lease_storage.filesystem"
}
Notes: - Make sure all API processes/pods mount the same shared volume so they see the same lease files. - Alternatives: run a single API worker or enable sticky sessions so the same process handles acquire and renew. - Prefect Cloud manages this for you; no change needed there. If you’d like, tell me how you’re deploying OSS Server (Docker Compose, Helm/Kubernetes, bare uvicorn), and I can share exact config snippets for that setup.
d
you forget about redis