< Marvin> Concurrency lease renewal failed slots are no long Prefect Community #ask-marvin

<@ULVA73B9P> Concurrency lease renewal failed - sl...

Demid

09/02/2025, 5:40 PM

@Marvin Concurrency lease renewal failed - slots are no longer reserved. Terminating execution to prevent over-allocation. What's problem? when he tried to renew take 404, in logs

Marvin

09/02/2025, 5:40 PM

✅ thought for 389.5 seconds

Marvin

09/02/2025, 5:47 PM

Short answer: your run had a reserved concurrency slot, but when the runner tried to renew the lease for that slot the API returned 404 (lease not found). Prefect then aborted the run to avoid using a slot it no longer owns. What the 404 means - Renewals call the concurrency-leases API. A 404 from renew means the server no longer recognizes that lease_id. - Docs for the renew endpoint: REST: renew_concurrency_lease - Related server API functions (acquire/release with leases): Python API: concurrency_limits_v2 Common causes - Lease expired before the renewal fired (runner was paused, event loop blocked, long GC/network stall, pod suspended, etc.). - The lease was already released (e.g., the run or a context manager exited and released slots) but the renewal task still fired once more. - The concurrency limit was reset or deleted while the run was active (

prefect concurrency-limit reset/delete <tag>

), invalidating existing leases. - The server/database was reset/upgraded and cleared active leases around that time. - Two processes tried to manage the same lease_id and one released it. How to investigate - Identify the limit/tag in use and whether the lease still shows as active:

Copy code

prefect concurrency-limit inspect <tag>

This shows

active_slots

with run IDs currently holding slots. - Check for nearby logs like “acquired concurrency lease … lease_id=<uuid>” and the exact HTTP 404 response around the renewal call. - Ensure your runner/flow task is not blocking the event loop (for long CPU work, use threads/processes) so renewals can run on time. - Confirm no one reset/deleted the concurrency limit while this run was executing. Mitigations / best practices - Don’t reset or delete concurrency limits while runs are active. - Avoid long blocking operations in the same loop where renewals happen; offload CPU-bound work to a thread/process pool. - Keep worker pods/nodes from going to sleep or being hibernated; ensure stable network to the API. - If this was a one-off due to a restart or a transient pause, just re-run. If recurring, capture DEBUG logs during acquisition/renewal. - You can increase verbosity by setting

PREFECT_LOGGING_LEVEL=DEBUG

on the worker. If you can share a bit more, I can be more specific: - Prefect version (assuming 3.x unless you say otherwise) - Where you’re running (Cloud vs OSS Server), and any server restarts/upgrades at that time - The tag/limit name and output of:

Copy code

prefect concurrency-limit inspect <tag>

- The surrounding log lines (acquisition succeeded, lease_id value, and the 404 during renewal)

Demid

09/02/2025, 6:09 PM

@Marvin i have option to change "concurrency": { "lease_storage": "prefect.server.concurrency.lease_storage.memory" }, storage to database?

Marvin

09/02/2025, 6:09 PM

✅ thought for 58.2 seconds

Marvin

09/02/2025, 6:10 PM

Short answer: there isn’t a database-backed lease storage today. The supported backends are: - In-memory:

prefect.server.concurrency.lease_storage.memory

(default; per-process) - Filesystem:

prefect.server.concurrency.lease_storage.filesystem

(shared across processes/replicas if they share a volume) If you’re running multiple API workers or replicas, using memory storage can cause exactly the 404-on-renew issue you’re seeing because renewals may hit a different process that doesn’t have the in-memory lease. Switching to the filesystem backend (and ensuring all API workers share the same filesystem path) will fix that. Update your server config to:

Copy code

"concurrency": {
  "lease_storage": "prefect.server.concurrency.lease_storage.filesystem"
}

Notes: - Make sure all API processes/pods mount the same shared volume so they see the same lease files. - Alternatives: run a single API worker or enable sticky sessions so the same process handles acquire and renew. - Prefect Cloud manages this for you; no change needed there. If you’d like, tell me how you’re deploying OSS Server (Docker Compose, Helm/Kubernetes, bare uvicorn), and I can share exact config snippets for that setup.

Demid

09/02/2025, 6:31 PM

you forget about redis

43 Views

Open in Slack

Previous Next