< Marvin> I defined task run concurrency limits based on tag Prefect Community #ask-marvin

<@ULVA73B9P> I defined task run concurrency limits...

Syméon del Marmol

09/18/2025, 2:52 PM

@Marvin I defined task run concurrency limits based on tags, however if a task gets to the "crashed" state, it stays in the bucket forever and blocks the slots. How to solve that ?

Marvin

09/18/2025, 2:52 PM

✅ thought for 227.7 seconds

Marvin

09/18/2025, 2:55 PM

Thanks for flagging this — you’re right that a task that “crashes” can leave its tag-based concurrency slot occupied if the lease isn’t released, which will block new runs. Quick fixes (recover stuck slots now) - Inspect which runs are holding the slots:

Copy code

prefect concurrency-limit inspect <your-tag>

This shows

active_slots

(TaskRun IDs still occupying the limit). - Clear them:

Copy code

prefect concurrency-limit reset <your-tag>

Docs: CLI: concurrency-limit Make it resilient going forward - Use decaying/lease-based limits so slots auto-release if a worker crashes. In Prefect 3.x, “v2” concurrency limits support a decay rate (

slot_decay_per_second

) so that if a task stops heartbeating, its slot naturally expires instead of sticking around forever. See: - How-to: Global concurrency limits - API: create concurrency limit v2 - API: bulk decrement with lease Notes: - The CLI’s

prefect concurrency-limit create

sets a basic limit; it doesn’t expose the decay parameter. To enable decay, create/update the limit via the UI (Cloud) or the REST API to set

slot_decay_per_second

. - Pick a decay that matches your tolerance for auto-release (e.g., enough time for normal runs to refresh their lease, but short enough to recover from a crash). - Optional automation: - If you’re on Cloud, create an Automation that triggers on “Task run enters Crashed” and runs a small remediation flow that calls

prefect concurrency-limit reset <tag>

(or uses the REST API to forget the stuck lease). - Alternatively, run a periodic “janitor” flow that: 1) Lists concurrency limits, 2) Compares

active_slots

to current task states, 3) Resets limits where all holders are terminal/crashed. A couple of questions to tailor the fix - Are you on Prefect Cloud or self-hosted Server? If self-hosted, which version? - How did you create the concurrency limits (CLI vs UI/API)? - Do you want slots to auto-release after a specific time (so we can suggest a sensible

slot_decay_per_second

)? Useful references - Tag-based concurrency limits - CLI: concurrency-limit commands - REST API: concurrency limits - Global (v2/lease-based) concurrency limits If you share your environment/version, I can provide a concrete step-by-step (UI or API) to convert your limit to a decaying/lease-based one so crashed runs can’t hold slots indefinitely.

Open in Slack

Previous Next