<@ULVA73B9P> I defined task run concurrency limits...
# ask-marvin
s
@Marvin I defined task run concurrency limits based on tags, however if a task gets to the "crashed" state, it stays in the bucket forever and blocks the slots. How to solve that ?
m
thought for 227.7 seconds
Thanks for flagging this — you’re right that a task that “crashes” can leave its tag-based concurrency slot occupied if the lease isn’t released, which will block new runs. Quick fixes (recover stuck slots now) - Inspect which runs are holding the slots:
Copy code
prefect concurrency-limit inspect <your-tag>
This shows
active_slots
(TaskRun IDs still occupying the limit). - Clear them:
Copy code
prefect concurrency-limit reset <your-tag>
Docs: CLI: concurrency-limit Make it resilient going forward - Use decaying/lease-based limits so slots auto-release if a worker crashes. In Prefect 3.x, “v2” concurrency limits support a decay rate (
slot_decay_per_second
) so that if a task stops heartbeating, its slot naturally expires instead of sticking around forever. See: - How-to: Global concurrency limits - API: create concurrency limit v2 - API: bulk decrement with lease Notes: - The CLI’s
prefect concurrency-limit create
sets a basic limit; it doesn’t expose the decay parameter. To enable decay, create/update the limit via the UI (Cloud) or the REST API to set
slot_decay_per_second
. - Pick a decay that matches your tolerance for auto-release (e.g., enough time for normal runs to refresh their lease, but short enough to recover from a crash). - Optional automation: - If you’re on Cloud, create an Automation that triggers on “Task run enters Crashed” and runs a small remediation flow that calls
prefect concurrency-limit reset <tag>
(or uses the REST API to forget the stuck lease). - Alternatively, run a periodic “janitor” flow that: 1) Lists concurrency limits, 2) Compares
active_slots
to current task states, 3) Resets limits where all holders are terminal/crashed. A couple of questions to tailor the fix - Are you on Prefect Cloud or self-hosted Server? If self-hosted, which version? - How did you create the concurrency limits (CLI vs UI/API)? - Do you want slots to auto-release after a specific time (so we can suggest a sensible
slot_decay_per_second
)? Useful references - Tag-based concurrency limits - CLI: concurrency-limit commands - REST API: concurrency limits - Global (v2/lease-based) concurrency limits If you share your environment/version, I can provide a concrete step-by-step (UI or API) to convert your limit to a decaying/lease-based one so crashed runs can’t hold slots indefinitely.