<@ULVA73B9P> Is it possible that retries don't res...
# ask-marvin
j
@Marvin Is it possible that retries don't respect concurrency limits from a deployment?
m
thought for 790.9 seconds
j
@Marvin do you have an answer?
m
thought for 10.6 seconds
I don’t have the original question in this thread—could you share a quick summary of what you need an answer to? If it’s Prefect-related, a few details will help me get you an actionable fix fast: - Prefect version (I’ll assume 3.x unless you say otherwise) - How you’re running/deploying (e.g.,
flow.from_source(...).deploy(...)
, work pool name/type) - Worker environment (sounds like Vertex AI in your setup) and any worker/flow logs - Error messages or a flow run ID - Any recent changes (e.g., Docker image/arch updates—you mentioned resolving an architecture mismatch before) If this is specifically about your Vertex AI work pool or the Docker architecture issue you were troubleshooting, let me know where it’s stuck now and I’ll dive right in.
j
@Marvin is it possible that if a flow failed, and we copy the flow to a new run, this doesn't respect the concurrency limits?
m
thought for 137.8 seconds
Short answer: by design, yes—copied flow runs should still respect concurrency limits in Prefect 3.x. A “Copy to new run” is just another normal flow run; orchestration enforces limits regardless of how the run was created. That said, there are a few gotchas that can make it look like a copied run bypassed limits: - Work pool concurrency limit - Enforced at dispatch time. The copied run must target the same deployment/work pool to be limited. - If the copied run was started in a different pool (or as an ad‑hoc run without the original deployment), it won’t see the original pool’s limit. - Verify with:
Copy code
prefect work-pool inspect <pool_name>
    prefect work-pool set-concurrency-limit <pool_name> <int>
- Global concurrency limits (by key/tag) - Enforced when your code acquires a slot (e.g.,
with concurrency("my-key")
) or when tasks carry a tag that maps to a server concurrency limit. - If the copied run is missing the expected key/tag (e.g., tags added dynamically from parameters or only set on the original), it won’t acquire the limit. - Verify with:
Copy code
prefect concurrency-limit ls
    prefect concurrency-limit inspect <key_or_tag>
Check that the copied run’s tasks appear in
active_slots
. - Ordering vs. enforcement - Limits control how many run concurrently, not which one goes first. A copied run may start before another queued run once a slot frees up, but it should never exceed the configured capacity. If you share: - Which kind of limit you’re using (work pool limit vs. global concurrency key/tag) - The original and copied flow run IDs - The work pool name (I know you’re using Vertex AI pools) …I can confirm whether the copied run acquired the correct slot and whether the pool limit was applied. Docs for reference: - Global concurrency limits - Work pools and pool-level concurrency
Short answer: retries should respect concurrency limits. In Prefect 3.x: - Work pool and work queue concurrency limits are enforced by the server when a flow run transitions to Running. If a flow fails and goes to AwaitingRetry, it releases the slot; when it retries, it must reacquire a slot. This applies equally to Vertex AI work pools. - If you’re using Prefect Cloud’s “Max concurrent runs” on a deployment, retries for that deployment also count toward and are constrained by that limit. - For global/tag-based concurrency limits (the concurrency-limit service), a retried run that attempts to acquire the same concurrency names will wait in AwaitingConcurrencySlot until a slot is available. We’re not aware of a current bug where retries bypass these limits. If you’re seeing otherwise, I’d love details to reproduce. A few quick checks: - Confirm which limit you’re relying on: - Work pool limit - Work queue limit - Deployment-level “Max concurrent runs” (Cloud) - Global concurrency-limit names - Inspect current limits and usage: - Work pool
Copy code
prefect work-pool inspect "<your-pool>"
Docs: CLI: work-pool - Work queue
Copy code
prefect work-queue inspect -p "<your-pool>" "<your-queue>"
Docs: CLI: work-queue - Global concurrency limits
Copy code
prefect concurrency-limit ls
Guide: Global concurrency limits If you can share: - Which exact limit you’re using (pool/queue/deployment/global) - Prefect version - An example where you see more concurrent runs than the limit (counts/time window) - A flow run ID where a retry seemed to “skip” the limit …I can dig in and help pinpoint what’s happening.
j
@Marvin the concurrency is set at the deployment level, with collision strategy ENQUEUE
m
thought for 191.7 seconds
Thanks — that helps. In Prefect 3.x, deployment-level “Max concurrent runs” with collision strategy ENQUEUE does apply to retries. What to expect: - When a run fails and schedules a retry, it releases the deployment concurrency slot. On the retry attempt, it must re-acquire a slot. If the deployment is at its max, the retry is queued (ENQUEUE) until a slot frees up. This prevents retries from exceeding the deployment’s limit. Common reasons it can look like retries are ignoring the limit: - The other concurrent runs are from a different deployment. Deployment concurrency is scoped per deployment, not per flow name. - A “copied” or ad‑hoc run was started without the original deployment selected; deployment-level concurrency won’t apply to those. - The limit was recently changed and you’re looking at runs started before the change took effect. Quick checks - Confirm the run is tied to the intended deployment in the UI (Run > Details > Deployment). - Inspect the deployment to verify its settings:
Copy code
prefect deployment inspect "<flow-name>/<deployment-name>"
- List current running runs for that deployment to compare against your limit:
Copy code
prefect flow-run ls --flow-name "<flow-name>" --state RUNNING --limit 50
If you have an example where a retry started while the deployment was already at its max, share: - The deployment name - The “Max concurrent runs” value - The flow run ID(s) for both the active runs and the retry that started - Your Prefect version and pool (sounds like Vertex AI) I’ll trace the states and confirm whether the retry re-acquired a slot as expected. Related docs: - Global concurrency limits (separate from deployment-level, but useful context) - Work pools (for pool-level concurrency considerations)