Harshith Gogineni
10/17/2025, 6:42 AMcreate_flow_run_from_deployment()
What's happening:
• Deployment YAML has concurrency_limit: 10 configured
• ECS keeps spawning Fargate tasks without respecting limits
I have tried adding a concurrency limit on work pool instead of deployments but that is causing the runs to not even get into pending/running state they are stuck in late.
So ideally I need a solution where the concurrency limits are respected.Marvin
10/17/2025, 6:42 AMMarvin
10/17/2025, 6:54 AMcreate_flow_run_from_deployment()).
- Work pool concurrency limits cap the total number of active runs launched from that pool (across deployments). When at capacity, additional runs will sit in SCHEDULED and will often show as Late if their scheduled time has already passed — that’s expected; they should kick off when a slot opens.
Docs for reference:
- States overview (incl. AwaitingConcurrencySlot and Late): States
- Work pools: Work pools
- API: create flow run from deployment: API: create_flow_run_from_deployment
- ECS push pool guide: ECS guide
What likely explains 20–30 runs when the deployment limit is 10
1) The limit isn’t actually set on the deployment you’re triggering (e.g., a different deployment is being used, or the YAML didn’t apply).
2) You may have multiple deployments being triggered — the concurrency limit is per deployment, not global. Two deployments each with 10 can yield ~20 running.
3) Older 3.4.11 has known fixes later for the concurrency lease system. We’ve shipped race-condition fixes and related concurrency improvements since then:
- “Fix race conditions in concurrency lease renewal” and “Fix double-release of concurrency limiter token” in 3.4.24. See release notes: Prefect 3.4.24 release
What explains “stuck in Late” when you set a work pool concurrency limit
- That is the normal state while they’re waiting for a pool slot after their scheduled time has passed. They should transition to Pending/Running automatically when a slot frees up.
- If they remain Late indefinitely, it usually means either:
- The pool limit is set lower than the number of already-running runs and never frees up, or
- The server’s background services aren’t running properly (these services are responsible for progressing SCHEDULED runs). If you’re running Server in Docker, make sure you’re starting it with services enabled; the default prefect server start manages that.
Quick checks to run now
1) Verify the deployment you’re triggering truly has the limit set:
- Run:
prefect deployment inspect <flow-name>/<deployment-name>
Confirm there’s a Concurrency limit and that collision strategy is ENQUEUE (queue when at limit). If you don’t see it, your YAML didn’t apply or you’re calling a different deployment.
2) Confirm how many deployments you’re triggering:
- If more than one, note that the limit is per deployment.
3) Inspect a couple of the runs that started beyond the limit:
- Run:
prefect flow-run inspect <flow-run-id>
If these are going straight to Running when the limit is already full, that points to a lease issue in the server version.
4) Validate your server and client versions:
- Server is 3.4.11; are your FastAPI client and any workers also 3.4.11? Mismatches can be risky for new features like deployment concurrency.
Recommended next steps
- Upgrade Prefect Server to at least 3.4.24+ to pick up concurrency lease fixes:
- Release notes show relevant fixes: 3.4.24
- Recreate or redeploy the deployment with an explicit concurrency config to be sure it’s applied. In code (Flow.deploy):
```
from prefect import flow
from prefect.client.schemas.objects import ConcurrencyLimitConfig
@flow
def my_flow():
...Marvin
10/17/2025, 6:54 AMconcurrency_limit is present for the specific deployment. Then run prefect deploy and recheck with prefect deployment inspect.
- If you need a global cap across everything in the pool (not just one deployment), use the work pool concurrency limit set to 10. Expect queued runs to show Late until a slot opens; they should start automatically. If they don’t, check server service logs.
A few clarifying questions to pinpoint the issue
- Can you share:
- The output of prefect deployment inspect <flow>/<deployment> for the affected deployment?
- A small snippet of how your FastAPI app calls create_flow_run_from_deployment()?
- Whether there are multiple deployments being triggered concurrently (names)?
- Whether you’ve set any work queue concurrency limits in addition to the pool limit?
- How you’re starting Prefect Server in Docker (command/compose) so we can confirm services are running?
If you want, I can help you test with a tiny repro: set a flow to sleep for 60s, deploy with concurrency_limit=2, then trigger 10 runs quickly and verify that only 2 run while the rest enter AwaitingConcurrencySlot — that’ll confirm the server is enforcing limits.Harshith Gogineni
10/21/2025, 6:47 AMMarvin
10/21/2025, 6:49 AMMarvin
10/21/2025, 6:50 AMprefect deployment inspect <flow-name>/<deployment-name>
You should see the concurrency limit and the strategy.
3) Sanity test with a repro:
- Create a trivial flow that sleeps 60s
- Deploy with concurrency_limit=2 (ENQUEUE)
- Trigger 10 runs quickly (via your FastAPI or CLI)
- Expect 2 Running, 8 in AwaitingConcurrencySlot (SCHEDULED). ECS should only spin up 2 tasks.
4) If you also use a work pool concurrency limit and see “Late” runs: that’s expected while they wait for a pool slot. They should move to Pending/Running as slots free up. If they remain Late indefinitely, double-check the server is running its background services properly in your Docker setup.
Handy references
- States (AwaitingConcurrencySlot, Late): States
- Work pools: Work pools
- API for creating runs from a deployment (respects deployment concurrency): create_flow_run_from_deployment
- ECS guide: ECS guide
If you can share:
- The output of prefect deployment inspect <flow>/<deployment>
- How you’re starting Prefect Server in Docker (compose/command)
- A snippet of your create_flow_run_from_deployment() call
- Whether multiple deployments are being triggered
…I can help pinpoint if there’s anything else at play. My bet is the server upgrade will fix the enforcement, and version-aligning everything will make behavior predictable.Harshith Gogineni
10/21/2025, 6:56 AMMarvin
10/21/2025, 6:56 AMMarvin
10/21/2025, 6:57 AMprefecthq/prefect:3.4.24-python3.12
or the latest 3.4.x patch with your preferred Python version. Tags are published here: Docker Hub: prefecthq/prefect.
Notes:
- In Prefect 3 ECS “push” work pools there isn’t a separate long‑running worker container; this image is the runtime image for your flow runs (set on the deployment or via job variables).
- It’s best to align your server, your FastAPI client package, and your runtime image to the same 3.4.x patch for predictability. I recommend 3.4.24+ because it includes concurrency lease fixes.
- If your flows use AWS SDKs or Prefect AWS integrations, bake them into the image:
FROM prefecthq/prefect:3.4.24-python3.12
RUN pip install -U prefect-aws boto3
How to set the image:
- In code:
my_flow.deploy(
name="prod",
work_pool_name="my-ecs-pool",
image="prefecthq/prefect:3.4.24-python3.12",
)
- Or in prefect.yaml:
deployments:
- name: prod
work_pool: { name: my-ecs-pool }
image: prefecthq/prefect:3.4.24-python3.12
If you want me to sanity‑check the exact tag you’re considering, share it and I’ll confirm it’s published.