<@ULVA73B9P> I have created a main flow/deployme...
# ask-community
k
@Marvin I have created a main flow/deployment , from this main deployment iam calling another deployment, creating effectively some 748 subflows of this deployment, i have provisioned around 10 workers, but i still see some runs getting stuck in pending state for longer periods, around 30 mins as well for some runs, why is it so, eventhough i have provisioned good number of workers?
m
thought for 368.2 seconds
Short answer: “Pending” means no worker has claimed the run yet. With 748 subflow runs, this usually comes down to one of these: - A work-pool/queue mismatch (workers aren’t polling the queue the sub-deployment uses) - A concurrency limit (either on the work pool or a global/tag-based concurrency limit) - Not enough available infra capacity even though workers are up (e.g., only a few jobs can be in-flight due to provider quotas) - The pool is paused or otherwise not accepting runs A few targeted checks to find the culprit: 1) Confirm the pool/queue wiring - Make sure your sub-deployment’s work pool (and work queue, if set) matches what your 10 workers are polling. - CLI:
Copy code
prefect work-pool inspect "<your-pool-name>"
Look for: - paused = False - concurrency_limit (see if it’s set) - queues configured for that pool - If your deployment is set to a specific
work_queue_name
, ensure workers start with
--work-queue "<that-queue>"
or are polling it. 2) Check for global concurrency limits by tag - If your deployment/flows use tags, a tag-based limit may be throttling runs in a Pending “AwaitingConcurrencySlot” state. - CLI:
Copy code
prefect concurrency-limit ls
If you see limits that match your tags, either increase them:
Copy code
prefect concurrency-limit create <tag> <new_limit>
or inspect/reset if slots are stuck:
Copy code
prefect concurrency-limit inspect <tag>
  prefect concurrency-limit reset <tag>
- Docs: Global concurrency limits 3) Check the work pool’s concurrency limit - If a limit is set on the pool, only that many runs will be allowed concurrently, regardless of worker count. - Increase or clear it if needed:
Copy code
prefect work-pool set-concurrency-limit "<your-pool-name>" 200
or
Copy code
prefect work-pool clear-concurrency-limit "<your-pool-name>"
- Docs: Work pools 4) Look at a “Pending” run’s state details - In the UI, click a Pending run and open “State details”. If you see: - “AwaitingConcurrencySlot …” → tag/global concurrency limit - “Awaiting work pool concurrency” → work pool concurrency limit - No worker found for queue/pool → pool/queue mismatch 5) Infra capacity/quotas - If you’re using Docker/Kubernetes/ECS, workers may be up but your cluster/cloud may only admit a subset of jobs at a time. That typically shows up as some runs quickly starting while others sit Pending until capacity frees up. - Check your infra for resource quotas or pending jobs. 6) Optional: subflow vs independent runs - If you’re calling
prefect.deployments.run_deployment
, the default is
as_subflow=True
. That’s fine, but you can set
as_subflow=False
to decouple from the parent flow’s state tracking if you prefer. Signature for reference:
Copy code
from prefect.deployments import run_deployment
  run_deployment(
      name="path/to/deployment",
      parameters={...},
      as_subflow=True,  # set to False if you want
      work_queue_name="...",  # ensure this matches your workers if you set it
      tags=[...]
  )
- API ref: run_deployment If you can share a bit more, I can pinpoint it quickly: - The work pool name and whether it shows a concurrency limit - Whether your sub-deployment sets a
work_queue_name
- What the “State details” say for one of the Pending runs - Whether you’re using tags on the deployment/flows - Which infrastructure type your workers use (Process, Docker, Kubernetes, ECS, etc.) Links for reference: - Work pools: docs - Global concurrency limits: docs
k
@Marvin iam not specifying any workqueue for my deployments in my run_deployment and also iam using only one 'default' workqueue curretly, can i create some workqueues and start one worker for each of the wokquue to reduce the delays of runs getting picked up, in this case if iam not specifying any workqueue while calling run deployment, do all teh euuns go to the default workqueu or are they distributed among all teh available workqeues
m
thought for 323.0 seconds
Short answer: runs are not auto-distributed. If you don’t specify a work queue when calling
run_deployment
, the flow run will use the deployment’s configured work queue; if the deployment doesn’t have one set, it goes to the work pool’s Default queue. Creating additional queues won’t change routing unless you point deployments (or each
run_deployment
call) at those queues. Details and suggestions: - In Prefect 3.x,
run_deployment
has an optional
work_queue_name
argument. If you omit it, the server uses the queue configured on the deployment; if none, it uses the pool’s Default queue. So all your runs will continue to land on “Default” unless you change that. - Starting more workers will reduce pickup latency: - Start more workers against the same pool/queue to increase capacity. - If you start a worker with only
--pool
(no
--work-queue
), it will poll all queues in that pool. If you want strict capacity per queue, start each worker with
--work-queue <name>
. - To actually use multiple queues, either: - Set the deployment to target a specific queue, or - Pass
work_queue_name
when you call
run_deployment
. Examples: - Create queues and start workers
Copy code
prefect work-queue create high-priority -p my-pool
prefect work-queue create low-priority -p my-pool

# Worker pinned to a specific queue
prefect worker start -p my-pool --work-queue high-priority

# Worker that polls all queues in the pool
prefect worker start -p my-pool
- Route a run to a queue at call time
Copy code
from prefect.deployments.flow_runs import run_deployment

await run_deployment(
    name="my-flow/my-deployment",
    parameters={"x": 1},
    work_queue_name="high-priority"
)
Helpful docs: - Work pools and queues - Workers - CLI: work-queue If you share how you want to partition runs (e.g., by priority, team, or environment), I can suggest a concrete queue/deployment layout.
k
@Marvin will teh pickup of runs be faster if i ditsribute my 748 runs to differnt work_queues and assign a worker to each of those work queues insted of using single 'default' workqueue for all the runs?
m
thought for 46.2 seconds
Short answer: Not necessarily. Pickup speed is driven by available workers and any concurrency limits, not by how many work queues you have. You’ll see the same or better improvement by starting more workers on your existing pool/queue as you would by splitting into many queues with one worker each. Why: - When you don’t specify
work_queue_name
in
run_deployment
, runs go to the deployment’s configured queue; if none, they land on the pool’s Default queue. - A worker started with just
--pool
will poll all queues in that pool; a worker started with
--work-queue <name>
will poll only that queue. Either way, more workers = faster pickup. - Multiple queues help for isolation, different job templates, prioritization, or separate concurrency limits. They don’t inherently speed up pickup unless you were previously bottlenecked by a queue-level concurrency limit. What to do to speed up pickup for 748 runs: - Start more workers for the same pool. You can point them all at the Default queue or let them poll the whole pool. - Make sure you’re not throttled by pool or queue concurrency limits. - Use an infrastructure-backed worker type (e.g., Kubernetes, ECS) if you need to fan out quickly. Examples: - Add workers against the same pool/queue
Copy code
# Start 5 workers that poll the entire pool
prefect worker start -p my-pool
prefect worker start -p my-pool
prefect worker start -p my-pool
prefect worker start -p my-pool
prefect worker start -p my-pool

# Or pin them to the Default queue explicitly
prefect worker start -p my-pool --work-queue Default
- If you do want separate queues (e.g., for isolation), it’s fine to run one worker per queue:
Copy code
prefect work-queue create q1 -p my-pool
prefect work-queue create q2 -p my-pool

prefect worker start -p my-pool --work-queue q1
prefect worker start -p my-pool --work-queue q2
Docs: - Work pools and queues - Workers - CLI: work-queue If you share your worker type (local, Docker, K8s, etc.), current pool/queue concurrency limits, and how you’re launching the 748 runs, I can suggest a concrete tuning plan.
k
- Use an infrastructure-backed worker type (e.g., Kubernetes, ECS) if you need to fan out quickly. @Marvin explain thsi point in detail
m
thought for 200.4 seconds
Great question. Here’s what “infrastructure-backed workers” (Kubernetes, ECS, etc.) mean in Prefect 3.x and why they help you fan out quickly. What they are - A worker that submits each flow run to external compute as a separate job (pod/task/container) instead of running the flow inline on the worker machine. - Examples: - Kubernetes worker: creates a Kubernetes Job/Pod per flow run. - ECS worker: starts an ECS task (Fargate or EC2) per flow run. - Docker worker: starts a new local/remote Docker container per flow run. Why they fan out faster - True parallelism: Each flow run gets its own container/pod/task. Your concurrency is limited by your cluster/service capacity instead of a single machine’s CPU/RAM. - Elastic capacity: K8s clusters with a cluster autoscaler or ECS with Fargate can scale up quickly to run many jobs at once. With proper quotas, 100s of runs can start in parallel. - Isolation and stability: Failures in one run do not impact others; each run lives in its own disposable environment. How it works in Prefect 3.x - You create a Work Pool of a specific type (kubernetes, ecs, docker, etc.). The pool stores a “base job template” that defines how to launch one run (image, resources, env vars, IAM/service account, networking). - You start a worker pointed at that pool. The worker watches the pool’s queues and, when runs are scheduled, submits jobs to your infra using the template (with per-run overrides). - Your deployments target that work pool. When a run is created (schedule, trigger, or run_deployment), the worker spins up infra for that run, then tears it down when finished. Setup at a glance - Create a work pool (and customize the job template as needed):
Copy code
prefect work-pool create my-k8s-pool --type kubernetes
prefect work-pool get-default-base-job-template --type kubernetes --file k8s-job.json
# Edit k8s-job.json (image, namespace, resources, env, serviceAccount, volumes, etc.)
prefect work-pool create my-k8s-pool --base-job-template k8s-job.json --overwrite
- Start a worker with the required extras installed:
Copy code
# Kubernetes
uv run --with 'prefect[kubernetes]' prefect worker start -p my-k8s-pool

# ECS
uv run --with 'prefect[aws]' prefect worker start -p my-ecs-pool
- Point your deployment to the pool (either in your project’s prefect.yaml or in code), then trigger runs. The worker will create pods/tasks per run. Kubernetes specifics (fast fan-out tips) - Image startup time: Use a small image and pre-pull it on nodes if possible; set imagePullPolicy: IfNotPresent when appropriate. Private registries should be close to the cluster. - Node scaling: Enable/verify the cluster autoscaler and ensure resource requests/limits are realistic so the scheduler can bin-pack efficiently. - Permissions: Assign a serviceAccount with minimal RBAC rights to create Jobs/Pods in the target namespace. - Resources: Tune CPU/memory requests to balance density vs throttling; too-large requests slow scheduling. - Cleanup: Use TTLAfterFinished for jobs or let Prefect manage job cleanup; keep logs flowing to your logging stack. ECS specifics (fast fan-out tips) - Launch type: Fargate is serverless (fast to scale, no cluster management) but watch costs; EC2-backed ECS can be faster/cheaper if you manage capacity. - Task definition: Right-size CPU/memory; specify subnets/security groups; attach an execution role and task role with least-privilege. - Networking and images: Keep ECR images in-region; ensure NAT/bandwidth is not a bottleneck for image pulls. - Logs: Configure CloudWatch logs for visibility and quick troubleshooting. When you’ll see the biggest gains - You have large bursts (like your 748 runs) and need many to start concurrently. - Your current bottleneck is one machine’s capacity (local/process worker) rather than orchestration. - You need isolation between runs, or different images per deployment. What still matters for pickup speed
- Worker availability: At least one healthy worker for the pool must be running; usually 1–3 workers are enough since they only submit jobs. You can run more for resilience or higher submission throughput. - Concurrency limits: Ensure your pool/queue/global concurrency limits aren’t throttling the burst. - Provider quotas: K8s cluster quotas, ECS account service limits, and image registry rate limits can cap fan-out. Raise limits or pre-warm where needed. Examples of deployment routing - In code, you can deploy to a pool directly:
Copy code
from prefect import flow
from prefect.deployments import Deployment

@flow
def my_flow():
    ...

Deployment(image="<http://ghcr.io/acme/my-image:latest|ghcr.io/acme/my-image:latest>").from_source(
    "<https://github.com/acme/my-repo>",
    entrypoint="flows/my_flow.py:my_flow"
).deploy(
    name="prod",
    work_pool_name="my-k8s-pool"
)
Docs to dive deeper - Work pools and queues: Work pools and queues - Workers overview: Workers - Managing work pools and job templates: Manage Work Pools - AWS ECS guide: Prefect AWS ECS guide - Serverless infra patterns: Serverless infrastructure guide If you share your target platform (K8s vs ECS), image strategy, and any concurrency limits, I can propose a concrete job template and scaling plan for your 748-run bursts.
k
@Marvin does specifying the type of worker when straing the worker will help for faster pickup? if there is no concurrency limits and no qout a limits that i sstopping the runs excution and even in taht case if i see delays while using 10 workers for 748 runs what could be the issue
m
thought for 115.7 seconds
Short answer - You don’t specify a “worker type” when starting a worker in Prefect 3.x. The work pool’s type (kubernetes, ecs, docker, process) determines how runs are launched. So “specifying the type when starting” won’t change pickup speed. - If you see delays with 10 workers for 748 runs and you’re sure there are no concurrency or provider quota limits, the delay is usually due to how workers poll/claim runs, queue/pool configuration, or infrastructure startup overhead. What actually affects pickup speed - Worker-to-queue wiring: - Workers started with only
--pool
poll all queues in the pool. - Workers started with
--work-queue
poll only that queue. If your runs all land on “Default” but most workers are pinned to other queues, they won’t pick anything up. - Pool/queue state and limits: - Pool or queue paused, or hidden concurrency limits on the pool/queue/deployment can throttle pickup. - API rate limits/backoff: - If workers hit Prefect Cloud API rate limits, they’ll back off and poll less often, delaying pickup. - Queue filters/priority: - Misconfigured queue filters (e.g., by tags) or priorities can make runs ineligible for some workers. - Worker submission throughput vs infra startup: - Workers claim runs fast but each submission might involve image pulls and provider API calls (K8s/ECS). If containers/pods/tasks are slow to start, it can look like “pickup” delay. How to diagnose quickly - Verify workers are polling the right place and are healthy - Make sure all workers are started against the correct pool and, if pinned, the correct queue(s). - In the UI, check Worker status and heartbeats; in logs, look for “claimed N runs” vs “no eligible runs”. - Inspect queues and pools
Copy code
prefect work-queue ls -p <your-pool> --verbose
prefect work-pool inspect <your-pool>
- Confirm: not paused, no concurrency limits, queue priorities/filters look correct, and the runs are actually in the queue your workers poll. - Preview and read runs
Copy code
prefect work-queue preview -p <your-pool>
prefect work-queue read-runs -p <your-pool>
- Preview shows which runs are eligible and why/why not. - Check for API rate limiting in worker logs - Look for 429 responses/backoff messages. If present, consider staggering worker starts or reducing poll aggressiveness. - Check infra startup time - For K8s/ECS: are images large or not cached? Is autoscaling slow? Are there node capacity issues, slow image registry pulls, or IAM/permissions hiccups? Concrete fixes to try - Ensure all workers can see the runs - If you’re using multiple queues, either route runs to those queues or start workers with just
--pool
so they poll all queues. - Add more workers or keep a few “submitter” workers but scale infra - Typically a few workers suffice; fan-out comes from infra-backed jobs (K8s/ECS). If you’re already infra-backed, focus on infra startup time. - Reduce infra cold-start - Use smaller images, pre-pull or warm images, colocate registry with cluster/region, enable/verify K8s cluster autoscaler or ECS service quotas, and right-size CPU/memory requests. - Re-check hidden limits - Deployment-level concurrency limits, global concurrency limits, or queue/pool limits can be easy to miss. - Avoid over-pinning workers to empty queues - If in doubt, start workers like:
Copy code
prefect worker start -p <your-pool>    # polls all queues in the pool
If you can share: - Your work pool type (process/docker/kubernetes/ecs), - How the 748 runs are created (schedule vs burst
run_deployment
), - Whether the delay is from Scheduled -> Pending or Pending -> Running, - And a sample of your worker logs around the delay, I’ll pinpoint the likely bottleneck and suggest exact changes. Docs for reference: - Work pools and queues: Work pools and queues - Workers: Workers
- Manage work pools (job templates, tuning): Manage Work Pools
k
@Marvin how can i write my worker logs to a file
m
thought for 931.7 seconds
You’ve got a few good options, depending on where your worker runs. The simplest is shell redirection; the most robust is a logging config that adds a FileHandler. Quickest: redirect stdout/stderr to a file (local/process workers) - Start your worker and append logs to a file:
Copy code
mkdir -p /var/log/prefect
nohup prefect worker start -p my-pool >> /var/log/prefect/worker.log 2>&1 &
- Rotate with logrotate (recommended for long-running services). Prefect-native: add a file handler via PREFECT_LOGGING_SETTINGS - Prefect honors a standard Python logging dictConfig via the
PREFECT_LOGGING_SETTINGS
env var. Example (rotating file + console):
Copy code
export PREFECT_LOGGING_SETTINGS='{
  "version": 1,
  "disable_existing_loggers": false,
  "formatters": {
    "standard": {
      "format": "%(asctime)s | %(levelname)s | %(name)s - %(message)s"
    }
  },
  "handlers": {
    "console": {
      "class": "logging.StreamHandler",
      "formatter": "standard",
      "stream": "<ext://sys.stdout>"
    },
    "file": {
      "class": "logging.handlers.RotatingFileHandler",
      "formatter": "standard",
      "filename": "/var/log/prefect/worker.log",
      "maxBytes": 10485760,
      "backupCount": 5
    }
  },
  "loggers": {
    "prefect": { "handlers": ["console", "file"], "level": "INFO", "propagate": false },
    "":        { "handlers": ["console", "file"], "level": "INFO" }
  }
}'
export PREFECT_LOGGING_LEVEL=INFO
prefect worker start -p my-pool
- Make sure the directory exists and is writable by the worker process. - You can also store that JSON in a file and set the env var to its contents:
Copy code
export PREFECT_LOGGING_SETTINGS="$(cat ./logging.json)"
Docker - If the worker runs in Docker, either: - Use Docker logs (recommended) and a logging driver (json-file, fluentd, etc.), or - Bind mount a host directory and point your FileHandler to it:
Copy code
docker run --name prefect-worker \
  -e PREFECT_API_URL=... -e PREFECT_API_KEY=... \
  -e PREFECT_LOGGING_SETTINGS="$(cat logging.json)" \
  -v /var/log/prefect:/var/log/prefect \
  prefecthq/prefect:3-latest prefect worker start -p my-pool
Kubernetes - Best practice: write to stdout and let your cluster logging (e.g., EFK/CloudWatch) collect logs. If you must write to a file, mount a volume and point your FileHandler there: - In your work pool’s base job template: - Add a volume/volumeMount (emptyDir or PVC) at
/logs
- Set
PREFECT_LOGGING_SETTINGS
env var to use
"filename": "/logs/worker.log"
- Example env var snippet in the job template:
Copy code
"env": [
  { "name": "PREFECT_LOGGING_SETTINGS",
    "value": "{\"version\":1,\"handlers\":{\"file\":{\"class\":\"logging.handlers.RotatingFileHandler\",\"filename\":\"/logs/worker.log\",\"maxBytes\":10485760,\"backupCount\":5},\"console\":{\"class\":\"logging.StreamHandler\"}},\"loggers\":{\"prefect\":{\"handlers\":[\"console\",\"file\"],\"level\":\"INFO\",\"propagate\":false}}}"
  }
]
ECS - Recommended: use the task’s awslogs/FireLens config to ship stdout/stderr to CloudWatch or another sink. - If you need a file, mount an EFS volume (or ephemeral storage) at
/logs
and configure
PREFECT_LOGGING_SETTINGS
to write to
/logs/worker.log
. Notes - Worker logs are about the worker process (polling, claiming, submission). Flow run logs come from the run’s pod/task/container and show in the Prefect UI; they’re typically handled by your platform’s logging. - You can raise/lower verbosity via
PREFECT_LOGGING_LEVEL
(e.g., DEBUG/INFO/WARN). If you tell me where your worker is running (local, Docker, K8s, ECS) I can give you a drop-in config snippet for that environment.