Hello community All of my deployment executions are running Prefect Community #ask-community

Hello community! All of my deployment executions a...

Ben

10/16/2025, 2:45 PM

Hello community! All of my deployment executions are running very late and I'm not sure why.... I am running prefect self-hosted with Docker Compose with a Worker service listening deployment executions for GCP Cloud Run work pool. My deployments can be late upwards to 4 hours in some cases. My worker service is started with this command in my docker compose file:

Copy code

entrypoint:
      - /opt/prefect/entrypoint.sh
      - prefect
      - worker
      - start
      - '--pool=my-work-pool'
      - '--with-healthcheck'
      - '--name=${DEFAULT_WORKER_NAME}'
      - '--limit=${DEFAULT_POOL_LIMIT}'

Nate

10/16/2025, 2:48 PM

Hello community! All of my deployment executions are running very late and I'm not sure why....

I am running prefect self-hosted with Docker Compose with a Worker service listening deployment executions for GCP Cloud Run work pool.

My deployments can be late upwards to 4 hours in some cases.

My worker service is started with this command in my docker compose file:

```entrypoint:

- /opt/prefect/entrypoint.sh

- prefect

- worker

- start

- '--pool=my-work-pool'

- '--with-healthcheck'

- '--name=${DEFAULT_WORKER_NAME}'

- '--limit=${DEFAULT_POOL_LIMIT}'```

hi @Ben this is often something to do with how you're running the worker or concurrency limits not sure there's enough info to know for sure from this but • why are you using

entrypoint.sh

here? what's in there? • you may want to ask @Marvin

Marvin

10/16/2025, 2:48 PM

✅ thought for 400.3 seconds

Marvin

10/16/2025, 2:55 PM

Thanks for the details — 4 hours “late” usually points to a worker/pool configuration issue, a backlog caused by concurrency limits, or the worker not actually polling the pool consistently. A few targeted checks to run first - Verify the pool type, status, and limits - Pool paused or concurrency set too low/zero will cause long backlogs. - Run:

Copy code

prefect work-pool inspect my-work-pool
    prefect work-queue ls

Confirm: - type is

cloud-run

- pool is not paused - work-pool and any work-queues do not have an unexpectedly low or zero concurrency limit - Docs: Work pools, Global concurrency limits, Concurrency limits API - Check the worker flags and logs - Your command looks generally fine; the worker will infer the correct type from the pool:

Copy code

prefect worker start --pool my-work-pool --name ${DEFAULT_WORKER_NAME} --limit ${DEFAULT_POOL_LIMIT} --with-healthcheck

- A very low

--limit

with long-running flows will create a backlog that shows up as “late.” What is

${DEFAULT_POOL_LIMIT}

and how long do typical runs take? - Temporarily increase logging to debug to see poll/submission behavior:

Copy code

PREFECT_LOGGING_LEVEL=DEBUG prefect worker start --pool my-work-pool ...

- Docs: Workers - Validate the work-queue targeting - If your deployments specify a work queue but the worker is not polling it (or vice versa), nothing will get picked up until another worker does. - If you are using queues, start the worker with the matching

--work-queue

or remove queue scoping. - Confirm Cloud Run credentials/quotas - The worker (running in Docker Compose) must have permissions to create/run Cloud Run jobs in the target project/region. Permission/quotas errors will appear in the worker logs and can cause repeated backoffs. - If you are near quota for concurrent jobs per region, submissions may be throttled and runs will appear late. - Guide: GCP worker guide - Healthcheck usage in Docker Compose -

--with-healthcheck

starts the Prefect health endpoint; for Docker to act on it, add a Compose healthcheck that hits the endpoint and expose the port. Example:

Copy code

healthcheck:
      test: ["CMD-SHELL", "wget -qO- <http://localhost:8080/health> || exit 1"]
      interval: 10s
      timeout: 5s
      retries: 3
    ports:
      - "8080:8080"

- Docs: Worker healthchecks - Double-check time sync and API config - Significant clock skew between your Prefect server, database, and worker host can make runs show “late.” - Ensure the worker’s

PREFECT_API_URL

points at your self-hosted API and that it’s reachable from the worker container. About your entrypoint.sh - What’s inside

/opt/prefect/entrypoint.sh

? If it doesn’t

exec "$@"

as the last line, the Prefect process may not be PID 1, signal handling can break, and restarts/healthchecks may not behave, causing downtime gaps where runs go late. - A safe minimal entrypoint:

Copy code

#!/usr/bin/env sh
  set -euo pipefail
  exec "$@"

Helpful next details to share so we can pinpoint the cause - Prefect version (exact) and confirmation you’re on Prefect 3.x - Output of:

Copy code

prefect work-pool inspect my-work-pool
  prefect work-queue ls

- Your worker logs around the time a run should start (any backoff, permission, or quota messages) - The value of

${DEFAULT_POOL_LIMIT}

and typical flow run duration - Whether deployments target a specific work queue - Contents of

/opt/prefect/entrypoint.sh

Marvin

10/16/2025, 2:55 PM

If everything above looks good and you still see hours of lateness, the next most common culprits are a low worker

--limit

relative to backlog, queue scoping mismatches, or Cloud Run submission failures/quotas. Happy to triage the logs with you.

Ben

10/16/2025, 2:55 PM

Hey thanks @Nate, while we wait for Marvin.... I am using a Docker compose template from Coolify, and this entrypoint.sh seems to be available within the official prefect 3 docker image. I did notice that

DEFAULT_POOL_LIMIT

is set to

- does that affect the concurrency?

Nate

10/16/2025, 3:00 PM

yes > this entrypoint.sh seems to be available within the official prefect 3 docker image im just curious why you're using it here to run a worker

DEFAULT_POOL_LIMIT

is not a prefect setting,

--limit

is a worker-specific (client-side) limit on the concurrent flow runs to start (its not a limit on the work pool) so it would appear coolify or whoever made the template is confused about the

--limit

flag

Nate

10/16/2025, 3:01 PM

besides the client-side

limit

flag, you can set (server-side) concurrency on the work pool, a queue within that pool, or the deployment itself

Ben

10/16/2025, 3:03 PM

My work pool had concurrency set to

unlimited

but this worker start command is using

--limit=1

Ben

10/16/2025, 3:04 PM

If I understand correctly, this client side limit is causing a bottleneck if I have long running jobs?

Nate

10/16/2025, 3:05 PM

could be!

Nate

10/16/2025, 3:06 PM

yea

--limit 1

means that the

prefect worker start

will only be able to submit one scheduled flow run for execution at a time

Ben

10/16/2025, 3:07 PM

Interesting, well I have increased it... lets see what happens

Ben

10/16/2025, 3:21 PM

Hey @Nate - based on a couple tests, it seems adjusting the

--limit

flag has done the trick. Thanks for the help!

Nate

10/16/2025, 3:22 PM

catjam

Open in Slack

Previous Next