Short answer: with defaults, a Prefect flow run typically starts within a few seconds to low tens of seconds after you trigger it. The biggest factors are your worker’s poll interval and your infrastructure’s cold-start time.
What determines startup latency
- Prefect orchestration: near-instant (milliseconds to sub-second to schedule/assign).
- Worker polling: by default workers poll for new work every ~10 seconds. That means average wait ~5s, worst-case ~10s before the worker grabs the run.
- Infrastructure spin-up:
- Process worker (same machine): usually sub-second to a couple seconds if the environment is warm.
- Docker worker: 2–5s if the image is already on the host; if the image must be pulled, startup can jump to tens of seconds or minutes (depends on image size/registry).
- Kubernetes worker: 5–20s on a warm cluster with cached images; 30–120s+ if the image needs pulling or if the cluster/nodes must scale up. Very large images or autoscaling can push this further.
- Code and environment setup: cloning source, importing heavy libraries, and any runtime initialization can add seconds.
Relevant defaults in Prefect 3.x
- Worker poll interval: 10 seconds (controls how often workers query for new runs).
- Prefetch window: 10 seconds (workers will pick up scheduled runs a bit early to help them start on time).
You can see these in the Worker settings:
-
Settings reference
- Health/worker behavior:
Worker healthchecks
How to reduce cold start time
- Lower the poll interval:
- Set
PREFECT_WORKER_QUERY_SECONDS=1
or
2
to cut pick-up delay to ~1–2s.
- Increase prefetch for scheduled runs:
-
PREFECT_WORKER_PREFETCH_SECONDS=30
so workers grab runs early and start right on time.
- Keep infra warm:
- Process worker for fastest starts when running on the same machine.
- Docker/Kubernetes: pre-pull images, keep nodes warm, avoid scale-to-zero if you need fast response.
- Use small images and local/nearby registries; pin to digests for caching.
- In Kubernetes, use
IfNotPresent
image pull policy and consider a DaemonSet to pre-pull.
- Trim runtime setup:
- Avoid heavy Git clones or do shallow clones; minimize expensive imports at module import time; cache credentials and connections where appropriate.
Example: start a worker tuned for fast pickup
PREFECT_WORKER_QUERY_SECONDS=1 PREFECT_WORKER_PREFETCH_SECONDS=30 prefect worker start -p my-pool
If you share which worker type you’re using (process, Docker, or Kubernetes) and what you’re seeing now, I can suggest a tighter plan to get startup as close to “instant” as your environment allows.