Jesse
09/25/2025, 4:15 PMMarvin
09/25/2025, 4:15 PMMarvin
09/25/2025, 4:20 PMexport PREFECT_RUNNER_HEARTBEAT_FREQUENCY=60 # must be >= 30 seconds
Or persist it in your profile:
prefect config set PREFECT_RUNNER_HEARTBEAT_FREQUENCY=60
- Then, create an Automation in Cloud to mark runs as Crashed if they stop heartbeating (e.g., “no heartbeat for 5 minutes”).
- Docs: Detect zombie flows (heartbeats + automations)
Worker heartbeats (worker liveness)
- What it is: Workers periodically report their status to Cloud and poll for work.
- Default: On by default. You can tune intervals with:
- PREFECT_WORKER_HEARTBEAT_SECONDS
(default 30)
- PREFECT_WORKER_QUERY_SECONDS
(default 10)
- Example:
export PREFECT_WORKER_HEARTBEAT_SECONDS=30
export PREFECT_WORKER_QUERY_SECONDS=10
prefect worker start -p <your-pool>
- For Kubernetes/Docker liveness/readiness probes, you can run the worker’s healthcheck webserver; see docs for setup details:
- Docs: Worker healthchecks
Troubleshooting tips
- Verify settings in the runtime environment:
prefect config view
- Ensure the process with the flow actually has PREFECT_RUNNER_HEARTBEAT_FREQUENCY
set.
- Check Cloud: open a flow run → Timeline panel to see last heartbeat.
- Network: make sure the runtime can reach PREFECT_API_URL
and has a valid PREFECT_API_KEY
.
- If you’re using containers, confirm the env vars are present inside the container/pod.
If you can share whether you’re targeting flow runs or workers (and where they run: Docker, K8s, VM), I can give you exact steps and example manifests.Jesse
09/25/2025, 4:22 PMMarvin
09/25/2025, 4:22 PMMarvin
09/25/2025, 4:29 PMPREFECT_RUNNER_HEARTBEAT_FREQUENCY
in the environment where each flow run actually executes. In ECS, that means the container started for the flow run.
- Recommended: 60 seconds (min is 30).
- How to set for ECS work pools (ECSWorker):
- In Prefect Cloud UI → Work Pools → your ECS pool → Edit base job template → add:
"env": {
"PREFECT_RUNNER_HEARTBEAT_FREQUENCY": "60"
}
This ensures every flow run container sends heartbeats.
- Alternatively, set per-deployment when deploying (job variables):
flow.from_source("<https://github.com/org/repo>", entrypoint="flows/my_flow.py:flow").deploy(
name="prod",
work_pool_name="my-ecs-pool",
job_variables={"env": {"PREFECT_RUNNER_HEARTBEAT_FREQUENCY": "60"}}
)
- Make sure this env var is on the flow run container, not just the long‑lived worker poller.
- Create an Automation in Cloud to mark runs as Crashed if heartbeats stop (e.g., “no heartbeat for 5 minutes”).
- Docs: Detect zombie flows (heartbeats + automations)
Worker heartbeats (ECS worker poller)
- Your ECS task that runs prefect worker start -p <pool>
already sends heartbeats by default.
- You can tune intervals on the worker poller container:
# In the ECS Task Definition for the worker container
PREFECT_WORKER_HEARTBEAT_SECONDS=30 # default 30
PREFECT_WORKER_QUERY_SECONDS=10 # default 10
- Start the worker as usual:
prefect worker start -p <your-ecs-pool>
- Optional: Worker healthcheck webserver (for ECS/ALB health checks)
- The worker serves a small webserver you can probe for liveness/readiness.
- Expose the port (default 8080) on your ECS task/container and point the ECS health check at the worker’s webserver.
- You can change host/port via env vars if needed:
PREFECT_WORKER_WEBSERVER_HOST=0.0.0.0
PREFECT_WORKER_WEBSERVER_PORT=8080
- Docs: Worker healthchecks
Verification and troubleshooting
- Verify active settings in the running container:
prefect config view
- In Cloud UI:
- Flow run details → Timeline shows last heartbeat.
- Work Pools → your pool → Workers tab shows worker last-seen/healthy status.
- Networking: Ensure flow run containers and the worker poller can reach PREFECT_API_URL
and have PREFECT_API_KEY
. Private subnets without NAT will block heartbeats.
- Remember: The poller’s env does not automatically apply to the flow run container. Set PREFECT_RUNNER_HEARTBEAT_FREQUENCY
in the ECS job template (or deployment job variables) so the flow run process sends heartbeats.
If you want, share your ECS setup details (how your ECS pool template is structured and how you deploy flows), and I can give an exact base job template snippet and an ECS task definition example.Jesse
09/25/2025, 4:54 PMPREFECT_RUNNER_HEARTBEAT_FREQUENCY
isn’t set by default on the tasks that ECS spawns?Marvin
09/25/2025, 4:54 PMMarvin
09/25/2025, 4:54 PMheartbeat_frequency
defaults to None, so no heartbeats are emitted unless you set PREFECT_RUNNER_HEARTBEAT_FREQUENCY
(must be >= 30).
- Worker heartbeats are enabled by default, but those only apply to the long‑lived worker process, not the per‑flow ECS task.
How to enable for ECS flow runs:
- Add the env var to your ECS work pool’s base job template so every flow container sends heartbeats:
"env": {
"PREFECT_RUNNER_HEARTBEAT_FREQUENCY": "60"
}
- Or set it per deployment:
flow.from_source("<https://github.com/org/repo>", entrypoint="flows/my_flow.py:flow").deploy(
name="prod",
work_pool_name="my-ecs-pool",
job_variables={"env": {"PREFECT_RUNNER_HEARTBEAT_FREQUENCY": "60"}}
)
Tip: Setting it at the pool template applies immediately to new runs; setting it in deployment job_variables
requires redeploy.
Docs: Detect zombie flows (heartbeats + automations)