<@ULVA73B9P>: How do I enable heartbeats on Prefec...
# marvin-ai
j
@Marvin: How do I enable heartbeats on Prefect Cloud?
m
thought for 296.9 seconds
Great question — “heartbeats” can refer to two things in Prefect Cloud. Do you want to enable heartbeats for: - Flow runs (to detect/auto-crash “zombie” runs), or - Workers (to monitor worker liveness)? Here’s how to enable both in Prefect 3.x. Flow run heartbeats (zombie detection) - What it is: The Runner inside your flow process periodically checks in with Cloud so automations can detect stalled runs. - Enable it by setting this where your flows actually execute (e.g., in your worker container/pod, VM, etc.):
Copy code
export PREFECT_RUNNER_HEARTBEAT_FREQUENCY=60  # must be >= 30 seconds
Or persist it in your profile:
Copy code
prefect config set PREFECT_RUNNER_HEARTBEAT_FREQUENCY=60
- Then, create an Automation in Cloud to mark runs as Crashed if they stop heartbeating (e.g., “no heartbeat for 5 minutes”). - Docs: Detect zombie flows (heartbeats + automations) Worker heartbeats (worker liveness) - What it is: Workers periodically report their status to Cloud and poll for work. - Default: On by default. You can tune intervals with: -
PREFECT_WORKER_HEARTBEAT_SECONDS
(default 30) -
PREFECT_WORKER_QUERY_SECONDS
(default 10) - Example:
Copy code
export PREFECT_WORKER_HEARTBEAT_SECONDS=30
export PREFECT_WORKER_QUERY_SECONDS=10
prefect worker start -p <your-pool>
- For Kubernetes/Docker liveness/readiness probes, you can run the worker’s healthcheck webserver; see docs for setup details: - Docs: Worker healthchecks Troubleshooting tips - Verify settings in the runtime environment:
Copy code
prefect config view
- Ensure the process with the flow actually has
PREFECT_RUNNER_HEARTBEAT_FREQUENCY
set. - Check Cloud: open a flow run → Timeline panel to see last heartbeat. - Network: make sure the runtime can reach
PREFECT_API_URL
and has a valid
PREFECT_API_KEY
. - If you’re using containers, confirm the env vars are present inside the container/pod. If you can share whether you’re targeting flow runs or workers (and where they run: Docker, K8s, VM), I can give you exact steps and example manifests.
j
@Marvin: For both zombie flow runs and workers. For workers, we use ECS with a task that pulls from Prefect, to start workers.
m
thought for 409.6 seconds
Perfect — here’s how to enable and verify heartbeats for both zombie flow runs and your ECS-based workers in Prefect 3.x. Flow run heartbeats (zombie detection) - What to do: Set
PREFECT_RUNNER_HEARTBEAT_FREQUENCY
in the environment where each flow run actually executes. In ECS, that means the container started for the flow run. - Recommended: 60 seconds (min is 30). - How to set for ECS work pools (ECSWorker): - In Prefect Cloud UI → Work Pools → your ECS pool → Edit base job template → add:
Copy code
"env": {
      "PREFECT_RUNNER_HEARTBEAT_FREQUENCY": "60"
    }
This ensures every flow run container sends heartbeats. - Alternatively, set per-deployment when deploying (job variables):
Copy code
flow.from_source("<https://github.com/org/repo>", entrypoint="flows/my_flow.py:flow").deploy(
        name="prod",
        work_pool_name="my-ecs-pool",
        job_variables={"env": {"PREFECT_RUNNER_HEARTBEAT_FREQUENCY": "60"}}
    )
- Make sure this env var is on the flow run container, not just the long‑lived worker poller. - Create an Automation in Cloud to mark runs as Crashed if heartbeats stop (e.g., “no heartbeat for 5 minutes”). - Docs: Detect zombie flows (heartbeats + automations) Worker heartbeats (ECS worker poller) - Your ECS task that runs
prefect worker start -p <pool>
already sends heartbeats by default. - You can tune intervals on the worker poller container:
Copy code
# In the ECS Task Definition for the worker container
  PREFECT_WORKER_HEARTBEAT_SECONDS=30   # default 30
  PREFECT_WORKER_QUERY_SECONDS=10       # default 10
- Start the worker as usual:
Copy code
prefect worker start -p <your-ecs-pool>
- Optional: Worker healthcheck webserver (for ECS/ALB health checks) - The worker serves a small webserver you can probe for liveness/readiness. - Expose the port (default 8080) on your ECS task/container and point the ECS health check at the worker’s webserver. - You can change host/port via env vars if needed:
Copy code
PREFECT_WORKER_WEBSERVER_HOST=0.0.0.0
    PREFECT_WORKER_WEBSERVER_PORT=8080
- Docs: Worker healthchecks Verification and troubleshooting - Verify active settings in the running container:
Copy code
prefect config view
- In Cloud UI: - Flow run details → Timeline shows last heartbeat. - Work Pools → your pool → Workers tab shows worker last-seen/healthy status. - Networking: Ensure flow run containers and the worker poller can reach
PREFECT_API_URL
and have
PREFECT_API_KEY
. Private subnets without NAT will block heartbeats. - Remember: The poller’s env does not automatically apply to the flow run container. Set
PREFECT_RUNNER_HEARTBEAT_FREQUENCY
in the ECS job template (or deployment job variables) so the flow run process sends heartbeats. If you want, share your ECS setup details (how your ECS pool template is structured and how you deploy flows), and I can give an exact base job template snippet and an ECS task definition example.
j
@Marvin: So
PREFECT_RUNNER_HEARTBEAT_FREQUENCY
isn’t set by default on the tasks that ECS spawns?
m
thought for 29.6 seconds
Short answer: correct — it’s not set by default on the ECS tasks that run your flows. Details: - In Prefect 3.x, flow-run heartbeats are opt-in. The runner setting
heartbeat_frequency
defaults to None, so no heartbeats are emitted unless you set
PREFECT_RUNNER_HEARTBEAT_FREQUENCY
(must be >= 30). - Worker heartbeats are enabled by default, but those only apply to the long‑lived worker process, not the per‑flow ECS task. How to enable for ECS flow runs: - Add the env var to your ECS work pool’s base job template so every flow container sends heartbeats:
Copy code
"env": {
  "PREFECT_RUNNER_HEARTBEAT_FREQUENCY": "60"
}
- Or set it per deployment:
Copy code
flow.from_source("<https://github.com/org/repo>", entrypoint="flows/my_flow.py:flow").deploy(
  name="prod",
  work_pool_name="my-ecs-pool",
  job_variables={"env": {"PREFECT_RUNNER_HEARTBEAT_FREQUENCY": "60"}}
)
Tip: Setting it at the pool template applies immediately to new runs; setting it in deployment
job_variables
requires redeploy. Docs: Detect zombie flows (heartbeats + automations)