<@ULVA73B9P> how do i add a heart beat for the the...
# ask-marvin
r
@Marvin how do i add a heart beat for the the flow runs that run in ecs
m
thought for 236.5 seconds
Great question. In Prefect 3.x, flow-run heartbeats are opt-in. To add a heartbeat for flows that run on ECS, set the runner heartbeat frequency as an environment variable in the ECS task container that executes your flow. What to set - Set
PREFECT_RUNNER_HEARTBEAT_FREQUENCY
to an integer ≥ 30 (seconds). Default is disabled (
None
). - This emits flow-run heartbeat events that you can use for crash (zombie) detection via Automations. Ways to set it for ECS 1) Per-deployment (job variables) - In your prefect.yaml:
Copy code
deployments:
  - name: my-ecs-deployment
    entrypoint: path/to/flow.py:my_flow
    work_pool:
      name: my-ecs-pool
      job_variables:
        env:
          PREFECT_RUNNER_HEARTBEAT_FREQUENCY: 60
- Or in Python when deploying:
Copy code
from prefect import flow

@flow
def my_flow():
    ...

(
    my_flow
    .from_source("<https://github.com/org/repo>", entrypoint="flows/my_flow.py:my_flow")
    .deploy(
        name="ecs-deploy",
        work_pool_name="my-ecs-pool",
        job_variables={
            "env": {
                "PREFECT_RUNNER_HEARTBEAT_FREQUENCY": 60
            }
        },
    )
)
2) For all ECS runs in a work pool - Edit the ECS work pool’s base job template (in the UI) and add:
Copy code
"env": {
  "PREFECT_RUNNER_HEARTBEAT_FREQUENCY": 60
}
This applies to every deployment that uses that pool. Notes and related settings - Flow-run heartbeat is different from the worker heartbeat. Workers have their own heartbeat interval (
PREFECT_WORKER_HEARTBEAT_SECONDS
, default 30). You don’t need to change that to enable flow-run heartbeats. - To automatically mark stuck/“zombie” flows as Crashed when heartbeats are missed, create an Automation that reacts to missed heartbeats (see docs below). Docs for deeper reference - Detect zombie flows and heartbeats: Detect zombie flows - Settings reference (includes runner and worker heartbeat settings): Settings reference - Customize job variables for deployments: Customize job variables - ECS worker/guide: ECS guide - Worker health checks (separate from flow heartbeats): Worker healthchecks Would you like this applied per-deployment or at the work pool level for all ECS runs? And do you also want help wiring an Automation to mark runs as Crashed if heartbeats are missed?
r
@Marvin i get below error when i specific in yaml Response: {‘detail’: “Error creating deployment: ValidationError: ‘60 is not valid under any of the given schemas’“}
m
thought for 80.2 seconds
That error usually happens when the env var value is not a string or it’s placed in the wrong spot in the YAML. A couple quick checks: - Make sure you put it under work_pool -> job_variables -> env - Make sure the value is a string (env vars must be strings), e.g. "60" not 60 Example prefect.yaml snippet that works:
Copy code
deployments:
  - name: my-ecs-deployment
    entrypoint: path/to/flow.py:my_flow
    work_pool:
      name: my-ecs-pool
      job_variables:
        env:
          PREFECT_RUNNER_HEARTBEAT_FREQUENCY: "60"
Equivalent in Python:
Copy code
(
    my_flow
    .from_source("<https://github.com/org/repo>", entrypoint="flows/my_flow.py:my_flow")
    .deploy(
        name="ecs-deploy",
        work_pool_name="my-ecs-pool",
        job_variables={
            "env": {"PREFECT_RUNNER_HEARTBEAT_FREQUENCY": "60"}
        },
    )
)
If you prefer to set it for all runs in the ECS work pool, edit the pool’s base job template in the UI and add:
Copy code
"env": {
  "PREFECT_RUNNER_HEARTBEAT_FREQUENCY": "60"
}
If you’re still seeing the schema error, can you share the exact YAML snippet you’re using? I’ll spot-check the structure. Refs: - Customize job variables - Detect zombie flows (heartbeats) - Settings reference
r
@Marvin how do i know if heartbeat config is working? how do i verify if task is sending heartbeat event?
m
thought for 308.5 seconds
Here’s a simple way to confirm your ECS flow is actually sending heartbeats and that your config is applied. 1) Verify the setting is applied inside the ECS task - Heartbeat is off by default; it only runs if
PREFECT_RUNNER_HEARTBEAT_FREQUENCY
is set (≥ 30). - Easiest check: log the resolved setting from within your flow at startup.
Copy code
from prefect import flow
from prefect.settings import get_current_settings

@flow
def my_flow():
    s = get_current_settings()
    print(f"Runner heartbeat_frequency = {s.runner.heartbeat_frequency!r}")  # e.g., 60
    ...
- Alternatively, if you can exec or add a startup command in your container, run:
Copy code
prefect config view --show-defaults --show-sources | grep PREFECT_RUNNER_HEARTBEAT_FREQUENCY
You should see it set from env and its value. 2) Look for heartbeat events in the UI - Go to your flow run details → Events tab. - You should see periodic heartbeat events arriving roughly every N seconds (your configured frequency). - You can also check the global Events page and filter by the flow run ID; search for “heartbeat” in the event feed. 3) Quick end-to-end test (optional) - Start a long-running flow with heartbeats enabled. - Kill the ECS task/container mid-run. If heartbeats were flowing, Prefect will stop receiving them and your Automations (see below) can detect this and mark the run as Crashed or notify you. 4) Add an Automation to confirm behavior on missed heartbeats - Create an Automation that reacts when flow-run heartbeats are missed longer than your interval (e.g., > 2x your frequency) and send a notification or mark the run as Crashed. This both verifies heartbeats and gives you protection against “zombie” runs. - Docs: Detect zombie flows Troubleshooting if you don’t see heartbeats - Ensure the env var is a string in YAML (e.g.,
"60"
, not
60
) and placed under
work_pool -> job_variables -> env
. - Confirm the ECS work pool’s base job template or the deployment’s
job_variables
actually include the env var that’s reaching the flow container. - Make sure you’re looking at the right flow run’s Events tab and the time window is correct. References - Heartbeats (zombie detection): Detect zombie flows - Settings reference: Settings reference - Customize job variables (env in deployments): Customize job variables If you want, share a link/screenshot of your flow run’s Events tab or your
prefect.yaml
snippet and I’ll double-check that everything’s wired correctly.