jack
12/03/2021, 6:20 PMNo heartbeat detected from the remote task; marking the run as failed.
For 20+ minutes following that log message, fetching the flow run state from prefect cloud still shows <Running: "Running flow.">
Ideally, as soon as the flow run is marked as failed, state from prefect cloud would say Failed. Suggestions?Kevin Kho
12/03/2021, 6:26 PMjack
12/03/2021, 6:42 PMKevin Kho
12/03/2021, 6:50 PMjack
12/03/2021, 7:34 PMKevin Kho
12/03/2021, 7:38 PMjack
12/03/2021, 7:47 PMKevin Kho
12/03/2021, 7:56 PMtowel
, the container with the Zombie Killer
service?jack
12/05/2021, 5:13 AMKevin Kho
12/05/2021, 6:36 PMprefect server start
. You would look for the container logsjack
12/05/2021, 7:20 PMKevin Kho
12/05/2021, 7:54 PMAnna Geller
12/06/2021, 9:50 AMIf we pass in different arguments to the flow (such that it does not consume so much memory), the flow succeeds.it looks like your flow runs are running out of memory, which causes the flow heartbeat’s to be lost. You noticed correctly that when you assign more memory, then this doesn’t happen. Especially when using ECS, I would definitely try and bump up the memory on your flow’s ECS task definition or run configuration. To explain why this behavior happens: flow heartbeats signal to Prefect Cloud that your flow is alive. If Prefect didn’t have heartbeats, flows that lose communication and die would permanently be shown as Running in the UI (this is what you experienced). Since your ECS container dies due to memory issues, flow heartbeats die with it and Prefect has no way of telling whether this flow run in the end failed/succeeded or was manually cancelled.
jack
12/06/2021, 4:12 PMNo heartbeat detected from the remote task; marking the run as failed
that when we query prefect for the flow state, it would also say the flow has failed. So far that is not happening.Anna Geller
12/06/2021, 4:22 PMjack
12/06/2021, 4:26 PMNo heartbeat detected from the remote task; marking the run as failed
But the run is not actually marked as failed. Prefect still shows the flow run with state Running.Kevin Kho
12/06/2021, 4:30 PMjack
12/06/2021, 4:31 PMstate_code
and posix_timestamp
shown in the above screenshot are Parameters. actual_work
is the only taskKevin Kho
12/06/2021, 4:59 PMjack
12/06/2021, 5:05 PM