https://prefect.io logo
Title
j

Joshua Greenhalgh

05/29/2022, 5:46 PM
Hi I am starting to randomly see this on flow runs;
12:09:39 ERROR prefect-server.ZombieKiller.TaskRun No heartbeat detected from the remote task; marking the run as failed.
12:22:53 INFO prefect-server.Lazarus.FlowRun Rescheduled by a Lazarus process. This is attempt 1.
I am running on GKE autopilot - any ideas how to debug cause of such events and mitigate against their occurence?
Happens rarely but once or twice every few days
I see this has been written up about it https://discourse.prefect.io/t/flow-is-failing-with-an-error-message-no-heartbeat-detected-from-the-remote-task/79 - so from this description I should try setting the heartbeat mode to thread - but this may not solve?
k

Kevin Kho

05/29/2022, 7:34 PM
Yes that’s exactly right. Thread will be more stable but you may be running into issues where the task is dying due to memory issues