Saurabh Indoria

12/07/2021, 11:38 AM
One of the flow runs encountered this error. Usually the runs succeed and this was an anomaly. Can someone share some details about this error? How can we prevent this from happening and why didn't it retry?
Pod prefect-job-94453cb1-2sw9q failed.
	Container 'flow' state: terminated
		Exit Code:: 139
		Reason: Error

Anna Geller

12/07/2021, 11:45 AM
@Saurabh Indoria looks like your pod crashed for some reason and flow’s heartbeat got lost. It could be that your pod ran out of memory. To prevent this from happening and mitigate the issue you can: 1. Track and allocate enough CPU and memory resources to your pods 2. You can establish some extra logging on your Kubernetes cluster to be able to track why this pod crashed. 3. You can set the flow’s heartbeat mode to thread if you have some long-running flows:
from prefect.run_configs import KubernetesRun
flow.run_config = KubernetesRun(env={"PREFECT__CLOUD__HEARTBEAT_MODE": "thread"})
🙌 1

Saurabh Indoria

12/07/2021, 11:47 AM
Thank you for the quick response @Anna Geller 🙌
👍 1