Hi, another Prefect + K8S question here :slightly_...
# ask-community
t
Hi, another Prefect + K8S question here šŸ™‚ Like the previous post we're frequently seeing the message
No heartbeat detected...
. Usually It happens in 2 situations: • the pod that run the tasks gets evicted / OOM killed • the pod was running on a preemptible node that gets removed and replaced. Is there something on the k8s agent, k8s job specification, or something else to configure to allow k8s to reschedule the job and let prefect know about it, so that the flow would continue ?!
k
I am not sure
restartPolicy
will help here. Maybe use a DaskExecutor cuz Dask will restart the workers and retry the work?
t
yes
restartPolicy
is not helping šŸ˜• Never used the DaskExecutor yet, I'll try it thanks !