Hello community, I am chasing an intermittent bug,...
# ask-community
t
Hello community, I am chasing an intermittent bug, possibly related more to Kubernetes than to Prefect. Currently, a lot of our Prefect flows map out to several k8s jobs and do some moderately heavy computation for ~1 hour and then shut down. Our load is very burst-y, so we might scale from 1-2 nodes to ~50 down to 1-2 nodes again within the span of an hour. We are using Azure Kubernetes Service and I have tried being quite aggressive with the k8s autoscaler, setting the
scale-down-delay-after-add
to 1m and similarly the
scale-down-unneeded-time
to 1m. The issues that we are facing is sometimes the task pods fail, seemingly without reason, and logs are quite unhelpful. My hunch is that is has something to do with scaling of the cluster, potentially destroying pods/losing networking between flow and task pod in the process? We are already setting the
<http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>: false
on all pods, so eviction shouldnt be the issue. Have anyone else had any experiences with k8s autoscaler settings leading to weird, intermittent task failures?
k
I think you are right this is autoscaling related, but what does the task execution look like? Does it fail midway with mapped tasks?
t
The screenshot is actually a little misleading since the failure there is of a flow with a single task running, so no mapping in that particular instance. Im not sure that there is a clear pattern for when they are failing in the mapping process to be honest. Do you guys have any "golden standard" autoscaler settings that you suggest/use internally?
k
I will ask around
🙌 1
m
Hello Thomas!
"<http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>": "false"
should resolve the issue and prevent from evicting pods during scaling down event. Feels like the issue might be with something else. The common issue I've seen is when pods without cpu and memory requests are being scheduled on nodes without enough resources. Eventually those pods are being killed by kube-scheduler for memory violations, or those pods are running with continuous cpu throttling. If pods were killed by kube-scheduler, you usually can see logs in kubernetes events for that namespace. Did you try to increase scale down delay to >1m? Also,
scale-down-utilization-threshold
could be tweaked as well (default is 0.5).
t
Hey Mariia! All of the pods we are running have set both requests and limits and im not seeing any OOMkilled pods 🤔 Ive set the scale down delays, both the unneeded and delay-after-add to 10m (the default of aks) again, but still seeing these weird drops. I might have a little look
scale-down-utilization-threshold
, in my head that should be lowered to something like ~0.2-0.3, would you agree?