Hey all, we are running prefect server v3.0 on AWS EKS and we have some issues related to how Karpenter reshuffles nodes which leads to the prefect server and/or prefect flows in prefect deployment runs on pods to be restarted in new pods.
Related to this, we were wondering if prefect server is able to shut down gracefully?
Any tips on how to make sure that prefect server runs stable and we don't get the following error would be highly appreciated:
Job reached backoff limit
a
Alexander Azzam
01/29/2025, 9:40 PM
@Leon Kozlowski is a fellow Karpenter guru
l
Leon Kozlowski
01/29/2025, 9:51 PM
I would suggest adding the do-not-disrupt annotation
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.