Hello We are using self hosted prefect 2 x on AWS EKS and wa Prefect Community #ask-community

Hello, We are using self-hosted prefect(2.x) on AW...

Peter Peter

11/07/2024, 12:59 PM

Hello, We are using self-hosted prefect(2.x) on AWS EKS and want to use spot instances to reduce costs. Trying to figure out how to handle when our spot instances get interrupted. Ideally we would like to resubmit/rerun the flow. How are others handling interrupts? I have tried the following without success: 1. on_crashed, on_failure hooks. 2. listening for SIGTERM Neither of the above two actually triggers the target function.

Alessio Civitillo

11/10/2024, 1:05 PM

Have you considered AKS with autoscaling?

Peter Peter

11/11/2024, 4:06 PM

Thanks for the response. We are an AWS shop and am sure that spot instances get interrupted in Azure as well. We are using autoscaling with EKS and no issues with that. Issue is that when spot instance gets interrupted the flow runs get terminated. Originally, we thought prefect would rerun when the spot instances are interrupted but this is not the case. This led me to look into SEGTERM/SIGINT to try to handle the interrupts. Not sure how others are handling this case.

Alessio Civitillo

11/11/2024, 4:57 PM

My mistake, somehow I translated spot instances to AWS EC2 instances. We also use AWS and I agree that problem would come up also in Azure. We normally had interruptions on big events like a VM being down or DNS issue, in those cases we manually did the reruns, but your case is different

2 Views

Open in Slack

Previous Next