Shelley Kim
01/08/2025, 4:45 AMMarvin
01/08/2025, 4:45 AMCRASHED
state that can be used to handle infrastructure interruptions. You might need to implement a custom logic to transition the flow to a CRASHED
state when such events are detected.
2. Pod Resilience: Use Kubernetes features like PodDisruptionBudgets or set up node affinity to reduce the likelihood of pods being evicted or nodes being rebooted unexpectedly.
3. Flow Recovery: Implement a recovery mechanism in your flow logic. This could involve checking the state of the flow periodically and re-triggering it if it remains in the "Running" state for too long without progress.
4. State Transition Logic: Review and enhance the state transition logic in your flow to handle unexpected interruptions and ensure it transitions to a terminal state if necessary.
For more detailed guidance on running flows on Kubernetes, you can refer to the Prefect documentation on Kubernetes.
Additionally, you can explore Prefect's state management and how it handles various terminal states in the state management documentation.