We have some Prefect flows that sometimes need to run for many hours. For particularly long-running flows, once the flow has been running for more than 12 hours, we’re often seeing that flow fail before it completes -- the Prefect UI shows its last state message as “`Unexpected error: CancelledError()`“. This doesn’t happen as a result of the code we’ve written to launch or monitor flows. It appears to be a result of an action that Prefect (or Dask?) is taking to automatically cancel long-running flows. However, I don’t see anything in the Prefect or Dask docs indicating that this is expected behavior, or how it could be controlled (e.g., disabled, or increased the allowable duration, etc.).
Can anybody provide any guidance on how to deal with flows failing with this
CancelledError
? Any clues on how we can configure Prefect or Dask to allow flows to run past the 12 hour mark?