We've been having some intermittent communication issues between Azure Kubernetes and Prefect Cloud. Causes of this include one of these 3 things happening:
* Prefect Cloud responding with a 500 error
* How do we get more details to help solve this?
* AKS Worker stops picking up Prefect Cloud scheduled flows.
* The only solution to this that has worked is to manually restart the worker.
* Prefect Cloud never recognizes that a pod/flow has completed and continues to show as Running forever until manually cancelled.
* It seems like there is a
timeout_seconds
variable for flows that I could try in this case.
Any help would be appreciated.