Max Eggers
12/12/2023, 11:47 PMNate
12/13/2023, 12:26 AMadd an automation that responds to Crashed flows
Nate
12/13/2023, 12:27 AMMax Eggers
12/13/2023, 12:29 AMNate
12/13/2023, 12:29 AMMax Eggers
12/13/2023, 12:29 AMNate
12/13/2023, 12:34 AMMax Eggers
12/13/2023, 12:53 AMMax Eggers
12/13/2023, 4:32 PMNate
12/13/2023, 5:06 PMfoo
,
you could have some wrapping dispatcher
flow (on a long-lived infra perhaps? otherwise this one might have the same problem 🙂 ) that runs whenever foo
was supposed to, and all it does is call run_deployment
, check if it got submitted, and implements logic to handle it when it doesn't
depending on the scale of your submission problem, this might be overkill, but it would be an explicit way to have full control over submission to infrastructure / retries of thatMax Eggers
12/13/2023, 5:46 PMNate
12/13/2023, 7:31 PMWould Prefect be open to a PR from me adding retries in the k8s worker code?we love to see contributions! feel free to open a PR and implementation details can be discussed there 👍
Max Eggers
12/14/2023, 10:11 PMNate
12/14/2023, 10:13 PMRun a deployment
Max Eggers
12/14/2023, 10:14 PMMax Eggers
12/14/2023, 10:15 PMMax Eggers
12/14/2023, 10:15 PMMax Eggers
12/14/2023, 10:16 PMNate
12/14/2023, 10:17 PMif job submission fails, there's nothing to retrysince flow retries are just talking about the flow's process basically, which never started if job submission failed
Max Eggers
12/14/2023, 10:18 PMMax Eggers
12/14/2023, 10:18 PMMax Eggers
12/14/2023, 10:18 PMNate
12/14/2023, 10:18 PM