We’re getting into operations using Prefect and wondering if there is a way to run a re-run a failed (or perhaps not failed) subtree of the graph while the rest of the DAG is still running. We have a DAG that might run for longer than 24 hours and we’ve encountered a common situation where one node in the graph fails for exogenous reasons. We’d like to fix the exogenous problem and then re-run that node so the dependent nodes can move on, but not interrupt the already running nodes.
04/08/2022, 4:22 PM
I think there are two issues here right?
1. Re-running failed tasks
2. If there is a subdag A->B->C, and A and B succeed but C fails, can you run A and B again?
If you add a retry above 10 minutes for a task, it will be queued again. Do retries not help there?
For the sub-DAG you can’t restart those. It’s a limitation of Prefect 1 that is addressed in Orion. You’d need to compress the task for the retry to be apply to the multiple tasks (but then they arent separate tasks anymore)
04/08/2022, 4:32 PM
thanks @Kevin Kho
Yeah a simply retry won’t work here because someone needs to take action outside the flow before the task can be retried.
I got the impression that this subtree rerun was one of the things with Orion. I’m going to have a look.
If i’m reading the documentation correctly (which I just started) - it looks like my use case would include inter-flow dependency. Is that doable? I guess each subtree would be a flow, and then the flows would need to depend on each other
feel free to point me to doc or an illustrative example too
04/08/2022, 5:05 PM
I’ll need to look into if it’s already possible, but this exact use case was one of the motivations. I am thinking what the options are for Prefect 1 though
What are your thoughts on a later flow you can trigger to fetch the failed ones and re-submit them?
Because that would let you run the sub-DAG from scratch for the failed items as opposed to relyng on the retry mechanisms
04/08/2022, 8:43 PM
sorry, to be clear, I’m wondering if inter-flow dependencies are supported in Orion
04/08/2022, 8:52 PM
Interflow meaning task 1 in FlowB has an upstream task 10 in Flow A for example? This I don’t think it supported