Anders Segerberg04/07/2022, 5:12 PM
in batches. As you said though, the problem is independent, so I would consider batching the 10k into 1000 for example and doing task looping to run each batch at a time. The loop can keep track of the total failures, the problem here though is that it becomes sequential when using looping.
3. So I think what we are left with is using some combination of Subflows. Can we use the subflow to fire off batches at a time sequentially and then use the DaskExecutor on the subflows and LocalExecutor on the main flow? Then you can use Task looping in the main flow to submit the batches and keep track of the main flow. I think something like this might work.
4. But all this kind of loses caching if we use the looping. If we use mapping, each task can be cached individually but with looping it can’t. One of the current flaws of Prefect 1 is that if you have a chain of tasks A -> B - > C and A and B succeed but C fails, you can’t retry A and B. This is one of the reasons Orion (Prefect 2.0) has no DAG. So you may need to compress tasks together to use the same cache. When you do that, you can use Prefect caching or targets to avoid re-running the same code.
Anders Segerberg04/07/2022, 5:39 PM
Anders Segerberg04/07/2022, 5:42 PM
Anders Segerberg04/07/2022, 5:44 PM
Anders Segerberg04/07/2022, 5:47 PM