Hi, is there a way to iteratively run a flow for batch processing over a large dataset?
Here’s the flow I’m trying to achieve:
1. Start flow with 3 parameters: start timestamp (ts_start), batch size (n), end timestamp (ts_end)
2. Task extracts batch of n documents starting from ts_start
3. Cache nth document’s timestamp (ts_last)
4. Go back to (1) passing in ts_last as ts_start if ts_last < ts_end
In other words, I want to iteratively process batches until we get a document with timestamp ts_end
a
Anna Geller
12/20/2021, 9:51 PM
@Vince perhaps is task looping what you are looking for?
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.