Hello Everyone,
I wanted to ask if there is a mechanism for Batch Processing in Prefect. I have found Child/Parent flows to simulate that behavior but interested if there are some other approaches.
k
Kevin Kho
09/24/2021, 6:41 PM
Hey @Irakli Gugushvili, could you elaborate what you mean by Batch Processing? So Prefect is definitely a batch orchestrator compared to other platforms that work with data streams.
Parent and child flows is one way to do batch processing where the parent can get a list of batches to process, and then kick off child flows for each one.
If you are talking about incremental pulls from a database, and then just processing deltas, the KV Store is our mechanism for that where you can persist metadata like the last processed timestamp and then process the new data
i
Irakli Gugushvili
09/24/2021, 6:45 PM
Thank you for the reply.
I mean I have one flow that I want to run, but I don't want to run it on the full data. I want to run it on the first batch and when its finished, rerun the flow but now on the next batch.
Will check KV Store.
k
Kevin Kho
09/24/2021, 6:46 PM
KV Store docs . You can use this to persist a watermark of what data has been processed
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.