Hi all~ I made a flow in prefect 1.0 that reads more than 1 million datas from the database and processes those data.
When testing to see if the flow was runned well, I didn't test it in such a large amount, so it worked well, but when I run the flow with such a large amount of data, a broken pipeline error occurs.
This may be due to the max_allowed_packet setting of RDS being used. And I registered that flow to the cloud and tried to run it, but it was also because there was too much data, so I couldn't get past the first task.
Anyway, what I'd like to ask for help is to make the flow run only once for processing a large amount of data. I'm wondering if it's possible to set the number of data the task will be executed on. Like determining the batch size in deep learning. Or do i fix the amount of data to be executed in one flow, and when that amount of data is processed, the flow will end and restarting the same flow over and over to complete all the data? For example, if the total amount of data is 1 million, i fixed the amount of data processed every time is 10,000 and run the flow 10 times.
Any idea to handle such heavy data, i completly welcome with open arms.🤗