Hi all,
I'm struggling to set up a use case with Serverless Push Workers in Azure. My goal is to process data in batches—extracting, transforming, and loading it to its destination. For this, I have a scheduler flow and a runner flow. The scheduler flow run is triggered by me running it, it analyses the data, splits it into batches, and using the prefect API, runs the runner flow for each batch. The challenge I face is limiting the amount of runner flows deployed to the Azure Container Instance (ACI). Before I was using VMs to run prefect workers, there it was easy, set up a worker on X VMs and X flows would execute in parallel. But without the concurrency limit on Serverless Push Workers, all flows will be ran at once, and that will overwhelm the data source/destination and my ACI quota limits on Azure. I looked into the task concurrency and global concurrency, but I don’t want to set the limit in the code, as I do not want the containers running idle if they will be waiting for active slots.
Anyone have any tips on how to address the issue? Am I missing a design issue with the scheduler and runner pipeline? Is there an alternative if I need to process hundreds of these batches with only a couple running in parallel?
Is there any plan to implement Serverless Push Workers concurrency control in the near future?