https://prefect.io logo
Title
n

Nimesh Kumar

01/15/2023, 5:23 PM
Hi, can anyone enlighten me regarding scaling prefect The scenario is i have setup my inferencing pipeline where the user upload an image goes through the pipeline and return back the results The problem here is there can be 1000-2000 inferencing request can come in one go, and then the question is 1. Will prefect can handle it 2. Is yes then how (my prefect is setup on cloud) Is there something like i have to setup auto-scaling kindof thing Please if anyone can help me on this
n

Nate

01/15/2023, 8:48 PM
Hi @Nimesh Kumar - I'd guess there are a lot of different ways of going about this, but here are my thoughts
1. Will prefect can handle it
I think this is largely a matter of how you would choose to set up your execution infrastructure to run your pipeline. For example, if you set up a runtime environment than can run many instances of your pipeline in parallel (Google Cloud Run, GKE, EKS, fargate, etc) then I'm sure Prefect could handle the orchestration / observability of these pipelines. But the real heavy work would be performed by your runtime infrastructure, so you'd have to make sure that you allocate resources appropriately.
2. Is yes then how (my prefect is setup on cloud)
It depends on how you decide to setup your runtime infrastructure! For example, if you used Google Cloud Run as alluded to above, you could define 2 prefect flows (and then create deployments from them): • orchestrator • worker a worker could be defined to run a batch of inferencing requests through the pipeline as a single Google Cloud Run, so you'd just have to create a deployment for the worker flow with a Google Cloud Run infrastructure block, the orchestrator could be triggered on the event of the user uploading an image (by making an api call to run this deployment), and would be responsible for distributing work across as many worker flows as necessary to handle all the inferencing requests. so tldr in terms of prefect cloud: • choose / setup a runtime environment (I recommend google cloud run) • create 2 deployments in your Prefect Cloud: orchestrator, worker • trigger the orchestrator deployment on the event of a user submitting something • run as many worker deployments as you need to handle all the requests hope this is helpful - if you are interested in dedicated infrastructure / prefect support for this project, you could reach out to cs@prefect.io
n

Nimesh Kumar

01/16/2023, 7:22 AM
Hi Nate, Thanks a lot for the detailed response. I just had one doubt regarding how can we run as many worker deployments as you need to handle all the requests Also let say i have 2 flow. so for each flow i have create two deployments (orchestrator, worker) any resource, link that you can provide.
n

Nate

01/16/2023, 7:41 PM
Hi @Nimesh Kumar - I'm not sure of any resources that exactly address your use case, but there is a recipe I created a while ago that may be useful based on what you've shared (here) worth noting that there is now a
run_deployment
utility
from prefect.deployments import run_deployment
that would allow you to kick off another deployment programmatically (instead of using the client directly like I did) I'll re-iterate that you can receive more in-depth support by reaching out to cs@prefect.io, as its not feasible for us to address all complex use cases for everyone in community hope that's helpful! feel free to reach out with specific questions if you come up with them