Hi everyone, I just started working with prefect. ...
# ask-community
l
Hi everyone, I just started working with prefect. I'm using it together with mlflow to improve our machine learning processes. My use case is that I have several models which I want to deploy and stop at demand because of limited infrastructure. I loved the prefect worker pools and I think I can implement this pattern by packaging my models into containers and have long running flows, the flows just start a http server of sorts, and interrupt it when no longer needed. Is prefect capable of doing that? Is there a better way to do this?
h
Hey Leonardo - we use Prefect extensively for computer vision tasks and also want to make sure we’re scaling up and down with demand. However I think prefect works best at being a simple worker. That is, start the worker when a request comes in, do the job and store the result, and then have the worker be destroyed by prefect. This does mean you have a 30s-ish delay due to the prefect startup time. I wouldn’t suggest the workers running their own HTTP server, instead, consider going outside prefect and using auto-scaling amazon ECS instances. It does exactly what you say. Not on amazon? GCP and Azure have similar concepts too
l
Thanks, unfortunately I must do it using bare metal servers with gpus. I think I'll end up orchestrating torchserve servers instead