Hello all!  I am currently investigating whether ...
# ask-community
a
Hello all!  I am currently investigating whether Prefect would fit our use case. We have many machine learning tasks that are chained together in various flows. The flows are triggered when a user uploads a file on the web client and run in the background. We are considering using Prefect to schedule and manage these flows. Currently, we have many workers for each of the machine learning tasks. There is a service that schedules the jobs and processes the outputs of the jobs via a message queue. However, there is no definition of a flow and the scheduling logic is scattered throughout the code for this service. This is where we think Prefect would be useful for us. From what I gathered, Prefect is mainly used for ETL workflows. In our case, we want to use Prefect to serve results from our machine learning models. Would Prefect be suitable for our use case? I am quite new to the MLOps field so any help or suggestions would be welcome!
e
From your high level definition, I believe prefect can be a good fit: 1. Job Scheduling - Prefect has a GraphQL API, from which you can start flow runs (your jobs). I believe this would make it easy for your custom job scheduler to integrate. 2. Jobs chained together - The
StartFlowRun
is a good fit, You could trigger job B from job A, and pass parameters to job B. Since this starts a new flow, these jobs can easily spread out to your worker cluster. 3. Scaling Out - Prefect runs on Dask, which is designed to support distributed computing. If you choose to do so, each individual flow can run on a cluster of workers. Prefect has a good amount of options here, including running on an existing cluster or starting a temporary cluster on a cloud provider just for that flow. 4. Serve - That is not the responsibility of prefect, it isn't designed to continuously run and serve based on requests. However, setting up a serving layer, and feeding the serving layer on a schedule / trigger is very much viable.
Triggering via GraphQL: https://medium.com/the-prefect-blog/event-driven-workflows-with-aws-lambda-2ef9d8cc8f1a StartFlowRun: https://docs.prefect.io/api/latest/tasks/prefect.html#startflowrun Sacling Out with DaskExecutor: https://docs.prefect.io/api/latest/executors.html#daskexecutor The dask executor accepts multiple implementatlions of cluster classes. For instance prefect could create a cluster of fargate containers for your flow to run on, when you create a flow run.
👍 1