How do you handle model registry deployment for Prefect dep Prefect Community #ask-community

How do you handle model registry / deployment for ...

Ofir

06/01/2023, 8:23 PM

How do you handle model registry / deployment for Prefect deployments? TL;DR we want revisions and be able to rollout / rollback a ML model for a daily running deployment. We have a Kedro data science inference pipeline that runs for 3 weeks (RRule). During the second week we would like to update the reference to the training model (the path to the pickle / mlflow run id) that is associated with the Prefect deployment. Is there best practice to do that? Kedro is using MLFlow for experiment manager, but I’m not sure how to propagate the pickle / mlflow run id to the deployed Prefect deployment. I could reinvent the wheel and add an S3 pointer to the “latest” pickle, but I want Amazon SageMaker / MLFlow like capability to review and deploy models.

Jake Kaplan

06/01/2023, 8:31 PM

deleting my previous answer because I realize I misread your question

Ofir

06/01/2023, 8:32 PM

I asked ChatGPT this prompt

In MLFlow, how do you rollout / rollback different revisions of a model?

Obviously I could sneak behind Prefect’s back and call:

Copy code

mlflow.update_registered_model()

But I am concerned of chaos and inability to review, track and manage the different model deployments associated with a Prefect daily inference pipeline.

Jake Kaplan

06/01/2023, 8:34 PM

I'm not super familiar with mlflow but you have a couple options for updating parameters that get used inside of a prefect flow

Jake Kaplan

06/01/2023, 8:34 PM

From my understanding of what you want to do, the simplest is probably a parameter on the deployment? https://docs.prefect.io/2.10.11/concepts/deployments/?h=deploymen#parameters-in-deployments

Ofir

06/01/2023, 8:36 PM

Thanks @Jake Kaplan! Does Prefect version control it / show a history of parameters changed?

Jake Kaplan

06/01/2023, 8:37 PM

You can only see the active parameters for a deployment (not the change history) But you can see each individual flow run from that deployment should have the parameters that got used

Jake Kaplan

06/01/2023, 8:40 PM

If changing the parameters on one deployment is too lightweight for the versioning you need, you could have multiple deployments for the same flow code. It sounds like you'd only want one schedule to be active at a time, but then you'd have a history, be able to run adhoc for each etc.

Ofir

06/01/2023, 8:41 PM

Thanks Jake, appreciate the answer. This is an MLOps / ML engineering question to be more precise.

Ofir

06/01/2023, 8:43 PM

My main concern is monitoring and tracking changes to deployed production models. Our use case is that the data science team frequently optimize the models and we would like to deploy these models to already running daily Prefect deployments, while allowing to see history and what changed

Ofir

06/01/2023, 8:43 PM

The equivalent in Apache Airflow is DAG versioning and in DevOps is the GitOps - where you can track changes made to deployments (to the states / deployments of the Kubernetes cluster).

Ofir

06/01/2023, 8:44 PM

This provides auditing, visibility and means to easily rollback to a previously stable version, or rollout a new version without losing track of what’s happening.

Ofir

06/01/2023, 8:51 PM

While I fully appreciate Prefect’s capabilities and fit for data pipelines, I am still lacking the expertise related to fully utilizing it for a data science product. I need to find the Prefect parallels of kubeflow/SageMaker/mlflow or get access to real-world case studies.

Nate

06/01/2023, 9:19 PM

forgive a potentially naive question, since MLOps isn't my domain of expertise, but I'm curious what specific things you'd be looking for Prefect to offer that you cannot find in the version control offered by GitHub (assuming that your model versions are defined in code) i see this

where you can track changes made to deployments (to the states / deployments of the Kubernetes cluster).

and I feel like you could create artifacts that represent infra events or model execution that you could track over time, but I'm wondering if I missing part of your motivation on this

I’m not sure how to propagate the pickle / mlflow run id to the deployed Prefect deployment.

could you put the reference in the

description

of the deployment ? then itd be markdown rendered in the UI on the deployment's page

Ofir

06/02/2023, 7:18 AM

Thanks @Nate. I’m wondering how Uber for example rolls out and rolls back data science models in production.

Ofir

06/02/2023, 7:19 AM

If all of a sudden you have a performance regression because of a new data science model that was deployed, you would want to be able to: 1. Be aware of it 2. Quickly rollback to the previous version

Ofir

06/02/2023, 7:22 AM

Now if

kubeflow

Amazon SageMaker

is your go-to tool for orchestrating and model deployments, it’s an already solved problem. You have a UI / portal where you have access and visibility to deployed models and you can select which one to currently use, i.e. the active model. When Prefect is in the picture, I’m a bit puzzled how this should work. Should Prefect be agnostic to underlying infra-related changes? if so, how do you keep track of all of the moving parts?

Sarah Krasnik

06/05/2023, 2:09 PM

@Ofir One idea is using variables. You can store the model version in a variable and update it to the new one, or update it to the previous one and roll it back. This variable could also dictate coded logic if it changes across models.

🙌 1

3 Views

Open in Slack

Previous Next