Explaining MLOps using MLflow Tool

Mobarak Inuwa Last Updated : 04 Dec, 2022

6 min read

This article was published as a part of the Data Science Blogathon.

Introduction

In this article, we will be seeing MLOps from the dimension of one of the powerful tools that make it easy to implement. These tool help to improve the deployment process for robust machine-learning projects. We will start by briefly seeing MLOps before diving into the usage of MLflow for MLOps.

The concept of MLOps can be complex for novices. A good way to decipher it is by using an implementation tool like MLflow. The belief in this article is that MLOps tools can help understand MLOps concepts generally.

What is MLOps?

The terms “machine learning” and “DevOps” are combined to form the term “MLOps,” which is used in software development. MLOps can be seen as a set of guidelines that machine learning (ML) experts follow to hasten the deployment of ML models in real projects and enhance the overall integration of various project pipeline operations.

It can be viewed as expanding the DevOps technique to incorporate data science and machine learning. The propagation of AI in software production creates a need for agreed-upon best practices to provide testing, deployment, and monitoring of the new system.

MLOps brings together design and operations in a way that makes the development of happen on a robust platform. MLOps require all the data, or artifacts, for model deployment to be contained in a group of files created by a training project. After grouping these model artifacts, developers must have the means to keep track of the code used to create them, the data used to train and test them, and the connections between them. This makes it possible to automate the steps of app creation and delivery. This helps CI/CD so ML apps can be continually deployed, integrated, and delivered.

Benefits of Using MLOps

There are three key things MLOps bring to the table; there are automation, continuous deployment, and monitoring.

Automation

Automation removes the manual process of doing things. Automation helps the process of building regular ML models without any manual intervention. For instance, automated testing or debugging could reduce human error and save correction time. Before the problem gets out of hand, it is fixed or reported right away.

Monitoring

Monitoring is another form of automation, but it involves sending signals when certain conditions are met. These signals could be on models or data. It may be when an anomaly is detected, such as a drift, while for models, it may be when a metric or hyperparameter is triggered. This could be after a model is deployed so that even when it is in production, it is still receiving new data and automatically retraining it.

Continuous X

This is another key benefit of MLOps, but what does “X” imply? This also implies automation, where there is a loop in production. This could be continuous Delivery, commonly known as CD, Continuous Integration CI, Continuous Training CT, Continuous Monitoring, etc. You can add to the list too! This feature in MLOps provides a sort of automation that allows an extension even after deployment or in the process of deployment where there is continuous provision of some variables of some sort.

What are MLOps Tools?

Note that these tools are not directly meant for implementing MLOps, they only have good features for uplifting the ML process to MLOps. MLOps tools help organizations apply DevOps practices to creating and using AI and machine learning (ML). They were developed to help close the gap between developing ML models and reaping the benefits of those models in the commercial world.

The type of tool to employ depends on the nature of the project. These tools can be seen as simply platforms for effectively implementing MLOps.

What is MLflow?

MLflow is an open-source platform for managing the development of machine learning models with the goal of meeting four primary functionalities. These functionalities include. As said earlier, this tool does not directly do MLOps. It only has good functionality for MLOps which we want to see. This implies you can use the tools without actually implementing MLOps by just doing regular ML workflow.

MLflow Components for MLOps

MLflow provides four components to help manage the ML workflow which we have seen previously. We will see the details and how they affect MLOps:

MLflow Tracking; is an API and UI that allows logging and querying experiments using Python, REST, R API, and Java API APIs. It is designed for logging parameters, code versioning, and setting metrics, and artifacts when running machine learning code to allow for later visualizing of the results. This feature supports the MLOps guideline for creating processes with details to aid future tracing.

An example is code and data versioning. MLflow Tracking runs on any environment including a notebook. This tracking feature can be used to create robust systems that meet up to MLOps requirements.

MLflow Projects; Managing projects is a very important tool for MLOps. In MLflow it is a format for easily packaging data science code in a way that makes it reusable and reproducible. It has a component that includes an API and a command-line tool for running projects, making workflow chaining possible. These are standard formats for packaging data science codes that are reusable.

The projects are organized as directories with a Git repository. This high-quality code management in projects eases teamwork which is highly important in MLOps. Tracking MLflow Projects from the Git repository is easy since in using the MLflow Tracking API in a Project, MLflow automatically remembers the project version and any saved parameters.

MLflow Models; An MLflow Model offers a common configuration for encasing machine learning models so they may be used in multiple other tools. The configuration specifies the rules that permit users to store a model in different so-called “flavors” that different downstream tools can recognize. It offers a standard for distributing machine-learning models in various flavors. Each Model is handled as a directory with arbitrary files, and it is possible to use a descriptor file that lists the model’s various “flavors.”

MLflow provides tools to deploy many common model types to diverse platforms. Outputting models in MLflow makes it very clear using the Tracking API automatically remembers which Project and run they came from. With all these controls implementing good MLOps becomes a breeze!

MLflow Registry; It provides a central model repository, a collection of APIs, and a user interface to enable collaborative management of an MLflow Model’s whole lifecycle. It offers model versioning and stage transitions from staging to production or archiving model lineage, which MLflow experiment and run produced the model and annotations.

This provides a one-stop model store, set of APIs, and UI, to collectively control the entire lifecycle of an MLflow Model. The concept of registering a model will include each registered model having one or many versions. So that when a new model is added to the Model Registry, it is added with its version number. Typically, each new model registered to the same model name increments the version number. When a model is registered, it carries a unique name and contains versions, associated transitional stages, model lineage, with other metadata.

By clicking the Register Model button above, you can fill in the model’s name. The MLflow interface is easy to use. You can navigate the Registered Models page and view the model properties below.

screenshot showing the names and versions of registered models in MLflow

This versioning is a tool highly required for MLOps. We have seen some of the key features of the MLflow data mining tool and how they can be used. I feel these are the most effective ones that cut into the MLOps discussion. Generally, we can see that the strength of MLflow is in managing utilities like models and data by keeping track. This is very handy for robust systems as robustness is seen in being scalable or easily upgradeable.

Conclusion

Since managing the lifecycle of ML using MLOps can be challenging, every tool that can help assist and ease the pain becomes very useful. MLOps becomes achievable using the features of tools such as MLflow. With edge-cutting features in model and data management and providing a very large range of ways to develop models that perform very well in meeting MLOps standards, MLflow is another tool to look out for. The biggest achievement with MLflow is data and model management.

Key Takeaways;

As you may have known, a perfect approach to learning something is via tools. Tools provide a hands-on understanding of concepts where we saw MLOps.
MLOps can be seen as a set of guidelines that machine learning (ML) experts follow to hasten the deployment of ML models in real projects and enhance the overall integration of various project pipeline operations.
There are three key things MLOps bring to the table; there are automation, continuous deployment, and monitoring
MLOps tools help organizations apply DevOps practices to creating and using AI and machine learning (ML).

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Mobarak Inuwa

I am an AI Engineer with a deep passion for research, and solving complex problems. I provide AI solutions leveraging Large Language Models (LLMs), GenAI, Transformer Models, and Stable Diffusion.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

Explaining MLOps using MLflow Tool

Introduction

What is MLOps?

Benefits of Using MLOps

Automation

Monitoring

Continuous X

What are MLOps Tools?

What is MLflow?

MLflow Components for MLOps

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#