How is MLOps Different from DevOps?

Gitesh Dhore Last Updated : 14 Sep, 2022

7 min read

This article was published as a part of the Data Science Blogathon.

Introduction

DevOps practices include continuous integration and deployment, which are CI/CD. MLOps talks about CI/CD and ongoing training, which is why DevOps practices aren’t enough to produce machine learning applications. In this article, I explained the important features of MLOps and the key differences from traditional DevOps practices.

DevOps – Development and Operations

Today’s competitive world is about how quickly you make your features available to the end user. DevOps helps the project team quickly integrate new features and make them available to end users using an automated DevOps pipeline.

DevOps uses two key components throughout its lifecycle:

1. Continuous Integration: Merging the code base into a central code repository such as git and bitbucket, automating the software system build process with Jenkins, and running automated test cases.

2. Continuous Delivery: Once new features are developed, tested, and integrated into the continuous integration phase, they must be automatically deployed to make them available to end users. This automated build and deployment are done in the developer’s continuous delivery phase.

When a project is deployed, and users start using it, it’s important to track various metrics. Under DevOps monitoring, an engineer takes care of several things like application monitoring, usage monitoring, visualization of key metrics, etc.

Machine Learning Vs. Traditional Software Development

According to the paper “Hidden Technical Debt in Machine Learning Systems,” Only a fraction of a real ML system consists of ML code. Along with the ML code, we need to consider data cleaning, data versioning, model versioning, and continuous training of models on a new data set. Machine learning system testing is different from the traditional software testing mechanism. Testing a Machine Learning application is more than just unit testing. We must consider data checks and data drift, model drift, and performance evaluation of the model deployed to production.

• Machine learning systems are highly experimental. You can’t guarantee that an algorithm will work in advance without doing some experiments first. Therefore, there is a need to track various experiments, feature engineering steps, model parameters, metrics, etc., to know which experimental algorithm the optimal results are achieved in the future.

• The deployment of machine learning models is particular, depending on the problem they are trying to solve. Most parts of the machine learning process involve things related to data. And therefore, the machine learning pipeline has several steps, including data processing, feature engineering, model training, model registry, and model deployment.

• Model output should be consistent over time. Therefore, we need to track data distribution and other statistical measurements related to data over a period. The live data should be similar to the data used to train the model.

• People who develop machine learning models do not focus on software practices because they often do not come from a software background.

MLOps – Operations in Machine Learning

MLOps or ML Ops is a set of practices that aim to reliably and efficiently deploy and maintain machine learning models in production. The word is a portmanteau of “machine learning” and the continuous development of DevOps in software.

MLS is a combination of DevOps, machine learning, and data engineering. Building on the existing DevOps approach, MLOps solutions are developed to increase reusability, facilitate automation, data shift management, model versioning, experiment tracking, ongoing training, and obtain richer and more consistent insights in a machine learning project.

Andrew Ng recently talked about how the machine learning community can use MLOps to build high-quality datasets and AI systems that are repeatable and systematic. He called for a shift in focus from model-centric machine learning to data-centric development. Andrew also said that going forward, MLOps can play an important role in ensuring high-quality and consistent data flow at all project stages.

This MLOps setup includes the following components:

• Source control

• Test and build services

• Deployment services

• Register models

• Store features

• ML metadata repository

• ML pipeline orchestrator

A more detailed architecture, including an automated pipeline for ongoing training is provided below:

Key Benefits of Using MLOps

• Continuous training: With MLOps, we can set up continuous training of models. Continuous training is essential, as with changes in time data, and it also affects the model’s output. Therefore, for the model output to be consistent, it is necessary to have continuous training with new incoming data.

• Watching experiments: When we develop a machine learning model, we run many experiments, such as hyperparameter tuning, different sampling of training data, and different model outputs concerning different parameters. So after many experiments, we get the best output model. But we don’t know which experiment gives the optimal result because we didn’t save those experiments. And now, when we return a few weeks later, we have to run everything again to get the optimal result. Here experiment tracking helps us to record small configuration experiments automatically.

• Data Drift: When an ML model is first deployed in production, data scientists are primarily concerned with how well the model will perform over time. The main question they ask is, does the model still capture the pattern of new incoming data as effectively as it did during the design phase? So if the data changes over time, the model’s performance will decrease because it is trained on data that is not the same as the new incoming data in statistical measures. And this change in the data is known as data drift, which directly affects the model’s performance and therefore needs to be watched out for. There are several statistical techniques to check data drift, such as the Kolmogorov-Smirnov test, but in MLOps they provide some ready-made tools that you can use for this purpose. Example: Hydrosphere and Fiddler

• Model registry: With a model registry, you can ensure that all key values (including data, configurations, environment variables, model code, versions, and documents) are in one place that everyone responsible can access. It helps in model versioning and faster deployment. Tools that support model registry out of the box like MLFlow, Azure Machine Learning Studio, Neptune AI, etc.

• Visualization: When you plot data, it is much more understandable than presented in table numbers. This is where visualization of various machine learning metrics, performance scores, and experiments becomes essential. You can do all of these (or most of them) yourself, but there are tools you can use to help speed up your machine learning development.

• Monitoring: You collect statistical data on model performance based on current data. The output of this phase is a trigger to execute a pipeline or a new experimental cycle. It helps start a continuous training channel. In addition, there could be many more things to monitor, like usage statistics, performance monitoring, application, system-level logging, etc. Various monitoring tools are available, such as Prometheus, open telemetry, etc.

Similarities of MLOps and DevOps

1. The two main components of DevOps, Continuous Integration and Continuous Delivery, are also needed in MLOps.

2. ML code testing is the same as in DevOps. Because it will be python code where DevOps testing methodologies can be applied. [There is also model testing and data validation testing that is new to MLOps]

What is different in MLOps compared to DevOps?

1. Data quality and drift: In MLOps, in addition to code testing, you also need to ensure that data quality is maintained throughout the lifecycle of a machine learning project. Make sure the data doesn’t change over time. Otherwise, the model needs to be retrained.

2. More than traditional deployment: In MLOps, you don’t necessarily deploy only the model artefact. You may need to deploy a complete machine-learning pipeline that includes data extraction, data processing, feature engineering, model training, model registry, and model deployment.

3. Continuous Training: There is a third concept in MLOps that does not exist in DevOps: continuous training (CT). We have to check for data drift and concept drift constantly; whenever there is a change, it will affect the model’s performance. So if the model’s performance decreases over time, we need to start the training pipeline automatically.

4. Model testing: A fraction of a real ML system consists of ML code. The surrounding elements required are extensive and complex. We need to consider data checks and data movement, model movement, testing, and performance validation of the model deployed to production.

5. Versioning of data and models: In DevOps, we consider versioning of code, but in machine learning, we deal with different samples of data and create different versions while training the model. We also generate different versions of the models concerning different hyperparameters. So, in MLOps, you must version both the data, the model, and the code.

Conclusion

In recent years, there have been drastic changes in the speed of data generation. Almost 90 per cent of the data available today comes from just the last few years. If you are reading this post, I assume it would be abundantly clear that while big data helps develop actionable insights, it also presents several challenges. These challenges include acquiring and cleaning big data, tracking and versioning for models, deploying monitoring pipelines for production, scaling machine learning operations, etc. That’s where MLOps can help the Machine Learning community tackle all the problems that can’t be solved with DevOps alone.

Machine learning system testing is different from the traditional software testing mechanism. Testing a Machine Learning application is more than just unit testing. We must consider data checks and data drift, model drift, and performance evaluation of the model deployed to production.
In MLOps, in addition to code testing, you also need to ensure that data quality is maintained throughout the lifecycle of a machine learning project. Ensure the data doesn’t change over time; otherwise, the model must be retrained.
ML model is first deployed in production; data scientists are primarily concerned with how well the model will perform over time. The main question they ask is, does the model still capture the pattern of new incoming data as effectively as it did during the design phase?

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Gitesh Dhore

I am a Machine Learning Enthusiast. Done some Industry level projects on Data Science and Machine Learning. Have Certifications in Python and ML from trusted sources like data camp and Skills vertex. My Goal in life is to perceive a career in Data Industry.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

How is MLOps Different from DevOps?

Introduction

DevOps – Development and Operations

Machine Learning Vs. Traditional Software Development

MLOps – Operations in Machine Learning

This MLOps setup includes the following components:

Key Benefits of Using MLOps

Similarities of MLOps and DevOps

What is different in MLOps compared to DevOps?

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#