Large Language Models (LLMs) are now widely used in a variety of applications, like machine translation, chat bots, text summarization , sentiment analysis , making advancements in the field of natural language processing (NLP). However, it is difficult to deploy and manage these LLMs in actual use, which is where LLMOps comes in. LLMOps refers to the set of practices, tools, and processes used to develop, deploy, and manage LLMs in production environments.
MLflow is an opensource platform that provides set of tools for tracking experiments, packaging code, and deploying models in production. Centralized model registry of MLflow simplifies the management of model versions and allows for easy sharing and collaborative access with the team members making it a popular choice for data scientists and Machine Learning engineers to streamline their workflow and improve productivity.
This article was published as a part of the Data Science Blogathon.
The following factors make managing and deploying LLMs in a production setting difficult:
MLflow is an open-source platform for managing the machine learning lifecycle. It provides a set of tools and APIs for managing experiments, packaging code, and deploying models. MLflow can be used to deploy and manage LLMs in production environments by following the steps:
It is a popular open-source library for building natural language processing models. These models are simple to deploy and manage in a production setting due to MLflow’s built-in support for them.To use the Hugging Face transformers with MLflow, follow these steps:
!pip install transformers
!pip install mlflow
import transformers
import mlflow
chat_pipeline = transformers.pipeline(model="microsoft/DialoGPT-medium")
with mlflow.start_run():
model_info = mlflow.transformers.log_model(
transformers_model=chat_pipeline,
artifact_path="chatbot",
input_example="Hi there!"
)
# Load as interactive pyfunc
chatbot = mlflow.pyfunc.load_model(model_info.model_uri)
#make predictions
chatbot.predict("What is the best way to get to Antarctica?")
>>> 'I think you can get there by boat'
chatbot.predict("What kind of boat should I use?")
>>> 'A boat that can go to Antarctica.'
Open AI is another popular platform for building LLMs. MLflow provides support for Open AI models, making it easy to deploy and manage Open AI models in a production environment. Following are the steps to use Open AI models with MLflow:
!pip install openai
!pip install mlflow
from typing import List
import openai
import mlflow
# Define a functional model with type annotations
def chat_completion(inputs: List[str]) -> List[str]:
# Model signature is automatically constructed from
# type annotations. The signature for this model
# would look like this:
# ----------
# signature:
# inputs: [{"type": "string"}]
# outputs: [{"type": "string"}]
# ----------
outputs = []
for input in inputs:
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": "<prompt>"}]
)
outputs.append(completion.choices[0].message.content)
return outputs
# Log the model
mlflow.pyfunc.log_model(
artifact_path="model",
python_model=chat_completion,
pip_requirements=["openai"],
)
Lang Chain is a platform for building LLMs using a modular approach. MLflow provides support for Lang Chain models, making it easy to deploy and manage Lang Chain models in a production environment. To use Lang Chain models with MLflow, you can follow these steps:
!pip install langchain
!pip install mlflow
from langchain import PromptTemplate, HuggingFaceHub, LLMChain
template = """Translate everything you see after this into French:
{input}"""
prompt = PromptTemplate(template=template, input_variables=["input"])
llm_chain = LLMChain(
prompt=prompt,
llm=HuggingFaceHub(
repo_id="google/flan-t5-small",
model_kwargs={"temperature":0, "max_length":64}
),
)
mlflow.langchain.log_model(
lc_model=llm_chain,
artifact_path="model",
registered_model_name="english-to-french-chain-gpt-3.5-turbo-1"
)
#Load the LangChain model
import mlflow.pyfunc
english_to_french_udf = mlflow.pyfunc.spark_udf(
spark=spark,
model_uri="models:/english-to-french-chain-gpt-3.5-turbo-1/1",
result_type="string"
)
english_df = spark.createDataFrame([("What is MLflow?",)], ["english_text"])
french_translated_df = english_df.withColumn(
"french_text",
english_to_french_udf("english_text")
)
Deploying and managing LLMs in a production environment can be challenging due to resource management, model performance, model versioning, and infrastructure issues. LLMs are simple to deploy and administer in a production setting using MLflow’s tools and APIs for managing the model lifecycle. In this blog, we discussed how to use MLflow to deploy and manage LLMs in a production environment, along with support for Hugging Face transformers, Open AI, and Lang Chain models. The collaboration between data scientists, engineers, and other stakeholders in the machine learning lifecycle can be improved by using MLflow.
Some of the Key Takeaways are as follow:
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.