Welcome to the world of Large Language Models (LLM). In the old days, transfer learning was a concept mostly used in deep learning. However, in 2018, the “Universal Language Model Fine-tuning for Text Classification” paper changed the entire landscape of Natural Language Processing (NLP). This paper explored models using fine-tuning and transfer learning.
LLAMA2 is one of the best LLM models used for text generation. In this guide, we will explore the automatic process of fine-tuning the LLAMA2 model using personal data. All of this is powered by Gradient AI. Gradient AI is a cloud platform that provides a Python SDK, allowing us to create, test, and easily manage models.
This process is going to take a long time! So let’s get started and get ready!
This article was published as a part of the Data Science Blogathon.
LLAMA2, or the Large Language Model of Many Applications, belongs to the category of Large Language Models (LLM). Developed by Facebook (Meta), this model is designed to explore a wide range of natural language processing (NLP) applications. In the earlier series, the ‘LAMA’ model was the starting face of development, but it utilized outdated methods.
As I mentioned in the intro, the pivotal moment came in 2018 with the paper ‘Universal Language Model Fine-tuning for Text Classification.’ This paper revolutionized the field of NLP through the techniques of deep learning and pre-training methods, greatly improving performance across different NLP applications.
Comparison with Other Platforms:
Model | Key Characteristics | Strengths |
---|---|---|
LLAMA2 | – Versatility across applications. | – Strong contextual understanding. |
– Adaptable with transfer learning. | – Effective in the different tasks of NLP. | |
– Context-aware responses. | ||
BERT (Bidirectional Encoder Representations from Transformers) | – Bidirectional context understanding. | – Excellent for tasks requiring deep contextual understanding. |
– Pre-trained on a large corpus. | – Effective in question answering, and more. | |
GPT (Generative Pre-trained Transformer) | – Focuses on generating coherent, contextually relevant text. | – Ideal for creative text generation and language understanding. |
– Autoregressive training methods. | – Strong performance in language modeling tasks. | |
XLNet | – Permutation language modeling objective. | – Achieves bidirectional context understanding. |
– Considered a hybrid model. | – Strong in different domains of NLP benchmarks. |
Gradient AI is a cloud platform that offers versatile tools for users to easily build, test, and update models. Utilizing such tools is a common method, as many industries leverage cloud infrastructure for model creation and testing. The platform streamlines the processes of building, training, and deploying models, providing test cases. This offers a convenient solution for users, researchers, and enterprises.
Comparison with Other Platforms:
Platform | Key Characteristics | Strengths |
---|---|---|
Gradient AI Cloud | – Comprehensive features and resources. | – Scalability for machine learning tasks. |
– User-friendly interfaces. | – Simplified deployment of machine learning models. | |
– Support for various frameworks. | – Collaboration features for teamwork. | |
Google Colab | – Free access to GPU resources for Jupyter notebooks. | – Quick access for experimenting with machine learning code. |
– Limited features compared to paid cloud platforms. | – Suitable for educational and personal projects. | |
AWS SageMaker | – Provides similar machine learning capabilities. | – Integration with other AWS services for seamless workflows. |
– Extensive suite of tools for end-to-end ML workflows. | – Scalability and flexibility with AWS infrastructure. | |
Azure Machine Learning | – Azure’s cloud-based machine learning platform. | – Integration with Azure services for comprehensive solutions. |
– Support for different ML frameworks. | – Seamless collaboration |
Creating GRADIENT_WORKSPACE_ID and GRADIENT_ACCESS_TOKEN involves obtaining the necessary credentials from the Gradient AI Cloud platform. Below are the steps to create these variables:
1. Workspace ID (GRADIENT_WORKSPACE_ID):
Fig: UI Of Gradient AI (Workspace)
Fig: UI Of Gradient AI (Authentication KEY)
2. Access Token (GRADIENT_ACCESS_TOKEN):
Fig: UI Of Gradient AI (Authentication KEY)
Building an automated fine-tuning app involves multiple steps, and for a streamlined process, we establish a structured workflow. A core element of modular coding is the creation of a logger and exception script, responsible for capturing logs and errors. Here’s a high-level overview of the coding structure. Finally, we integrate the Streamlit application for a user-friendly UI, simplifying the component in visually so anyone can test the application.
project_root
│
├── configs
│
├── research
│ └── trials.ipynb
├── logs
│
├── src
│ └── lama2FineTune
│ ├── component
│ │ └── fine_tune.py
│ ├── constant
│ │ └── env_variable.py
│ ├── exception
│ ├── logger
│ ├── utils
│ │ └── main_utils.py
│ └── __init__.py
│
├── venv
├── .env
├── .gitignore
├── init_setup.sh
├── params.yaml
├── app.py
├── README.md
├── requirements.txt
├── setup.py
Fig: UI Of Coding Structure
Fig: project architecture
This architecture allows for the efficient fine-tuning of the LLAMA2 model and seamless integration with the Gradient AI platform.
Fig: Diagram of Fine tune the LLM2
The Diagram integrates a Streamlit app for user interaction, FineTuner class for LLAMA2 fine-tuning, Gradient SDK for cloud communication, and modular coding components, ensuring a streamlined process of customizing and deploying the LLAMA2 model on Gradient AI.
git clone https://github.com/SuyodhanJ6/Fine-Tune-LLAMA2.git
ls
o/p : Fine-Tune-LLAMA2
cd Fine-Tune-LLAMA2
conda create -p ./venv python=3.9 -y
conda activate ./venv
pip install -r requirements.txt
touch .env
GRADIENT_WORKSPACE_ID=Past your key
GRADIENT_ACCESS_TOKEN=Past your key
lama2FineTune
│ ├── exception
│ │ └── __init__.py
│ ├── logger
│ │ └── __init__.py
Logger File:
The logger file is important for recording and storing code(function, class, script name) information, serving multiple crucial functions:
import logging
import os
from datetime import datetime
import os
LOG_FILE = f"{datetime.now().strftime('%m_%d_%Y_%H_%M_%S')}.log"
logs_path = os.path.join(os.getcwd(), "logs", LOG_FILE)
os.makedirs(logs_path, exist_ok=True)
LOG_FILE_PATH = os.path.join(logs_path, LOG_FILE)
logging.basicConfig(
filename=LOG_FILE_PATH,
format="[ %(asctime)s ] %(lineno)d %(name)s - %(levelname)s - %(message)s",
level=logging.INFO,
)
Exception File:
The exception file is designed to manage unexpected events or errors during the program run and the key importance:
import sys
def error_message_detail(error, error_detail: sys):
"""
Method Name : error_message_detail
Description : Format and return an error message with traceback details.
Return : str
Args :
error (Exception): The error object or message.
error_detail (sys): The traceback information from the error.
"""
_, _, exc_tb = error_detail.exc_info()
file_name = exc_tb.tb_frame.f_code.co_filename
error_message = "Error occurred in python script name
[{0}] at line number [{1}]. Error message: {2}".format(
file_name, exc_tb.tb_lineno, str(error)
)
return error_message
class McqGeneratorException(Exception):
"""
Custom exception class for handling money laundering-related errors.
"""
def __init__(self, error_message, error_detail: sys):
"""
Method Name : __init__
Description : Initialize the MoneyLaunderingException exception.
Return : None
Args :
error_message (str): The main error message.
error_detail (sys): Additional details about the error.
"""
super().__init__(error_message)
self.error_message_detail = error_detail
def __str__(self):
"""
Method Name : __str__
Description : Return a string representation of the
MoneyLaundering exception.
Return : str
Args : None
"""
return str(self.error_message_detail)
1. RESPONSE_JSON
lama2FineTune
│ ├── config
│ │ └── __init__.py
Note: Write the all samples you want in the above section I provide.
SAMPLES = [
{
"inputs": "### Instruction: Who is Prashant Malge? \n\n### Response:
Prashant Malge is a dedicated fourth-year Computer Science student
at DPCOE - Dhole Patil College Of Engineering Pune, with a strong
passion for data science. He has a solid foundation in programming
, databases, and machine learning."
},
{
"inputs": "### Instruction: Tell me about Prashant Malge's academic
journey. \n\n### Response: Prashant Malge's academic journey has
equipped him with a solid foundation in programming, databases, and
machine learning. He is currently a fourth-year student at DPCOE -
Dhole Patil College Of Engineering Pune."
},
{
"inputs": "### Instruction: What skills does Prashant Malge possess?
\n\n### Response: Prashant Malge possesses skills in statistical
analysis, machine learning, and data visualization. As a Data
Scientist, he leverages his expertise to derive insights and make
informed decisions."
},
{
"inputs": "### Instruction: Where has Prashant Malge gained hands-on
experience in data science? \n\n### Response: Prashant Malge gained
hands-on experience in data science through two internships at
inuron.ai. During these internships, he addressed complex data
problems and collaborated effectively with stakeholders."
},
{
"inputs": "### Instruction: How does Prashant Malge approach data
problems? \n\n### Response: Prashant Malge excels in addressing
complex data problems and is known for his collaborative approach.
He works effectively with stakeholders to deliver scalable and
secure data solutions."
},
{
"inputs": "### Instruction: What are Prashant Malge's interests
outside of data science? \n\n### Response: Beyond his technical
pursuits, Prashant Malge has a deep love for tea and a passion for
sports. He has played softball for years, representing Kolhapur in
state-level competitions three times."
},
{
"inputs": "### Instruction: Can you share Prashant Malge's
personal website? \n\n### Response: Prashant Malge's personal
website is available at https://suyodhanj6.github.io/"
}
]
lama2FineTune
│ ├── constants
│ │ └── env_variable.py
└── __init__.py
# Loading OPENAI_API_KEY
from dotenv import load_dotenv
import os
# take environment variables from .env.
load_dotenv()
""" Workspace Constants """
GRADIENT_WORKSPACE_ID=os.getenv("GRADIENT_WORKSPACE_ID")
""" Access token of gradients"""
GRADIENT_ACCESS_TOKEN=os.getenv("GRADIENT_ACCESS_TOKEN")
# Other constants from params.yaml
MODEL_ADAPTER_NAME = "PrashantModelAdapter"
NUM_EPOCHS = 3
lama2FineTune
│ ├── component
│ │ └── fine_tune.py
import os
import sys
import logging
from datetime import datetime
from gradientai import Gradient
from lama2FineTune.constants.env_varaible import GRADIENT_WORKSPACE_ID, GRADIENT_ACCESS_TOKEN
from lama2FineTune.logger import logging
from lama2FineTune.exception import Llama2Exception
class FineTuner:
def __init__(self, model_name, num_epochs):
self.model_name = model_name
self.num_epochs = num_epochs
self.gradient = None
self.model_adapter = None
def initialize_gradient(self):
# Initialize Gradient AI Cloud with credentials
self.gradient = Gradient(workspace_id=GRADIENT_WORKSPACE_ID,
access_token=GRADIENT_ACCESS_TOKEN)
def create_model_adapter(self):
# Create model adapter with the specified name
base_model = self.gradient.get_base_model(base_model_slug="nous-hermes2")
model_adapter = base_model.create_model_adapter(name=self.model_name)
return model_adapter
def fine_tune_model(self, samples):
# Fine-tune the model using the provided samples and number of epochs
for epoch in range(self.num_epochs):
for sample in samples:
query = sample["inputs"]
response = sample["response"]
self.model_adapter.fine_tune(inputs=query, targets=response)
def fine_tune(self):
try:
# Initialize logging
# Initialize Gradient AI Cloud
self.initialize_gradient()
# Create model adapter
self.model_adapter = self.create_model_adapter()
logging.info(f"Created model adapter with id {self.model_adapter.id}")
# Fine-tune the model
self.fine_tune_model(SAMPLES)
except Exception as e:
# Handle exceptions using custom exception class and logging
raise Llama2Exception(e, sys)
finally:
# Clean up resources if needed
if self.model_adapter:
self.model_adapter.delete()
if self.gradient:
self.gradient.close()
# if __name__ == "__main__":
# # Example usage
# fine_tuner = FineTuner(model_name=MODEL_ADAPTER_NAME, num_epochs=NUM_EPOCHS)
# fine_tuner.fine_tune()
# app.py
import streamlit as st
from lama2FineTune.components.fine_tune import FineTuner
from lama2FineTune.constants import MODEL_ADAPTER_NAME, NUM_EPOCHS
def main():
st.title("LLAMA2 Fine-Tuning App")
# Get user input for model name and number of epochs
model_name = st.text_input("Enter Model Name", value=MODEL_ADAPTER_NAME)
num_epochs = st.number_input("Enter Number of Epochs", min_value=1, value=NUM_EPOCHS)
# Display fine-tuning button
if st.button("Fine-Tune Model"):
fine_tuner = FineTuner(model_name=model_name, num_epochs=num_epochs)
# Perform fine-tuning
st.info(f"Fine-tuning model {model_name} for {num_epochs} epochs.
This may take some time...")
fine_tuner.fine_tune()
st.success("Fine-tuning completed successfully!")
# Display generated output after fine-tuning
sample_query = "### Instruction: Who is Prashant Malge? \n\n ### Response:"
completion = fine_tuner.model_adapter.complete(query=sample_query,
max_generated_token_count=100).generated_output
st.subheader("Generated Output (after fine-tuning):")
st.text(completion)
if __name__ == "__main__":
main()
streamlit run app.py
Fig: UI Of Fine-Tune app
Fig: UI Of Gradient AI (Model Creation)
In this notebook, I provide the code that u can run collab or Jupyter Notebook (local)
In conclusion, we explored the project structure, developed a personalized model through transfer learning, and constructed a modular coding approach. This project employs a structured and organized process for refining LLAMA2 language models with personalized data. Key elements include a Streamlit application (app.py), a fine-tuning pipeline (fine_tune.py), and further modules for constants, exceptions, logging, and utilities. The design prioritizes clarity, ease of maintenance, and an improved user experience.
A: Modular community is crucial for code clarity, maintainability, and scalability, achieved by segregating branches based on specific functionalities.
A: The Streamlit app delivers a UI interface for interacting with the LLAMA2 fine-tuning process. Users can input parameters and initiate automated fine-tuning through the interface.
A: LLAMA2 is a large language model designed for natural language processing tasks. It supports transfer learning by allowing fine-tuning on specific domains or tasks using personal datasets.
A: Transfer learning with LLAMA2 involves initializing the model with pre-trained weights and fine-tuning it on domain-specific or task-specific data, adapting its knowledge to the target application
A: The project emphasizes logging for improved runtime visibility and employs business exceptions to improve error reporting, contributing to a more robust system.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.