Refining English-to-Hinglish Translations with Gemma 2 9B

Nibedita Dutta Last Updated : 12 Mar, 2025

11 min read

Have you ever thought about how to make communication easier for people who use a mix of Hindi and English, commonly known as Hinglish? With the growing use of Hinglish in everyday conversations, social media, and advertising, there’s a need for tools that can accurately translate between English and Hinglish. This is where advanced language models like Gemma 2 9B come into play. By fine-tuning this model, we can create solutions that understand the unique blend of Hindi and English, making communication more effective for a wider audience.

Learning Objectives

Understand the key features and multilingual capabilities of the Gemma 2 9B model.
Learn how Unsloth AI accelerates fine-tuning for large language models.
Gain hands-on experience in fine-tuning the Gemma 2 9B model for English-to-Hinglish translation.
Explore the impact of fine-tuning on translation accuracy compared to the original model.
Learn how to deploy and query the fine-tuned model using Ollama for real-world applications.

This article was published as a part of the Data Science Blogathon.

Understanding Gemma 2 9B Model
Fine tuning Gemma 2 9B using Unsloth AI
What is Unsloth AI?
Hands On Tutorial on Fine Tuning Gemma 2 9B For English to Hinglish Translations
Querying the Model Through Ollama
Comparison with Original Gemma 2 9B Model
Conclusion
Frequently Asked Questions

Understanding Gemma 2 9B Model

Gemma 2 models represent a significant advancement in artificial intelligence, offering powerful language processing capabilities with a focus on efficiency and accessibility. These models are designed to excel in tasks such as text generation, code writing, and problem-solving. With their compact size and robust performance, Gemma 2 models provide a versatile tool for developers and users alike. They are particularly noted for their competitive performance relative to larger models.

Parameter Size: The model has 9 billion parameters, which is relatively small compared to other larger LLMs, making it efficient for deployment on devices with limited resources
Training Data: It was trained on a massive dataset of 8 trillion tokens, including web documents, code, and mathematical text. This diverse training enables the model to excel in tasks like text generation, code writing, and mathematical problem-solving
Architecture: Gemma 2 uses a transformer architecture, which is well-suited for natural language processing tasks. It is designed to handle a wide range of tasks, from answering questions to generating code
Multilingual and Code Generation: Gemma 2 is proficient in multiple languages and can generate code in various programming languages, making it a versatile tool for developers
Efficiency and Accessibility: Its relatively small size allows for deployment on laptops or desktops, democratizing access to state-of-the-art AI models. It also supports fast inference, making it suitable for real-time applications

Fine tuning Gemma 2 9B using Unsloth AI

Fine-tuning the multilingual Gemma 2 9B model can be highly beneficial for Hindi translations due to its robust multilingual capabilities and adaptability.

Multilingual Strengths: Gemma 2 models, including the 9B version, have demonstrated strong multilingual performance across various languages, often surpassing larger models like Llama-3-70B in specific tasks. For instance, fine-tuned versions have excelled in languages such as French and Korean, showcasing their ability to handle diverse linguistic structures effectively. This capability indicates that with fine-tuning on Hinglish datasets, the model can achieve high-quality translations and semantic understanding.
Customization for Hindi: Fine-tuning allows the model to adapt specifically to Hinglish unique syntax, grammar, and cultural nuances. Using techniques like Supervised Fine-Tuning (SFT) or Low-Rank Adaptation (LoRA), developers can enhance its translation accuracy by training it on curated Hinglish datasets. This process ensures that the model generates contextually accurate and culturally relevant translations.
Efficiency for Low-Resource Scenarios: The Gemma 2 9B model is computationally efficient compared to larger models like the 27B version, making it ideal for projects with limited resources while still delivering excellent result

What is Unsloth AI?

Unsloth AI, founded in 2023 and based in San Francisco, is an innovative startup revolutionizing the fine-tuning and training of large language models (LLMs). With a focus on speed and efficiency, Unsloth’s platform enables model training up to 30 times faster while using 90% less memory compared to traditional methods. This is achieved through advanced software optimizations, such as handwritten GPU kernels, rather than relying on hardware upgrades. The company embraces an open-source approach, boasting over 8 million monthly downloads and 29,000 GitHub stars. By making AI training more accessible and cost-effective, Unsloth AI caters to developers and enterprises alike, fostering a collaborative and inclusive AI ecosystem.

Unsloth speeds up LLM training using several techniques. It manually derives backpropagation steps, like manual autograd, for faster gradient calculations. And optimizes chained matrix multiplications and builds custom, more efficient kernels known as Triton language kernels. It also uses Flash Attention to focus on critical input data. Along with other memory-efficient strategies, these enhance training speed and efficiency.

How Unsloth Enables Faster Fine Tuning; translations with Gemma 2 9B

Hands On Tutorial on Fine Tuning Gemma 2 9B For English to Hinglish Translations

In the following tutorial, we fine tune the multilingual Gemma 2 9B on a Hinglish Dataset leveraging the Unsloth AI library on Google Colab using T4 GPU. We save the fine tuned model in Hugging Face and then query the model for different inputs through Ollama. Post this, we explore how the fine tuned model helps in more accurate English to Hinglish translations.

Step 1: Install Necessary Libraries

We will first install necessary libraries below:

!pip install unsloth

Step 2: Loading the Model

The code below loads the pre-trained Gemma 2 9B language model using the unsloth library. It sets configuration options like a maximum sequence length of 2048 tokens and enables 4-bit quantization to reduce memory usage. The data type (dtype) is auto-detected, and the model and tokenizer are loaded for use in further language processing tasks. This setup optimizes memory efficiency while working with large language models.

from unsloth import FastLanguageModel
import torch

max_seq_length = 2048  # Choose any! We auto support RoPE Scaling internally!
dtype = (
    None  # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
)
load_in_4bit = True  # Use 4bit quantization to reduce memory usage. Can be False.

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/gemma-2-9b",
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit)

Step 3: Adding LoRA Adapters

For Adding LoRA Adapters, we only need to update 1 to 10% of all parameters. The code below utilizes the FastLanguageModel.get_peft_model function to adapt a model using LoRA (Low-Rank Adaptation) techniques. It specifies parameters such as the rank (r = 16), target modules for adaptation, and optimization settings like lora_alpha and bias.

The code also enables “unsloth” for efficient memory usage and sets a random state for reproducibility.

model = FastLanguageModel.get_peft_model(
    model,
    r = 16, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

Step 4: Defining the Alpaca Format For Preparing the Dataset

The code below defines a prompt formatting function for preparing training data in a structured format. It starts by creating a template (alpaca_prompt) that includes placeholders for the instruction, input, and output. The formatting_prompts_func function takes in a batch of examples, extracts the English (en) and Hinglish (hi_ng) text, and formats them into the defined template. It adds an EOS_TOKEN (End-of-Sequence token) at the end of each formatted prompt to prevent the model from generating responses indefinitely. The final output is a dictionary with the formatted text for each example, ready for model training or fine-tuning.

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN
def formatting_prompts_func(examples):
    instructions = ["Translate English to Hinglish"]
    inputs       = examples["en"]
    outputs      = examples['hi_ng']
    texts = []
    for instruction, input, output in zip(instructions, inputs, outputs):
        # Must add EOS_TOKEN, otherwise your generation will go on forever!
        text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN
        texts.append(text)
    
    return { "text" : texts, }

Step 5: Loading the Dataset

The code below prepares the dataset in the correct format, with each entry consisting of a properly structured instruction-input-output prompt for Hinglish translation tasks.

from datasets import load_dataset
from datasets import Dataset, DatasetDict

dataset = load_dataset("nateraw/english-to-hinglish", split = "train")
dataset= dataset.remove_columns(["source"])

df_pandas = dataset.to_pandas()

def apply_format(col1,col2):
   instruction = "Translate English to Hinglish"
   text = alpaca_prompt.format(instruction, col1, col2) + EOS_TOKEN
   return text
   
df_pandas['text'] = df_pandas.apply(lambda e:apply_format(e['en'],e['hi_ng']),axis=1)
df_pandas.drop(['en','hi_ng'],axis=1,inplace=True)
dataset = Dataset.from_pandas(df_pandas)

Step 6: Defining Huggingface TRL’s SFTTrainer for Training the Model

The code below initializes an SFTTrainer for fine-tuning a model using the trl library. It sets up training parameters such as batch size, gradient accumulation steps, and learning rate within TrainingArguments. The trainer also configures logging and optimization settings, including the use of mixed precision (fp16 or bf16) based on hardware support. The training process is optimized with an AdamW optimizer and a linear learning rate scheduler.

from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, # Can make training 5x faster for short sequences.
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        max_steps = 60,
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),

        #LOGGING ARGUMENTS
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs",
        report_to = "none", # Use this for WandB etc
    ),
)

Step 7: Starting the Training

trainer_stats = trainer.train()

Step 8: Inference from the Fine Tuned Model

The code below sets up inference for the fine-tuned model using FastLanguageModel. It first prepares a prompt (alpaca_prompt) for translation from English to Hinglish by formatting it with an example input. The prompt is tokenized and transferred to a GPU (cuda) for efficient computation. The model then generates a response with a maximum of 64 new tokens, and the output is decoded back into text. Finally, it extracts the part of the output after the “### Response:” section, which contains the generated Hinglish translation.

# alpaca_prompt = Copied from above
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
inputs = tokenizer(
[
    alpaca_prompt.format(
        "Translate English to Hinglish", # instruction
        "remind me to get eggs today", # input
        "", # output - leave this blank for generation!
    )
], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
output = tokenizer.batch_decode(outputs)
output[0].split("### Response:\n")[1]

Output


'mujhe aaj eggs lene ke liye yaad dilaayen<eos>'

Step 9: Saving the Model & Pushing to Hugging Face

The following code is for saving the trained model and pushing it to Hugging Face Hub. You would need to give it the HF token for writing to the Hub.

model.save_pretrained("lora_model")  # Local saving
tokenizer.save_pretrained("lora_model")

model.push_to_hub("mimidutta007/english_to_hinglish_FTgemma2", token = "") # Online saving
tokenizer.push_to_hub("mimidutta007/english_to_hinglish_FTgemma2", token = "") # Online saving

You can find the model here. I have also converted it to GGUF format so that we can query the model through ollama as well.

Querying the Model Through Ollama

Learn how to interact with the fine-tuned Gemma 2 9B model using Ollama, enabling seamless English-to-Hinglish translations through efficient API queries.

Pulling the Fine Tuned Model Through Ollama

This code installs the Ollama software and the langchain-ollama library, which allows interaction with language models via Ollama. It then starts Ollama as a background subprocess (subprocess.Popen) to run in a non-blocking manner. After waiting for 3 seconds (time.sleep(3)), the code pulls a fine-tuned model (english_to_hinglish_FTgemma2) from Ollama using the ollama pull command. This setup enables the model to be used for English-to-Hinglish translation tasks.

#Installing Ollama and langchain-ollama library
!curl -fsSL https://ollama.com/install.sh | sh
!pip install langchain-ollama

#Starting a subprocess so that ollama can be run in a non blocking manner
import subprocess
subprocess.Popen(["ollama", "serve"])
import time
time.sleep(3) 

#Pulling the Model
!ollama pull hf.co/mimidutta007/english_to_hinglish_FTgemma2

Querying the Fine Tuned Model Through Ollama

This code sets up a prompt template using langchain for an English-to-Hinglish translation task. It defines a template that includes placeholders for the instruction and input, then creates a ChatPromptTemplate from it. The model (OllamaLLM) is instantiated with a fine-tuned Hinglish translation model. The prompt and model are combined in a chain. The input data is passed to the chain, generating a translation response.
The result is then displayed in Markdown format.

from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM
from IPython.display import Markdown

# Define the template
template = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{Instruction}

### Input:
{Input}

### Response:
"""

# Create a prompt template
prompt = ChatPromptTemplate.from_template(template)
# Instantiate the model
model = OllamaLLM(model="hf.co/mimidutta007/english_to_hinglish_FTgemma2")
# Chain the prompt and model
chain = prompt | model

input_data = {
    "Instruction": "Translate from English to Hinglish",
    "Input": "are there any roads closed in the area due to construction"
}

# Invoke the chain with input data and display the response in Markdown format
response = chain.invoke(input_data)

Output


'kya area ke kisi road par construction ki wajah se band hai'

Query-2

“Input”: “please text Joanne Brennan that I will be five minutes late.”

Output


'Joanne Brenan ko message karo ke main 5 minutes late hoon'

Query-3

“Input”: “remind me to get eggs today”

Output


'mujhe aaj eggs lene ke liye yaad dilaayen<eos>'

Query-4

“Input”: “Rotate the Image 90 degrees to the right”

Output


'Image ko 90 degree tak apne Right me rotate kare'

As we can see from the outputs above, they are translated well into Hinglish. Some words are kept in English, while others are converted to Hindi. This suggests that the model, after training, understands Hinglish grammar and can generate accurate outputs.

Comparison with Original Gemma 2 9B Model

Lets also check what the original Gemma 2 9B model outputs for an input:

Input	Output From Original Gemma 2 9B	Output From Fine-tuned Gemma 2 9B
play some motivating music	Karo Yaar Kuch Motivational Gaane	kuch motivate karne wala music bajao
play some motivating music	Chal yaar, kucch pumping gaane bajao	kuch motivate karne wali music play kare
remind me to get eggs today	Yaar, mujhe aaj ande lene ka yaad dila de!	mujhe aaj eggs lene ke liye yaad dilaayen
please text Joanne Brennan that I will be five	Bhai Joanne Brennan ko msg kar de ki main 5 minute late ho jaunga.	Joanne Brennan ko message karo ki main 5 minutes late ho
Request you to please give feedback on comment section	Kya comment section mein kya chal raha hai? Bata de feedback do!	Mujhe comment section par apne feedback dene ki request hai

As we see in the table above, the original Gemma 2 9B outputs are not contextually incorrect. However, the fine-tuned model provides contextually accurate responses. It also maintains a formal tone in the message. In contrast, the original model’s output sounds more casual.
Also, some outputs from the original model are not Hinglish but in complete Hindi like “Yaar, mujhe aaj ande lene ka yaad dila de!”
We also observe some contextuallu inaccurate translations by the original Gemma 2 9B model like “Kya comment section mein kya chal raha hai? Bata de feedback do!” while the fine tuned model translates it accurately.

Conclusion

The development of LLM models for Hinglish translation is crucial for bridging the gap between formal languages and the hybrid dialect commonly used in India’s everyday communication. Fine-tuning the multilingual Gemma 2 9B model offers significant advantages, especially with its efficiency, multilingual strengths, and adaptability to Hinglish’s unique nuances. This approach not only enhances translation accuracy but also facilitates better communication in personal and professional contexts. With the support of Unsloth AI’s innovative fine-tuning capabilities, this model can revolutionize Hinglish translation and improve engagement across diverse audiences.

Key Takeaways

Hinglish, a blend of Hindi and English, is increasingly used in informal communication across India. Hence making it essential for businesses and individuals to develop accurate translation models to engage with a broader audience effectively.
The Gemma 2 9B model is compact yet powerful, with 9 billion parameters and excellent multilingual capabilities. It excels in various tasks such as text generation, code writing, and problem-solving, making it highly versatile.
Fine-tuning the Gemma 2 9B model on Hinglish datasets improves its translation accuracy and ensures it adapts to Hinglish’s unique syntax, grammar, and cultural nuances, making it more effective for real-world applications.
The Gemma 2 9B model’s smaller size (9 billion parameters) allows for efficient deployment on devices with limited resources, offering high performance without the need for costly hardware.
Unsloth AI’s platform significantly enhances the fine-tuning process by enabling faster training (up to 30 times faster) with 90% less memory usage, making AI training more accessible and cost-effective for developers.

Frequently Asked Questions

Q1. Why is it important to develop LLM models for Hinglish translation?

A. Hinglish, a blend of Hindi and English, is widely used in informal communication in India, especially on social media, in advertising, and in daily conversations. Developing LLM models for Hinglish translation helps businesses and individuals effectively communicate with a broader audience, improving engagement and bridging the gap between formal and colloquial language.

Q2. What is the Gemma 2 9B model, and how does it support Hinglish translation?

A. The Gemma 2 9B model is a powerful language processing tool with 9 billion parameters, offering robust performance across multilingual tasks. Its compact size, high efficiency, and adaptability make it an ideal candidate for fine-tuning on Hinglish datasets, improving translation accuracy and capturing Hinglish’s unique syntax and cultural nuances.

Q3. How does fine-tuning the Gemma 2 9B model improve Hinglish translation?

A. Fine-tuning the Gemma 2 9B model using curated Hinglish datasets allows the model to adapt to the language’s distinct syntax, grammar, and vocabulary. This customization ensures more accurate and culturally relevant translations from English to Hinglish, improving communication in both personal and professional contexts.

Q4. What are the advantages of using Unsloth AI for fine-tuning?

A. Unsloth AI offers significant advantages by enabling faster training (up to 30 times faster) while using 90% less memory than traditional methods. This platform makes the fine-tuning process more efficient, cost-effective, and accessible, helping developers create highly specialized language models with fewer resources.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Nibedita Dutta

Nibedita completed her master’s in Chemical Engineering from IIT Kharagpur in 2014 and is currently working as a Senior Data Scientist. In her current capacity, she works on building intelligent ML-based solutions to improve business processes.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Data analyst Learning Path

Tableau Learning Path

NLP Learning Path

Data Scientist Learning Path

Data Engineer Learning Path

MLOps Learning Path

AI Engineer Learning Path

Computer Vision Learning Path

Generative AI Learning Path

Generative AI Roadmap for Enterprises

LLMs Roadmap

Prompt Engineer Leaning Path

Refining English-to-Hinglish Translations with Gemma 2 9B

Learning Objectives

Table of contents

Understanding Gemma 2 9B Model

Fine tuning Gemma 2 9B using Unsloth AI

What is Unsloth AI?

Hands On Tutorial on Fine Tuning Gemma 2 9B For English to Hinglish Translations

Step 1: Install Necessary Libraries

Step 2: Loading the Model

Step 3: Adding LoRA Adapters

Step 4: Defining the Alpaca Format For Preparing the Dataset

Step 5: Loading the Dataset

Step 6: Defining Huggingface TRL’s SFTTrainer for Training the Model

Step 7: Starting the Training

Step 8: Inference from the Fine Tuned Model

Step 9: Saving the Model & Pushing to Hugging Face

Querying the Model Through Ollama

Pulling the Fine Tuned Model Through Ollama

Querying the Fine Tuned Model Through Ollama

Query-2

Query-3

Query-4

Comparison with Original Gemma 2 9B Model

Conclusion

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory