How to Build a Multilingual Chatbot using Large Language Models?

shalutyagi 30 Jun, 2024
7 min read

Introduction

This article covers the creation of a multilingual chatbot for multilingual areas like India, utilizing large language models. The system improves consumer reach and personalization by using LLMs to translate questions between local languages and English. We go over the architecture, implementation specifics, advantages, and required actions. Subsequent research endeavours will center on possible progressions and wider implementation of this resolution.

Learning Objectives

  • Understand the role and functionalities of Large Language Models (LLMs) in enhancing customer experience and personalization.
  • Learn how to develop a multilingual chatbot using LLMs for translation and query handling.
  • Explore the architecture and implementation details of a multilingual chatbot using tools like Gradio, Databricks, Langchain, and MLflow.
  • Gain knowledge on embedding techniques and creating a vector database for retrieval-augmented generation (RAG) with personalized data.
  • Identify potential advancements and future enhancements in scaling and fine-tuning LLM-based multilingual chatbots for broader applications.

This article was published as a part of the Data Science Blogathon.

Rise of Technology and Chat-GPT

With the rising technology and launch of Chat-GPT , the world have shifted its focus on utilizing Large Language Models for their use. Organizations rapidly use large language models to drive business value. Organizations constantly use them to enhance customer experience, add personalization, and improve customer reach.

Role of Large Language Models

An LLM is a computer program that has been fed enough examples to be able to recognize and interpret human language or other types of complex data. Many organizations train LLMs on data gathered from the Internet — comprising thousands or millions of gigabytes’ worth of text. But the quality of the samples impacts how well LLMs will learn natural language, so an LLM’s programmers may use a more curated data set.

Understanding Large Language Models (LLMs)

LLMs use a type of machine learning called deep learning in order to understand how characters, words, and sentences function together. Deep learning involves the probabilistic analysis of unstructured data, which eventually enables the deep learning model to recognise distinctions between pieces of content without human intervention.

Programmers further train LLMs via tuning, fine-tuning them or prompt-tuning them to perform specific tasks such as interpreting questions, generating responses, or translating text from one language to another.

Motivation for a Multilingual Chatbot

Many geographical regions in the world have multiple spoken languages. India, as a multilingual country, speaks several languages, with only 10% being literate in English. Here, a single common language is adopted in the community for proper communication. But this can cause one language to be dominant over others, and can be a disadvantage to the speakers of other languages.

This can also result in the disappearance of a language, its distinctive culture and a way of thinking. For national / international companies here, having their business / marketing content in multiple languages is an expensive option, hence majority of them stick to one language of commerce – English, which could also mean losing opportunity to better connect with local audiences and thus losing potential customers. While using English is not inherently wrong, it excludes those who are not conversant in that language from participating in mainstream commerce.

Proposed Solution

The proposed solution allows people to ask queries in their local language, use LLMs to understand and retrieve information in English, and translate it back into the local language. This solution leverages the power of Large Language Models for translation and query handling.

Key Features and Functionalities

  • Translation from local language to English and vice-versa.
  • Finding the most similar query in the database.
  • Answering queries via the base LLM if no similar query is found.

This solution helps businesses, especially banks, to reach a wider population and allows banking services to benefit common people, improving their financial prospects.

Building a Multilingual Chatbot using Large Language Models

Advantages of a Multilingual Chatbot

  • Increased Customer Reach: Supporting multiple languages, a chatbot can reach a wider audience and provide assistance to users who may not speak the same language Make information and services – especially essential services like Banking – more accessible to people This benefits both – the people as well as the company.
  • Improved Personalization: Multi-lingual chatbots can provide personalized recommendations and tailored experiences to users based on their language and cultural preferences.
  • Enhanced Customer Service: Chatbots can provide better customer service and help resolve issues more efficiently, thus leading to increased customer satisfaction.

Architecture of the Multilingual Chatbot

  • The user opens the Gradio app and has options of typing the data in the local language
  • Translation: Utilizing given prompt in given local language using LLM (Llama-70b-chat) through mlflow route.
  • The system converts the translated prompt to embeddings using Instructor-xl embeddings and searches it in the created vector database (chroma) from the local language personalized data.
  • The system passes the prompt with the context (most similar embeddings from the semantic search) to the LLM for the result.
  • Translation: Translation of result in the local language.
Implementation Details

Implementation Details

  • Gradio is used to build the front-end of the app.
  • Databricks was used for Coding. All the framework is designed in Databricks
  • Used LLAMA-2 70b chat as the chosen Large Language Model. MosaicML inferencing was used to get the chat completion output from the prompt.
  • The application performed embeddings using the Instructor-xl model.
  • The application stored the embeddings in the ChromaDb vector database.
  • The framework and pipeline of the app utilized Langchain and MLflow.

Code Implementation

Let us now implement Multilingual Chatbot using Large Language Model.

Step1: Installing Necessary Packages

The packages are easily available on hugging face.

%pip install mlflow
%pip install --upgrade langchain
%pip install faiss-cpu
%pip install pydantic==1.10.9
%pip install chromadb
%pip install InstructorEmbedding
%pip install gradio

Loading the CSV for RAG implementation and converting it into text chunks

RAG is an AI framework for retrieving facts from an external knowledge base to ground large language models (LLMs) on the most accurate, up-to-date information and to give users insight into LLMs’ generative process.

Researchers and developers use retrieval-augmented generation (RAG) to improve the quality of LLM-generated responses by grounding the model on external sources of knowledge, supplementing the LLM’s internal representation of information.

The data was question -response in hindi language. One can generate the set of question-response for any language and use it as an input for RAG implementation.

Step2: Loading and Preparing Data for RAG

from langchain.document_loaders.csv_loader import CSVLoader
loader = CSVLoader(file_path="/Workspace/DataforRAG_final1.csv",
encoding="utf-8", csv_args={'delimiter': ','})
data = loader.load()


from langchain.text_splitter import RecursiveCharacterTextSplitter
#from langchain.text_splitter import CharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
text_chunks = text_splitter.split_documents(data)

Loading the Instructor-Xl embeddings

We downloaded the Instructor-XL embeddings from the Hugging Face site.

Step3: Creating and Storing Embeddings

from langchain.embeddings import HuggingFaceInstructEmbeddings
instructor_embeddings = HuggingFaceInstructEmbeddings(model_name="hkunlp/instructor-xl", 
                                                    model_kwargs={"device": "cuda"})

We used the Chroma vector database to store the embeddings created for the RAG data. We created the instructor-xl embeddings for the personalized dataset.

# Embed and store the texts
# Supplying a persist_directory will store the embeddings on disk
from langchain.vectorstores import Chroma
persist_directory = 'db'

## Here is the nmew embeddings being used
embedding = instructor_embeddings

vectordb = Chroma.from_documents(documents=text_chunks, 
                                 embedding=embedding,
                                 persist_directory=persist_directory)


# persiste the db to disk
vectordb.persist()
vectordb = None

# Now we can load the persisted database from disk, and use it as normal. 
vectordb = Chroma(persist_directory=persist_directory, 
                  embedding_function=embedding)

Step4: Defining the Prompt Template

from langchain.llms import MlflowAIGateway
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA
import gradio as gr
from mlflow.gateway import set_gateway_uri, create_route, query,delete_route

set_gateway_uri("databricks")

mosaic_completion_route = MlflowAIGateway(
  gateway_uri="databricks",
  route="completion"
)

# Wrap the prompt and Gateway Route into a chain

template = """[INST] <>
You are Banking Question Answering Machine. Answer Accordingly
<>

{context}

{question} [/INST]

"""
prompt = PromptTemplate(input_variables=['context', 'question'], 
template=template)

retrieval_qa_chain = RetrievalQA.from_chain_type(llm=mosaic_completion_route, chain_type="stuff", 
retriever=vectordb.as_retriever(), chain_type_kwargs={"prompt": prompt})


def generating_text(prompt,route_param,token):
    # Create a Route for text completions with MosaicML Inference API 
    create_route(
        name=route_param,
        route_type="llm/v1/completions",
        model={
            "name": "llama2-70b-chat",
            "provider": "mosaicml",
            "mosaicml_config": {
            "mosaicml_api_key": "3abc"
            }
        }
    )

    response1 = query(
    route=route_param,
    data={"prompt": prompt,"temperature": 0.1,
        "max_tokens": token}
    )

    return(response1)

Step5: Gradio App Development

We developed the front-end using the Gradio package. It features a fixed template that can be customized according to one’s needs.

Multilingual Chatbot
import string
import random
 
# initializing size of string
N = 7

def greet(Input,chat_history):
    RouteName1="text1"
    RouteName2="text2"
    system="""you are a translator which converts english to hindi. 
    Please translate the given text to hindi language and 
    only return the content translated. no explanation"""
    
    system1="""you are a translator which converts hindi to english. 
    Please translate the given text to english language from hindi language and 
    only return the content translated. 
    no explanation"""
    
    prompt=f"[INST] <> {system1} <> {Input}[/INST]"
    delete_route("text1")
    result=generating_text(prompt,RouteName1,400)
    res=result['candidates'][0]['text']
    t=retrieval_qa_chain.run(res)
    prompt2=f"[INST] <> {system} <> {t} [/INST]"
    delete_route("text2")
    token=800
    result1=generating_text(prompt2,RouteName2,token)
    chat_history.append((Input, result1['candidates'][0]['text']))
    return "", chat_history


with gr.Blocks(theme=gr.themes.Soft(primary_hue=gr.themes.colors.blue, 
    secondary_hue=gr.themes.colors.red)) as demo:
    gr.Markdown("## सखा- भाषा अब कोई बाधा नहीं है")
    chatbot = gr.Chatbot(height=400) #just to fit the notebook
    msg = gr.Textbox(label="Prompt",placeholder="अपना प्रश्न हिंदी में यहां दर्ज करें",max_lines=2)
    with gr.Row():
        btn = gr.Button("Submit")
        clear = gr.ClearButton(components=[msg, chatbot], value="Clear console")
    # btn = gr.Button("Submit")
    # clear = gr.ClearButton(components=[msg, chatbot], value="Clear console")
    btn.click(greet, inputs=[msg, chatbot], outputs=[msg, chatbot])
    msg.submit(greet, inputs=[msg, chatbot], outputs=[msg, chatbot])
    gr.Examples([["एचडीएफसी बैंक का कस्टमर केयर नंबर क्या है?"],
                 ["गोल्ड लोन क्या है??"],['गोल्ड लोन के लिए आवश्यक दस्तावेज।']], 
                 inputs=[msg,chatbot])

gr.close_all()
demo.launch(share=True,debug=True)

     #import csv

Further Advancements

Let us now explore further advancements of Multilingual Chatbot.

Scaling to Different Regional Languages

Currently, for demo purposes, we have built the solution for the Hindi language. The same can be scaled for different regional languages.

Fine-Tuning LLAMA-2 70b Chat Model

  • Fine-tuning the model with custom data in Hindi.
  • Extending fine-tuning to other local native languages.

Potential Enhancements and Future Work

  • Incorporating additional features and functionalities.
  • Improving the accuracy and efficiency of translations and responses.
  • Exploring the integration of more advanced LLMs and embedding techniques.

Conclusion

Large language models (LLMs) could be used to create a multilingual chatbot that will transform accessibility and communication in linguistically varied areas like India. This technology improves customer engagement by addressing linguistic hurdles. Future developments in LLM capabilities and scaling to additional languages will improve user experience even more and increase the global reach of multilingual chatbots.

Key Takeaways

  • Multilingual chatbots leveraging LLMs bridge language gaps, enhancing accessibility and user engagement.
  • Integration of Gradio, Databricks, Langchain, and MLflow streamlines multilingual chatbot development.
  • Use of retrieval-augmented generation (RAG) improves response quality by leveraging external knowledge sources.
  • Personalized experiences and expanded customer reach are facilitated through language-specific embeddings and vector databases.
  • Future advancements aim to scale and fine-tune LLMs for broader linguistic diversity and enhanced efficiency.

Frequently Asked Questions

Q1. What is a multilingual chatbot?

A. A multilingual chatbot is an AI-powered tool capable of understanding and responding in multiple languages, facilitating communication across diverse linguistic backgrounds.

Q2. How do large language models (LLMs) enhance multilingual chatbots?

A. LLMs enable multilingual chatbots to translate queries, understand context, and generate responses in different languages with high accuracy and naturalness.

Q3. What are the advantages of using LLMs in multilingual chatbots?

A. LLMs improve customer reach by catering to diverse language preferences, enhance personalization through tailored interactions, and boost efficiency in handling multilingual queries.

Q4. How can businesses benefit from implementing multilingual chatbots?

A. Businesses can expand their customer base by providing services in customers’ preferred languages, improve customer satisfaction with personalized interactions, and streamline operations across global markets.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

shalutyagi 30 Jun, 2024

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear