In the revolutionary era of AI, conversational agents or chatbots have emerged as pivotal tools for engaging users, assisting, and enhancing user experience across various digital platforms. Chatbots powered by advanced AI techniques enable automated and interactive conversations that resemble human interactions. With the launch of ChatGPT, the ability to answer user queries has reached greater heights. Building Chatbots such as ChatGPT on custom data can help businesses with better user feedback and experience. In this article, we will build Langchain’s Chatbot solution, like ChatGPT, on multiple custom websites and the Retrieval Augmented Generation (RAG) technique. To begin with the project, we will first understand a few critical components to build such an application.
Here is what you will learn from this project: Large Language Chat Models
This article was published as a part of the Data Science Blogathon.
To build a ChatBot like ChatGPT, a framework like Langchain comes in this step. We define the Large language model that is used to create the response. Ensure you use gpt-3.5-turbo-16k as the model while working with multiple data sources. This will occupy more number of tokens. Use this model name and avoid InvalidRequestError in handy. Langchain is an open-source framework designed to drive the development of applications driven by large language models(LLMs). At its core, LangChain facilitates the creation of applications that possess a crucial attribute and context awareness. These applications connect LLMs to custom data sources, including prompt instructions, few-shot examples, and contextual content. Through this vital integration, the language model can ground its responses in the provided context, resulting in a more subtle and informed interaction with the user.
LangChain provides a high-level API that makes it easy to connect language models to other data sources and build complex applications. With this, you can build applications such as Search Engine, Advanced Recommendation Systems, eBook PDF summarization, Question and answering agents, Code Assistant chatbots, and many more.
Large language models are great when it comes to generating responses as a conventional AI. It can do various tasks like Code generation, mail writing, generating blog articles, and so on. But one huge disadvantage is the domain-specific knowledge. LLMs usually tend to hallucinate when it comes to answering domain-specific questions. To overcome challenges like reducing hallucinations and training the pre-trained LLMs with domain-specific datasets, we use an approach called Fine Tuning. Fine Tuning reduces hallucinations and best way to make a model learn about domain knowledge. But this comes with higher risks. Fine Tuning requires training time and computation resources that are a bit expensive.
RAG comes to the rescue. Retrieval Augmented Generation (RAG) ensures the domain data content is fed to LLM that can produce contextually relevant and factual responses. RAG not only acquires the knowledge but also requires no re-training of the LLM. This approach reduces the computation requirements and helps the organization to operate on a limited training infrastructure. RAG utilizes vector databases that also help in scaling the application.
The figure demonstrates the Workflow of the Chat with Multiple Websites project.
Let’s dive into the code to understand the components used in the Workflow.
You can install LangChain using the pip command. We can also install OpenAI to set up the API key.
pip install langchain
pip install openai
pip install chromadb tiktoken
Let’s set up the OpenAI API key.
In this project, we will use ChatOpenAI with a gpt-3.5-turbo-16k model and OpenAI embeddings. Both these components require an OpenAI API key. In order to get your API key, log in to platform.openai.com.
1. After you log into your account, click on your profile and choose “View API keys“.
2. Press “Create new secret key” and copy your API key.
Create an environment variable using the os library as per the syntax and paste your API key.
import os
os.environ['OPENAI_API_KEY'] = "sk-......zqBp" #replace the key
In order to build a Chatbot application like ChatGPT, the fundamental requirement is custom data. For this project since we want to chat with multiple websites, we need to define the website URLs and load this data source via WebBaseLoader. Langchain loader such as WebBaseLoader scraps the data content from the respective URLs.
from langchain.document_loaders import WebBaseLoader
URLS = [
'https://medium.com/@jaintarun7/getting-started-with-camicroscope-4e343429825d',
'https://medium.com/@jaintarun7/multichannel-image-support-week2-92c17a918cd6',
'https://medium.com/@jaintarun7/multi-channel-support-week3-2d220b27b22a'
]
loader = WebBaseLoader(URLS)
data = loader.load()
Chunking refers to a specific linguistic task that involves identifying and segmenting contiguous non-overlapping groups of words (or tokens) in a sentence that serve a common grammatical function. In simple words, Chunking helps break down the large text into smaller segments. Langchain provides text splitter supports like CharacterTextSplitter, which splits text into characters.
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
websites_data = text_splitter.split_documents(data)
For a deep learning model dealing with text, one needs to pass the text into the Embedding layer. In a similar way to make the model learn the context, the chunked data needs to be converted into Embeddings. Embeddings are a way to convert words or tokens into numerical vectors. This transformation is crucial because it allows us to represent textual data, which is inherently discrete and symbolic, in a continuous vector space. Each word or token is represented by a unique vector.
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
The actual website data is extracted and converted into the Embeddings that are in vector form. Vector Databases are a unique way to store the embeddings in databases such as Chroma. The vector database is a new type of database that is becoming popular in the world of ML and AI. The key advantage of using a Vector database is because of searching techniques and similarity search. After getting the user query, the result of the similarity search and retrieval is usually a ranked list of vectors that have the highest similarity scores with the query vector. Using this metric, the application ensures returning factual responses.
A few of the commonly used and popular Open Source vector databases are Chroma, Elastic Search, Milvus, Qdrant, Weaviate, and FAISS.
from langchain.vectorstores import Chroma
websearch = Chroma.from_documents(websites_data, embeddings)
In this step, we define the Large language model, that is used to create the response. Make sure you use gpt-3.5-turbo-16k as the model while working with multiple data sources. This will occupy more number of tokens. Use this model name and avoid InvalidRequestError.
from langchain.chat_models import ChatOpenAI
model = ChatOpenAI(model='gpt-3.5-turbo-16k',temperature=0.7)
We have reached the final part of the project, where we get the input prompt and use a vector database retriever to retrieve the relevant context for the entered prompt. RetrievalQA stacks up both large language models and vector databases in a chain that helps in better response.
from langchain.chains import RetrievalQA
rag = RetrievalQA.from_chain_type(llm=model, chain_type="stuff", retriever=websearch.as_retriever())
prompt = "Write code implementation for Multiple Tif image conversion into RGB"
response = rag.run(prompt)
print(response)
Output
#installation
!pip install langchain openai tiktoken chromadb
#import required libraries
import os
from getpass import getpass
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
#set up OpenAI API Key
api_key = getpass()
os.environ['OPENAI_API_KEY'] = api_key
#ETL=> load the data
URLS = [
'https://medium.com/@jaintarun7/getting-started-with-camicroscope-4e343429825d',
'https://medium.com/@jaintarun7/multichannel-image-support-week2-92c17a918cd6',
'https://medium.com/@jaintarun7/multi-channel-support-week3-2d220b27b22a'
]
loader = WebBaseLoader(URLS)
data = loader.load()
#Chunking => Text Splitter into smaller tokens
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
websites_data = text_splitter.split_documents(data)
#create embeddings
embeddings = OpenAIEmbeddings()
#store embeddings and the data inside Chroma - vector database
websearch = Chroma.from_documents(websites_data, embeddings)
#define chat large language model-> 16K token size
model = ChatOpenAI(model='gpt-3.5-turbo-16k',temperature=0.7)
Is LangChain a library or framework?
#retrieval chain
rag = RetrievalQA.from_chain_type(llm=model, chain_type="stuff", retriever=websearch.as_retriever())
#retrieve relevant output
prompt = "Write code implementation for Multiple Tif image conversion into RGB"
#run your query
response = rag.run(prompt)
print(response)
To conclude with the article, we have successfully built a Chatbot for multiple websites using Langchain. This is not just a simple Chatbot. Rather, it is a Chatbot that answers like ChatGPT but on your data. The key takeaway from this article is:
This project use case has inspired you to explore the potential of the Langchain and RAG.
A. Large language model(LLM) is a transformer-based model that generates text data based on the user prompt, whereas LangChain is a framework that provides LLM as one component alongside various other components such as memory, vectorDB, embeddings, and so on.
A. Langchain is a robust open-source framework used to build ChatBot like ChatGPT on your data. With this framework, you can build various applications such as Search applications, Question and answering bots, Code generation assistants, and more.
A. Langchain is an open-source framework designed to drive the development of applications driven by large language models(LLMs). At its core, LangChain facilitates the creation of applications that possess a crucial attribute and context awareness.
A. Retrieval Augmented Generation (RAG) ensures the domain data content is fed to LLM that can produce contextually relevant and factual responses.
A. RAG is a technique that combines the LLM and information knowledge store to generate a response. The core idea behind RAG is knowledge transfer that requires no training of the model, whereas Fine Tuning is a technique where we impose the data to the LLM and re-train the model for external knowledge retrieval.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.