A New Era of Text Generation: RAG, LangChain, and Vector Databases

Babina Banjara Last Updated : 17 Nov, 2023

13 min read

Introduction

Innovative techniques continually reshape how machines understand and generate human language in the rapidly evolving landscape of natural language processing. One such groundbreaking approach is Retrieval Augmented Generation (RAG), which combines the power of generative models like GPT (Generative Pretrained Transformer) with the efficiency of vector databases and langchain.

RAG represents a paradigm shift in the way machines process language, bridging the gap between generative models and retrieval models to achieve a level of contextual understanding and responsiveness that was previously unparalleled. In this blog post, we will explore RAG’s core concepts, integration with GPT models, the role of vector databases, and its real-world applications.

Learning Objectives

Understand the Fundamentals of Retrieval Augmented Generation (RAG)
Gain insights into Vector Database and its innovative data storage and retrieval approach using vectors.
Understand how RAG, LangChain, and Vector Database work in tandem to interpret user queries, retrieve relevant information, and generate coherent responses.
Develop practical skills in applying integrated technologies for specific applications.

This article was published as a part of the Data Science Blogathon.

What is RAG?
Understanding the Essence of RAG
What is Langchain?
Vector Databases in RAG
How do Vector Databases Store and Retrieve Data?
Benefits of Using Vector Databases in RAG
Putting Everything Together
Implementing Question Answering in LangChain
load_qa_chain
Retrieval QA Chain
Conversation Retrieval Chain
Reducing Hallucinations in GenAI Applications with RAG
Evaluating RAG Applications
Ethical Considerations in the Age of Advanced Information Technologies
Frequently Asked Questions

What is RAG?

Retrieval Augmented Generation (RAG) represents a fusion of generative and retrieval models. It seamlessly combines the creative capabilities of generative models with the precision of retrieval systems, allowing for diverse and contextually relevant content generation.

In some traditional language generation tasks, such as text completion or question answering, generative models like GPT (Generative Pre-trained Transformer) have shown excellent capabilities in generating contextually relevant text. However, when the input context is ambiguous or the data is scarce, they may produce incorrect or inconsistent responses.

Retrival-based models, on the other hand, can efficiently extract relevant responses from a vast database of pre-written responses or documents. To deliver accurate responses, chatbots and information retrieval systems frequently utilize them. However, because they are limited to the responses in their database, they lack the originality and adaptability of generative models.

Understanding the Essence of RAG

At its core, RAG is a sophisticated approach that harmoniously blends two fundamental aspects of natural language processing.

Generative Models

Generative models, like OpenAI’s GPT (Generative Pretrained Transformer), have gained widespread recognition for generating coherent and contextually relevant text based on large-scale training datasets. These models employ intricate neural architectures to predict and generate sequences of words, making them highly creative in generating human-like language.

Retrieval Models

On the other hand, retrieval models are designed to efficiently search and retrieve specific pieces of information from vast datasets. These models, often utilizing advanced techniques like vector embeddings, can quickly identify relevant documents, paragraphs, or sentences based on the input query.

What is Langchain?

Let’s first capture the essence of LangChain before we decipher its subtleties. A reliable library called LangChain was created to make interactions with different large language model( LLM) providers, such as OpenAI, Cohere, Bloom, Huggingface, and others, more leisurely. The ability to create Chains and logical connections that aid in bridging one or more LLMs distinguishes LangChain from other platforms.

Why LangChain?

Countless opportunities are available on LangChain; your imagination is the only constraint.

Imagine chatbots that offer users wit and charm in addition to information.
Picture e-commerce platforms that accurately recommend products and compel customers to buy.
Envision healthcare apps provide individualized medical insights, enabling people to choose wisely for their well-being.

You can have extraordinary experiences with LangChain. You have the power to make these concepts a reality right in front of you.

Vector Databases in RAG

A vector database is a sophisticated system that stores and processes data in vector format. Unlike traditional databases that rely on structured tables, vector databases organize information using vector mathematical representations of data points in a multi-dimensional space. This unique approach enables efficient storage, retrieval, and manipulation of vast datasets, making them ideal for handling complex and unstructured data.

How do Vector Databases Store and Retrieve Data?

Storing Data as Vectors

In vector databases, data is stored as vectors to preserve the relationships and similarities between different data points. For example, in a recommendation system, user preferences for movies can be represented as vectors. If two users have similar movie preferences, their respective vectors would be close together in the vector space.

Vector databases utilize specialized data structures and algorithms to store these vectors efficiently. One common technique is the use of embeddings, where raw data is transformed into fixed-size vectors using techniques like word embeddings for natural language processing or image embeddings for computer vision tasks.

Indexing and Retrieving Data

Traditional databases rely heavily on indexing techniques to speed up data retrieval. Vector databases, too, employ indexing strategies, but they focus on the geometric properties of vectors. One widely used method is the creation of spatial indexes, such as tree structures (like R-trees) or approximate nearest neighbor search algorithms (like locality-sensitive hashing). These methods enable the database to identify vectors close to a given query vector quickly.

When a query is made to find similar images to a reference image, the vector representation of the reference image is compared to the vectors of all stored images using proximity measures like cosine similarity or Euclidean distance. This comparison, facilitated by specialized algorithms, allows the database to retrieve similar images efficiently.

Benefits of Using Vector Databases in RAG

RAG combines traditional language generation techniques with retrieval methods, allowing AI systems to pull in specific information from a dataset before generating a response. When integrated with vector databases, RAG becomes a potent tool, offering several distinct advantages:

Contextual Relevance

Vector databases enable the storage of contextual information in a way that preserves relationships and similarities between different data points. When applied to RAG, generated content can be more contextually relevant. By retrieving and analyzing vectors similar to the query context, the AI system can craft responses that align closely with the conversation’s context, enhancing the user experience significantly.

Efficient Information Retrieval

Traditional databases often struggle with large datasets and complex queries, leading to slower response times. Vector databases, optimized for similarity searches, excel at efficient information retrieval. When integrated with RAG models, they enable quick access to relevant data points. This speed is crucial in applications like chatbots, customer support systems, and content recommendation engines where timely responses are paramount.

Accurate Semantics

Vector representations capture semantic similarities between words and phrases. When generating responses using RAG, vector databases assist in understanding the keywords and semantic nuances of the query. This ensures that the generated content is not just a keyword match but a semantically accurate and meaningful response, enhancing the overall quality of interactions.

Enhanced Decision-Making

RAG systems can make more informed decisions by leveraging the wealth of data stored in vector databases. Whether generating product recommendations, answering complex queries, or assisting in data analysis, the comprehensive understanding derived from vector representations aids AI systems in providing valuable insights and recommendations.

Incorporating vector databases into Retrieval Augmented Generation enhances the accuracy and relevance of generated content and opens doors to new possibilities in personalization, contextual understanding, and real-time responsiveness.

Putting Everything Together

We can develop a retrieval-augmented AI system that is both generative and retrieval-based by combining the strength of OpenAI’s GPT models, LangChain, and Pinecone Vector Databases. This system can retrieve and use particular pieces of information in addition to producing creative and human-like text.

Loading: We must first load our data. Different loaders in LangChain can be used.
Splitting: Text splitters divide documents into predetermined-size splits.
Storage: Splits, such as a vector store, will be kept and frequently embedded in storage.
Retrieval: The app retrieves splits from storage, frequently with similar input question embeddings.
Generation: An LLM uses a prompt containing the query and the retrieved data to generate an answer.

Implementing Question Answering in LangChain

There are different ways to do question answering in LangChain. Some of them are discussed below:

1. load_qa_chain

It performs Question Answering every time it is called on all documents passed. It is straightforward and thorough, but because it might not concentrate on the most crucial components of the question, it may be slower and less effective than RetrievalQA.

2. RetrievalQA

RetrievalQA performs QA on the portion of the text that is most pertinent. It is quicker and more effective than the load_qa_chain and uses it under the hood on each chunk. Since the documents may not always provide the best text chunks for the question, these performance gains come at the risk of losing some context or information.

3. VectorstoreIndexCreator

RetrievalQA can run with fewer lines of code thanks to VectorstoreIndexCreator, a higher-level wrapper.

4. ConversationalRetrievalChain

A chat history component is offered by ConversationalRetrievalChain, which is an advancement of RetriervalQAChat.

Let us discuss these in detail.

Import Necessary Libraries

from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
from langchain.document_loaders import TextLoader
from langchain.document_loaders import PyPDFLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma

import os
os.environ["OPENAI_API_KEY"] = ""

llm = OpenAI()

load_qa_chain

This Loads a chain that you can use to answer questions over a set of documents. This uses all the documents, so there might be a problem when using huge files.

from langchain.chains.question_answering import load_qa_chain

# load document
loader = PyPDFLoader("/content/example.pdf")
documents = loader.load()


chain = load_qa_chain(llm=OpenAI(), chain_type="map_reduce")

query = "How many AI Publications in 2021?"
chain.run(input_documents=documents, question=query)

We are using the chain type of map_reduce. If stuff is used, the number of tokens exceeds the limit as it takes a list of documents, inserts them all into a prompt, and passes that prompt to an LLM.

For Multiple Documents

For multiple documents
loaders = [....]
documents = []
for loader in loaders:
     documents.extend(loader.load())
     
chain = load_qa_chain(llm=OpenAI(), chain_type="map_reduce")

query = "How many AI Publications in 2021?"
chain.run(input_documents=documents, question=query)

Retrieval QA Chain

RetrievalQA chain uses load_qa_chain under the hood. We retrieve the most relevant chunk of text and feed those to the large language model of our choice.

Loading the Document

The first step is to load the data we need for answering questions. We can use one of Langchain’s many built-in data loaders. Here, we use PyPDFLoader, which loads the present pdf file data from the p.

loader = PyPDFLoader("/content/example.pdf")
documents = loader.load()

Splitting the Documents

After you complete the scraping and loading, we move on to chunking and cleaning the data. It is crucial to chunk the data to embed a substantial amount of context within our vector index. Embed only a few words because they don’t contain enough information to match relevant vectors, and embedding an entire page would take up too much space in the prompt’s context window.

There are many text splitters that Langchain supports. Here, we are using split by character as that would make it easier to manage the context length directly, choosing a chunk size of about 1000 tokens for this use case. We also set a small overlap length to preserve text continuity between our chunks.

# split the documents into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=20)
texts = text_splitter.split_documents(documents)

Using Embedding and Vector Database

After gathering our data sources, it’s time to create our knowledge base index. You can quickly locate vectors based on specific criteria, such as similarity or distance, by creating an index, essentially a map or reference.

Use OpenAI for the Embeddings and ChromaDB as the vector database. The document vectors can be added to the index once created.

# select which embeddings we want to use
embeddings = OpenAIEmbeddings()

# create the vectorestore to use as the index
db = Chroma.from_documents(texts, embeddings)

Using Retrieval

It’s time to define our retriever once our vector store has been indexed. The retrieval module uses its search algorithm to determine how the pertinent documents are retrieved from the vector database.

In NLP, retrieval is crucial, especially in applications that use search engines, question-answering systems, and information retrievals. A retriever’s objective is to effectively search through a sizable corpus of documents to find the most likely to contain user-specific information.

# expose this index in a retriever interface
retriever = db.as_retriever(search_type="similarity", search_kwargs={"k":2})

Search Types

Different types of searches can be used in the retrieval.

Semantic Similarity: The semantic search fetches all similar documents but does not enforce diversity.
MMR (Maximal Marginal Relevance): Not only looks at similarity but also looks at diversity. The first chunk is a little different from the second one

Creating the Chain

We can construct the Conversational Retrieval Chain now that we have all the necessary parts. In order to facilitate conversational interactions, this chain expands on RetrievalQAChain by including a chat history element. We are using the chain type of stuff.

# create a chain to answer questions
qa = RetrievalQA.from_chain_type(
    llm=OpenAI(), chain_type="stuff", retriever=retriever, return_source_documents=True)

query = "what is the total number of AI publications?"
result = qa({"query": query})

Conversation Retrieval Chain

The ConversationalRetrievalQA chain builds on RetrievalQAChain to provide a chat history component.

It first combines the chat history and the question into a single question.
After that, it looks up relevant documents from the retriever.
And then passes those documents and the question to a question-answering chain to return a response.

from langchain.chains import ConversationalRetrievalChain

# load document
loader = PyPDFLoader("/content/example.pdf")
documents = loader.load()

# split the documents into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

# select which embeddings we want to use
embeddings = OpenAIEmbeddings()

# create the vectorestore to use as the index
db = Chroma.from_documents(texts, embeddings)

# expose this index in a retriever interface
retriever = db.as_retriever(search_type="similarity", search_kwargs={"k":2})

# create a chain to answer questions
qa = ConversationalRetrievalChain.from_llm(OpenAI(), retriever)

First query

We first create an empty chat history and append all the user queries and chatbot responses.

chat_history = []
query = "what is the total number of AI publications?"
result = qa({"question": query, "chat_history": chat_history})

Second Query

The chat history now contains the previous user query and chatbot response.

chat_history = [(query, result["answer"])]
query = "What is this number divided by 2?"
result = qa({"question": query, "chat_history": chat_history})

Reducing Hallucinations in GenAI Applications with RAG

In the context of Generative AI, particularly with models like Retrieval-Augmented Generation (RAG), mitigating hallucinations is pivotal. RAG combines information retrieval and language generation techniques, allowing AI systems to generate content based on retrieved, relevant information. However, hallucinations can still occur.

Causes of Hallucinations

Data Biases: The model might generate biased or hallucinated content if the training data contains biased information.
Over-Optimization: Models that are overly complex or trained for too long can memorize rare patterns in the training data, leading to hallucinations.
Ambiguity in Training Data: If the training data lacks context or contains ambiguous information, the model might fill in the gaps with hallucinated content.

Here’s how you can employ RAG effectively to reduce these hallucinations:

Use of Targeted Queries

Craft precise queries for information retrieval. By formulating focused queries, irrelevant or misleading information is less likely to be retrieved, reducing the chances of hallucinations.

# Example of a targeted query using RAG
query = "Relevant context for generating response."
retrieved_information = retrieve_information(query)

Incorporate Rich Context

RAG allows for the inclusion of context when generating responses. Including relevant context from retrieved information makes the generated content more grounded and less prone to hallucinations.

# Incorporating retrieved information as context for generation
context = preprocess_retrieved_info(retrieved_information)
generated_response = generate_response_with_context(user_query, context)

Confidence Scores

Assign confidence scores to retrieved information. Higher confidence scores indicate reliable information. Filter responses based on these confidence scores to ensure that only trustworthy information contributes to the generated content.

# Filtering retrieved information based on confidence scores
confident_retrieved_info = filter_high_confidence_info(retrieved_information)
generated_response = generate_response(user_query, confident_retrieved_info)

Human Validation Loop

Introduce a human validation step. Human moderators can review generated responses. If a response contains potentially hallucinated content, the system can flag it for human review before presenting it to users.

# Human validation loop for generated responses
generated_response = generate_response(user_query, retrieved_information)
if requires_human_validation(generated_response):
    reviewed_response = human_review(generated_response)
    send_response_to_user(reviewed_response)
else:
    send_response_to_user(generated_response)

By integrating these strategies, RAG can significantly reduce hallucinations in Generative AI applications.

Evaluating RAG Applications

For the various components of our RAG application, we have so far selected typical/ arbitrary values. However, if we changed something, like our LLM, embedding model, or chunking logic, how can we be sure that our configuration is better than it was? We must create trustworthy methods to quantitatively assess a generative task like this because it is very challenging. We must conduct both unit or component and end-to-end evaluations because our application contains numerous moving parts.

Component-wise Evaluation

It can necessitate evaluating the following:

Retrieval in isolation: This is the best source in our retrieved chunks.

LLM’s response: Can the LLM produce a high-quality answer given that source?

Overall, the System’s Quality Evaluation

Additionally, we can evaluate the overall system’s quality, which accesses the data sources, and the quality of the response for end-to-end evaluation.

Ethical Considerations in the Age of Advanced Information Technologies

Ethical considerations are pivotal in shaping the innovation trajectory in advanced information technologies, where Retrieval Augmented Generation (RAG), LangChain, and Vector Databases seamlessly intertwine. As these cutting-edge technologies redefine how we process information and interact with data, examining the ethical implications in this transformative landscape is imperative.

Privacy and Data Security

The amalgamation of RAG, LangChain, and Vector Databases processes, retrieves, and generates vast amounts of data. Ethical concerns arise regarding the privacy and security of this data. Ensuring robust encryption, strict access controls, and transparent data usage policies are essential to protect user information from unauthorized access and misuse.

Bias and Fairness

These technologies train algorithms on extensive datasets, potentially inheriting biases present in the data. Addressing algorithmic bias is crucial to preventing discriminatory outcomes, especially in language processing and content generation applications.

Accountability and Transparency

Developers, organizations, and policymakers must be accountable for their creations as these technologies evolve. Transparent communication about how these systems operate, the sources of data, and the algorithms employed fosters trust among users.

Conclusion

In conclusion, the fusion of cutting-edge technologies like Retrieval Augmented Generation (RAG), LangChain, and Vector Database marks a significant milestone in natural language processing and information retrieval. The synergy between these advanced tools amplifies the efficiency of data retrieval and elevates the quality of generated content to unprecedented levels.

With Retrieval Augmented Generation, we have bridged the gap between traditional search methods and natural language understanding. By integrating LangChain’s linguistic prowess and Vector Database’s robust data structuring, we have created a seamless ecosystem where machines swiftly comprehend human language intricacies, access relevant information, and produce contextually rich, coherent, and accurate content. As we move forward, the collaboration between RAG, LangChain, and Vector Database promises a future where human-machine interaction is both efficient and deeply meaningful. The journey doesn’t end here; it merely evolves, and with each step, we get closer to a world where technology truly augments our capabilities, making the complex simple and the unimaginable attainable.

Key Takeaways

Businesses and individuals can efficiently access pertinent information by harnessing RAG’s natural language understanding alongside LangChain’s linguistic analysis and Vector Database’s structured data retrieval capabilities.
The collaboration between RAG, LangChain, and Vector Database elevates content generation to unprecedented levels.
More brilliant virtual assistants, intuitive chatbots, and insightful data analysis tools become possible, enhancing user experiences across various applications. This transformative shift ensures that technology serves as an actual augmentation of human capabilities.

Frequently Asked Questions

Q1. What is Retrieval Augmented Generation (RAG), and how does it work with LangChain and Vector Database?

A1. Retrieval Augmented Generation (RAG) is an advanced technology that integrates natural language understanding and generation with information retrieval. RAG comprehends user queries, retrieves relevant information from large datasets using the Vector Database, and generates human-like responses. Combined with LangChain’s linguistic analysis, it ensures accurate interpretation of queries, allowing for the generation of contextually rich and coherent responses.

Q2. How does LangChain enhance the capabilities of RAG in understanding human language?

A2. LangChain is a linguistic analysis tool that dissects the intricacies of language, including grammar, syntax, and semantics. By integrating LangChain with RAG, the system gains a deep understanding of the nuances of human language. LangChain’s analysis ensures that RAG comprehends queries with precision, accurately interpreting the meaning behind the words. This synergy results in more accurate and contextually relevant responses.

Q3. What role does Vector Database play in this integrated system, and how does it improve information retrieval?

A3. Vector Database is a revolutionary data storage system that organizes data points using vectors, capturing semantic relationships between them. The Vector Database efficiently stores and retrieves structured information in this integrated system. By understanding the semantic context of data, it ensures that the retrieved information is not only relevant but also contextually accurate. This innovative approach significantly improves the speed and accuracy of information retrieval.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Babina Banjara

Technology can impact lives at a level that has never been realized in mankind's history. The idea that something I create can impact someone worldwide now or in the future drives my passion for Technology.

A dedicated ML Engineer and Tech enthusiast, proficient in training ML models. My current interests are advancing machine learning techniques, particularly in natural language processing, LLMs, and multimodal AI.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

A New Era of Text Generation: RAG, LangChain, and Vector Databases

Introduction

Learning Objectives

Table of contents

What is RAG?

Understanding the Essence of RAG

Generative Models

Retrieval Models

What is Langchain?

Why LangChain?

Vector Databases in RAG

How do Vector Databases Store and Retrieve Data?

Storing Data as Vectors

Indexing and Retrieving Data

Benefits of Using Vector Databases in RAG

Contextual Relevance

Efficient Information Retrieval

Accurate Semantics

Enhanced Decision-Making

Putting Everything Together

Implementing Question Answering in LangChain

1. load_qa_chain

2. RetrievalQA

3. VectorstoreIndexCreator

4. ConversationalRetrievalChain

Import Necessary Libraries

load_qa_chain

For Multiple Documents

Retrieval QA Chain

Loading the Document

Splitting the Documents

Using Embedding and Vector Database

Using Retrieval

Search Types

Creating the Chain

Conversation Retrieval Chain

First query

Second Query

Reducing Hallucinations in GenAI Applications with RAG

Causes of Hallucinations

Use of Targeted Queries

Incorporate Rich Context

Confidence Scores

Human Validation Loop

Evaluating RAG Applications

Component-wise Evaluation

Overall, the System’s Quality Evaluation

Ethical Considerations in the Age of Advanced Information Technologies

Privacy and Data Security

Bias and Fairness

Accountability and Transparency

Conclusion

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid