What is Retrieval-Augmented Generation (RAG)?

Soumyadarshan Dash Last Updated : 04 Apr, 2025

13 min read

Retrieval-Augmented Generation (RAG) is a smart AI technique that combines two powerful tools: information retrieval and text generation. Imagine a system that can search for relevant facts or data (like a librarian) and then use that information to create clear, accurate, and detailed answers (like a writer). RAG is used in chatbots, virtual assistants, and other AI tools to provide better, more informed responses. It’s like having a super-smart helper that knows how to find and use the right information. In this article, you will get to know all about Retrieval-Augmented Generation, its uses, applications, and how it will shape the future of RAG and LLMs.

This article was published as a part of the Data Science Blogathon

What is RAG?
Why Use RAG?
The Fusion of Retrieval-Based and Generative Models
Range of Data Sources to Empower RAG Models
Benefits of Retrieval-Augmented Generation (RAG)
Diverse Approaches in RAG
Ethical Considerations in RAG
Applications of Retrieval Augmented Generation (RAG)
The Future of RAGs and LLMs
How Does Retrieval-Augmented Generation Work?
Utilizing LangChain for Enhanced RAG
Conclusion
Frequently Asked Questions

What is RAG?

Retrieval-Augmented Generation, or RAG, represents a cutting-edge approach to artificial intelligence (AI) and natural language processing (NLP). At its core, RAG LLM is an innovative framework that combines the strengths of retrieval-based and generative models, revolutionizing how AI systems understand and generate human-like text.

Why Use RAG?

The development of RAG (Retrieval-Augmented Generation) is a direct response to the limitations of large language models (LLMs) like GPT. While LLMs have shown impressive text generation capabilities, they often struggle to provide contextually relevant responses, hindering their utility in practical applications. RAG LLM aims to bridge this gap by offering a solution that understands user intent and delivers meaningful and context-aware replies.

The Fusion of Retrieval-Based and Generative Models

RAG is fundamentally a hybrid model that seamlessly integrates two critical components. Retrieval-based methods involve accessing and extracting information from external knowledge sources such as databases, articles, or websites.

On the other hand, generative models excel in generating coherent and contextually relevant text. What distinguishes RAG Model is its ability to harmonize these two components, creating a symbiotic relationship that allows it to comprehend user queries deeply and produce responses that are not just accurate but also contextually rich.

Deconstructing RAG’s Mechanics

To grasp the essence of RAG LLM, it’s essential to deconstruct its operational mechanics. RAG operates through a series of well-defined steps:

Begin by receiving and processing user input.
Analyze the user input to understand its meaning and intent.
Utilize retrieval-based methods to access external knowledge sources. This enriches the understanding of the user’s query.
Use the retrieved external knowledge to enhance comprehension.
Employ generative capabilities to craft responses. Ensure responses are factually accurate, contextually relevant, and coherent.
Combine all the information gathered to produce responses that are meaningful and human-like.
Ensure that the transformation of user queries into responses is done effectively.

The Role of Language Models and User Input

Central to understanding RAG is appreciating the role of Large Language Models (LLMs) in AI systems. LLMs like GPT are the backbone of many NLP applications, including chatbots and virtual assistants. They excel in processing user input and generating text, but their accuracy and contextual awareness are paramount for successful interactions. RAG strives to enhance these essential aspects by integrating retrieval and generation.

Incorporating External Knowledge Sources

RAG’s distinguishing feature is its ability to integrate external knowledge sources seamlessly. By drawing from vast information repositories, RAG augments its understanding, enabling it to provide well-informed and contextually nuanced responses. Incorporating external knowledge elevates the quality of interactions and ensures that users receive relevant and accurate information.

Generating Contextual Responses

Ultimately, RAG’s hallmark is its ability to generate contextual responses. Moreover, it considers the broader context of user queries, leverages external knowledge, and produces responses demonstrating a deep understanding of the user’s needs. Consequently, these context-aware responses are a significant advancement, as they facilitate more natural and human-like interactions, making AI systems powered by RAG highly effective in various domains.

Retrieval Augmented Generation (RAG) is a transformative concept in AI and NLP. Additionally, by harmonizing retrieval and generation components, RAG addresses the limitations of existing language models and paves the way for more intelligent and context-aware AI interactions. Furthermore, its ability to seamlessly integrate external knowledge sources and generate responses that align with user intent positions RAG as a game-changer in developing AI systems that can truly understand and communicate with users in a human-like manner.

Range of Data Sources to Empower RAG Models

In this section, we delve into the pivotal role of external data sources within the Retrieval Augmented Generation (RAG) framework. We explore the diverse range of data sources that can be harnessed to empower RAG-driven models.

APIs and Real-time Databases

APIs (Application Programming Interfaces) and real-time databases are dynamic sources that provide up-to-the-minute information to RAG-driven models. They also allow models to access the latest data as it becomes available.

Document Repositories

Document repositories serve as valuable knowledge stores, offering structured and unstructured information. Additionally, they are fundamental in expanding the knowledge base that RAG models can draw upon.

Webpages and Scraping

Web scraping is a method for extracting information from web pages. It enables RAG LLM models to access dynamic web content, making it a crucial source for real-time data retrieval.

Databases and Structured Information

Databases provide structured data that can be queried and extracted. Additionally, RAG models can utilize databases to retrieve specific information, enhancing their responses’ accuracy.

Benefits of Retrieval-Augmented Generation (RAG)

Let us now talk about the benefits of Retrieval Augmented Generation or RAG Model.

Enhanced LLM Memory

RAG addresses the information capacity limitation of traditional Language Models (LLMs). Traditional LLMs have a limited memory called “Parametric memory.” RAG introduces a “Non-Parametric memory” by tapping into external knowledge sources. This significantly expands the knowledge base of LLMs, enabling them to provide more comprehensive and accurate responses.

Improved Contextualization

RAG enhances the contextual understanding of LLMs by retrieving and integrating relevant contextual documents. This empowers the model to generate responses that align seamlessly with the specific context of the user’s input, resulting in accurate and contextually appropriate outputs.

Updatable Memory

A standout advantage of RAG is its ability to accommodate real-time updates and fresh sources without extensive model retraining. Moreover, this keeps the external knowledge base current and ensures that LLM-generated responses are always based on the latest and most relevant information.

Source Citations

RAG-equipped models can provide sources for their responses, thereby enhancing transparency and credibility. Moreover, users can access the sources that inform the LLM’s responses, promoting transparency and trust in AI-generated content.

Reduced Hallucinations

Studies have shown that RAG models exhibit fewer hallucinations and higher response accuracy. They are also less likely to leak sensitive information. Reduced hallucinations and increased accuracy make RAG models more reliable in generating content.

These benefits collectively make Retrieval Augmented Generation (RAG) a transformative framework in Natural Language Processing. Consequently, it overcomes the limitations of traditional language models and enhances the capabilities of AI-powered applications.

Also Read: Traditional RAG to Graph RAG: The Evolution of Knowledge Retrieval Systems in Artificial Intelligence

Diverse Approaches in RAG

RAG Model offers a spectrum of approaches for the retrieval mechanism, catering to various needs and scenarios:

Simple: Retrieve relevant documents and seamlessly incorporate them into the generation process, ensuring comprehensive responses.
Map Reduce: Combine responses generated individually for each document to craft the final response, synthesizing insights from multiple sources.
Map Refine: Iteratively refine responses using initial and subsequent documents, enhancing response quality through continuous improvement.
Map Rerank: Rank responses and select the highest-ranked response as the final answer, prioritizing accuracy and relevance.
Filtering: Apply advanced models to filter documents, utilizing the refined set as context for generating more focused and contextually relevant responses.
Contextual Compression: Extract pertinent snippets from documents, generating concise and informative responses and minimizing information overload.
Summary-Based Index: Leverage document summaries, index document snippets, and generate responses using relevant summaries and snippets, ensuring concise yet informative answers.
Forward-Looking Active Retrieval Augmented Generation (FLARE): Predict forthcoming sentences by initially retrieving relevant documents and iteratively refining responses. Flare ensures a dynamic and contextually aligned generation process.

These diverse approaches empower RAG to adapt to various use cases and retrieval scenarios, allowing for tailored solutions that maximize the relevance, accuracy, and efficiency of AI-generated responses.

Ethical Considerations in RAG

RAG introduces ethical considerations that demand careful attention:

Ensuring Fair and Responsible Use: Ethical deployment of RAG involves using the technology responsibly and refraining from any misuse or harmful applications. Moreover, developers and users must adhere to ethical guidelines to maintain the integrity of AI-generated content.
Addressing Privacy Concerns: RAG’s reliance on external data sources may involve accessing user data or sensitive information. Therefore, establishing robust privacy safeguards to protect individuals’ data and ensure compliance with privacy regulations is imperative.
Mitigating Biases in External Data Sources: External data sources can inherit biases in their content or collection methods. Moreover, developers must implement mechanisms to identify and rectify biases, ensuring AI-generated responses remain unbiased and fair. This process involves constant monitoring and refinement of data sources and training processes.

Want to master RAG? Here’s a detailed learning path to become a RAG specialist in 2025!

Applications of Retrieval Augmented Generation (RAG)

RAG finds versatile applications across various domains, enhancing AI capabilities in different contexts:

Chatbots and AI Assistants: RAG-powered systems excel in question-answering scenarios, providing context-aware and detailed answers from extensive knowledge bases. These systems enable more informative and engaging interactions with users.
Education Tools: RAG can significantly improve educational tools by offering students access to answers, explanations, and additional context based on textbooks and reference materials. This facilitates more effective learning and comprehension.
Legal Research and Document Review: Legal professionals can leverage RAG models to streamline document review processes and conduct efficient legal research. RAG assists in summarizing statutes, case law, and other legal documents, saving time and improving accuracy.
Medical Diagnosis and Healthcare: In the healthcare domain, RAG models serve as valuable tools for doctors and medical professionals. Moreover, they provide access to the latest medical literature and clinical guidelines, thereby aiding in accurate diagnosis and treatment recommendations.
Language Translation with Context: RAG enhances language translation tasks by considering the context in knowledge bases. This approach results in more accurate translations, accounting for specific terminology and domain knowledge, which is particularly valuable in technical or specialized fields.

These applications highlight how RAG’s integration of external knowledge sources empowers AI systems to excel in various domains, providing context-aware, accurate, and valuable insights and responses.

The Future of RAGs and LLMs

The evolution of Retrieval-Augmented Generation (RAG) and Large Language Models (LLMs) is poised for exciting developments:

Advancements in Retrieval Mechanisms: The future of RAG will witness refinements in retrieval mechanisms. Furthermore, these enhancements will focus on improving the precision and efficiency of document retrieval, ensuring that LLMs access the most relevant information quickly. Moreover, advanced algorithms and AI techniques will play a pivotal role in this evolution.
Integration with Multimodal AI: The synergy between RAG and multimodal AI, which combines text with other data types like images and videos, holds immense promise. Future RAG models will seamlessly incorporate multimodal data to provide richer and more contextually aware responses. This will open doors to innovative applications like content generation, recommendation systems, and virtual assistants.
RAG in Industry-Specific Applications: As RAG matures, it will find its way into industry-specific applications. Healthcare, law, finance, and education sectors will harness RAG-powered LLMs for specialized tasks. For example, in healthcare, RAG models will aid in diagnosing medical conditions by instantly retrieving the latest clinical guidelines and research papers, ensuring doctors have access to the most current information.
Ongoing Research and Innovation in RAG: The future of RAG is marked by relentless research and innovation. Furthermore, AI researchers will continue to push the boundaries of what RAG can achieve, exploring novel architectures, training methodologies, and applications. Consequently, this ongoing pursuit of excellence will result in more accurate, efficient, and versatile RAG models.
LLMs with Enhanced Retrieval Capabilities: LLMs will evolve to possess enhanced retrieval capabilities as a core feature. Furthermore, they will seamlessly integrate retrieval and generation components, making them more efficient at accessing external knowledge sources. Consequently, this integration will lead to LLMs that are proficient in understanding context and excel in providing context-aware responses.

How Does Retrieval-Augmented Generation Work?

The following diagrams illustrate the LangChain workflow for RAG.

rag system architecture - search and generation

These images depict the architecture of a Retrieval-Augmented Generation (RAG) system. The various components are as follows:

Load: Raw data from various formats (JSON, PDFs, URLs) is gathered and prepared for processing.
Split: The data is broken into smaller chunks or documents for easier handling and better embeddings.
Embed: The data chunks are transformed into numerical embeddings that capture their semantic meaning.
Store: The embeddings are saved in a vector database for fast retrieval during future queries.
Question: A user query or question is provided as input to the system.
Retrieve: The system retrieves relevant documents as context from the vector database based on the question.
Prompt: The retrieved information is sent along with the prompt that guides the large language model (LLM).
LLM: The LLM uses context and prompts to generate a coherent and contextually relevant answer.
Answer: The final answer addresses the user’s initial query based on the retrieved information.

Also Read: 8 Types of Chunking for RAG Systems

Utilizing LangChain for Enhanced RAG

Installation of LangChain and OpenAI Libraries

This line of code installs the LangChain and OpenAI libraries. LangChain is critical for handling text data and embedding, while OpenAI provides access to state-of-the-art Large Language Models (LLMs). This installation step is essential for setting up the required tools for RAG.

!pip install langchain openai
!pip install -q -U faiss-cpu tiktoken
import os

It is best practice to store the API keys in the .env file and load them using the below code:

from dotenv import load_dotenv
load_dotenv('/.env')

Also Read: A New Era of Text Generation: RAG, LangChain, and Vector Databases

Web Data Loading for the RAG Knowledge Base

The code utilizes LangChain’s “WebBaseLoader.”
Three web pages are specified for data retrieval: YOLO-NAS object detection, DeciCoder’s code generation efficiency, and a Deep Learning Daily newsletter.
This step is essential for building the knowledge base used in RAG, enabling contextually relevant and accurate information retrieval and integration into language model responses.

from langchain_community.document_loaders import WebBaseLoader

yolo_nas_loader = WebBaseLoader("https://deci.ai/blog/yolo-nas-object-detection-foundation-model/").load()

decicoder_loader = WebBaseLoader("https://deci.ai/blog/decicoder-efficient-and-accurate-code-generation-llm/#:~:text=DeciCoder's%20unmatched%20throughput%20and%20low,re%20obsessed%20with%20AI%20efficiency.").load()

yolo_newsletter_loader = WebBaseLoader("https://deeplearningdaily.substack.com/p/unleashing-the-power-of-yolo-nas").load()

Split the Data into Chunks

We will use any text splitter to split the data into smaller chunks. Here we will use CharacterTextSplitter.

text_splitter = CharacterTextSplitter(
	separator="\n\n",
	chunk_size=500,
	chunk_overlap=0,
	is_separator_regex=False,)

Let us apply the text_splitter for the data as below to split the data into chunks.

yolo_nas_chunks = text_splitter.split_documents(yolo_nas_loader) 
decicoder_chunks = text_splitter.split_documents(decicoder_loader)
yolo_newsletter_chunks = text_splitter.split_documents(yolo_newsletter_loader)

Embedding and Vector Store Setup

The code sets up embeddings for the RAG process.
It uses “OpenAIEmbeddings” to create an embedding model.
A “CacheBackedEmbeddings” object is initialized, allowing embeddings to be stored and retrieved efficiently using a local file store.
A “FAISS” vector store is created from the preprocessed chunks of web data (yolo_nas_chunks, decicoder_chunks, and yolo_newsletter_chunks), enabling fast and accurate similarity-based retrieval.
Finally, a retriever is instantiated from the vector store, facilitating efficient document retrieval during the RAG process.

from langchain_openai import OpenAIEmbeddings
from langchain.embeddings.cache import CacheBackedEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.storage import LocalFileStore

store = LocalFileStore("./cachce/")

# create an embedder
core_embeddings_model = OpenAIEmbeddings()

embedder = CacheBackedEmbeddings.from_bytes_store(
    core_embeddings_model,
    store,
    namespace = core_embeddings_model.model
)

# store embeddings in vector store
vectorstore = FAISS.from_documents(yolo_nas_chunks, embedder)

vectorstore.add_documents(decicoder_chunks)

vectorstore.add_documents(yolo_newsletter_chunks)

# instantiate a retriever
retriever = vectorstore.as_retriever()

Checkout: Understanding Word Embeddings

Establishing the Retrieval System

The code configures the retrieval system for Retrieval Augmented Generation (RAG) using LangChain Expression Language Chains.
We will initialize ChatPromptTemplate using a prompt that is sent to the LLM.
It uses “ChatOpenAI” from the LangChain library to set up a chat-based Large Language Model (LLM).
The “rag_chain_from_docs” chain is created, incorporating the context, prompt and LLM.
The “rag_chain_with_source” chain is created, using retriever and rag_chain_from_docs.
This chain is designed to perform retrieval-based question-answering tasks, and it is configured to return source documents for added context during the RAG process.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableParallel

# this formats the docs returned by the retriever 
def format_docs(docs):
	return "\n\n".join(doc.page_content for doc in docs)

# prompt to send to the LLM
prompt = """You are an assistant for question-answering tasks.
    	Use the following pieces of retrieved context to answer the question.
    	If you don't know the answer, just say that you don't know.
    	Use three sentences maximum and keep the answer concise.

    	Question: {question}

    	Context: {context}

    	Answer:
    	"""

prompt_template = ChatPromptTemplate.from_template(prompt)

llm = ChatOpenAI(model='gpt-4o-mini', streaming=True)


# This code defines a chain where input documents are first formatted, then passed through a prompt template, and finally processed by an LLM.
rag_chain_from_docs = (
	RunnablePassthrough.assign(context=(lambda x: format_docs(x["context"])))
	| prompt_template
	| llm
	)
# This code creates a parallel process: one retrieves the context (using a retriever), and the other passes the question through unchanged. The results are then combined and assigned to the variable `answer` using the `rag_chain_from_docs` processing chain.
rag_chain_with_source = RunnableParallel(
	{"context": retriever, "question": RunnablePassthrough()}
).assign(answer=rag_chain_from_docs)

Initializes the RAG System

The code sets up a RetrievalQA chain, a critical part of the RAG system, by combining an OpenAIChat language model (LLM) with a retriever and callback handler.

Issue Queries to the RAG System

It sends various user queries to the RAG system, prompting it to retrieve contextually relevant information.

Retrieves Responses

After processing the queries, the RAG system generates and returns contextually rich and accurate responses. The responses are printed on the console.


# This is to generate response with RAG system
response = rag_chain_with_source.invoke(
	"What does Neural Architecture Search have to do with how Deci creates its models?")

print(response['answer'].content)
print(response['context'])


response = rag_chain_with_source.invoke("What is DeciCoder")
print(response['answer'].content)
print(response['context'])

response = rag_chain_with_source.invoke(
	"Write a blog about Deci and how it used NAS to generate YOLO-NAS and DeciCoder")
print(response['answer'].content)
print(response['context'])

This code exemplifies how RAG and LangChain can enhance information retrieval and generation in AI applications.

Explore these articles to know more about RAG and its applications:

Conclusion

Retrieval-augmented generation (RAG) represents a transformative leap in artificial intelligence. It seamlessly integrates large language models (LLMs) with external knowledge sources, addressing the limitations of LLMs’ parametric memory.

RAG’s ability to access real-time data, coupled with improved contextualization, enhances the relevance and accuracy of AI-generated responses. Its updatable memory ensures responses are current without extensive model retraining. RAG also offers source citations, bolstering transparency and reducing data leakage. In summary, RAG empowers AI to provide more accurate, context-aware, and reliable information, promising a brighter future for AI applications across industries.

Key Takeaways

Retrieval Augmented Generation (RAG) is a groundbreaking framework that enhances Large Language Models (LLMs) by integrating external knowledge sources.
RAG overcomes the limitations of LLMs’ parametric memory, enabling them to access real-time data, improving contextualization, and providing up-to-date responses.
With RAG, AI-generated content becomes more accurate, context-aware, and transparent, as it can cite sources and reduce data leakage.
RAG’s updatable memory eliminates frequent model retraining, making it a cost-effective solution for various applications.
This technology promises to revolutionize AI across industries, providing users with more reliable and relevant information.

Frequently Asked Questions

Q1. What is a retrieval-augmented generation (RAG)?

A. Retrieval-augmented generation (RAG) combines generation and retrieval models in AI. It enhances text generation by retrieving relevant information from a large dataset before generating responses.

Q2. Who invented retrieval augmented generation?

A. Retrieval Augmented Generation (RAG) was introduced by researchers at Facebook AI (now Meta AI) in 2020.

Q3. What is retrieval augmented generation NVIDIA?

A. NVIDIA integrates RAG into AI frameworks to enhance language models by combining retrieval of external data with generative AI, improving accuracy and context awareness.

Q4. What is RAG and LLM?

A. RAG (retrieval-augmented generation) and LLM (large language model) represent advancements in AI. RAG combines retrieval and generation, while LLM refers to models like GPT that process and generate text.

Soumyadarshan Dash

Hello there! I'm Soumyadarshan Dash, a passionate and enthusiastic person when it comes to data science and machine learning. I'm constantly exploring new topics and techniques in this field, always striving to expand my knowledge and skills. In fact, upskilling myself is not just a hobby, but a way of life for me.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

What is Retrieval-Augmented Generation (RAG)?

Table of contents

What is RAG?

Why Use RAG?

The Fusion of Retrieval-Based and Generative Models

Deconstructing RAG’s Mechanics

The Role of Language Models and User Input

Incorporating External Knowledge Sources

Generating Contextual Responses

Range of Data Sources to Empower RAG Models

APIs and Real-time Databases

Document Repositories

Webpages and Scraping

Databases and Structured Information

Benefits of Retrieval-Augmented Generation (RAG)

Enhanced LLM Memory

Improved Contextualization

Updatable Memory

Source Citations

Reduced Hallucinations

Diverse Approaches in RAG

Ethical Considerations in RAG

Applications of Retrieval Augmented Generation (RAG)

The Future of RAGs and LLMs

How Does Retrieval-Augmented Generation Work?

Utilizing LangChain for Enhanced RAG

Installation of LangChain and OpenAI Libraries

Web Data Loading for the RAG Knowledge Base

Split the Data into Chunks

Embedding and Vector Store Setup

Establishing the Retrieval System

Initializes the RAG System

Issue Queries to the RAG System

Retrieves Responses

Conclusion

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#