Building Multi-Document Agentic RAG using LLamaIndex

Ketan Kumar Last Updated : 05 Sep, 2024

12 min read

Introduction

In the rapidly evolving field of artificial intelligence, the ability to process and understand vast amounts of information is becoming increasingly crucial. Enter Multi-Document Agentic RAG – a powerful approach that combines Retrieval-Augmented Generation (RAG) with agent-based systems to create AI that can reason across multiple documents. This guide will walk you through the concept, implementation, and potential of this exciting technology.

Learning Objectives

Understand the fundamentals of Multi-Document Agentic RAG systems and their architecture.
Learn how embeddings and agent-based reasoning enhance AI’s ability to generate contextually accurate responses.
Explore advanced retrieval mechanisms that improve information extraction in knowledge-intensive applications.
Gain insights into the applications of Multi-Document Agentic RAG in complex fields like research and legal analysis.
Develop the ability to evaluate the effectiveness of RAG systems in AI-driven content generation and analysis.

This article was published as a part of the Data Science Blogathon.

Understanding RAG and Multi-Document Agents
Why Multi-Document Agentic RAG is a Game-Changer?
Key Strengths of Multi-Document Agentic RAG Systems
Building Blocks of Multi-Document Agentic RAG
Implementing a Basic Multi-Document Agentic RAG
Challenges and Considerations
Frequently Asked Questions

Understanding RAG and Multi-Document Agents

Retrieval-Augmented Generation (RAG) is a technique that enhances language models by allowing them to access and use external knowledge. Instead of relying solely on their trained parameters, RAG models can retrieve relevant information from a knowledge base to generate more accurate and informed responses.

Multi-Document Agentic RAG takes this concept further by enabling an AI agent to work with multiple documents simultaneously. This approach is particularly valuable for tasks that require synthesizing information from various sources, such as academic research, market analysis, or legal document review.

Why Multi-Document Agentic RAG is a Game-Changer?

Let us understand why multi-document agentic RAG is a game-changer.

Smarter Understanding of Context: Imagine having a super-smart assistant that doesn’t just read one book, but an entire library to answer your question. That’s what enhanced contextual understanding means. By analyzing multiple documents, the AI can piece together a more complete picture, giving you answers that truly capture the big picture.
Boost in Accuracy for Tricky Tasks: We’ve all played “connect the dots” as kids. Multi-Document Agentic RAG does something similar, but with information. By connecting facts from various sources, it can tackle complex problems with greater precision. This means more reliable answers, especially when dealing with intricate topics.
Handling Information Overload Like a Pro: In today’s world, we’re drowning in data. Multi-Document Agentic RAG is like a supercharged filter, sifting through massive amounts of information to find what’s truly relevant. It’s like having a team of experts working around the clock to digest and summarize vast libraries of knowledge.
Adaptable and Growable Knowledge Base: Think of this as a digital brain that can easily learn and expand. As new information becomes available, Multi-Document Agentic RAG can seamlessly incorporate it. This means your AI assistant is always up-to-date, ready to tackle the latest questions with the freshest information.

Key Strengths of Multi-Document Agentic RAG Systems

We will now look into the key strengths of multi-document agentic RAG systems.

Supercharging Academic Research: Researchers often spend weeks or months synthesizing information from hundreds of papers. Multi-Document Agentic RAG can dramatically speed up this process, helping scholars quickly identify key trends, gaps in knowledge, and potential breakthroughs across vast bodies of literature.
Revolutionizing Legal Document Analysis: Lawyers deal with mountains of case files, contracts, and legal precedents. This technology can swiftly analyze thousands of documents, spotting critical details, inconsistencies, and relevant case law that might take a human team days or weeks to uncover.
Turbocharging Market Intelligence: Businesses need to stay ahead of trends and competition. Multi-Document Agentic RAG can continuously scan news articles, social media, and industry reports, providing real-time insights and helping companies make data-driven decisions faster than ever before.
Navigating Technical Documentation with Ease: For engineers and IT professionals, finding the right information in sprawling technical documentation can be like searching for a needle in a haystack. This AI-powered approach can quickly pinpoint relevant sections across multiple manuals, troubleshooting guides, and code repositories, saving countless hours of frustration.

Building Blocks of Multi-Document Agentic RAG

Imagine you’re building a super-smart digital library assistant. This assistant can read thousands of books, understand complex questions, and give you detailed answers using information from multiple sources. That’s essentially what a Multi-Document Agentic RAG system does. Let’s break down the key components that make this possible:

Document Processing

Converts all types of documents (PDFs, web pages, Word files, etc.) into a format that our AI can understand.

Creating Embeddings

Transforms the processed text into numerical vectors (sequences of numbers) that represent the meaning and context of the information.

In simple terms, imagine creating a super-condensed summary of each paragraph in your library, but instead of words, you use a unique code. This code captures the essence of the information in a way that computers can quickly compare and analyze.

Indexing

It creates an efficient structure to store and retrieve these embeddings. This is like creating the world’s most efficient card catalog for our digital library. It allows our AI to quickly locate relevant information without having to scan every single document in detail.

Retrieval

It uses the query (your question) to find the most relevant pieces of information from the indexed embeddings. When you ask a question, this component races through our digital library, using that super-efficient card catalog to pull out all the potentially relevant pieces of information.

Agent-based Reasoning

An AI agent interprets the retrieved information in the context of your query, deciding how to use it to formulate an answer. This is like having a genius AI agent who not only finds the right documents but also understands the deeper meaning of your question. They can connect dots across different sources and figure out the best way to answer you.

Generation

It produces a human-readable answer based on the agent’s reasoning and the retrieved information. This is where our genius agent explains their findings to you in clear, concise language. They take all the complex information they’ve gathered and analyzed, and present it in a way that directly answers your question.

This powerful combination allows Multi-Document Agentic RAG systems to provide insights and answers that draw from a vast pool of knowledge, making them incredibly useful for complex research, analysis, and problem-solving tasks across many fields.

Implementing a Basic Multi-Document Agentic RAG

Let’s start by building a simple agentic RAG that can work with three academic papers. We’ll use the llama_index library, which provides powerful tools for building RAG systems.

Step1: Installation of Required Libraries

To get started with building your AI agent, you need to install the necessary libraries. Here are the steps to set up your environment:

Install Python: Ensure you have Python installed on your system. You can download it from the official Python website: Download Python
Set Up a Virtual Environment: It’s good practice to create a virtual environment for your project to manage dependencies. Run the following commands to set up a virtual environment:

python -m venv ai_agent_env
source ai_agent_env/bin/activate  # On Windows, use `ai_agent_env\Scripts\activate`

Install OpenAI API and LlamaIndex:

pip install openai llama-index==0.10.27 llama-index-llms-openai==0.1.15
pip install llama-index-embeddings-openai==0.1.7

Step2: Setting Up API Keys and Environment Variables

To use the OpenAI API, you need an API key. Follow these steps to set up your API key:

Obtain an API Key: Sign up for an account on the OpenAI website and obtain your API key from the API section.
Set Up Environment Variables: Store your API key in an environment variable to keep it secure. Add the following line to your .bashrc or .zshrc file (or use the appropriate method for your operating system)

export OPENAI_API_KEY='your_openai_api_key_here'

Access the API Key in Your Code: In your Python code, import necessary libraries, and access the API key using the os module

import os
import openai
import nest_asyncio
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.tools import FunctionTool, QueryEngineTool
from llama_index.core.vector_stores import MetadataFilters, FilterCondition
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner
from typing import List, Optional
import subprocess
openai.api_key = os.getenv('OPENAI_API_KEY')

#optionally, you could simply add openai key directly. (not a good practice)
#openai.api_key = 'your_openai_api_key_here'

nest_asyncio.apply()

Step3: Downloading Documents

As stated earlier, I am only using three papers to make this agentic rag, we would later scale this agentic rag to more papers in some other blog. You could use your own documents (optionally).

# List of URLs to download
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=hSyW5go0v8",
]

# Corresponding filenames to save the files as
papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "selfrag.pdf",
]

# Loop over both lists and download each file with its respective name
for url, paper in zip(urls, papers):
    subprocess.run(["wget", url, "-O", paper])

Step4: Creating Vector and Summary Tool

The below function, get_doc_tools, is designed to create two tools: a vector query tool and a summary query tool. These tools help in querying and summarizing a document using an agent-based retrieval-augmented generation (RAG) approach. Below are the steps and their explanations.

def get_doc_tools(
    file_path: str,
    name: str,
) -> str:
    """Get vector query and summary query tools from a document."""

Loading Documents and Preparing for Vector Indexing

The function starts by loading the document using SimpleDirectoryReader, which takes the provided file_path and reads the document’s contents. Once the document is loaded, it’s processed by SentenceSplitter, which breaks the document into smaller chunks, or nodes, each containing up to 1024 characters. These nodes are then indexed using VectorStoreIndex, a tool that allows for efficient vector-based queries. This index will later be used to perform searches over the document content based on vector similarity, making it easier to retrieve relevant information.

# Load documents from the specified file path
documents = SimpleDirectoryReader(input_files=[file_path]).load_data()

# Split the loaded document into smaller chunks (nodes) of 1024 characters
splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

# Create a vector index from the nodes for efficient vector-based queries
vector_index = VectorStoreIndex(nodes)

Defining the Vector Query Function

Here, the function defines vector_query, which is responsible for answering specific questions about the document. The function accepts a query string and an optional list of page numbers. If no page numbers are provided, the entire document is queried. The function first checks if page_numbers is provided; if not, it defaults to an empty list.

Then, it creates metadata filters that correspond to the specified page numbers. These filters help narrow down the search to specific parts of the document. The query_engine is created using the vector index and is configured to use these filters, along with a similarity threshold, to find the most relevant results. Finally, the function executes the query using this engine and returns the response.

  # vector query function
    def vector_query(
        query: str, 
        page_numbers: Optional[List[str]] = None
    ) -> str:
        """Use to answer questions over a given paper.
    
        Useful if you have specific questions over the paper.
        Always leave page_numbers as None UNLESS there is a specific page you want to search for.
    
        Args:
            query (str): the string query to be embedded.
            page_numbers (Optional[List[str]]): Filter by set of pages. Leave as NONE 
                if we want to perform a vector search
                over all pages. Otherwise, filter by the set of specified pages.
        
        """
    
        page_numbers = page_numbers or []
        metadata_dicts = [
            {"key": "page_label", "value": p} for p in page_numbers
        ]
        
        query_engine = vector_index.as_query_engine(
            similarity_top_k=2,
            filters=MetadataFilters.from_dicts(
                metadata_dicts,
                condition=FilterCondition.OR
            )
        )
        response = query_engine.query(query)
        return response

Creating the Vector Query Tool

This part of the function creates the vector_query_tool, a tool that links the previously defined vector_query function to a dynamically generated name based on the name parameter provided when calling get_doc_tools.

The tool is created using FunctionTool.from_defaults, which automatically configures it with the necessary defaults. This tool can now be used to perform vector-based queries over the document using the function defined earlier.

       
    # Creating the Vector Query Tool
    vector_query_tool = FunctionTool.from_defaults(
        name=f"vector_tool_{name}",
        fn=vector_query
    )

Creating the Summary Query Tool

In this final section, the function creates a tool for summarizing the document. First, it creates a SummaryIndex from the nodes that were previously split and indexed. This index is designed specifically for summarization tasks. The summary_query_engine is then created with a response mode of "tree_summarize", which allows the tool to generate concise summaries of the document content.

The summary_tool is finally created using QueryEngineTool.from_defaults, which links the query engine to a dynamically generated name based on the name parameter. The tool is also given a description indicating its purpose for summarization-related queries. This summary tool can now be used to generate summaries of the document based on user queries.

# Summary Query Tool
    summary_index = SummaryIndex(nodes)
    summary_query_engine = summary_index.as_query_engine(
        response_mode="tree_summarize",
        use_async=True,
    )
    summary_tool = QueryEngineTool.from_defaults(
        name=f"summary_tool_{name}",
        query_engine=summary_query_engine,
        description=(
            f"Useful for summarization questions related to {name}"
        ),
    )

    return vector_query_tool, summary_tool

Calling Function to Build Tools for Each Paper

paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

initial_tools = [t for paper in papers for t in paper_to_tools_dict[paper]]
len(initial_tools)

This code processes each paper and creates two tools for each: a vector tool for semantic search and a summary tool for generating concise summaries, in this case 6 tools.

Step5: Creating the Agent

Earlier we created tools for agent to use, now we will create our agent using then FunctionCallingAgentWorker class. We would be using “gpt-3.5-turbo” as our llm.

llm = OpenAI(model="gpt-3.5-turbo")

agent_worker = FunctionCallingAgentWorker.from_tools(
    initial_tools, 
    llm=llm, 
    verbose=True
)
agent = AgentRunner(agent_worker)

This agent can now answer questions about the three papers we’ve processed.

Step6: Analyzing Responses from the Agent

We asked our agent different questions from the three papers, and here is its response. Here are examples and explanation of how it works within.

Explanation of the Agent’s Interaction with LongLoRA Papers

In this example, we queried our agent to extract specific information from three research papers, particularly about the evaluation dataset and results used in the LongLoRA study. The agent interacts with the documents using the vector query tool, and here’s how it processes the information step-by-step:

User Input: The user asked two sequential questions regarding the evaluation aspect of LongLoRA: first about the evaluation dataset and then about the results.
Agent’s Query Execution: The agent identifies that it needs to search the LongLoRA document specifically for information about the evaluation dataset. It uses the vector_tool_longlora function, which is the vector query tool set up specifically for LongLoRA.

=== Calling Function ===
Calling function: vector_tool_longlora with args: {"query": "evaluation dataset"}

Function Output for Evaluation Dataset: The agent retrieves the relevant section from the document, identifying that the evaluation dataset used in LongLoRA is the “PG19 test split,” which is a dataset commonly used for language model evaluation due to its long-form text nature.
Agent’s Second Query Execution: Following the first response, the agent then processes the second part of the user’s question, querying the document about the evaluation results of LongLoRA.

=== Calling Function ===
Calling function: vector_tool_longlora with args: {"query": "evaluation results"}

Function Output for Evaluation Results: The agent returns detailed results showing how the models perform better in terms of perplexity with larger context sizes. It highlights key findings, such as improvements with larger context windows and specific context lengths (100k, 65536, and 32768). It also notes a trade-off, as extended models experience some perplexity degradation on smaller context sizes due to Position Interpolation—a common limitation in such models.
Final LLM Response: The agent synthesizes the results into a concise response that answers the initial question about the dataset. Further explanation of the evaluation results would follow, summarizing the performance findings and their implications.

Few More Examples for Other Papers

Explanation of the Agent’s Behavior: Summarizing Self-RAG and LongLoRA

In this instance, the agent was tasked with providing summaries of both Self-RAG and LongLoRA. The behavior observed in this case differs from the previous example:

Summary Tool Usage

=== Calling Function ===
Calling function: summary_tool_selfrag with args: {"input": "Self-RAG"}

Unlike the earlier example, which involved querying specific details (like evaluation datasets and results), here the agent directly utilized the summary_tool functions designed for Self-RAG and LongLoRA. This shows the agent’s ability to adaptively switch between query tools based on the nature of the question—opting for summarization when a broader overview is required.

Distinct Calls to Separate Summarization Tools

=== Calling Function ===
Calling function: summary_tool_longlora with args: {"input": "LongLoRA"}

The agent separately called summary_tool_selfrag and summary_tool_longlora to obtain the summaries, demonstrating its capacity to handle multi-part queries efficiently. It identifies the need to engage distinct summarization tools tailored to each paper rather than executing a single combined retrieval.

Conciseness and Directness of Responses

The responses provided by the agent were concise and directly addressed the prompt. This indicates that the agent can extract high-level insights effectively, contrasting with the previous example where it provided more granular data points based on specific vector queries.

This interaction highlights the agent’s capability to deliver high-level overviews versus detailed, context-specific responses observed previously. This shift in behavior underscores the versatility of the agentic RAG system in adjusting its query strategy based on the nature of the user’s question—whether it’s a need for in-depth detail or a broad summary.

Challenges and Considerations

While Multi-Document Agentic RAG is powerful, there are some challenges to keep in mind:

Scalability: As the number of documents grows, efficient indexing and retrieval become crucial.
Coherence: Ensuring that the agent produces coherent responses when integrating information from multiple sources.
Bias and Accuracy: The system’s output is only as good as its input documents and retrieval mechanism.
Computational Resources: Processing and embedding large numbers of documents can be resource-intensive.

Conclusion

Multi-Document Agentic RAG represents a significant advancement in the field of AI, enabling more accurate and context-aware responses by synthesizing information from multiple sources. This approach is particularly valuable in complex domains like research, legal analysis, and technical documentation, where precise information retrieval and reasoning are crucial. By leveraging embeddings, agent-based reasoning, and robust retrieval mechanisms, this system not only enhances the depth and reliability of AI-generated content but also paves the way for more sophisticated applications in knowledge-intensive industries. As technology continues to evolve, Multi-Document Agentic RAG is poised to become an essential tool for extracting meaningful insights from vast amounts of data.

Key Takeaways

Multi-Document Agentic RAG improves AI response accuracy by integrating information from multiple sources.
Embeddings and agent-based reasoning enhance the system’s ability to generate context-aware and reliable content.
The system is particularly valuable in complex fields like research, legal analysis, and technical documentation.
Advanced retrieval mechanisms ensure precise information extraction, supporting knowledge-intensive industries.
Multi-Document Agentic RAG represents a significant step forward in AI-driven content generation and data analysis.

Frequently Asked Questions

Q1. What is Multi-Document Agentic RAG?

A. Multi-Document Agentic RAG combines Retrieval-Augmented Generation (RAG) with agent-based systems to enable AI to reason across multiple documents.

Q2. How does Multi-Document Agentic RAG improve accuracy?

A. It enhances accuracy by synthesizing information from various sources, allowing AI to connect facts and provide more precise answers.

Q3. In which fields is Multi-Document Agentic RAG most beneficial?

A. It’s particularly valuable in academic research, legal document analysis, market intelligence, and technical documentation.

Q4. What are the key components of a Multi-Document Agentic RAG system?

A. The key components include document processing, creating embeddings, indexing, retrieval, agent-based reasoning, and generation.

Q5. What is the role of embeddings in this system?

A. Embeddings convert text into numerical vectors, capturing the meaning and context of information for efficient comparison and analysis.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Ketan Kumar

I'm a Data Scientist at Syngene International Limited. I have completed my Master's in Data Science from VIT AP and I have a burning passion for Generative AI. My expertise lies in building robust machine learning and NLP models for innovative projects. Currently, I'm putting this knowledge to work in drug discovery research at Syngene, exploring the potential of LLMs. Always eager to learn and delve deeper into the ever-evolving world of data science and AI!

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

Building Multi-Document Agentic RAG using LLamaIndex

Introduction

Learning Objectives

Table of contents

Understanding RAG and Multi-Document Agents

Why Multi-Document Agentic RAG is a Game-Changer?

Key Strengths of Multi-Document Agentic RAG Systems

Building Blocks of Multi-Document Agentic RAG

Document Processing

Creating Embeddings

Indexing

Retrieval

Agent-based Reasoning

Generation

Implementing a Basic Multi-Document Agentic RAG

Step1: Installation of Required Libraries

Step2: Setting Up API Keys and Environment Variables

Step3: Downloading Documents

Step4: Creating Vector and Summary Tool

Loading Documents and Preparing for Vector Indexing

Defining the Vector Query Function

Creating the Vector Query Tool

Creating the Summary Query Tool

Calling Function to Build Tools for Each Paper

Step5: Creating the Agent

Step6: Analyzing Responses from the Agent

Explanation of the Agent’s Interaction with LongLoRA Papers

Explanation of the Agent’s Behavior: Summarizing Self-RAG and LongLoRA

Summary Tool Usage

Distinct Calls to Separate Summarization Tools

Conciseness and Directness of Responses

Challenges and Considerations

Conclusion

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC