Building an Agentic RAG with Phidata

Tarun R Jain Last Updated : 31 Dec, 2024

8 min read

When building applications using Large Language Models (LLMs), the quality of responses heavily depends on effective planning and reasoning capabilities for a given user task. While traditional RAG techniques are powerful, incorporating Agentic workflows can significantly enhance the system’s ability to process and respond to queries.

In this article, you will build an Agentic RAG system with memory components using the Phidata open-source Agentic framework, demonstrating how to combine vector databases i.e., Qdrant, embedding models, and intelligent agents for improved results.

Learning Objectives

Understand and design the architecture for the components required for Agentic RAG systems.
How do vector databases and embedding models for knowledge base creation be integrated within the Agentic workflow?
Learn to implement memory components for improved context retention
Develop an AI Agent that can perform multiple tool calls and decide what tool to choose based on user questions or tasks using the Phidata.
Real-world use case to build a Document Analyzer Assistant Agent that can interact with personal information from the knowledge base and DuckDuckGo in the absence of context in the knowledge base.

This article was published as a part of the Data Science Blogathon.

What is Agents and RAG?
What is Phidata?
Agents Use Cases Using Phidata
Real-time Use Case for Agentic RAG
Step-by-Step Code Implementation: Agentic RAG using Qdrant, OpenAI, and Phidata
Advantages of Agentic RAG
Conclusion
Frequently Asked Questions

What is Agents and RAG?

Agents in the context of AI are components designed to emulate human-like thinking and planning capabilities. Agents components consist of:

Task decomposition into manageable subtasks.
Intelligent decision-making about which tools to use and take necessary Action.
Reasoning about the best approach to solving a problem.

RAG (Retrieval-Augmented Generation) combines knowledge retrieval with LLM capabilities. When we integrate agents into RAG systems, we create a powerful workflow that can:

Analyze user queries intelligently.
Save the user document inside a knowledge base or Vector database.
Choose appropriate knowledge sources or context for the given user query.
Plan the retrieval and response generation process.
Maintain context through memory components.

The key difference between traditional RAG and Agentic RAG lies in the decision-making layer that determines how to process each query and interact with tools to get real-time information.

Now that we know, there is a thing like Agentic RAG, how do we build it? Let’s break it down.

What is Phidata?

Phidata is an open-source framework designed to build, monitor, and deploy Agentic workflows. It supports multimodal AI agents equipped with memory, knowledge, tools, and reasoning capabilities. Its model-agnostic architecture ensures compatibility with various large language models (LLMs), enabling developers to transform any LLM into a functional AI agent. Additionally, Phidata allows you to deploy your Agent workflows using a bring your own cloud (BYOC) approach, offering both flexibility and control over your AI systems.

Key features of Phidata include the ability to build teams of agents that collaborate to solve complex problems, a user-friendly Agent UI for seamless interaction (Phidata playground), and built-in support for agentic retrieval-augmented generation (RAG) and structured outputs. The framework also emphasizes monitoring and debugging, providing tools to ensure robust and reliable AI applications.

Agents Use Cases Using Phidata

Explore the transformative power of Agent-based systems in real-world applications, leveraging Phidata to enhance decision-making and task automation.

Financial Analysis Agent

By integrating tools like YFinance, Phidata allows the creation of agents that can fetch real-time stock prices, analyze financial data, and summarize analyst recommendations. Such agents assist investors and analysts in making informed decisions by providing up-to-date market insights.

Web Search Agent

Phidata also helps develop agents capable of retrieving real-time information from the web using search tools like DuckDuckGo, SerpAPI, or Serper. These agents can answer user queries by sourcing the latest data, making them valuable for research and information-gathering tasks.

Multimodal Agents

Phidata also supports multimodal capabilities, enabling the creation of agents that analyze images, videos, and audio. These multimodal agents can handle tasks such as image recognition, text-to-image generation, audio transcription, and video analysis, offering versatile solutions across various domains. For text-to-image or text-to-video tasks, tools like DALL-E and Replicate can be integrated, while for image-to-text and video-to-text tasks, multimodal LLMs such as GPT-4, Gemini 2.0, Claude AI, and others can be utilized.

Real-time Use Case for Agentic RAG

Imagine you have documentation for your startup and want to create a chat assistant that can answer user questions based on that documentation. To make your chatbot more intelligent, it also needs to handle real-time data. Typically, answering real-time data queries requires either rebuilding the knowledge base or retraining the model.

This is where Agents come into play. By combining the knowledge base with Agents, you can create an Agentic RAG (Retrieval-Augmented Generation) solution that not only improves the chatbot’s ability to retrieve accurate answers but also enhances its overall performance.

We have three main components that come together to form our knowledge base. First, we have Data sources, like documentation pages, PDFs, or any websites we want to use. Then we have Qdrant, which is our vector database – it’s like a smart storage system that helps us find similar information quickly. And finally, we have the embedding model that converts our text into a format that computers can understand better. These three components feed into our knowledge base, which is like the brain of our system.

Now we define the Agent object from Phidata.

The agent is connected to three components:

A Reasoning Model (like GPT-4, Gemini 2.0, or Claude) that helps it think and plan.
Memory (SqlAgentStorage) that helps it remember previous conversations
Tools (like DuckDuckGo search) that it can use to find information

Note: Here Knowledge Base and DuckDuckGo both will act as a tool, and then based on a task or user query the Agent will take Action on which tool to use to generate the response. Also Embedding model is OpenAI by default, so we will use OpenAI – GPT-4o as the reasoning model.

Let’s build this code.

Step-by-Step Code Implementation: Agentic RAG using Qdrant, OpenAI, and Phidata

It’s time to build a Document Analyzer Assistant Agent that can interact with personal information (A website) from the knowledge base and DuckDuckGo in the absence of context in the knowledge base.

Step1: Setting Up Dependencies

To build the Agentic RAG workflow we need to install a few libraries that include:

Phidata: To define the Agent object and workflow execution.
Google Generative AI – Reasoning model i.e., Gemini 2.0 Flash
Qdrant – Vector database where the knowledge base will be saved and later used to retrieve relevant information
DuckDuckGo – Search engine used to extract real-time information.

pip install phidata google-generativeai duckduckgo-search qdrant-client

Step2: Initial Configuration and Setup API keys

In this step, we will set up the environment variables and gather the required API credentials to run this use case. For your OpenAI API key, you can get it from: https://platform.openai.com/. Create your account and create a new key.

from phi.knowledge.website import WebsiteKnowledgeBase
from phi.vectordb.qdrant import Qdrant

from phi.agent import Agent
from phi.storage.agent.sqlite import SqlAgentStorage
from phi.model.openai import OpenAIChat
from phi.tools.duckduckgo import DuckDuckGo

import os

os.environ['OPENAI_API_KEY'] = "<replace>"

Step3: Setup Vector Database – Qdrant

You now will have to initialize the Qdrant client by providing the collection name, URL, and API key for your vector database. The Qdrant database stores and indexes the knowledge from the website, allowing the agent to perform retrieval of relevant information based on user queries. This step sets up the data layer for your agent:

Create cluster: https://cloud.qdrant.io/
Give a name to your cluster and copy the API key once the cluster is created.
Under the curl command, you can copy the Endpoint URL.

COLLECTION_NAME = "agentic-rag"
QDRANT_URL = "<replace>"
QDRANT_API_KEY = "<replace>"

vector_db = Qdrant(
    collection=COLLECTION_NAME,
    url=QDRANT_URL,
    api_key=QDRANT_API_KEY,
)

Step4: Creating the knowledge base

Here, you’ll define the sources from which the agent will pull its knowledge. In this example, we are building a Document analyzer agent that can make our job easy to answer questions from the website. We will use the Qdrant document website URL for indexing.

The WebsiteKnowledgeBase object interacts with the Qdrant vector database to store the indexed knowledge from the provided URL. It’s then loaded into the knowledge base for retrieval by the agent.

Note: Remember we use the load function to index the data source to the knowledge base. This needs to be run just once for each collection name, if you change the collection name and want to add new data, only that time run the load function again.

URL = "https://qdrant.tech/documentation/overview/"

knowledge_base = WebsiteKnowledgeBase(
    urls = [URL],
    max_links = 10,
    vector_db = vector_db,
)

knowledge_base.load() # only run once, after the collection is created, comment this

Step5: Define your Agent

The Agent configures an LLM (GPT-4) for response generation, a knowledge base for information retrieval, and an SQLite storage system to track interactions and responses as Memory. It also sets up a DuckDuckGo search tool for additional web searches when needed. This setup forms the core AI agent capable of answering queries.

We will set show_tool_calls to True to observe the backend runtime execution and track whether the query is routed to the knowledge base or the DuckDuckGo search tool. When you run this cell, it will create a database file where all messages are saved by enabling memory storage and setting add_history_to_messages to True.

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    knowledge=knowledge_base,
    tools=[DuckDuckGo()],

    show_tool_calls=True,
    markdown=True,

    storage=SqlAgentStorage(table_name="agentic_rag", db_file="agents_rag.db"),
    add_history_to_messages=True,
)

Step6: Try Multiple Query

Finally, the agent is ready to process user queries. By calling the print_response() function, you pass in a user query, and the agent responds by retrieving relevant information from the knowledge base and processing it. If the query is not from the knowledge base, it will use a search tool. Lets observe the changes.

Query -1: From the knowledge base

agent.print_response(
  "what are the indexing techniques mentioned in the document?", 
  stream=True
)

Query-2 Outside the knowledge base

agent.print_response(
  "who is Virat Kohli?", 
  stream=True
)

Advantages of Agentic RAG

Discover the key advantages of Agentic RAG, where intelligent agents and relational graphs combine to optimize data retrieval and decision-making.

Enhanced reasoning capabilities for better response generation.
Intelligent tool selection based on query contexts such as Knowledge Base and DuckDuckGo or any other tools from where we can fetch the context that can be provided to the Agent.
Memory integration for improved context awareness that can remember and extract history conversation messages.
Better planning and task decomposition, the primary part in Agentic workflow is to get the task and break it down into sub-tasks, and then make better decisions and action plans.
Flexible integration with various data sources such as PDF, Website, CSV, Docs, and many more.

Conclusion

Implementing Agentic RAG with memory components provides a reliable solution for building intelligent knowledge retrieval systems and search engines. In this article, we explored what Agents and RAG are, and how to combine them. With the combination of Agentic RAG, query routing improves due to the decision-making capabilities of the Agents.

Key Takeaways

Discover how Agentic RAG with Phidata enhances AI by integrating memory, a knowledge base, and dynamic query handling.
Learn to implement an Agentic RAG with Phidata for efficient information retrieval and adaptive response generation.
The Phidata data library provides a streamlined implementation process with just 30 lines of core code along with Multimodal such as Gemini 2.0 Flash.
Memory components are crucial for maintaining context and improving response relevance.
Integration of multiple tools (knowledge base, web search) enables flexible information retrieval – Vector databases like Qdrant provide advanced indexing capabilities for efficient search.

Frequently Asked Questions

Q1. Can Phidata handle multimodal tasks, and what tools does it integrate for this purpose?

A. Yes, Phidata is built to support multimodal AI agents capable of handling tasks involving images, videos, and audio. It integrates tools like DALL-E and Replicate for text-to-image or text-to-video generation, and utilizes multimodal LLMs such as GPT-4, Gemini 2.0, and Claude AI for image-to-text and video-to-text tasks.

Q2. What tools and frameworks are available for developing Agentic RAG systems?

A. Developing Agentic Retrieval-Augmented Generation (RAG) systems involves utilizing various tools and frameworks that facilitate the integration of autonomous agents with retrieval and generation capabilities. Here are some tools and frameworks available for this purpose: Langchain, LlamaIndex, Phidata, CrewAI, and AutoGen.

Q3. Can Phidata integrate with external tools and knowledge bases?

A. Yes, Phidata allows the integration of various tools and knowledge bases. For instance, it can connect with financial data tools like YFinance for real-time stock analysis or web search tools like DuckDuckGo for retrieving up-to-date information. This flexibility enables the creation of specialized agents tailored to specific use cases.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Tarun R Jain

Data Scientist at AI Planet || YouTube- AIWithTarun || Google Developer Expert in ML || Won 5 AI hackathons || Co-organizer of TensorFlow User Group Bangalore || Pie & AI Ambassador at DeepLearningAI

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

Building an Agentic RAG with Phidata

Learning Objectives

Table of contents

What is Agents and RAG?

What is Phidata?

Agents Use Cases Using Phidata

Financial Analysis Agent

Web Search Agent

Multimodal Agents

Real-time Use Case for Agentic RAG

Step-by-Step Code Implementation: Agentic RAG using Qdrant, OpenAI, and Phidata

Step1: Setting Up Dependencies

Step2: Initial Configuration and Setup API keys

Step3: Setup Vector Database – Qdrant

Step4: Creating the knowledge base

Step5: Define your Agent

Step6: Try Multiple Query

Query -1: From the knowledge base

Query-2 Outside the knowledge base

Advantages of Agentic RAG

Conclusion

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap