When building applications using Large Language Models (LLMs), the quality of responses heavily depends on effective planning and reasoning capabilities for a given user task. While traditional RAG techniques are powerful, incorporating Agentic workflows can significantly enhance the system’s ability to process and respond to queries.
In this article, you will build an Agentic RAG system with memory components using the Phidata open-source Agentic framework, demonstrating how to combine vector databases i.e., Qdrant, embedding models, and intelligent agents for improved results.
This article was published as a part of the Data Science Blogathon.
Agents in the context of AI are components designed to emulate human-like thinking and planning capabilities. Agents components consist of:
RAG (Retrieval-Augmented Generation) combines knowledge retrieval with LLM capabilities. When we integrate agents into RAG systems, we create a powerful workflow that can:
The key difference between traditional RAG and Agentic RAG lies in the decision-making layer that determines how to process each query and interact with tools to get real-time information.
Now that we know, there is a thing like Agentic RAG, how do we build it? Let’s break it down.
Phidata is an open-source framework designed to build, monitor, and deploy Agentic workflows. It supports multimodal AI agents equipped with memory, knowledge, tools, and reasoning capabilities. Its model-agnostic architecture ensures compatibility with various large language models (LLMs), enabling developers to transform any LLM into a functional AI agent. Additionally, Phidata allows you to deploy your Agent workflows using a bring your own cloud (BYOC) approach, offering both flexibility and control over your AI systems.
Key features of Phidata include the ability to build teams of agents that collaborate to solve complex problems, a user-friendly Agent UI for seamless interaction (Phidata playground), and built-in support for agentic retrieval-augmented generation (RAG) and structured outputs. The framework also emphasizes monitoring and debugging, providing tools to ensure robust and reliable AI applications.
Explore the transformative power of Agent-based systems in real-world applications, leveraging Phidata to enhance decision-making and task automation.
By integrating tools like YFinance, Phidata allows the creation of agents that can fetch real-time stock prices, analyze financial data, and summarize analyst recommendations. Such agents assist investors and analysts in making informed decisions by providing up-to-date market insights.
Phidata also helps develop agents capable of retrieving real-time information from the web using search tools like DuckDuckGo, SerpAPI, or Serper. These agents can answer user queries by sourcing the latest data, making them valuable for research and information-gathering tasks.
Phidata also supports multimodal capabilities, enabling the creation of agents that analyze images, videos, and audio. These multimodal agents can handle tasks such as image recognition, text-to-image generation, audio transcription, and video analysis, offering versatile solutions across various domains. For text-to-image or text-to-video tasks, tools like DALL-E and Replicate can be integrated, while for image-to-text and video-to-text tasks, multimodal LLMs such as GPT-4, Gemini 2.0, Claude AI, and others can be utilized.
Imagine you have documentation for your startup and want to create a chat assistant that can answer user questions based on that documentation. To make your chatbot more intelligent, it also needs to handle real-time data. Typically, answering real-time data queries requires either rebuilding the knowledge base or retraining the model.
This is where Agents come into play. By combining the knowledge base with Agents, you can create an Agentic RAG (Retrieval-Augmented Generation) solution that not only improves the chatbot’s ability to retrieve accurate answers but also enhances its overall performance.
We have three main components that come together to form our knowledge base. First, we have Data sources, like documentation pages, PDFs, or any websites we want to use. Then we have Qdrant, which is our vector database – it’s like a smart storage system that helps us find similar information quickly. And finally, we have the embedding model that converts our text into a format that computers can understand better. These three components feed into our knowledge base, which is like the brain of our system.
Now we define the Agent object from Phidata.
The agent is connected to three components:
Note: Here Knowledge Base and DuckDuckGo both will act as a tool, and then based on a task or user query the Agent will take Action on which tool to use to generate the response. Also Embedding model is OpenAI by default, so we will use OpenAI – GPT-4o as the reasoning model.
Let’s build this code.
It’s time to build a Document Analyzer Assistant Agent that can interact with personal information (A website) from the knowledge base and DuckDuckGo in the absence of context in the knowledge base.
To build the Agentic RAG workflow we need to install a few libraries that include:
pip install phidata google-generativeai duckduckgo-search qdrant-client
In this step, we will set up the environment variables and gather the required API credentials to run this use case. For your OpenAI API key, you can get it from: https://platform.openai.com/. Create your account and create a new key.
from phi.knowledge.website import WebsiteKnowledgeBase
from phi.vectordb.qdrant import Qdrant
from phi.agent import Agent
from phi.storage.agent.sqlite import SqlAgentStorage
from phi.model.openai import OpenAIChat
from phi.tools.duckduckgo import DuckDuckGo
import os
os.environ['OPENAI_API_KEY'] = "<replace>"
You now will have to initialize the Qdrant client by providing the collection name, URL, and API key for your vector database. The Qdrant database stores and indexes the knowledge from the website, allowing the agent to perform retrieval of relevant information based on user queries. This step sets up the data layer for your agent:
COLLECTION_NAME = "agentic-rag"
QDRANT_URL = "<replace>"
QDRANT_API_KEY = "<replace>"
vector_db = Qdrant(
collection=COLLECTION_NAME,
url=QDRANT_URL,
api_key=QDRANT_API_KEY,
)
Here, you’ll define the sources from which the agent will pull its knowledge. In this example, we are building a Document analyzer agent that can make our job easy to answer questions from the website. We will use the Qdrant document website URL for indexing.
The WebsiteKnowledgeBase object interacts with the Qdrant vector database to store the indexed knowledge from the provided URL. It’s then loaded into the knowledge base for retrieval by the agent.
Note: Remember we use the load function to index the data source to the knowledge base. This needs to be run just once for each collection name, if you change the collection name and want to add new data, only that time run the load function again.
URL = "https://qdrant.tech/documentation/overview/"
knowledge_base = WebsiteKnowledgeBase(
urls = [URL],
max_links = 10,
vector_db = vector_db,
)
knowledge_base.load() # only run once, after the collection is created, comment this
The Agent configures an LLM (GPT-4) for response generation, a knowledge base for information retrieval, and an SQLite storage system to track interactions and responses as Memory. It also sets up a DuckDuckGo search tool for additional web searches when needed. This setup forms the core AI agent capable of answering queries.
We will set show_tool_calls
to True
to observe the backend runtime execution and track whether the query is routed to the knowledge base or the DuckDuckGo search tool. When you run this cell, it will create a database file where all messages are saved by enabling memory storage and setting add_history_to_messages
to True
.
agent = Agent(
model=OpenAIChat(id="gpt-4o"),
knowledge=knowledge_base,
tools=[DuckDuckGo()],
show_tool_calls=True,
markdown=True,
storage=SqlAgentStorage(table_name="agentic_rag", db_file="agents_rag.db"),
add_history_to_messages=True,
)
Finally, the agent is ready to process user queries. By calling the print_response() function, you pass in a user query, and the agent responds by retrieving relevant information from the knowledge base and processing it. If the query is not from the knowledge base, it will use a search tool. Lets observe the changes.
agent.print_response(
"what are the indexing techniques mentioned in the document?",
stream=True
)
agent.print_response(
"who is Virat Kohli?",
stream=True
)
Discover the key advantages of Agentic RAG, where intelligent agents and relational graphs combine to optimize data retrieval and decision-making.
Implementing Agentic RAG with memory components provides a reliable solution for building intelligent knowledge retrieval systems and search engines. In this article, we explored what Agents and RAG are, and how to combine them. With the combination of Agentic RAG, query routing improves due to the decision-making capabilities of the Agents.
A. Yes, Phidata is built to support multimodal AI agents capable of handling tasks involving images, videos, and audio. It integrates tools like DALL-E and Replicate for text-to-image or text-to-video generation, and utilizes multimodal LLMs such as GPT-4, Gemini 2.0, and Claude AI for image-to-text and video-to-text tasks.
A. Developing Agentic Retrieval-Augmented Generation (RAG) systems involves utilizing various tools and frameworks that facilitate the integration of autonomous agents with retrieval and generation capabilities. Here are some tools and frameworks available for this purpose: Langchain, LlamaIndex, Phidata, CrewAI, and AutoGen.
A. Yes, Phidata allows the integration of various tools and knowledge bases. For instance, it can connect with financial data tools like YFinance for real-time stock analysis or web search tools like DuckDuckGo for retrieving up-to-date information. This flexibility enables the creation of specialized agents tailored to specific use cases.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.