How to Build Agentic RAG with SmolAgents?

Pankaj Singh Last Updated : 15 Jan, 2025
10 min read

Building an Agentic Retrieval-Augmented Generation (RAG) system with SmolAgents enables the development of AI agents capable of autonomous decision-making and task execution. SmolAgents, a minimalist library by Hugging Face, facilitates the creation of such agents in a concise and efficient manner. In this article, we will go step by step to build the Agentic RAG with SmolAgents.

Read on!

What is SmolAgents?

SmolAgents is a cutting-edge library developed by Hugging Face to simplify the creation of intelligent agents capable of performing complex tasks. Designed with a minimalist philosophy, the library encapsulates agent logic in approximately 1,000 lines of code, making it both lightweight and powerful. Its intuitive design ensures ease of use without compromising on advanced functionalities.

Core Features of SmolAgents

Core Features of SmolAgents
Source: Hugging Face
  1. Code Agents:
    • SmolAgents specializes in “Code Agents” that can autonomously generate and execute code to perform user-defined tasks.
    • These agents can be securely executed in sandboxed environments like E2B, ensuring safe operation without unintended consequences.
  2. ToolCallingAgents:
    • Traditional agents are capable of utilizing JSON or text-based actions to interact with the tool.  But the SmolAgent follows: “Thought: I should call tool ‘get_weather’. Action: get_weather(Paris).”) format.
    • These are ideal for scenarios requiring structured outputs and integration with various APIs or functions.
  3. Extensive Integrations:
    • Supports various large language models (LLMs), including Hugging Face’s inference API, OpenAI, Anthropic, and more through LiteLLM.
    • Provides access to a shared tool repository on Hugging Face Hub, enhancing flexibility and adaptability.
  4. Streamlined Architecture:
    • Provides robust building blocks for complex agent behaviours like tool calling and multi-step task execution.
    • Tailored for developers seeking a balance between simplicity and functionality.

Components of SmolAgents

To create advanced agents, several elements work together:

  1. LLM Core: Powers decision-making and actions of the agent.
  2. Tool Repository: A predefined list of accessible tools the agent can use for task execution.
  3. Parser: Processes the LLM’s outputs to extract actionable information.
  4. System Prompt: Provides clear instructions and aligns with the parser to ensure consistent outputs.
  5. Memory: Maintains context across iterations, crucial for multi-step agents.
  6. Error Logging and Retry Mechanisms: Enhances system resilience and efficiency.

SmolAgents integrates these elements seamlessly, saving developers from the complexity of building such systems from scratch.

It is a powerful tool for developers looking to harness the potential of agentic AI while maintaining simplicity and scalability. Whether for creating autonomous code executors or dynamic multi-step agents, SmolAgents offers the foundation for building cutting-edge applications.

Understanding Agentic RAG

Agentic RAG combines traditional Retrieval-Augmented Generation with agentic capabilities, allowing AI systems to not only retrieve and generate information but also to reason, plan, and interact with external tools dynamically. This integration enhances the system’s ability to handle complex tasks by decomposing queries, retrieving relevant information, and iteratively refining responses.

Also read: 7 Agentic RAG System Architectures to Build AI Agents

Key Benefits of Combining SmolAgents and Agentic RAG:

  1. Intelligence: SmolAgents adds reasoning, planning, and tool-calling capabilities to the RAG pipeline.
  2. Adaptability: The agent can dynamically adjust its actions based on the retrieved data.
  3. Efficiency: Reduces manual intervention by automating iterative processes.
  4. Security: Ensures safe execution of external code and queries.
  5. Scalability: Enables complex workflows that can be easily scaled or modified for different domains.

SmolAgents and Agentic RAG complement each other by bringing together robust retrieval-augmented generation with dynamic, agentic reasoning and interaction capabilities. This synergy empowers developers to create intelligent, autonomous systems capable of handling sophisticated tasks across various domains.

Also read: RAG vs Agentic RAG: A Comprehensive Guide

Agentic RAG Hands-on with SmolAgents

Agentic RAG with SmolAgents
Source: Author

Here’s the process of building the Agentic RAG with SmolAgents: First, we load and process data from a PDF document, splitting it into manageable chunks and generating embeddings to enable semantic search. These embeddings are stored in a vector database, allowing the system to retrieve the most relevant information in response to user queries. For external queries or additional context, a search agent is employed to fetch and integrate data from external sources. This combination of document retrieval and external search capabilities ensures that the system can provide comprehensive and accurate answers to a wide range of questions.

Required Python Packages

%pip install pypdf -q
%pip install faiss-cpu -q
!pip install -U langchain-community

Explanation:

  • pypdf: A library for working with PDF files.
  • faiss-cpu: A library for efficient similarity search and clustering of dense vectors.
  • langchain-community: A library for building applications with language models.

Importing Required Libraries

from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import FAISS
# The other imports are likely correct and can remain as they are:
from langchain_openai import OpenAIEmbeddings
from langchain_openai.llms import OpenAI
from langchain_openai.chat_models import ChatOpenAI
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter

Explanation:

These imports bring in the necessary modules for:

  • Loading PDF documents (PyPDFLoader).
  • Storing and searching vectors (FAISS).
  • Generating embeddings using OpenAI (OpenAIEmbeddings).
  • Splitting text into smaller chunks (RecursiveCharacterTextSplitter).

Loading and Splitting the PDF Document

loader = PyPDFLoader("/content/RESPONSIBLE DATA SHARING WITH DONORS.pdf")
pages = loader.load()

Explantion:

The PyPDFLoader loads the PDF file and extracts its pages into a list of Document objects.

for page in pages:
   print(page.page_content)

Output Showing PDF Content:

Output showing PDF content
splitter = RecursiveCharacterTextSplitter(
   chunk_size=1000,
   chunk_overlap=200,
)
splitted_docs = splitter.split_documents(pages) # split_document accepts a list of documents

Explantion:

The RecursiveCharacterTextSplitter splits the document into smaller chunks:

  • chunk_size=1000: Each chunk contains up to 1000 characters.
  • chunk_overlap=200: Adjacent chunks overlap by 200 characters to ensure context is preserved.
print(len(splitted_docs))

Output:

21

First Chunk

print(splitted_docs[0])

This prints the number of chunks and the content of the first three chunks.

Output:

page_content='THE CENTRE FOR HUMANITARIAN DATA  

 1

DECEMBER 2020

1     Because there are well-established and accepted standards and
mechanisms for sharing financial information with donors, including a role
for external audits, requests for financial data are not included in this
guidance note. This guidance note deals with sensitive personal and non-
personal data.

2    Roepstorff, K., Faltas, C. and Hövelmann, S., 2020. Counterterrorism
Measures and Sanction Regimes: Shrinking Space for Humanitarian Aid
Organisations.   

THE CENTRE FOR HUMANITARIAN DATA

GUIDANCE NOTE SERIES 

DATA RESPONSIBILITY IN HUMANITARIAN ACTION

RESPONSIBLE DATA SHARING WITH DONORS

KEY TAKEAWAYS:

• Sharing sensitive personal and non-personal data without adequate
safeguards can exacerbate risks for crisis-affected people, humanitarian
organizations and donors.

• Donors regularly request data from the organizations they fund in order to
fulfil their obligations' metadata={'source': '/content/RESPONSIBLE DATA
SHARING WITH DONORS.pdf', 'page': 0}

Explantion:

  • Contains the first 1,000 characters of the document.
  • Includes the overlap with the next chunk (the last 200 characters are duplicated in Chunk 2).

Second Chunk

print(splitted_docs[1])

Output:

page_content='risks for crisis-affected people, humanitarian organizations
and donors.

• Donors regularly request data from the organizations they fund in order to
fulfil their obligations and objectives. Some of these requests relate to
sensitive information and data which needs to be protected in order to
mitigate risk.

• Common objectives for data sharing with donors include: (i) situational
awareness and programme design; (ii) accountability and transparency; and
(iii) legal, regulatory, and policy requirements.

• Common constraints related to sharing data with donors include: (i) lack of
regulatory framework for responsibly managing sensitive non-personal data;
(ii) capacity gaps; and (iii) purpose limitation.

• Donors and humanitarian organizations can take the following steps to
minimize risks while maximizing benefits when sharing sensitive data: (i)
reviewing and clarifying the formal or' metadata={'source':
'/content/RESPONSIBLE DATA SHARING WITH DONORS.pdf', 'page': 0}

Explanation:

  • Starts at the 800th character of the document (due to 200-character overlap) and continues for another 1,000 characters.
print(splitted_docs[2])

Output:

page_content='limitation.

• Donors and humanitarian organizations can take the following steps to
minimize risks while maximizing benefits when sharing sensitive data: (i)
reviewing and clarifying the formal or informal frameworks that govern the
collection and sharing of disaggregated data; (ii) formalizing and
standardising requests for sensitive data; (iii) investing in data
management capacities of staff and organisations; and (iv) adopting common
principles for donor data management.

INTRODUCTION

Donors have an important role in the humanitarian data ecosystem, both as
drivers of increased data collection and analysis, and as direct users of
data. This is not a new phenomenon; the need for accountability and
transparency in the use of donor funding is broadly understood and
respected. However, in recent years, donors have begun requesting data that
can be sensitive. This includes personal data about' metadata={'source':
'/content/RESPONSIBLE DATA SHARING WITH DONORS.pdf', 'page': 0}

Third Chunk

  • Starts at the 1,600th character and continues similarly, maintaining the overlap with the previous chunk.

Generating Embeddings

from google.colab import userdata
openai_api_key = userdata.get('OPENAI_API_KEY')
# Set the API key for OpenAIEmbeddings
embed_model = OpenAIEmbeddings(openai_api_key=openai_api_key)
# Generate embeddings for the documents
embeddings = embed_model.embed_documents([chunk.page_content for chunk in splitted_docs])
print(f"Embeddings shape: {len(embeddings), len(embeddings[0])}")

Explanation:

  • Initializes the OpenAIEmbeddings model and generates embeddings for each chunk of text.
  • The embeddings are numerical representations of the text that capture its semantic meaning.

Output:

Generating Embeddings
vector_db = FAISS.from_documents(
   documents = splitted_docs,
   embedding = embed_model)

Explanation:

  • Creates a FAISS vector database from the document chunks and their embeddings.
  • FAISS allows for efficient similarity search over the embeddings.
similar_docs = vector_db.similarity_search("what is the objective for data sharing with donors?", k=5)
print(similar_docs[0].page_content)

Explanation:

  • Performs a similarity search to find the top 5 document chunks most relevant to the query.
  • Prints the content of the most relevant chunk.

Output:

in capacity to fulfil donor requirements might also deter smaller and/or
local NGOs from seeking funding, undermining localization efforts.13

OBJECTIVES FOR DATA SHARING WITH DONORS

The most commonly identified objectives for donors requesting sensitive data
from partners are situational awareness and programme design; accountability
and transparency; and legal, regulatory, and policy requirements. 

Situational awareness and programme design

Donors seek information and data from humanitarian organizations in order to
understand and react to changes in humanitarian contexts. This allows donors
to improve their own programme design and evaluation, prevent duplication of
assistance, identify information gaps, and ensure appropriate targeting of
assistance.

Accountability and transparency

Donors and humanitarian organizations have an obligation to account for their
activities. Data can enable donors to explain and defend funding on foreign
aid to taxpayers.

SmolAgents

! pip -q install smolagents
! pip -q install litellm

Explanation:

  • Installs additional libraries:
    • smolagents: A library for building agents with tools.
    • litellm: A library for interacting with language models.

Defining a Retriever Tool

from smolagents import Tool
class RetrieverTool(Tool):
   name = "retriever"
   description = "Uses semantic search to retrieve the parts of the documentation that could be most relevant to answer your query."
   inputs = {
       "query": {
           "type": "string",
           "description": "The query to perform. This should be semantically close to your target documents. Use the affirmative form rather than a question.",
       }
   }
   output_type = "string"
   def __init__(self, vector_db, **kwargs):  # Add vector_db as an argument
       super().__init__(**kwargs)
       self.vector_db = vector_db  # Store the vector database
   def forward(self, query: str) -> str:
       assert isinstance(query, str), "Your search query must be a string"
       docs = self.vector_db.similarity_search(query, k=4)  # Perform search here
       return "\nRetrieved documents:\n" + "".join(
           [
               f"\n\n===== Document {str(i)} =====\n" + doc.page_content
               for i, doc in enumerate(docs)
           ]
       )
retriever_tool = RetrieverTool(vector_db=vector_db)  # Pass vector_db during instantiation

Explanation:

  • Defines a custom RetrieverTool that uses the FAISS vector database to perform the semantic search.
  • The tool takes a query as input and returns the most relevant document chunks.

Setting Up the Agent

from smolagents import LiteLLMModel, DuckDuckGoSearchTool
model = LiteLLMModel(model_id="gpt-4o", api_key = "your_api_key")
search_tool = DuckDuckGoSearchTool()
from smolagents import HfApiModel, CodeAgent
agent = CodeAgent(
   tools=[retriever_tool,search_tool], model=model, max_steps=6, verbose=True
)

Explanation:

  • Initializes a language model (LiteLLMModel) and a web search tool (DuckDuckGoSearchTool).
  • Creates a CodeAgent that can use the retriever tool and search tool to answer queries.
agent.run("Tell me about Analytics Vidhya")

Output:

Output
Analytics Vidhya is a leading platform for professionals in Artificial
Intelligence, Data Science, and Data Engineering. It offers various
educational resources including courses, blogs, guides, and hackathons to
facilitate learning and networking. The platform is well-known for providing
mentorship, industry-relevant content, and tools for both beginners and
experienced practitioners in the AI domain.
agent_output = agent.run("what are the constraints for data sharing with donors?")
print("Final output:")
print(agent_output)
output

Final Output:

Constraints for data sharing with donors include:
1. Lack of regulatory framework for responsibly managing sensitive non-
personal data.
2. Capacity gaps within organizations.
3. Purpose limitation, where data should be collected only for specified,
explicit, and legitimate purposes and not processed further in a way incompatible with those purposes.
4. Need for formalization and standardization of data requests from donors.
5. Requirement for common principles and guidelines for donor data
management.
agent_output1 = agent.run("WHAT ARE THE OBJECTIVES FOR DATA SHARING WITH DONORS?")
Output

Final Output:

The objectives for data sharing with donors include: 1. Situational awareness
and programme design - to improve understanding of contexts, enhance
programme design, prevent duplication, and ensure appropriate targeting of
assistance. 2. Accountability and transparency - to account for activities
and explain foreign aid funding to taxpayers. 3. Legal, regulatory, and 
policy requirements - to ensure compliance with national and international
laws, including counter-terrorism, migration, and other legal standards.

Here we did the following:

  1. Loads and processes a PDF document.
  2. Splits the document into smaller chunks.
  3. Generates embeddings for the chunks.
  4. Creates a vector database for semantic search.
  5. Defines a custom tool for retrieving relevant document chunks.
  6. Sets up an agent (using SmolAgents) to answer queries using the retriever tool and a language model.

To build Agentic Rag explore this also: A Comprehensive Guide to Building Agentic RAG Systems with LangGraph

What are the Advantages of Using SmolAgents for Agentic RAG?

Advantages of Using SmolAgents for Agentic RAG:

  • Simplicity: SmolAgents allows the creation of powerful agents in minimal lines of code, streamlining the development process.
  • Flexibility: Supports integration with various large language models and tools, enabling customization for specific tasks.
  • Security: Facilitates execution within sandboxed environments, ensuring safe and controlled operations.

By leveraging SmolAgents, you can easily build Agentic RAG systems that are capable of complex reasoning and dynamic interaction with external data sources, enhancing the overall performance and applicability of AI solutions.

Also read: Top 4 Agentic AI Design Patterns for Architecting AI Systems

Conclusion

The combination of SmolAgents and Agentic RAG represents a significant advancement in building intelligent, autonomous systems. SmolAgents’ minimalist yet powerful framework, paired with the dynamic retrieval and reasoning capabilities of Agentic RAG, enables the creation of AI agents that can handle complex tasks efficiently. This synergy enhances adaptability, security, and scalability, making it ideal for applications in research, decision-making, and automation. Together, they pave the way for next-generation AI systems that are both intelligent and autonomous.

Explore the Agentic AI Pioneer Program to deepen your understanding of Agent AI and unlock its full potential. Join us on this journey to discover innovative insights and applications!

Hi, I am Pankaj Singh Negi - Senior Content Editor | Passionate about storytelling and crafting compelling narratives that transform ideas into impactful content. I love reading about technology revolutionizing our lifestyle.

Responses From Readers

Clear

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details