How to Build Agentic RAG with SmolAgents?

Pankaj Singh Last Updated : 15 Jan, 2025

10 min read

Building an Agentic Retrieval-Augmented Generation (RAG) system with SmolAgents enables the development of AI agents capable of autonomous decision-making and task execution. SmolAgents, a minimalist library by Hugging Face, facilitates the creation of such agents in a concise and efficient manner. In this article, we will go step by step to build the Agentic RAG with SmolAgents.

Read on!

What is SmolAgents?
Core Features of SmolAgents
Components of SmolAgents
Understanding Agentic RAG
Agentic RAG Hands-on with SmolAgents
What are the Advantages of Using SmolAgents for Agentic RAG?
Conclusion

What is SmolAgents?

SmolAgents is a cutting-edge library developed by Hugging Face to simplify the creation of intelligent agents capable of performing complex tasks. Designed with a minimalist philosophy, the library encapsulates agent logic in approximately 1,000 lines of code, making it both lightweight and powerful. Its intuitive design ensures ease of use without compromising on advanced functionalities.

Core Features of SmolAgents

Code Agents:
- SmolAgents specializes in “Code Agents” that can autonomously generate and execute code to perform user-defined tasks.
- These agents can be securely executed in sandboxed environments like E2B, ensuring safe operation without unintended consequences.
ToolCallingAgents:
- Traditional agents are capable of utilizing JSON or text-based actions to interact with the tool. But the SmolAgent follows: “Thought: I should call tool ‘get_weather’. Action: get_weather(Paris).”) format.
- These are ideal for scenarios requiring structured outputs and integration with various APIs or functions.
Extensive Integrations:
- Supports various large language models (LLMs), including Hugging Face’s inference API, OpenAI, Anthropic, and more through LiteLLM.
- Provides access to a shared tool repository on Hugging Face Hub, enhancing flexibility and adaptability.
Streamlined Architecture:
- Provides robust building blocks for complex agent behaviours like tool calling and multi-step task execution.
- Tailored for developers seeking a balance between simplicity and functionality.

Components of SmolAgents

To create advanced agents, several elements work together:

LLM Core: Powers decision-making and actions of the agent.
Tool Repository: A predefined list of accessible tools the agent can use for task execution.
Parser: Processes the LLM’s outputs to extract actionable information.
System Prompt: Provides clear instructions and aligns with the parser to ensure consistent outputs.
Memory: Maintains context across iterations, crucial for multi-step agents.
Error Logging and Retry Mechanisms: Enhances system resilience and efficiency.

SmolAgents integrates these elements seamlessly, saving developers from the complexity of building such systems from scratch.

It is a powerful tool for developers looking to harness the potential of agentic AI while maintaining simplicity and scalability. Whether for creating autonomous code executors or dynamic multi-step agents, SmolAgents offers the foundation for building cutting-edge applications.

Understanding Agentic RAG

Agentic RAG combines traditional Retrieval-Augmented Generation with agentic capabilities, allowing AI systems to not only retrieve and generate information but also to reason, plan, and interact with external tools dynamically. This integration enhances the system’s ability to handle complex tasks by decomposing queries, retrieving relevant information, and iteratively refining responses.

Also read: 7 Agentic RAG System Architectures to Build AI Agents

Key Benefits of Combining SmolAgents and Agentic RAG:

Intelligence: SmolAgents adds reasoning, planning, and tool-calling capabilities to the RAG pipeline.
Adaptability: The agent can dynamically adjust its actions based on the retrieved data.
Efficiency: Reduces manual intervention by automating iterative processes.
Security: Ensures safe execution of external code and queries.
Scalability: Enables complex workflows that can be easily scaled or modified for different domains.

SmolAgents and Agentic RAG complement each other by bringing together robust retrieval-augmented generation with dynamic, agentic reasoning and interaction capabilities. This synergy empowers developers to create intelligent, autonomous systems capable of handling sophisticated tasks across various domains.

Also read: RAG vs Agentic RAG: A Comprehensive Guide

Agentic RAG Hands-on with SmolAgents

Here’s the process of building the Agentic RAG with SmolAgents: First, we load and process data from a PDF document, splitting it into manageable chunks and generating embeddings to enable semantic search. These embeddings are stored in a vector database, allowing the system to retrieve the most relevant information in response to user queries. For external queries or additional context, a search agent is employed to fetch and integrate data from external sources. This combination of document retrieval and external search capabilities ensures that the system can provide comprehensive and accurate answers to a wide range of questions.

Required Python Packages

%pip install pypdf -q
%pip install faiss-cpu -q
!pip install -U langchain-community

Explanation:

pypdf: A library for working with PDF files.
faiss-cpu: A library for efficient similarity search and clustering of dense vectors.
langchain-community: A library for building applications with language models.

Importing Required Libraries

from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import FAISS
# The other imports are likely correct and can remain as they are:
from langchain_openai import OpenAIEmbeddings
from langchain_openai.llms import OpenAI
from langchain_openai.chat_models import ChatOpenAI
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter

Explanation:

These imports bring in the necessary modules for:

Loading PDF documents (PyPDFLoader).
Storing and searching vectors (FAISS).
Generating embeddings using OpenAI (OpenAIEmbeddings).
Splitting text into smaller chunks (RecursiveCharacterTextSplitter).

Loading and Splitting the PDF Document

loader = PyPDFLoader("/content/RESPONSIBLE DATA SHARING WITH DONORS.pdf")
pages = loader.load()

Explantion:

The PyPDFLoader loads the PDF file and extracts its pages into a list of Document objects.

for page in pages:
   print(page.page_content)

Output Showing PDF Content:

splitter = RecursiveCharacterTextSplitter(
   chunk_size=1000,
   chunk_overlap=200,
)
splitted_docs = splitter.split_documents(pages) # split_document accepts a list of documents

Explantion:

The RecursiveCharacterTextSplitter splits the document into smaller chunks:

chunk_size=1000: Each chunk contains up to 1000 characters.
chunk_overlap=200: Adjacent chunks overlap by 200 characters to ensure context is preserved.

print(len(splitted_docs))

Output:

First Chunk

print(splitted_docs[0])

This prints the number of chunks and the content of the first three chunks.

Output:

page_content='THE CENTRE FOR HUMANITARIAN DATA  

 1

DECEMBER 2020

1     Because there are well-established and accepted standards and
 mechanisms for sharing financial information with donors, including a role
 for external audits, requests for financial data are not included in this
 guidance note. This guidance note deals with sensitive personal and non-
personal data.

2    Roepstorff, K., Faltas, C. and Hövelmann, S., 2020. Counterterrorism
 Measures and Sanction Regimes: Shrinking Space for Humanitarian Aid
 Organisations.   

THE CENTRE FOR HUMANITARIAN DATA

GUIDANCE NOTE SERIES 

DATA RESPONSIBILITY IN HUMANITARIAN ACTION

RESPONSIBLE DATA SHARING WITH DONORS

KEY TAKEAWAYS:

• Sharing sensitive personal and non-personal data without adequate
 safeguards can exacerbate risks for crisis-affected people, humanitarian
 organizations and donors.

• Donors regularly request data from the organizations they fund in order to
 fulfil their obligations' metadata={'source': '/content/RESPONSIBLE DATA
 SHARING WITH DONORS.pdf', 'page': 0}

Explantion:

Contains the first 1,000 characters of the document.
Includes the overlap with the next chunk (the last 200 characters are duplicated in Chunk 2).

Second Chunk

print(splitted_docs[1])

Output:

page_content='risks for crisis-affected people, humanitarian organizations
 and donors.

• Donors regularly request data from the organizations they fund in order to
 fulfil their obligations and objectives. Some of these requests relate to
 sensitive information and data which needs to be protected in order to
 mitigate risk.

• Common objectives for data sharing with donors include: (i) situational
 awareness and programme design; (ii) accountability and transparency; and
 (iii) legal, regulatory, and policy requirements.

• Common constraints related to sharing data with donors include: (i) lack of
 regulatory framework for responsibly managing sensitive non-personal data;
 (ii) capacity gaps; and (iii) purpose limitation.

• Donors and humanitarian organizations can take the following steps to
 minimize risks while maximizing benefits when sharing sensitive data: (i)
 reviewing and clarifying the formal or' metadata={'source':
 '/content/RESPONSIBLE DATA SHARING WITH DONORS.pdf', 'page': 0}

Explanation:

Starts at the 800th character of the document (due to 200-character overlap) and continues for another 1,000 characters.

print(splitted_docs[2])

Output:

page_content='limitation.

• Donors and humanitarian organizations can take the following steps to
 minimize risks while maximizing benefits when sharing sensitive data: (i)
 reviewing and clarifying the formal or informal frameworks that govern the
 collection and sharing of disaggregated data; (ii) formalizing and
 standardising requests for sensitive data; (iii) investing in data
 management capacities of staff and organisations; and (iv) adopting common
 principles for donor data management.

INTRODUCTION

Donors have an important role in the humanitarian data ecosystem, both as
 drivers of increased data collection and analysis, and as direct users of
 data. This is not a new phenomenon; the need for accountability and
 transparency in the use of donor funding is broadly understood and
 respected. However, in recent years, donors have begun requesting data that
 can be sensitive. This includes personal data about' metadata={'source':
 '/content/RESPONSIBLE DATA SHARING WITH DONORS.pdf', 'page': 0}

Third Chunk

Starts at the 1,600th character and continues similarly, maintaining the overlap with the previous chunk.

Generating Embeddings

from google.colab import userdata
openai_api_key = userdata.get('OPENAI_API_KEY')
# Set the API key for OpenAIEmbeddings
embed_model = OpenAIEmbeddings(openai_api_key=openai_api_key)
# Generate embeddings for the documents
embeddings = embed_model.embed_documents([chunk.page_content for chunk in splitted_docs])
print(f"Embeddings shape: {len(embeddings), len(embeddings[0])}")

Explanation:

Initializes the OpenAIEmbeddings model and generates embeddings for each chunk of text.
The embeddings are numerical representations of the text that capture its semantic meaning.

Output:

vector_db = FAISS.from_documents(
   documents = splitted_docs,
   embedding = embed_model)

Explanation:

Creates a FAISS vector database from the document chunks and their embeddings.
FAISS allows for efficient similarity search over the embeddings.

similar_docs = vector_db.similarity_search("what is the objective for data sharing with donors?", k=5)
print(similar_docs[0].page_content)

Explanation:

Performs a similarity search to find the top 5 document chunks most relevant to the query.
Prints the content of the most relevant chunk.

Output:

in capacity to fulfil donor requirements might also deter smaller and/or
 local NGOs from seeking funding, undermining localization efforts.13

OBJECTIVES FOR DATA SHARING WITH DONORS

The most commonly identified objectives for donors requesting sensitive data
 from partners are situational awareness and programme design; accountability
 and transparency; and legal, regulatory, and policy requirements. 

Situational awareness and programme design

Donors seek information and data from humanitarian organizations in order to
 understand and react to changes in humanitarian contexts. This allows donors
 to improve their own programme design and evaluation, prevent duplication of
 assistance, identify information gaps, and ensure appropriate targeting of
 assistance.

Accountability and transparency

Donors and humanitarian organizations have an obligation to account for their
 activities. Data can enable donors to explain and defend funding on foreign 
aid to taxpayers.

SmolAgents

! pip -q install smolagents
! pip -q install litellm

Explanation:

Installs additional libraries:
- smolagents: A library for building agents with tools.
- litellm: A library for interacting with language models.

Defining a Retriever Tool

from smolagents import Tool
class RetrieverTool(Tool):
   name = "retriever"
   description = "Uses semantic search to retrieve the parts of the documentation that could be most relevant to answer your query."
   inputs = {
       "query": {
           "type": "string",
           "description": "The query to perform. This should be semantically close to your target documents. Use the affirmative form rather than a question.",
       }
   }
   output_type = "string"
   def __init__(self, vector_db, **kwargs):  # Add vector_db as an argument
       super().__init__(**kwargs)
       self.vector_db = vector_db  # Store the vector database
   def forward(self, query: str) -> str:
       assert isinstance(query, str), "Your search query must be a string"
       docs = self.vector_db.similarity_search(query, k=4)  # Perform search here
       return "\nRetrieved documents:\n" + "".join(
           [
               f"\n\n===== Document {str(i)} =====\n" + doc.page_content
               for i, doc in enumerate(docs)
           ]
       )
retriever_tool = RetrieverTool(vector_db=vector_db)  # Pass vector_db during instantiation

Explanation:

Defines a custom RetrieverTool that uses the FAISS vector database to perform the semantic search.
The tool takes a query as input and returns the most relevant document chunks.

Setting Up the Agent

from smolagents import LiteLLMModel, DuckDuckGoSearchTool
model = LiteLLMModel(model_id="gpt-4o", api_key = "your_api_key")
search_tool = DuckDuckGoSearchTool()
from smolagents import HfApiModel, CodeAgent
agent = CodeAgent(
   tools=[retriever_tool,search_tool], model=model, max_steps=6, verbose=True
)

Explanation:

Initializes a language model (LiteLLMModel) and a web search tool (DuckDuckGoSearchTool).
Creates a CodeAgent that can use the retriever tool and search tool to answer queries.

agent.run("Tell me about Analytics Vidhya")

Output:

Analytics Vidhya is a leading platform for professionals in Artificial
 Intelligence, Data Science, and Data Engineering. It offers various
 educational resources including courses, blogs, guides, and hackathons to
 facilitate learning and networking. The platform is well-known for providing
 mentorship, industry-relevant content, and tools for both beginners and
 experienced practitioners in the AI domain.

agent_output = agent.run("what are the constraints for data sharing with donors?")
print("Final output:")
print(agent_output)

Final Output:

Constraints for data sharing with donors include:
1. Lack of regulatory framework for responsibly managing sensitive non-
personal data.
2. Capacity gaps within organizations.
3. Purpose limitation, where data should be collected only for specified,
 explicit, and legitimate purposes and not processed further in a way incompatible with those purposes.
4. Need for formalization and standardization of data requests from donors.
5. Requirement for common principles and guidelines for donor data
 management.

agent_output1 = agent.run("WHAT ARE THE OBJECTIVES FOR DATA SHARING WITH DONORS?")

Final Output:

The objectives for data sharing with donors include: 1. Situational awareness
 and programme design - to improve understanding of contexts, enhance
 programme design, prevent duplication, and ensure appropriate targeting of
 assistance. 2. Accountability and transparency - to account for activities
 and explain foreign aid funding to taxpayers. 3. Legal, regulatory, and 
policy requirements - to ensure compliance with national and international
 laws, including counter-terrorism, migration, and other legal standards.

Here we did the following:

Loads and processes a PDF document.
Splits the document into smaller chunks.
Generates embeddings for the chunks.
Creates a vector database for semantic search.
Defines a custom tool for retrieving relevant document chunks.
Sets up an agent (using SmolAgents) to answer queries using the retriever tool and a language model.

To build Agentic Rag explore this also: A Comprehensive Guide to Building Agentic RAG Systems with LangGraph

What are the Advantages of Using SmolAgents for Agentic RAG?

Advantages of Using SmolAgents for Agentic RAG:

Simplicity: SmolAgents allows the creation of powerful agents in minimal lines of code, streamlining the development process.
Flexibility: Supports integration with various large language models and tools, enabling customization for specific tasks.
Security: Facilitates execution within sandboxed environments, ensuring safe and controlled operations.

By leveraging SmolAgents, you can easily build Agentic RAG systems that are capable of complex reasoning and dynamic interaction with external data sources, enhancing the overall performance and applicability of AI solutions.

Also read: Top 4 Agentic AI Design Patterns for Architecting AI Systems

Conclusion

The combination of SmolAgents and Agentic RAG represents a significant advancement in building intelligent, autonomous systems. SmolAgents’ minimalist yet powerful framework, paired with the dynamic retrieval and reasoning capabilities of Agentic RAG, enables the creation of AI agents that can handle complex tasks efficiently. This synergy enhances adaptability, security, and scalability, making it ideal for applications in research, decision-making, and automation. Together, they pave the way for next-generation AI systems that are both intelligent and autonomous.

Explore the Agentic AI Pioneer Program to deepen your understanding of Agent AI and unlock its full potential. Join us on this journey to discover innovative insights and applications!

Pankaj Singh

Hi, I am Pankaj Singh Negi - Senior Content Editor | Passionate about storytelling and crafting compelling narratives that transform ideas into impactful content. I love reading about technology revolutionizing our lifestyle.

Advanced AI Agents Best of Tech

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

How to Build Agentic RAG with SmolAgents?

Table of contents

What is SmolAgents?

Core Features of SmolAgents

Components of SmolAgents

Understanding Agentic RAG

Agentic RAG Hands-on with SmolAgents

Required Python Packages

Explanation:

Importing Required Libraries

Loading and Splitting the PDF Document

Explantion:

Explantion:

First Chunk

Second Chunk

Explanation:

Third Chunk

Generating Embeddings

Explanation:

Explanation:

Explanation:

SmolAgents

Explanation:

Defining a Retriever Tool

Explanation:

Setting Up the Agent

Explanation:

What are the Advantages of Using SmolAgents for Agentic RAG?

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)