Building Custom Tools for AI Agents Using smolagents

Ritobrata Ghosh Last Updated : 05 Mar, 2025

11 min read

LLMs have now exploded in their use across various domains. They are no longer limited to chatbots hosted on the web but are being integrated into enterprises, government agencies, and beyond. A key innovation in this landscape is building custom tools for AI agents using smolagents, allowing these systems to extend their capabilities. Using smolagents, AI agents can leverage tools, take actions in defined environments, and even call other agents.

This workflow enables LLM-powered AI systems to operate with greater autonomy, making them more reliable for achieving complete end-to-end task completion.

Learning Objectives

Learn what AI agents are, how they differ from traditional LLMs, and their role in modern AI applications with custom tools for LLM agents.
Discover why AI agents need custom tools for LLM agents to fetch real-time data, execute actions, and enhance decision-making.
Gain hands-on experience in integrating and deploying AI agents using smolagents for real-world applications.
Understand how to create and integrate custom tools that AI agents can invoke for enhanced functionality using smolagents.
Learn how to host and interact with an AI agent that utilizes the tools you built, enabling a more interactive and intelligent chatbot experience.

This article was published as a part of the Data Science Blogathon.

Prerequisites
Basics of Agents in Generative AI
Workflow of an Agent
Parts of an AI Agent
Need of Using Tools
The `smolagents` Library
Our Codebase
Our First Tool
Final Step: Hosting the Project
Conclusion
Frequently Asked Questions

Prerequisites

This is an article meant for the intermediate-level developers and data professionals who are well versed in using basic LLMs. The following are expected:

You know how to code in Python in intermediate level
You know the basics of using LLMs in your code
You are familiar with the broader GenAI ecosystem
You know the very basics of the Hugging Face platform and the `transformers` library in Python

Those are the bare minimum that is expected of you to learn from this tutorial, but here are further recommended background for you to benefit fully from this tutorial:

You can use LLM libraries such as LangChain, Ollama, etc.
You know the basics of Machine Learning theory
You can use an API in your code, and solve problems using API responses

Basics of Agents in Generative AI

You are probably familiar with ChatGPT. You can ask questions to it, and it answers your questions. It can also write code for you, tell you a joke, etc.

Because it can code, and it can answer your questions, you might want to use it to complete tasks for you, too. Where you demand something from it, and it completes a full task for you.

If it is vague for you right now, don’t worry. Let me give you an example. You know LLMs can search the web, and they can reason using information as input. So, you can combine these capabilities together, and ask an LLM to create a full travel itinerary for you. Right?

Yes. You will ask something like, “Hey AI, I am planning a vacation from 1st April to 7th April. I would like to visit the state of Himachal Pradesh. I really like snow, skiing, rope-ways, and lush green landscape. Can you plan an itinerary for me? Also find the lowest flight costs for me from the Kolkata airport.”

Taking in this information an agent should be able to find and compare all flight costs of those days inclusive, including return journey, and which places you should visit given your criteria, and hotels and costs for each place.

Here, the AI model is using your given criteria to interact with the real world to search for flights, hotels, buses, etc., and also suggest you places to visit.

This is what we call agentic approach in AI. And let’s learn more about it.

Workflow of an Agent

The agent is based on an LLM and LLM can interact with the external world using only text. Text in, text out.

So, when we ask an agent to do something, it takes that input as text data, and it reasons using text/language, and it can only output text.

It is in the middle part or the last part where the use of tools come in. The tools return some desired values, and using those values the agent returns the response in text. It can also do something very different, like making a transaction on the stock market, or generate an image.

The workflow of an AI agent should be understood like this:

Understand –> Reason –> Interact

This is one step of an agentic workflow, and when multiple steps are involved, like in most use cases, it should be seen as:

Thought –> Action –> Observation

Using the command given to the agent, it thinks about the task at hand, analyzes what needs to be done (Thought), and then it acts towards the completion of the task (Action), and then it observes if any further actions are needed to be performed, or how complete the whole task is (Observation).

In this tutorial, we will code up a chat agent where we will ask it to greet the user according to the user’s time zone. So, when a user says, “I am in Kolkata, greet me!”, the agent will think about the request, and parse it carefully. Then it will fetch the current time according to the timezone, this is the action. And then, it will observe for further task, whether the user have requested an image. If not, then it will go on and greet the user. Otherwise, it will further take action invoking the image generation model.

Parts of an AI Agent

So far, we were talking in conceptual terms, and workflow. Now lets take a dive into the concrete components of an AI agent.

You can say that an AI agent has two parts:

the brain of the agent
the tools of that agent

The brain of the agent is a traditional LLM model like llama3, phi4, GPT4, etc. Using this, the agent thinks and reasons.

The tools are externally coded tools that the agent can invoke. It can call an API for a stock price or the current temperature of a place. Also have another agent that it can invoke. It can also be a simple calculator.

Using `smolagents` framework, you can create any function in Python with any AI model that has been tuned for function calling.

In our example, we will have a tool to tell the user a fun fact about a dog, fetch the current timezone, and generate an image. The model will be a Qwen LLM model. More on the model later.

Need of Using Tools

They are now not merely used as text-completion tools and answering questions in Q&A formats. They are now used as small but but crucial cogs in much larger systems where many elements of those systems are not based on Generative AI.

Below is an abstract concept image:

In this abstract system graph, we see that GenAI components often have to take important inputs from non-Generative AI traditional system components.

We need tools to interact with these component and not the answer that is present in an LLM’s knowledge base.

As we have seen that LLM models serve as the “brain” of the agent, the agent will inherit all the faults of LLMs as well. Some of them are:

Many LLMs have a knowledge cut-off date, and you might need updated information like current weather, and stock price data. Or you might need information about geopolitical developments.
LLMs often hallucinate data. For deployed applications, you need your agents to be 100% correct about any answer. LLMs often fail to answer some simple Math problems.
LLMs often refuse to answer about questions for non-obvious reasons, like, “As a Large Language Model, I cannot answer this question”
LLMs that can do a web-search use their picks of websites, but as an expert in a domain, you might prefer results from some websites over others.

The above are only some reasons to use deterministic tools.

The `smolagents` Library

`smolagents` is a library used as a framework for using agents in your LLM application. It is developed by HuggingFace, and it is Open Source.

There are other frameworks such as LlamaIndex, LangGraph, etc. that you can use for the same purpose. But, for this tutorial, we will focus on smolagents alone.

There are some libraries that create agents that output JSON, and there are some libraries that output Python code directly. Research has shown this approach to be much more practical and efficient. smolagents is a library that creates agents that output Python code directly.

Our Codebase

All code are available on the GitHub repository for the project. I will not go through all the code there, but I will highlight the most important pieces of that codebase.

The Gradio_UI.py file holds the code for the UI library Gradio using which the agent interacts with the user.
The agent.json file has the configuration of the file
requirements.txt has the requirements of the project.
The prompts.yaml file has the example prompts and example required for the agent to perform actions. We will talk more about it later.
The core of the app lies in the app.py file. We will discuss mostly about this file.

The prompts.yaml file contain many example tasks and responses formats we expect the model to see. It also uses Jinja templating. It gets added to the prompt that we ultimately send to the model. We will later see that the prompts are added to the `CodeAgent` class.

A Quick Note on Code Agent

Tool-calling agents can work in two ways- they can either return a JSON blob, or they can directly write code.

It is apparent that if the tool-calling agent uses code directly, it is much better in practice. It also saves you the overhead of having the system to parse the JSON in the middle.

`smolagents` library falls in the second category of LLM agents, i.e. it uses code directly.

The app.py file

This is the file where we create the agent class, and this is where we define our own tools.

These are the imports:

from smolagents import CodeAgent,DuckDuckGoSearchTool, HfApiModel,load_tool,tool
import datetime
import requests
import pytz
import yaml
from tools.final_answer import FinalAnswerTool

We are importing `CodeAgent` class from the `smolagents` library. Also importing `load_tool` and `tool` classes. We will use these in time.

Our First Tool

We want to call an API that has stored cool facts about dogs. It is hosted on https://dogapi.dog. You can visit the website and read the docs about using the API. It is completely free.

To make a Python function usable by the AI agent, you have to:

add the `@tool` decorator to a function
have a very clear docstring describing the function with clear descriptions of the arguments
add type annotations to the function, for both inputs and return type of the function
clearly return something
add as much comments as you can

@tool
def get_amazing_dog_fact()-> str:
    """A tool that tells you an amazing fact about dogs using a public API.
    Args: None
    """
    # URL for the public API
    url = "https://dogapi.dog/api/v2/facts?limit=1"

    # case when there is a response from the API
    try:
        # 
        response = requests.get(url)
        if response.status_code == 200: # excpected, okay status code
            # parsing status code
            cool_dog_fact = response.json()['data'][0]['attributes']['body']
            return cool_dog_fact
        else:
            # in case of an unfavorable status code
            return "A dog fact could not be fetched."
    except requests.exceptions.RequestException as e:
        return "A dog fact could not be fetched."

Note that we are returning a properly parsed string as the final answer.

Tool to get current time

Below is a tool to get the current time in a timezone of your choice:

@tool
def get_current_time_in_timezone(timezone: str) -> str:
    """A tool that fetches the current local time in a specified timezone.
    Args:
        timezone: A string representing a valid timezone (e.g., 'America/New_York').
    """
    try:
        # Create timezone object
        tz = pytz.timezone(timezone)
        # Get current time in that timezone
        local_time = datetime.datetime.now(tz).strftime("%Y-%m-%d %H:%M:%S")
        return f"The current local time in {timezone} is: {local_time}"
    except Exception as e:
        return f"Error fetching time for timezone '{timezone}': {str(e)}"

You can also use other tools that are other AI models, like this:

image_generation_tool = load_tool("agents-course/text-to-image", trust_remote_code=True)

Now, these are the tools at the agent’s disposal. What about the model? We are going to use the Qwen2.5-Coder-32B-Instruct model. You have to apply to be able to use this model. They are pretty open about granting access.

This is how you create the model object:

model = HfApiModel(
max_tokens=2096,
temperature=0.5,
model_id='Qwen/Qwen2.5-Coder-32B-Instruct',# it is possible that this model may be overloaded
custom_role_conversions=None,
)

We now have to add the prompts that we talked about earlier:

with open("prompts.yaml", 'r') as stream:
    prompt_templates = yaml.safe_load(stream)

Now, our final task is to create the agent object.

agent = CodeAgent(
    model=model,
    tools=[final_answer, get_current_time_in_timezone, get_amazing_dog_fact,
          image_generation_tool], ## add your tools here (don't remove final answer)
    max_steps=6,
    verbosity_level=1,
    grammar=None,
    planning_interval=None,
    name=None,
    description=None,
    prompt_templates=prompt_templates
)

Note the very important argument `tools`. Here we add all the names of the functions that we created or defined to a list. This is very important. This is how the agent knows about the tools that are available to its disposal.

Other arguments to this function are several hyperparameters that we will not discuss or change in this tutorial. You can refer to the documentation for more information.

For the full code, go ahead and visit the repository and the app.py file from where the above code is.

I have explained all the core concepts and all the necessary code. HuggingFace provided the template of the project here.

Final Step: Hosting the Project

You can go ahead right now, and use the chat interface where you can use the tools that I have mentioned.

Here is my HuggingFace space, called greetings_gen. You should clone the project, and set a suitable name, and also change the visibility to public if you want to make the agent available to friends and public.

And make changes `app.py` file and add your new tools, remove mine- whatever you wish.

Here are some examples where you can see the inputs and outputs of the agent:

Conclusion

Agents can reliably perform tasks using multiple tools giving them more autonomy, and enables them to complete more complex tasks with deterministic inputs and outputs, while giving more ease to the user.

You learned about the basics of agentic AI, the basics of using smolagents library, and you also learned to create tools of your own that an AI agent can use, along hosting a chat model in HuggingFace spaces where you can interact with an agent that uses the tools that you created!

Feel free to follow me on the Fediverse, X/Twitter, and LinkedIn. And be sure to visit my website.

Key Takeaways

AI agents enhance LLMs by integrating custom tools for real-time data retrieval and decision-making.
The smolagents library simplifies AI agent creation by providing an easy-to-use framework.
Custom tools enable AI agents to execute actions beyond standard language model capabilities.
Deploying AI agents on Hugging Face Spaces allows for easy sharing and interaction.
Integrating AI agents with custom tools improves automation and efficiency in real-world applications.

Frequently Asked Questions

Q1. What is an AI agent?

A. An AI agent is an LLM-powered system that can interact with custom tools to perform specific tasks beyond text generation.

Q2. Why do AI agents need custom tools?

A. Custom tools help AI agents fetch real-time data, execute commands, and perform actions they can’t handle on their own.

Q3. What is the smolagents library?

A. smolagents is a lightweight framework by Hugging Face that helps developers create AI agents capable of using custom tools.

Q4. How can I create custom tools for an AI agent?

A. You can define functions as custom tools and integrate them into your AI agent to extend its capabilities.

Q5. Where can I deploy my AI agent?

A. You can deploy AI agents on platforms like Hugging Face Spaces for easy access and interaction.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Ritobrata Ghosh

I am Deep Learning Research Engineer. My research interests are Scientific Machine Learning and Edge AI. I like functional languages and low-level programming.

I like to read books, learning to play music, and spending time with my doggo.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

Building Custom Tools for AI Agents Using smolagents

Learning Objectives

Table of contents

Prerequisites

Basics of Agents in Generative AI

Workflow of an Agent

Parts of an AI Agent

Need of Using Tools

The `smolagents` Library

Our Codebase

A Quick Note on Code Agent

Our First Tool

Tool to get current time

Final Step: Hosting the Project

Conclusion

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg