Imagine having a personal assistant that not only understands your requests but also knows exactly how to execute them, whether it’s performing a quick calculation or fetching the latest stock market news. In this article, we delve into the fascinating world of AI agents, exploring how you can build your own using the LlamaIndex framework. We’ll guide you step-by-step through creating these intelligent agents, highlighting the power of LLM‘s function-calling capabilities, and demonstrating how they can make decisions and carry out tasks with impressive efficiency. Whether you’re new to AI or an experienced developer, this guide will show you how to unlock the full potential of AI agents in just a few lines of code.
This article was published as a part of the Data Science Blogathon.
AI agents are like digital assistants on steroids. They don’t just respond to your commands—they understand, analyze, and make decisions on the best way to execute those commands. Whether it’s answering questions, performing calculations, or fetching the latest news, AI agents are designed to handle complex tasks with minimal human intervention. These agents can process natural language queries, identify the key details, and use their abilities to provide the most accurate and helpful responses.
The rise of AI agents is transforming how we interact with technology. They can automate repetitive tasks, enhance decision-making, and provide personalized experiences, making them invaluable in various industries. Whether you’re in finance, healthcare, or e-commerce, AI agents can streamline operations, improve customer service, and provide deep insights by handling tasks that would otherwise require significant manual effort.
LlamaIndex is a cutting-edge framework designed to simplify the process of building AI agents using Large Language Models (LLMs). It leverages the power of LLMs like OpenAI’s models, enabling developers to create intelligent agents with minimal coding. With LlamaIndex, you can plug in custom Python functions, and the framework will automatically integrate these with the LLM, allowing your AI agent to perform a wide range of tasks.
Let us now look onto the steps on how we can implement AI agents using LlamaIndex.
Here we will be using GPT-4o from OpenAI as our LLM model, and querying the web is being carried out using Bing search. Llama Index already has Bing search tool integration, and it can be installed with this command.
!pip install llama-index-tools-bing-search
First you need to create a Bing search API key, which can be obtained by creating a Bing resource from the below link. For experimentation, Bing also provides a free tier with 3 calls per second and 1k calls per month.
Install the necessary Python libraries using the following commands:
%%capture
!pip install llama_index llama-index-core llama-index-llms-openai
!pip install llama-index-tools-bing-search
Next, set your API keys as environment variables so that LlamaIndex can access them during execution.
import os
os.environ["OPENAI_API_KEY"] = "sk-proj-<openai_api_key>"
os.environ['BING_API_KEY'] = "<bing_api_key>"
Initialize the LLM model (in this case, GPT-4o from OpenAI) and run a simple test to confirm it’s working.
from llama_index.llms.openai import OpenAI
llm = OpenAI(model="gpt-4o")
llm.complete("1+1=")
Create two functions that your AI agent will use. The first function performs a simple addition, while the second retrieves the latest stock market news using Bing Search.
from llama_index.tools.bing_search import BingSearchToolSpec
def addition_tool(a:int, b:int) -> int:
"""Returns sum of inputs"""
return a + b
def web_search_tool(query:str) -> str:
"""A web query tool to retrieve latest stock news"""
bing_tool = BingSearchToolSpec(api_key=os.getenv('BING_API_KEY'))
response = bing_tool.bing_news_search(query=query)
return response
For a better function definition, we can also make use of pydantic models. But for the sake of simplicity, here we will rely on LLM’s ability to extract arguments from the user query.
from llama_index.core.tools import FunctionTool
add_tool = FunctionTool.from_defaults(fn=addition_tool)
search_tool = FunctionTool.from_defaults(fn=web_search_tool)
A function tool allows users to easily convert any user-defined function into a tool object.
Here, the function name is the tool name, and the doc string will be treated as the description, but this can also be overridden like below.
tool = FunctionTool.from_defaults(addition_tool, name="...", description="...")
query = "what is the current market price of apple"
response = llm.predict_and_call(
tools=[add_tool, search_tool],
user_msg=query, verbose = True
)
Here we will call llm’s predict_and_call method along with the user’s query and the tools we defined above. Tools arguments can take more than one function by placing all functions inside a list. The method will go through the user’s query and decide which is the most suitable tool to perform the given task from the list of tools.
=== Calling Function ===
Calling function: web_search_tool with args: {"query": "current market price of Apple stock"}
=== Function Output ===
[['Warren Buffett Just Sold a Huge Chunk of Apple Stock. Should You Do the Same?', ..........
from llama_index.llms.openai import OpenAI
from llama_index.tools.bing_search import BingSearchToolSpec
from llama_index.core.tools import FunctionTool
llm = OpenAI(model="gpt-4o")
def addition_tool(a:int, b:int)->int:
"""Returns sum of inputs"""
return a + b
def web_search_tool(query:str) -> str:
"""A web query tool to retrieve latest stock news"""
bing_tool = BingSearchToolSpec(api_key=os.getenv('BING_API_KEY'))
response = bing_tool.bing_news_search(query=query)
return response
add_tool = FunctionTool.from_defaults(fn=addition_tool)
search_tool = FunctionTool.from_defaults(fn=web_search_tool)
query = "what is the current market price of apple"
response = llm.predict_and_call(
tools=[add_tool, search_tool],
user_msg=query, verbose = True
)
For those looking to push the boundaries of what AI agents can do, advanced customization offers the tools and techniques to refine and expand their capabilities, allowing your agent to handle more complex tasks and deliver even more precise results.
To improve how the AI agent interprets and uses functions, you can incorporate pydantic models. This adds type checking and validation, ensuring that your agent processes inputs correctly.
For more complex user queries, consider creating additional tools or refining existing ones to handle multiple tasks or more intricate requests. This might involve adding error handling, logging, or even custom logic to manage how the agent responds to different scenarios.
AI agents can process user inputs, reason about the best approach, access relevant knowledge, and execute actions to provide accurate and helpful responses. They can extract parameters specified in the user’s query and pass them to the relevant function to carry out the task. With LLM frameworks such as LlamaIndex, Langchain, etc., one can easily implement agents with a few lines of code and also customize things such as function definitions using pydantic models.
A. An AI agent is a digital assistant that processes user queries, determines the best approach, and executes tasks to provide accurate responses.
A. LlamaIndex is a popular framework that allows easy implementation of AI agents using LLMs, like OpenAI’s models.
A. Function calling enables the AI agent to select the most appropriate function based on the user’s query, making the process more efficient.
A. You can integrate web search by using tools like BingSearchToolSpec, which retrieves real-time data based on queries.
A. Yes, AI agents can evaluate multiple functions and choose the best one to execute based on the user’s request.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.