This article introduces the ReAct pattern for improved capabilities and demonstrates how to create AI agents from scratch. It covers testing, debugging, and optimizing AI agents in addition to tools, libraries, environment setup, and implementation. This tutorial gives users the skills they need to create effective AI agents, regardless of whether they are developers or enthusiasts.
This article was published as a part of the Data Science Blogathon.
AI agents are self-governing creatures that employ sensors to keep an eye on their environment, process information, and accomplish predefined goals. They can be anything from basic bots to sophisticated systems that can adjust and learn over time. Typical instances include recommendation engines like Netflix and Amazon’s, chatbots like Siri and Alexa, and self-driving cars from Tesla and Waymo.
Also essential in a number of sectors are these agents: UiPath and Blue Prism are examples of robotic process automation (RPA) programs that automate repetitive processes. DeepMind and IBM Watson Health are examples of healthcare diagnostics systems that help diagnose diseases and recommend treatments. In their domains, AI agents greatly improve productivity, precision, and customisation.
These agents play a critical role in improving our daily lives and accomplishing particular objectives.
AI agents are significant because they can:
In essence, AI agents are pivotal in driving the next wave of technological advancements, making systems smarter and more responsive to user needs.
AI agents have a wide range of applications across various industries. Here are some notable use cases:
The ReAct pattern operates in a loop of Thought, Action, Pause, Observation, Answer.
This loop allows the AI agent to reason about the input, act on it by leveraging external resources, and then integrate the results back into its reasoning process. By doing so, the AI agent can provide more accurate and contextually relevant responses, significantly expanding its utility.
The ReAct pattern is a potent design pattern that combines reasoning and action-taking skills to improve the capabilities of AI agents. LLMs such as GPT-3 or GPT-4 benefit greatly from this technique because it allows them to interface with other tools and APIs to carry out activities beyond their original programming.
The ReAct pattern operates in a cyclic loop consisting of the following steps:
The ReAct pattern is important for several reasons:
Python is a versatile and powerful programming language that is widely used in AI and machine learning due to its simplicity and extensive library support. For building AI agents, several Python libraries are essential:
The OpenAI API is a robust platform that provides access to advanced language models developed by OpenAI. These models can understand and generate human-like text, making them ideal for building AI agents. With the OpenAI API, you can:
The httpx library is an HTTP client for Python that supports both synchronous and asynchronous requests. It is designed to be easy to use while providing powerful features for making web requests. With httpx, you can:
Together, the OpenAI API and httpx library provide the foundational tools needed to build and enhance AI agents, enabling them to interact with external resources and perform a wide range of actions.
Let us now set up the environment by following certain steps:
To get started with building your AI agent, you need to install the necessary libraries. Here are the steps to set up your environment:
python -m venv ai_agent_env
source ai_agent_env/bin/activate # On Windows, use `ai_agent_env\Scripts\activate`
pip install openai httpx
To use the OpenAI API, you need an API key. Follow these steps to set up your API key:
export OPENAI_API_KEY='your_openai_api_key_here'
import os
openai.api_key = os.getenv('OPENAI_API_KEY')
With the environment set up, you are now ready to start building your AI agent.
Let us now build the AI agent.
To build the AI agent, we will create a class that handles interactions with the OpenAI API and manages the reasoning and actions. Here’s a basic structure to get started:
import openai
import re
import httpx
class ChatBot:
def __init__(self, system=""):
self.system = system
self.messages = []
if self.system:
self.messages.append({"role": "system", "content": system})
def __call__(self, message):
self.messages.append({"role": "user", "content": message})
result = self.execute()
self.messages.append({"role": "assistant", "content": result})
return result
def execute(self):
completion = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=self.messages)
return completion.choices[0].message.content
This class initializes the AI agent with an optional system message and handles user interactions. The __call__ method takes user messages and generates responses using the OpenAI API.
To implement the ReAct pattern, we need to define the loop of Thought, Action, Pause, Observation, and Answer. Here’s how we can incorporate this into our AI agent:
prompt = """
You run in a loop of Thought, Action, PAUSE, Observation.
At the end of the loop you output an Answer.
Use Thought to describe your thoughts about the question you have been asked.
Use Action to run one of the actions available to you - then return PAUSE.
Observation will be the result of running those actions.
Your available actions are:
calculate:
e.g. calculate: 4 * 7 / 3
Runs a calculation and returns the number - uses Python so be sure to use floating point
syntax if necessary
wikipedia:
e.g. wikipedia: Django
Returns a summary from searching Wikipedia
simon_blog_search:
e.g. simon_blog_search: Django
Search Simon's blog for that term
Example session:
Question: What is the capital of France?
Thought: I should look up France on Wikipedia
Action: wikipedia: France
PAUSE
You will be called again with this:
Observation: France is a country. The capital is Paris.
You then output:
Answer: The capital of France is Paris
""".strip()
action_re = re.compile('^Action: (\w+): (.*)
The query function runs the ReAct loop by sending the question to the AI agent, parsing the actions, executing them, and feeding the observations back into the loop.
Let us now look into the implementing actions.
The Wikipedia search action allows the AI agent to search for information on Wikipedia. Here’s how to implement it:
def wikipedia(q):
response = httpx.get("https://en.wikipedia.org/w/api.php", params={
"action": "query",
"list": "search",
"srsearch": q,
"format": "json"
})
return response.json()["query"]["search"][0]["snippet"]
The blog search action allows the AI agent to search for information on a specific blog. Here’s how to implement it:
def simon_blog_search(q):
response = httpx.get("https://datasette.simonwillison.net/simonwillisonblog.json", params={
"sql": """
select
blog_entry.title || ': ' || substr(html_strip_tags(blog_entry.body), 0, 1000) as text,
blog_entry.created
from
blog_entry join blog_entry_fts on blog_entry.rowid = blog_entry_fts.rowid
where
blog_entry_fts match escape_fts(:q)
order by
blog_entry_fts.rank
limit
1
""".strip(),
"_shape": "array",
"q": q,
})
return response.json()[0]["text"]
The calculation action allows the AI agent to perform mathematical calculations. Here’s how to implement it:
def calculate(what):
return eval(what)
Next, we need to register these actions in a dictionary so the AI agent can use them:
known_actions = {
"wikipedia": wikipedia,
"calculate": calculate,
"simon_blog_search": simon_blog_search
}
To integrate the actions with the AI agent, we need to ensure that the query function can handle the different actions and feed the observations back into the reasoning loop. Here’s how to complete the integration:
def query(question, max_turns=5):
i = 0
bot = ChatBot(prompt)
next_prompt = question
while i < max_turns:
i += 1
result = bot(next_prompt)
print(result)
actions = [action_re.match(a) for a in result.split('\n') if action_re.match(a)]
if actions:
action, action_input = actions[0].groups()
if action not in known_actions:
raise Exception(f"Unknown action: {action}: {action_input}")
print(" -- running {} {}".format(action, action_input))
observation = known_actions[action](action_input)
print("Observation:", observation)
next_prompt = f"Observation: {observation}"
else:
return result
With this setup, the AI agent can reason about the input, perform actions, observe the results, and generate responses.
Let us now follow the steps for testing and debugging.
To test the AI agent, you can run sample queries and observe the results. Here are a few examples:
print(query("What does England share borders with?"))
print(query("Has Simon been to Madagascar?"))
print(query("Fifteen * twenty five"))
While testing, you might encounter some common issues. Here are a few tips to debug them:
Let us now improve AI agents.
To make the AI agent more robust and secure:
To enhance the AI agent’s capabilities, you can add more actions such as:
The future of AI agents is promising, with advancements in machine learning, natural language processing, and AI ethics. Emerging trends include:
In this comprehensive guide, we explored the concept of AI agents, their significance, and the ReAct pattern that enhances their capabilities. We covered the necessary tools and libraries, set up the environment, and walked through building an AI agent from scratch. We also discussed implementing actions, integrating them with the AI agent, and testing and debugging the system. Finally, we looked at real-world applications and future prospects of AI agents.
By following this guide, you now have the knowledge to create your own build AI agents from scratch. Experiment with different actions, enhance the agent’s capabilities, and explore new possibilities in the exciting field of artificial intelligence.
A. The ReAct pattern (Reason + Act) involves implementing additional actions that an AI agent can take, like searching Wikipedia or running calculations, and teaching the agent to request these actions and process their results.
A. Essential tools and libraries include Python, OpenAI API, httpx for HTTP requests, and Python’s regular expressions (re) library.
A. Validate inputs thoroughly to prevent injection attacks, use sandboxing techniques where possible, implement error handling, and log actions for monitoring and debugging.
A. Yes, you can add various actions such as fetching weather information, searching for news articles, or translating text using appropriate APIs and integrating them into the AI agent’s reasoning loop
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.
Where is the link to the code