Embark on a thrilling journey into the future of software development with ‘Launching into Autogen: Exploring the Basics of a Multi-Agent Framework.’ In the wake of OpenAI’s ChatGPT, a specialized realm known as LLM agents is experiencing an unprecedented surge, revolutionizing AI agent development. From automating mundane tasks to tackling challenges in dynamic decision-making, LLM agents are pushing the boundaries of what was once deemed impossible.
As we step into the era of spatial computing, envision a world where computers seamlessly merge with reality, and the significance of AI agents becomes paramount. Imagine instructing agents through words and gestures as they execute tasks with unparalleled reasoning and acting capabilities. However, we’re at the dawn of the AI agent revolution, witnessing the birth of new infrastructures, tools, and frameworks that empower agents to tackle increasingly complex tasks. Autogen, a cutting-edge framework for crafting multi-agent chat systems, takes center stage in our exploration.
Join us in this article as we unravel the intricacies of AI agents in the early stages of the revolution, delving into the capabilities of Autogen and discovering how to bring these intelligent entities to life.
This article was published as a part of the Data Science Blogathon.
The vanilla Language Models are great at doing many things, such as translation, question-answering, etc. However, their knowledge and capability are limited. It is like a mason without its tools while building a house. However, it has been observed that LLMs can reason and act given the necessary tools. Most LLMs have limited knowledge of the world, but we can augment them with information from custom sources via prompting.
We can achieve this via two methods. Retrieval Augmented Generation and LLM agents. In an RAG, we feed models with information via custom hard-coded pipelines. But with agents, the LLM, based on its reasoning, will use a tool at its disposal. For example, GPT-4 with a Serper tool will browse the internet and answer accordingly, or it can fetch and analyze stock performance when it has access to the Yahoo Finance tool. So, this combination of LLMs, tools, and a framework for reasoning and taking action is what an AI agent is.
There has been a rapid rise in platforms and tools for building LLM agents. Autogen is one such tool. So, let’s understand what Autogen is and how to create LLM agents with it.
Autogen is an open-source tool from Microsoft to build robust multi-agent applications. Designed from the ground up, keeping multiple-agent communication in mind. It lets us create LLM applications where multiple agents converse with each other to find solutions to provided problems. The agents are highly customizable, meaning we can guide them to perform specific tasks. It also integrates well with the Langchain tooling ecosystem, which means we leverage existing Langchain tools to augment our agents.
To accomplish tasks, Autogen provides different types of Agents, such as,
We only need an Assistant Agent and User Proxy Agents for most of our use cases. So, let’s see how we can configure agents with Autogen. There are other agents like RetrieveAssistantAgent and RetrieveUserProxy agents configured for RAG.
Here is a diagram of a typical multi-agent workflow.
Now, let’s dive in to configure Autogen agents. But before that, set up the environment. If the use case requires code execution, the agents will do it in the current environment, and as per the official documentation, it is best done inside a container. To get started quickly, you can use GitHub codespaces. Make sure you install “pyautogen”.
As of now, Autogen supports only OpenAI models. To effectively use agents, we need to configure our models. We can configure multiple OpenAI models and use the ones we need. There are different ways to configure models, but we will define a JSON file.
#OAI_CONFIG_LIST
[
{
"model": "gpt-4",
"api_key": "<your OpenAI API key here>"
},
{
"model": "gpt-4",
"api_key": "<your Azure OpenAI API key here>",
"base_url": "<your Azure OpenAI API base here>",
"api_type": "azure",
"api_version": "2023-07-01-preview"
},
{
"model": "gpt-3.5-turbo",
"api_key": "<your OpenAI API key here>"
}
]
Now, define a config list for models.
import autogen
config_list = autogen.config_list_from_json(
"OAI_CONFIG_LIST",
filter_dict={
"model": {
"gpt-3.5-turbo"
}
}
)
This method will first search for OAI_CONFIG_LIST in the environment variable. If unsuccessful, it will search for the OAI_CONFIG_LIST Json file in the current directory. The filter_dict is for filtering models based on some parameters, here it is set to model.
Now, we define the configuration for the LLM. In this example, we will use a Jupyter Notebook function tool to run Python scripts to accomplish a simple graph plotting task.
llm_config = {
"functions": [
{
"name": "python",
"description": "run cell in ipython and return the execution result.",
"parameters": {
"type": "object",
"properties": {
"cell": {
"type": "string",
"description": "Valid Python cell to execute.",
}
},
"required": ["cell"],
},
},
],
"config_list": config_list,
"timeout": 120,
}
We shall define the function for running Python scripts in the IPython notebook.
from IPython import get_ipython
def exec_python(cell):
ipython = get_ipython()
result = ipython.run_cell(cell)
log = str(result.result)
if result.error_before_exec is not None:
log += f"\n{result.error_before_exec}"
if result.error_in_exec is not None:
log += f"\n{result.error_in_exec}"
return log
We are almost done building our multi-agent system. The only missing pieces are the agents we talked about earlier. Here, we will need an Assistant Agent and a User proxy agent. This is how we can define these agents.
chatbot = autogen.AssistantAgent(
name="chatbot",
system_message="For coding tasks, only use the functions you have been provided with. \
Reply TERMINATE when the task is done.",
llm_config=llm_config,
)
# create a UserProxyAgent instance named "user_proxy"
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
is_termination_msg=lambda x: x.get("content", "") and x.get("content", "").\
rstrip().endswith("TERMINATE"),
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
code_execution_config={"work_dir": "coding"},
)
The UserProxyAgent has a human_input_mode parameter, which puts an actual human in the agent loop based on its value. When set to ALWAYS, it asks for input after every response; for TERMINATE, it only asks at the end of the execution, and for NEVER, it does not ask for user inputs.
Register the functions with the user proxy agent.
user_proxy.register_function(
function_map={
"python": exec_python,
}
)
Now, run the agents.
# start the conversation
user_proxy.initiate_chat(
chatbot,
message="plot a curve for sin wave",
)
Running the agent will show logs on your system about what is happening.
In the above execution log, you can see that the Assistant agent “chatbot” generates the code, the code is then run, and potential errors are found. The user proxy then sends the error back to the assistant, and again, it runs the solution code, in this case, installing Matplotlib, and finally, it runs the final code and returns the output.
We can also extend this by adding another Assistant agent to the conversation, such as a Critic or Reviewer. This will help make the output more personalized. Here is how you can do that.
critic = autogen.AssistantAgent(
name="Critic",
system_message="""Critic. You are a helpful assistant highly \
skilled in evaluating the quality of a \
given visualization code by providing a score from 1 (bad) - 10 (good)\
while providing clear rationale. \
YOU MUST CONSIDER VISUALIZATION BEST PRACTICES for each evaluation.\
Specifically, you can carefully \
evaluate the code across the following dimensions
- bugs (bugs): are there bugs, logic errors, syntax error or typos?\
Are there any reasons why the code may \
fail to compile? How should it be fixed? If ANY bug exists,\
the bug score MUST be less than 5.
- Data transformation (transformation): Is the data transformed\
appropriately for the visualization type? E.g., \
is the dataset appropriated filtered, aggregated, or grouped\
if needed? If a date field is used, is the date field\
first converted to a date object etc?
YOU MUST PROVIDE A SCORE for each of the above dimensions.
{bugs: 0, transformation: 0, compliance: 0, type: 0, encoding: 0, aesthetics: 0}
Do not suggest code.
Finally, based on the critique above, suggest a concrete list of actions that the coder \
should take to improve the code.
""",
llm_config=llm_config,
)
groupchat = autogen.GroupChat(agents=[user_proxy, coder, critic], messages=[], max_round=12)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)
user_proxy.initiate_chat(manager, message="plot a curve for inverse sin wave in \
the -pi/2 to pi/2 region")
The above run will add a critic to suggest improvements for the plot, and the Assistant will try to generate compliant codes. When we need more than two agents, we use a chat manager.
In this example, we have used GPT 3.5. For more complicated coding and reasoning tasks, GPT-4 is preferred. We can do more with fewer agents with capable models like GPT-4. And also, the GPT-3.5 sometimes tends to get stuck in a loop. So, GPT-4 is a much better choice for severe applications. Also, a new type of experimental EcoAssistant is being developed (code). This agent solves the higher cost of using capable models like GPT-4 via a model hierarchy. The idea is to start a conversation with cost-effective models, and if it does not achieve the end goal, it uses the capable yet costly ones. One significant benefit of this approach is the synergic effect. As the agents share a single database, the codes by bigger models can be retrieved by smaller models later. Thus improving efficiency and reducing costs.
The scope of AI agents in the real world is immense. A lot of companies have started implementing agents into their existing systems. So here are a few use cases of AI agents that can be very useful.
The AI agents are growing in popularity. In the times to come, there is zero doubt they will be integrated into most software systems in one way or another. This is still the earliest stage of agent development, similar to the 90s of the internet. In no time, there will be much better agents solving novel problems. And the libraries and tools like Langchain are only going to evolve.
A. Autogen is an open-source Python framework for building a personalized multi-agent system as a high-level abstraction.
A. Autogen provides a high-level abstraction for building multi-agent chat solutions for complex LLM workflows.
A. AI agents are software programs that interact with their environment, make decisions, and act to achieve an end goal.
A. This depends on your use cases and budget. GPT 4 is the most capable but expensive, while GPT 3.5 and Cohere models are less qualified but fast and cheap.
A. The chains are a sequence of hard-coded actions to follow, while agents use LLMs and other tools (also chains) to reason and act according to the information.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.