In the rapidly evolving field of Generative AI, powerful models only do through prompting by humans until agents come, it’s like models are brains and agents are limbs, so, agentic workflow is introduced to do tasks autonomously using agents leveraging GenAI model. In the world of AI development agents are the future because agents can do complex tasks without the direct involvement of humans. Microsoft’s AutoGen frameworks stand out as a powerful tool for creating and managing multi-agent conversations. AutoGen simplifies the process of building an AI system that can collaborate, reason, and solve complex problems through agent-to-agent interactions.
In this article, we will explore the key features of AutoGen, how it works, and how you can leverage its capabilities in projects.
This article was published as a part of the Data Science Blogathon.
An agent is an entity that can send messages, receive messages and generate responses using GenAI models, tools, human inputs or a mixture of all. This abstraction not only allows agents to model real-world and abstract entities, such as people and algorithms. It simplifies the implementation of complex workflows.
AutoGen is developed by a community of researchers and engineers. It incorporates the latest research in multi-agent systems and has been used in many real-world applications. AutoGen Framework is extensible and composable meaning you can extend a simple agent with customizable components and create workflows that combine these agents to create a more powerful agent. It is modular and easy to implement.
Let us now explore agents of AutoGen.
At the heart of AutoGen are conversable agents. It is the agent with base functionality and it is the base class for all other AutoGen agents. A conversable Agent is capable of engaging in conversations, processing information, and performing tasks.
AutoGen provides several pre-defined agent types, each designed for specific roles.
Patterns let us make complex problem-solving and task completion through collaborating agent interaction.
AutoGen facilitates multi-agent conversation and task execution through a sophisticated orchestration of AI agents.
Agent Initialization: In AutoGen, we first initiate agents. These involve creating instances of the agent types you need and configuring them with specific parameters.
Example:
from autogen import AssistantAgent, UserProxyAgent
assistant1 = AssistantAgent("assistant1", llm_config={"model": "gpt-4","api_key":"<YOUR API KEY>"})
assistant2 = AssistantAgent("assistant2", llm_config={"model": "gpt-4","api_key":"<YOUR API KEY>"})
Conversation Flow: Once the agents are initialized, AutoGen manages the flow of conversation between them.
Typical flow pattern:
This is the basic conversation flow in AutoGen. For working with more complex task processes we can combine multiple agents into a group called GroupChat, and then use Group Manager to manage the conversation. Every group and group manager will be responsible for specific tasks.
As the conversation progresses, agents may need to perform specific tasks, AutoGen supports various task execution methods.
AutoGen implements a robust error-handling process. If an agent encounters an error, it can often diagnose and attempt to fix the issue autonomously. This creates a cycle of continuous improvement and problem-solving.
Conversations in AutoGen can terminate based on predefined conditions.
The flexibility of this termination condition allows for both quick and targeted interaction.
Let us now explore use cases and examples of Microsoft’s AutoGen Framework.
AutoGen excels at breaking down and solving complex problems through multi-agent collaboration. It can be used in scientific research to analyze data, formulate hypotheses, and design experiments.
AutoGen can generate, execute, and debug code across various programming languages. This is particularly useful for software development and automation tasks.
AutoGen framework is well suited for multi-agent automated advertising management. It can track the customer’s reviews, clicks on advertising, automated AB testing on targeted advertising, and use GenAI models such as Gemini, and Stable diffusion to generate customer-specific advertise
AutoGen can create interactive tutoring experiences, where different agents take on roles such as teacher, student, and evaluator.
Let us now explore a simple example of the Teacher-Student-Evaluator model.
from autogen import AssistantAgent, UserProxyAgent
teacher = AssistantAgent("Teacher", llm_config={"model": "gpt-4","api_key":"<YOUR API KEY>"})
student = UserProxyAgent("Student")
evaluator = AssistantAgent("Evaluator", llm_config={"model": "gpt-4","api_key":"<YOUR API KEY>"})
def tutoring_session():
student.initiate_chat(teacher, message="I need help understanding quadratic equations.")
# Teacher explains concept
student.send(evaluator, "Did I understand correctly? A quadratic equation is ax^2 + bx + c = 0")
# Evaluator assesses understanding and provides feedback
teacher.send(student, "Let's solve this equation: x^2 - 5x + 6 = 0")
# Student attempts solution
evaluator.send(teacher, "Assess the student's solution and provide guidance if needed.")
tutoring_session()
Till now we have gathered all the necessary knowledge for working with AutoGen Framework. Now, let’s implement a hands-on project so we can cement our understanding.
In this project, we will use AutoGen Agents to download a dataset from the web and try to analyze it using LLM.
#create a conda environment
$ conda create -n autogen python=3.11
# after the creating env
$ conda activate autogen
# install autogen and necessary libraries
pip install numpy pandas matplolib seaborn python-dotenv jupyterlab
pip pyautogen
Now, open your Vscode and start the project by creating a Jupyter notebook of your choice.
import os
import autogen
from autogen.coding import LocalCommandLineCodeExecutor
from autogen import ConversableAgent
from dotenv import load_dotenv
Now, collect your API keys of the generative model from the respective site and put them into .env file at the root of the project. Belew code will load all the API keys into the system.
load_dotenv()
google_api_key = os.getenv("GOOGLE_API_KEY")
open_api_key = os.getenv("OPENAI_API_KEY")
os.environ["GOOGLE_API_KEY"] = google_api_key.strip('"')
os.environ["OPENAI_API_KEY"] = open_api_key.strip('"')
seed = 42
I use the GeminiAI free version to test the code. Setting the gemini safety to NONE.
safety_settings = [
{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE"},
]
llm_config = {
"config_list": [
{
"model": "gemini-1.5-flash",
"api_key": os.environ["GOOGLE_API_KEY"],
"api_type": "google",
"safety_settings": safety_settings,
}
]
}()
llm_config = {
"config_list" = [{"model": "gpt-4", "api_key": os.getenv("OPENAI_API_KEY")}
}
coding_task = [
"""download data from https://raw.githubusercontent.com/vega/vega-datasets/main/data/penguins.json""",
""" find desccriptive statistics of the dataset, plot a chart of their relation between spices and beak length and save the plot to beak_length_depth.png """,
"""Develope a short report using the data from the dataset, save it to a file named penguin_report.md.""",
]
I will use four assistants
It is an AutoGen User proxy, it is a subclass of ConversableAgent, It’s human_input_mode is ALWAYS which means it will work as a human agent. And its LLM configuration is False. By default, it will ask humans for input but here we will put human_input_mode to NEVER, so it will work autonomously.
user_proxy = autogen.UserProxyAgent(
name="User_proxy",
system_message="A human admin.",
code_execution_config={
"last_n_messages": 3,
"work_dir": "groupchat",
"use_docker": False,
}, # Please set use_docker=True if docker is available to
#run the generated code. Using docker is safer than running the generated code directly.
human_input_mode="NEVER",
)
To build Code and Writer agents we will leverage AutoGen Assistant Agent which is a subclass of Conversable Agent. It is designed to solve tasks with LLM. human_input_mode is NEVER. We can use a system message prompt with an assistant agent.
coder = autogen.AssistantAgent(
name="Coder", # the default assistant agent is capable of solving problems with code
llm_config=llm_config,
)
writer = autogen.AssistantAgent(
name="writer",
llm_config=llm_config,
system_message="""
You are a professional report writer, known for
your insightful and engaging report for clients.
You transform complex concepts into compelling narratives.
Reply "TERMINATE" in the end when everything is done.
""",
)
It is an assistant agent who will take care of the quality of the code created by the coder agent and suggest any improvement needed.
system_message="""Critic. You are a helpful assistant highly skilled in
evaluating the quality of a given visualization code by providing a score
from 1 (bad) - 10 (good) while providing clear rationale. YOU MUST CONSIDER
VISUALIZATION BEST PRACTICES for each evaluation. Specifically, you can
carefully evaluate the code across the following dimensions
- bugs (bugs): are there bugs, logic errors, syntax error or typos? Are
there any reasons why the code may fail to compile? How should it be fixed?
If ANY bug exists, the bug score MUST be less than 5.
- Data transformation (transformation): Is the data transformed
appropriately for the visualization type? E.g., is the dataset appropriated
filtered, aggregated, or grouped if needed? If a date field is used, is the
date field first converted to a date object etc?
- Goal compliance (compliance): how well the code meets the specified
visualization goals?
- Visualization type (type): CONSIDERING BEST PRACTICES, is the
visualization type appropriate for the data and intent? Is there a
visualization type that would be more effective in conveying insights?
If a different visualization type is more appropriate, the score MUST
BE LESS THAN 5.
- Data encoding (encoding): Is the data encoded appropriately for the
visualization type?
- aesthetics (aesthetics): Are the aesthetics of the visualization
appropriate for the visualization type and the data?
YOU MUST PROVIDE A SCORE for each of the above dimensions.
{bugs: 0, transformation: 0, compliance: 0, type: 0, encoding: 0,
aesthetics: 0}
Do not suggest code.
Finally, based on the critique above, suggest a concrete list of actions
that the coder should take to improve the code.
""",
critic = autogen.AssistantAgent(
name="Critic",
system_message = system_message,
llm_config=llm_config,
)
In AutoGen we will use GroupChat features to group multiple agents together to do specific tasks. and then using GroupChatManager to control the GroupChat behavior.
groupchat_coder = autogen.GroupChat(
agents=[user_proxy, coder, critic], messages=[], max_round=10
)
groupchat_writer = autogen.GroupChat(
agents=[user_proxy, writer, critic], messages=[], max_round=10
)
manager_1 = autogen.GroupChatManager(
groupchat=groupchat_coder,
llm_config=llm_config,
is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
code_execution_config={
"last_n_messages": 1,
"work_dir": "groupchat",
"use_docker": False,
},
)
manager_2 = autogen.GroupChatManager(
groupchat=groupchat_writer,
name="Writing_manager",
llm_config=llm_config,
is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
code_execution_config={
"last_n_messages": 1,
"work_dir": "groupchat",
"use_docker": False,
},
)
Now, we will create and user agent to initiate the chat process and detect the termination command. It is a simple UserProxy agent acts as a human.
user = autogen.UserProxyAgent(
name="User",
human_input_mode="NEVER",
is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
code_execution_config={
"last_n_messages": 1,
"work_dir": "tasks",
"use_docker": False,
}, # Please set use_docker=True if docker is available to run the
#generated code. Using docker is safer than running the generated
#code directly.
)
user.initiate_chats(
[
{"recipient": coder, "message": coding_task[0], "summary_method": "last_msg"},
{
"recipient": manager_1,
"message": coding_task[1],
"summary_method": "last_msg",
},
{"recipient": manager_2, "message": coding_task[2]},
]
)
The output of this process will be very lengthy for brevity I will post some of the initial output.
Here, you can see the agent will work in steps first download the penguin dataset, then start creating code using coder agent the critic agent will check the code and suggest improvements and then it will re-run the coder agent to improve as suggested by the critic.
It is a simple AutoGen agentic workflow, you can experiment with the code and use different LLMs.
You can get all the code used in this article here
The future of AI is not just individual LLMs, but about creating ecosystems of AI entities that can work together seamlessly. AutoGen is at the forefront of this paradigm shift, paving the way for a new era of collaborative artificial intelligence. As you explore AutoGen’s capabilities, remember that you are not just working with a tool, you are partnering with an evolving ecosystem of AI agents. Embrace the possibilities, and experiment with different agent configurations and LLMs.
A. AutoGen is created by Microsoft to simplify the building of multi-agent AI systems. During the creation of the framework developer applies the latest agent workflow research and techniques which make APIs very easy to use. Unlike single-agent frameworks, AutoGen facilitates agent-to-agent communication and task delegation.
A. As you are working with AI I assume you know pretty much about Python. That’s it you can start with AutoGen then learn incrementally and always read official documentation. The framework provides high-level abstraction that simplifies the process of creating and managing AI agents.
A. AutoGen agents can be configured to access external data sources and APIs. This allows them to retrieve real-time information, interact with databases, or utilize external services as part of their problem-solving process.
A. AutoGen is highly flexible and customizable. You can easily use it with different frameworks. Follow the official documentation and ask specific questions in forums for better use cases.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.