Exploring Microsoft’s AutoGen Framework for Agentic Workflow

Avijit Biswas Last Updated : 02 Jul, 2024

10 min read

Introduction

In the rapidly evolving field of Generative AI, powerful models only do through prompting by humans until agents come, it’s like models are brains and agents are limbs, so, agentic workflow is introduced to do tasks autonomously using agents leveraging GenAI model. In the world of AI development agents are the future because agents can do complex tasks without the direct involvement of humans. Microsoft’s AutoGen frameworks stand out as a powerful tool for creating and managing multi-agent conversations. AutoGen simplifies the process of building an AI system that can collaborate, reason, and solve complex problems through agent-to-agent interactions.

In this article, we will explore the key features of AutoGen, how it works, and how you can leverage its capabilities in projects.

Learning Outcomes

Understand the concept and functionality of AI agents and their role in autonomous task execution.
Explore the features and benefits of the AutoGen framework for multi-agent AI systems.
Learn how to implement and manage agent-to-agent interactions using AutoGen.
Gain practical experience through hands-on projects involving data analysis and report generation with AutoGen agents.
Discover real-world applications and use cases of AutoGen in various domains such as problem-solving, code generation, and education.

This article was published as a part of the Data Science Blogathon.

What is an Agent?
What is Interesting in AutoGen Framework?
Agents of AutoGen
How AutoGen Works?
Use Cases and Examples
Example of Teacher-Student-Evaluator Model
Implementing AutoGen in a Project
Frequently Asked Questions

What is an Agent?

An agent is an entity that can send messages, receive messages and generate responses using GenAI models, tools, human inputs or a mixture of all. This abstraction not only allows agents to model real-world and abstract entities, such as people and algorithms. It simplifies the implementation of complex workflows.

Microsoft's AutoGen Framework for Agentic Workflow

What is Interesting in AutoGen Framework?

AutoGen is developed by a community of researchers and engineers. It incorporates the latest research in multi-agent systems and has been used in many real-world applications. AutoGen Framework is extensible and composable meaning you can extend a simple agent with customizable components and create workflows that combine these agents to create a more powerful agent. It is modular and easy to implement.

Agents of AutoGen

Let us now explore agents of AutoGen.

Conversable Agents

At the heart of AutoGen are conversable agents. It is the agent with base functionality and it is the base class for all other AutoGen agents. A conversable Agent is capable of engaging in conversations, processing information, and performing tasks.

Agents Types

AutoGen provides several pre-defined agent types, each designed for specific roles.

AssistantAgent: A general-purpose AI assistant capable of understanding and responding to queries.
UserProxyAgent: Simulate user behavior, allowing for testing and development of agent interaction.
GroupChat: Uses multiple agents to group and they will work as a system for doing specific tasks.

Conversation Patterns

Patterns let us make complex problem-solving and task completion through collaborating agent interaction.

one-to-one conversation between agents
Group chats with multiple agents
Hierarchical conversation where agents can delegate tasks to sub-agents

How AutoGen Works?

AutoGen facilitates multi-agent conversation and task execution through a sophisticated orchestration of AI agents.

Key Process

Agent Initialization: In AutoGen, we first initiate agents. These involve creating instances of the agent types you need and configuring them with specific parameters.

Example:

from autogen import AssistantAgent, UserProxyAgent

assistant1 = AssistantAgent("assistant1", llm_config={"model": "gpt-4","api_key":"<YOUR API KEY>"})
assistant2 = AssistantAgent("assistant2", llm_config={"model": "gpt-4","api_key":"<YOUR API KEY>"})

Conversation Flow: Once the agents are initialized, AutoGen manages the flow of conversation between them.

Typical flow pattern:

A task or query is introduced
The appropriate agents(s) process the input
Responses are generated and passed to the next agents or back to the user
This cycle continues until the tasks is completed or a termination condition is met.

This is the basic conversation flow in AutoGen. For working with more complex task processes we can combine multiple agents into a group called GroupChat, and then use Group Manager to manage the conversation. Every group and group manager will be responsible for specific tasks.

Task Execution

As the conversation progresses, agents may need to perform specific tasks, AutoGen supports various task execution methods.

Natural language process: Agents can interpret and generate human-like text in multiple languages.
Code Execution: Agents can create, write, run and debug code in various programming languages automatically.
External API calls: Agents can interact with external services to fetch or process data.
Searching Web: The agent can automatically search the web such as Wikipedia to extract information for specific queries.

Error Handling and Interaction

AutoGen implements a robust error-handling process. If an agent encounters an error, it can often diagnose and attempt to fix the issue autonomously. This creates a cycle of continuous improvement and problem-solving.

Conversation Termination

Conversations in AutoGen can terminate based on predefined conditions.

Task completion
Reaching a predefined number of turns
Explicit termination command
Error thresholds

The flexibility of this termination condition allows for both quick and targeted interaction.

Use Cases and Examples

Let us now explore use cases and examples of Microsoft’s AutoGen Framework.

Complex problem solving

AutoGen excels at breaking down and solving complex problems through multi-agent collaboration. It can be used in scientific research to analyze data, formulate hypotheses, and design experiments.

Code generation and Debugging

AutoGen can generate, execute, and debug code across various programming languages. This is particularly useful for software development and automation tasks.

Automated Advertise System

AutoGen framework is well suited for multi-agent automated advertising management. It can track the customer’s reviews, clicks on advertising, automated AB testing on targeted advertising, and use GenAI models such as Gemini, and Stable diffusion to generate customer-specific advertise

Education Tutoring

AutoGen can create interactive tutoring experiences, where different agents take on roles such as teacher, student, and evaluator.

Example of Teacher-Student-Evaluator Model

Let us now explore a simple example of the Teacher-Student-Evaluator model.

from autogen import AssistantAgent, UserProxyAgent

teacher = AssistantAgent("Teacher", llm_config={"model": "gpt-4","api_key":"<YOUR API KEY>"})
student = UserProxyAgent("Student")
evaluator = AssistantAgent("Evaluator", llm_config={"model": "gpt-4","api_key":"<YOUR API KEY>"})

def tutoring_session():
    student.initiate_chat(teacher, message="I need help understanding quadratic equations.")
    # Teacher explains concept
    student.send(evaluator, "Did I understand correctly? A quadratic equation is ax^2 + bx + c = 0")
    # Evaluator assesses understanding and provides feedback
    teacher.send(student, "Let's solve this equation: x^2 - 5x + 6 = 0")
    # Student attempts solution
    evaluator.send(teacher, "Assess the student's solution and provide guidance if needed.")

tutoring_session()

Till now we have gathered all the necessary knowledge for working with AutoGen Framework. Now, let’s implement a hands-on project so we can cement our understanding.

Implementing AutoGen in a Project

In this project, we will use AutoGen Agents to download a dataset from the web and try to analyze it using LLM.

Step1: Environment Setup

#create a conda environment
$ conda create -n autogen python=3.11
# after the creating env
$ conda activate autogen
# install autogen and necessary libraries
pip install numpy pandas matplolib seaborn python-dotenv jupyterlab
pip pyautogen

Now, open your Vscode and start the project by creating a Jupyter notebook of your choice.

Step2: Load Libraries

import os
import autogen
from autogen.coding import LocalCommandLineCodeExecutor
from autogen import ConversableAgent
from dotenv import load_dotenv

Now, collect your API keys of the generative model from the respective site and put them into .env file at the root of the project. Belew code will load all the API keys into the system.

load_dotenv()

google_api_key = os.getenv("GOOGLE_API_KEY")
open_api_key = os.getenv("OPENAI_API_KEY")

os.environ["GOOGLE_API_KEY"] = google_api_key.strip('"')
os.environ["OPENAI_API_KEY"] = open_api_key.strip('"')
seed = 42

For GeminiAI, OpenAI

I use the GeminiAI free version to test the code. Setting the gemini safety to NONE.

safety_settings = [
    {"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"},
    {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"},
    {"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_NONE"},
    {"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE"},
]

Step3: Configuring LLM to Gemini-1.5-flash

llm_config = {
    "config_list": [
        {
            "model": "gemini-1.5-flash",
            "api_key": os.environ["GOOGLE_API_KEY"],
            "api_type": "google",
            "safety_settings": safety_settings,
        }
    ]
}()

Step4: Configuring LLM to OpenAI

llm_config = {
      "config_list" = [{"model": "gpt-4", "api_key": os.getenv("OPENAI_API_KEY")}

}

Step5: Defining Coding Task

coding_task = [
    """download data from https://raw.githubusercontent.com/vega/vega-datasets/main/data/penguins.json""",
    """ find desccriptive statistics of the dataset, plot a chart of their relation between spices and beak length and save the plot to beak_length_depth.png """,
    """Develope a short report using the data from the dataset, save it to a file named penguin_report.md.""",
]

Step5: Designing the Assistant Agents

I will use four assistants

User Proxy
Coder
Writer
Critic

User Proxy Agent

It is an AutoGen User proxy, it is a subclass of ConversableAgent, It’s human_input_mode is ALWAYS which means it will work as a human agent. And its LLM configuration is False. By default, it will ask humans for input but here we will put human_input_mode to NEVER, so it will work autonomously.

  user_proxy = autogen.UserProxyAgent(
    name="User_proxy",
    system_message="A human admin.",
    code_execution_config={
        "last_n_messages": 3,
        "work_dir": "groupchat",
        "use_docker": False,
    },  # Please set use_docker=True if docker is available to 
        #run the generated code. Using docker is safer than running the generated code directly.
    human_input_mode="NEVER",
)

Code and Writer agents

To build Code and Writer agents we will leverage AutoGen Assistant Agent which is a subclass of Conversable Agent. It is designed to solve tasks with LLM. human_input_mode is NEVER. We can use a system message prompt with an assistant agent.

coder = autogen.AssistantAgent(
    name="Coder",  # the default assistant agent is capable of solving problems with code
    llm_config=llm_config,
)
writer = autogen.AssistantAgent(
    name="writer",
    llm_config=llm_config,
    system_message="""
        You are a professional report writer, known for
        your insightful and engaging report for clients.
        You transform complex concepts into compelling narratives.
        Reply "TERMINATE" in the end when everything is done.
        """,
)

Critic Agent

It is an assistant agent who will take care of the quality of the code created by the coder agent and suggest any improvement needed.

system_message="""Critic. You are a helpful assistant highly skilled in 
evaluating the quality of a given visualization code by providing a score 
from 1 (bad) - 10 (good) while providing clear rationale. YOU MUST CONSIDER 
VISUALIZATION BEST PRACTICES for each evaluation. Specifically, you can 
carefully evaluate the code across the following dimensions
- bugs (bugs):  are there bugs, logic errors, syntax error or typos? Are 
there any reasons why the code may fail to compile? How should it be fixed? 
If ANY bug exists, the bug score MUST be less than 5.
- Data transformation (transformation): Is the data transformed 
appropriately for the visualization type? E.g., is the dataset appropriated 
filtered, aggregated, or grouped  if needed? If a date field is used, is the
 date field first converted to a date object etc?
- Goal compliance (compliance): how well the code meets the specified 
visualization goals?
- Visualization type (type): CONSIDERING BEST PRACTICES, is the 
visualization type appropriate for the data and intent? Is there a 
visualization type that would be more effective in conveying insights? 
If a different visualization type is more appropriate, the score MUST 
BE LESS THAN 5.
- Data encoding (encoding): Is the data encoded appropriately for the 
visualization type?
- aesthetics (aesthetics): Are the aesthetics of the visualization 
appropriate for the visualization type and the data?

YOU MUST PROVIDE A SCORE for each of the above dimensions.
{bugs: 0, transformation: 0, compliance: 0, type: 0, encoding: 0, 
aesthetics: 0}
Do not suggest code.
Finally, based on the critique above, suggest a concrete list of actions
 that the coder should take to improve the code.
""",

critic = autogen.AssistantAgent(
    name="Critic",
    system_message = system_message,
    llm_config=llm_config,
    )

Group Chat and Manager Creation

In AutoGen we will use GroupChat features to group multiple agents together to do specific tasks. and then using GroupChatManager to control the GroupChat behavior.

groupchat_coder = autogen.GroupChat(
    agents=[user_proxy, coder, critic], messages=[], max_round=10
)

groupchat_writer = autogen.GroupChat(
    agents=[user_proxy, writer, critic], messages=[], max_round=10
)
manager_1 = autogen.GroupChatManager(
    groupchat=groupchat_coder,
    llm_config=llm_config,
    is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
    code_execution_config={
        "last_n_messages": 1,
        "work_dir": "groupchat",
        "use_docker": False,
    },
)

manager_2 = autogen.GroupChatManager(
    groupchat=groupchat_writer,
    name="Writing_manager",
    llm_config=llm_config,
    is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
    code_execution_config={
        "last_n_messages": 1,
        "work_dir": "groupchat",
        "use_docker": False,
    },
)

Now, we will create and user agent to initiate the chat process and detect the termination command. It is a simple UserProxy agent acts as a human.

user = autogen.UserProxyAgent(
    name="User",
    human_input_mode="NEVER",
    is_termination_msg=lambda x: x.get("content", "").find("TERMINATE") >= 0,
    code_execution_config={
        "last_n_messages": 1,
        "work_dir": "tasks",
        "use_docker": False,
    },  # Please set use_docker=True if docker is available to run the 
      #generated code. Using docker is safer than running the generated 
      #code directly.
)

user.initiate_chats(
    [
        {"recipient": coder, "message": coding_task[0], "summary_method": "last_msg"},
        {
            "recipient": manager_1,
            "message": coding_task[1],
            "summary_method": "last_msg",
        },
        {"recipient": manager_2, "message": coding_task[2]},
    ]
)

Output

The output of this process will be very lengthy for brevity I will post some of the initial output.

Here, you can see the agent will work in steps first download the penguin dataset, then start creating code using coder agent the critic agent will check the code and suggest improvements and then it will re-run the coder agent to improve as suggested by the critic.

It is a simple AutoGen agentic workflow, you can experiment with the code and use different LLMs.

You can get all the code used in this article here

Conclusion

The future of AI is not just individual LLMs, but about creating ecosystems of AI entities that can work together seamlessly. AutoGen is at the forefront of this paradigm shift, paving the way for a new era of collaborative artificial intelligence. As you explore AutoGen’s capabilities, remember that you are not just working with a tool, you are partnering with an evolving ecosystem of AI agents. Embrace the possibilities, and experiment with different agent configurations and LLMs.

Key Takeaways

Multi-agent Collaboration: AutoGen simplifies the creation of a multi-agent AI system where different agents can work together to accomplish a complex task.
Flexibility and Customization: The framework offers extensive customization options, allowing developers to create agents tailored to specific tasks or domains.
Code Generation and Execution: AutoGen agents can write, debug, and execute code, making it a powerful tool for software development and data analysis.
Conversational Intelligence: By leveraging LLMs agents can engage in natural language conversation, which makes it suitable for a wide range of applications from customer service to personalized tutoring.

Frequently Asked Questions

Q1. What is AutoGen and How does it differ from other AI frameworks?

A. AutoGen is created by Microsoft to simplify the building of multi-agent AI systems. During the creation of the framework developer applies the latest agent workflow research and techniques which make APIs very easy to use. Unlike single-agent frameworks, AutoGen facilitates agent-to-agent communication and task delegation.

Q2. Do I need advanced programming skills to use AutoGen?

A. As you are working with AI I assume you know pretty much about Python. That’s it you can start with AutoGen then learn incrementally and always read official documentation. The framework provides high-level abstraction that simplifies the process of creating and managing AI agents.

Q3. Can AutoGen agents access external data or services?

A. AutoGen agents can be configured to access external data sources and APIs. This allows them to retrieve real-time information, interact with databases, or utilize external services as part of their problem-solving process.

Q4. Can AutoGen be used with LanchChain or Llamaindex?

A. AutoGen is highly flexible and customizable. You can easily use it with different frameworks. Follow the official documentation and ask specific questions in forums for better use cases.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Avijit Biswas

A self-taught, project-driven learner, love to work on complex projects on deep learning, Computer vision, and NLP. I always try to get a deep understanding of the topic which may be in any field such as Deep learning, Machine learning, or Physics. Love to create content on my learning. Try to share my understanding with the worlds.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

Exploring Microsoft’s AutoGen Framework for Agentic Workflow

Introduction

Learning Outcomes

Table of contents

What is an Agent?

What is Interesting in AutoGen Framework?

Agents of AutoGen

Conversable Agents

Agents Types

Conversation Patterns

How AutoGen Works?

Key Process

Task Execution

Error Handling and Interaction

Conversation Termination

Use Cases and Examples

Complex problem solving

Code generation and Debugging

Automated Advertise System

Education Tutoring

Example of Teacher-Student-Evaluator Model

Implementing AutoGen in a Project

Step1: Environment Setup

Step2: Load Libraries

Step3: Configuring LLM to Gemini-1.5-flash

Step4: Configuring LLM to OpenAI

Step5: Defining Coding Task

Step5: Designing the Assistant Agents

User Proxy Agent

Code and Writer agents

Critic Agent

Group Chat and Manager Creation

Output

Conclusion

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect