Building a Multi-Agent System for Automatic Code Error Detection from Screenshots

Nibedita Dutta Last Updated : 30 Mar, 2025
10 min read

Can AI detect and fix coding errors just by analyzing a screenshot? With a Multi-Agent System for Automatic Code Error Detection, the answer is yes. This innovative approach uses artificial intelligence and reasoning to identify coding mistakes from images, propose accurate solutions, and explain the logic behind them. At the core is a decentralized Multi-Agent System, where autonomous agents—such as AI models, tools, or services—work collaboratively. Each agent gathers data, makes localized decisions, and contributes to solving complex debugging tasks. By automating this process, developers can save time, improve accuracy, and avoid the manual hassle of searching for solutions online.

Learning Objectives

  • Understand the Multi-Agent System with Reasoning and how it automates error detection and solution generation from screenshots.
  • Explore the role of artificial intelligence in enhancing the efficiency of a Multi-Agent System with Reasoning for software debugging.
  • Learn how Griptape simplifies the development of multi-agent systems with modular workflows.
  • Implement a multi-agent system for detecting coding errors from screenshots using AI models.
  • Utilize vision language models and reasoning-based LLMs for automated error detection and explanation.
  • Build and deploy AI agents specialized in web searching, reasoning, and image analysis.
  • Develop structured workflows to extract, analyze, and resolve coding errors efficiently.
  • Optimize security, scalability, and reliability in multi-agent system implementations.

This article was published as a part of the Data Science Blogathon.

Multi Agentic Systems: A brief Introduction

Multi-Agent Systems (MAS) represent intricate frameworks consisting of numerous interactive intelligent agents, each possessing unique skills and objectives. These agents can take various forms, including software applications, robotic entities, drones, sensors, and even humans, or a blend of these elements. The primary purpose of MAS is to address challenges that individual agents struggle to manage independently by harnessing the power of collective intelligence, collaboration, and coordinated efforts among the agents.

Distinctive Features of Multi-Agent Systems

  • Autonomy: Each agent functions with a level of self-governance, making choices based on its localized understanding of the surroundings.
  • Decentralization: Authority is spread across the agents, enabling the system to maintain operations even if certain parts fail.
  • Self-Organization: Agents possess the ability to adjust and arrange themselves according to emergent behaviors, resulting in effective task distribution and conflict management.
  • Real-Time Functionality: MAS can swiftly react to dynamic conditions without requiring human oversight, which makes them ideal for scenarios such as emergency response and traffic regulation.

Some Practical Examples of Multi Agent Systems

Multi-agent systems are transforming various industries by enabling intelligent collaboration among autonomous agents. Here are some practical examples showcasing their real-world applications.

  • Agent For Resolving Queries Dynamically: This is a sophisticated multi-agent system designed to address customer inquiries effectively. It begins by tapping into its extensive knowledge base and, when necessary, retrieves pertinent information from integrated tools to deliver precise answers.
  • Dynamic Assignment of Tickets: This advanced multi-agent system streamlines the ticket management workflow within the Customer Support division by automatically directing incoming support tickets to the most appropriate agents. Utilizing generative AI, it evaluates each ticket based on established criteria such as category, severity, and agent specialization.
  • Agent For Analyzing Gaps in Knowledge Base: This specialized multi-agent system aims to enhance the efficacy of the knowledge base by pinpointing recurring support challenges that require better coverage in existing articles. Through the use of generative AI, this agent examines trends in support tickets and customer inquiries to identify areas needing improvement.

Building Multi-Agent Systems with Griptape

The Griptape framework streamlines the development of collaborative AI agents by balancing predictability and creativity through modular design and secure workflows.

Agent Specialization and Coordination

Griptape enables developers to define agents with distinct roles, such as:

  • Research Agents: Gather data using tools like web search and scraping
  • Writer Agents: Transform insights into narratives tailored to specific audiences
  • Analytical Agents: Validate outputs against predefined schemas or business rules

Agents interact through workflows that parallelize tasks while maintaining dependencies. For example, a research agent’s findings can trigger multiple writer agents to generate content simultaneously

Workflow Design

The framework supports two approaches:

  • Sequential Pipelines: For linear task execution (e.g., data ingestion → analysis → reporting)
  • DAG-Based Workflows: For complex, branching logic where agents dynamically adjust based on intermediate outputs

Security and Scalability

Key safeguards include:

  • Off-prompt data handling: Minimizes exposure of sensitive information during LLM interactions
  • Permission controls: Restricts tool usage based on agent roles
  • Cloud integration: Deploys agents independently via services like Griptape Cloud for horizontal scaling

Implementation Best Practices

  • Use Rulesets to enforce agent behavior (e.g., formatting standards, ethical guidelines)
  • Leverage Memory types: Short-term for creative tasks, long-term for structured processes
  • Test workflows locally before deploying to distributed environments

Griptape’s modular architecture reduces reliance on prompt engineering by prioritizing Python code for logic definition, making it ideal for enterprise-grade applications like customer support automation and real-time data analysis pipelines

Hands on Implementation of Developing a Multi-agent system for Resolution of Coding Errors

In this tutorial, we will be creating a multi-agent system aimed at automatically detecting errors from coding screenshots, specifically using Python for our example. This system will not only identify errors but also offer users clear explanations on how to resolve them. Throughout this process, we will utilize vision language models in conjunction with reasoning-based large language models to enhance the functionality of our multi-agent framework.

Step1: Installing and Importing Necessary Libraries

First we will install all required libraries below:

!pip install griptape
!sudo apt update
!sudo apt install -y pciutils
!pip install langchain-ollama
!curl -fsSL https://ollama.com/install.sh | sh
!pip install ollama==0.4.2
!pip install "duckduckgo-search>=7.0.1"


import os
from griptape.drivers.prompt.ollama import OllamaPromptDriver
import requests
from griptape.drivers.file_manager.local import LocalFileManagerDriver
from griptape.drivers.prompt.openai import OpenAiChatPromptDriver
from griptape.loaders import ImageLoader
from griptape.structures import Agent
from griptape.tools import FileManagerTool, ImageQueryTool
from griptape.tasks import PromptTask, StructureRunTask
from griptape.drivers.structure_run.local import LocalStructureRunDriver
from griptape.structures import Agent, Workflow

from griptape.drivers.web_search.duck_duck_go import DuckDuckGoWebSearchDriver
from griptape.structures import Agent
from griptape.tools import PromptSummaryTool, WebSearchTool

Step2: Running the Ollama Server and Pulling the Models

The following code starts the ollama server. We also pull “minicpm-v” model from ollama so that this vision model can be used to extract text from handwritten notes.

import threading
import subprocess
import time

def run_ollama_serve():
  subprocess.Popen(["ollama", "serve"])

thread = threading.Thread(target=run_ollama_serve)
thread.start()
time.sleep(5)

!ollama pull minicpm-v

Now we also set the Open AI API key below which is needed to chat with the Ollama Model on Griptape

import os
os.environ["OPENAI_API_KEY"] = ""

We will also leverage a powerful LLM to assist with reasoning by explaining the coding error and the provided solution with sufficient context. For this, we pull the phi4-mini model.

!ollama pull phi4-mini

Step3: Creating an Agent to Analyze Screenshots

We start with creating an agent to analyze screenshots of Python Coding Errors. This agent leverages a vision language model (minicpm-v) in the backend.

images_dir = os.getcwd()


def analyze_screenshots():
      driver = LocalFileManagerDriver(workdir=images_dir)
      return Agent(
          tools=[
              FileManagerTool(file_manager_driver=driver),
              ImageQueryTool(
                  prompt_driver=OllamaPromptDriver(model="minicpm-v"), image_loader=ImageLoader(file_manager_driver=driver)
              ),
          ])

Step4: Creating Agents For Web searching and Reasoning

We then create two agents, one for web searching on possible solutions of the coding error and another for reasoning behind the error and its found solutions.

def websearching_agent():
   return Agent(
        tools=[WebSearchTool(web_search_driver=DuckDuckGoWebSearchDriver()), PromptSummaryTool(off_prompt=False)],
    )
    
    
def reasoning_agent():
	 return Agent(
	    prompt_driver=OllamaPromptDriver(
	        model="phi4-mini",
	    ))

Step5: Defining Tasks For Analyzing Screenshots, Finding Solutions and Providing with Reasoning

We use screenshots like this for automatic evaluation. We save it in our current working directory as “sample.jpg”. Its a handwritten answer sheet. This agentic system will first extract errors from coding screenshots and identify possible solutions. It will then provide sufficient reasoning behind the errors and their solutions.

image_file_name = "pythonerror1.jpg"
team = Workflow()
screenshotanalysis_task= StructureRunTask(
        (
            """Extract IN TEXT FORMAT ALL THE LINES FROM THE GIVEN SCREEN SHOT %s"""%(image_file_name),
        ),
        id="research",
        structure_run_driver=LocalStructureRunDriver(
            create_structure=analyze_screenshots,
        ),
    )

findingsolution_task =StructureRunTask(
                (
                  """FIND SOLUTION TO ONLY THE CODING ERRORS FOUND in the TEXT {{ parent_outputs["research"] }}. DO NOT INCLUDE ANY ADDITIONAL JUNK NON CODING  LINES WHILE FINDING THE SOLUTION.

                """,
                ),id="evaluate",
                structure_run_driver=LocalStructureRunDriver(
                    create_structure=websearching_agent,
                    )
                )

reasoningsolution_task = StructureRunTask(
                (
                  """ADD TO THE PREVIOUS OUTPUT, EXPANDED VERSION OF REASONING ON HOW TO SOLVE THE ERROR BASED ON {{ parent_outputs["evaluate"] }}.
                  DO INCLUDE THE WHOLE OUTPUT FROM THE PREVIOUS AGENT {{ parent_outputs["evaluate"] }}  AS WELL IN THE FINAL OUTPUT.
                
                
                """,
                ),
                structure_run_driver=LocalStructureRunDriver(
                    create_structure=reasoning_agent,
                    )
                )
            

Step6: Executing the Workflow 

Now we will execute the workflow.

screenshotanalysis_task.add_child(findingsolution_task)
findingsolution_task.add_child(reasoningsolution_task)
screenshotanalysis_task.add_child(reasoningsolution_task)  
 
team  = Workflow(
    tasks=[screenshotanalysis_task,findingsolution_task,reasoningsolution_task],
)
answer = team.run()
print(answer.output)

Input Screenshot

input snap

Output From Agentic System


Certainly! Here is an expanded explanation of how you can solve this error in
Python:
When working with strings and integers together, it's important that both elements
are either numbers (integers or floats) for numerical operations like addition. In
your case, you're trying to concatenate a string ("hello world") The error occurs
because Python does not allow direct concatenation of strings and integers without
explicitly handling them as separate types first (i.e., by conversion). The
solution is straightforward: convert both elements to com Here's an expanded
explanation along with your corrected code:

```python
try:
# Initialize variable 'a' as 1234 (an integer)
a = 1234
# Convert 'a' from int to str and then concatenate" hello world" print(str(a) +
"hello world")
except Exception as error: # Catch any exceptions that might occur print("Oops! An
exception has occured: ", error)
# Print the type of the caught exception for debugging purposes. print("Exception
TYPE:", type (error))
# Explicitly stating what class TypeError is expected in this context,
# though it's redundant since we've already captured and printed it above.
print("Exception TYPE: <class 'TypeError'>")
In summary, converting an integer to a string before concatenation solves the issue
by ensuring both elements are strings. This allows for seamless addition
(concatenation) of these two pieces into one coherent output.
Remember that this approach is not limited just to adding integers and strings; it's
applicable whenever you need to concatenate different data types in Python,
provided they can be converted or handled as compatible formats first.

As seen from the above output, not just the error is correctly explained with sufficient reasoning but the solution is also provided with enough reasoning.

Analyzing with Other Queries

Let us try with another query screenshot with a different Python coding issue

imput2; Multi-Agent System with Reasoning

Output From Agentic System


Certainly! Here is an expanded explanation of how we can solve any potential issues
with this corrected version:
1. **Syntax Correction**: The original error was due to improper indentation, which
caused a `IndentationError. In Python, proper block structure and consistent
indentations are crucial for the interpreter. 2. **Handling ZeroDivisionError
Properly**:
- A division by zero is an exceptional case that needs special handling.
- Using try-except blocks allows us gracefully handle this exception without
crashing our program unexpectedly (i.e., `ZeroDivisionError").
3. **Logging and Debugging Information**:
- The use of the logging module (log.debug`) helps in tracking what values are being
processed, which is useful for debugging.
Here's a complete example with proper indentation:
```python
import logging
# Configure logger to display debug level messages.
logging.basicConfig(level-logging.DEBUG)
log = logging.getLogger(____name___)
def division(a, b):
log.debug(f"Dividing {a} by {b}.")
try:
return a / b # Attempt the division operation here
except ZeroDivisionError as e: # Catch and handle zero-division error specifically.
log.exception("Oh noes!") # Log an exception message with traceback
# Example usage of function to demonstrate logging output
if __name_ "_main_":
print(division(10, 2))
try:
division (5, 0) except Exception as e:
pass
# Should work fine
# This should raise a ZeroDivisionError and log it.
# Catch any unexpected exceptions (though unlikely here)

As seen from the above output, the error is correctly explained with sufficient reasoning. Additionally, an alternative coding solution is provided to avoid this error, along with enough reasoning.

Let us try with another query screenshot with a different Python coding issue.

input3 python code error; Multi-Agent System with Reasoning

Output From Agentic System


Certainly! Here's an expanded version of how to solve this error based on the
previous output:
1. **Identify Missing Dependency:**
The error message indicates that there is a missing dependency, specifically for the
`packaging module which setuptools requires.
2. **Install 'packaging' Package Manually (if not already installed):**
***bash
pip install packaging
3. **Upgrade Setuptools to Ensure Compatibility:**
Upgrading setuptools can help resolve any compatibility issues and ensure that all
dependencies are correctly managed:
***bash
pip install --upgrade setuptools
4. **Re-run the Installation Command:**
After installing `packaging` manually (if it wasn't installed previously) or upgrading setuptools, re-execute your original installation command to see if this
resolves any issues.
5. **Verify Dependency Resolution and Reinstallation Attempts:**
If you encounter further errors related specifically to missing dependencies after
following the above steps:
- Check for additional required packages by reviewing error messages.
- Install those specific requirements using pip, e.g., `pip install <missing-
package-name>`.
6. **Check Environment Consistency:**
Ensure that your Python environment is consistent and not conflicting with other
installations or virtual environments:
***bash
# List installed packages to verify consistency across different setups (if
applicable)
pip list
# If using a specific version of setuptools, ensure it's correctly configured:
7. **Consult Documentation:**
Refer to the official documentation for both `packaging and `setuptools if you
encounter persistent issues or need more detailed guidance on resolving complex
dependency problems.
8. **Seek Community Help (if needed):**
If after following these steps, you're still facing difficulties:
- Post a question with specific error messages in relevant forums like Stack
Overflow.
- Provide details about your environment setup and the commands you've run for
better assistance from community members or experts.
By carefully addressing each step above based on what you encounter during
installation attempts (as indicated by any new errors), you'll be able to resolve
missing dependencies effectively. This systematic approach ensures that all
required packages are correctly installed

Conclusion

Integrating a multi-agent system (MAS) to automatically detect coding errors from screenshots offers significant improvements in developer efficiency. By leveraging AI and tools like Griptape, this approach provides timely and accurate solutions with detailed reasoning, saving valuable time for developers. Additionally, the flexibility and scalability of MAS can be applied across various industries, enabling seamless task management and enhanced productivity.

Key Takeaways

  • Integrating an automated system to identify coding errors from screenshots saves developers significant time by providing accurate solutions with detailed reasoning, reducing the need for manual error searching.
  • MAS is a decentralized architecture that uses autonomous agents to collaborate in solving complex problems, enhancing task management and scalability across industries.
  • The Griptape framework simplifies the development of multi-agent systems, offering modular design, agent specialization, secure workflows, and scalability, making it ideal for enterprise-level AI solutions.
  • MAS can dynamically adapt to changing conditions, making them ideal for real-time applications such as coding error detection, customer support automation, and data analysis.

Frequently Asked Questions

Q1. What is a multi-agent system (MAS), and how does it work?

A. A multi-agent system (MAS) consists of multiple autonomous agents that work together in a decentralized manner to solve complex problems. These agents communicate and collaborate within a shared environment to achieve individual and collective objectives, using their own localized data to make informed decisions.

Q2. How can a multi-agent system improve coding error detection from screenshots?

A. By integrating AI-driven multi-agent systems, developers can automate the process of detecting coding errors in screenshots. These systems can analyze the visual data, identify mistakes, and provide solutions with logical explanations, significantly reducing the time spent manually searching for errors.

Q3. What is Griptape, and how does it aid in developing multi-agent systems?

A. Griptape is a flexible framework designed for developing multi-agent systems. It simplifies the creation of collaborative AI agents by providing modular design, secure workflows, and scalability, making it suitable for complex applications like error detection, customer support automation, and real-time data analysis.

Q4. What are some real-world applications of multi-agent systems?

A. Multi-agent systems are used in various industries, such as customer support (e.g., ticket assignment agents), content quality control (e.g., redundancy deduction agents), and knowledge base enhancement (e.g., knowledge gap analysis agents). These systems help streamline processes, improve productivity, and maintain quality standards.

Q5. How does Griptape ensure security and scalability in multi-agent systems?

A. Griptape ensures security by implementing off-prompt data handling, which minimizes the exposure of sensitive information, and permission controls to restrict tool usage based on agent roles. It also supports cloud integration, allowing for scalable deployment of agents and facilitating horizontal scaling as system demands grow.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Nibedita completed her master’s in Chemical Engineering from IIT Kharagpur in 2014 and is currently working as a Senior Data Scientist. In her current capacity, she works on building intelligent ML-based solutions to improve business processes.

Login to continue reading and enjoy expert-curated content.

Responses From Readers

Clear

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details