Can AI detect and fix coding errors just by analyzing a screenshot? With a Multi-Agent System for Automatic Code Error Detection, the answer is yes. This innovative approach uses artificial intelligence and reasoning to identify coding mistakes from images, propose accurate solutions, and explain the logic behind them. At the core is a decentralized Multi-Agent System, where autonomous agents—such as AI models, tools, or services—work collaboratively. Each agent gathers data, makes localized decisions, and contributes to solving complex debugging tasks. By automating this process, developers can save time, improve accuracy, and avoid the manual hassle of searching for solutions online.
This article was published as a part of the Data Science Blogathon.
Multi-Agent Systems (MAS) represent intricate frameworks consisting of numerous interactive intelligent agents, each possessing unique skills and objectives. These agents can take various forms, including software applications, robotic entities, drones, sensors, and even humans, or a blend of these elements. The primary purpose of MAS is to address challenges that individual agents struggle to manage independently by harnessing the power of collective intelligence, collaboration, and coordinated efforts among the agents.
Multi-agent systems are transforming various industries by enabling intelligent collaboration among autonomous agents. Here are some practical examples showcasing their real-world applications.
The Griptape framework streamlines the development of collaborative AI agents by balancing predictability and creativity through modular design and secure workflows.
Griptape enables developers to define agents with distinct roles, such as:
Agents interact through workflows that parallelize tasks while maintaining dependencies. For example, a research agent’s findings can trigger multiple writer agents to generate content simultaneously
The framework supports two approaches:
Key safeguards include:
Griptape’s modular architecture reduces reliance on prompt engineering by prioritizing Python code for logic definition, making it ideal for enterprise-grade applications like customer support automation and real-time data analysis pipelines
In this tutorial, we will be creating a multi-agent system aimed at automatically detecting errors from coding screenshots, specifically using Python for our example. This system will not only identify errors but also offer users clear explanations on how to resolve them. Throughout this process, we will utilize vision language models in conjunction with reasoning-based large language models to enhance the functionality of our multi-agent framework.
First we will install all required libraries below:
!pip install griptape
!sudo apt update
!sudo apt install -y pciutils
!pip install langchain-ollama
!curl -fsSL https://ollama.com/install.sh | sh
!pip install ollama==0.4.2
!pip install "duckduckgo-search>=7.0.1"
import os
from griptape.drivers.prompt.ollama import OllamaPromptDriver
import requests
from griptape.drivers.file_manager.local import LocalFileManagerDriver
from griptape.drivers.prompt.openai import OpenAiChatPromptDriver
from griptape.loaders import ImageLoader
from griptape.structures import Agent
from griptape.tools import FileManagerTool, ImageQueryTool
from griptape.tasks import PromptTask, StructureRunTask
from griptape.drivers.structure_run.local import LocalStructureRunDriver
from griptape.structures import Agent, Workflow
from griptape.drivers.web_search.duck_duck_go import DuckDuckGoWebSearchDriver
from griptape.structures import Agent
from griptape.tools import PromptSummaryTool, WebSearchTool
The following code starts the ollama server. We also pull “minicpm-v” model from ollama so that this vision model can be used to extract text from handwritten notes.
import threading
import subprocess
import time
def run_ollama_serve():
subprocess.Popen(["ollama", "serve"])
thread = threading.Thread(target=run_ollama_serve)
thread.start()
time.sleep(5)
!ollama pull minicpm-v
Now we also set the Open AI API key below which is needed to chat with the Ollama Model on Griptape
import os
os.environ["OPENAI_API_KEY"] = ""
We will also leverage a powerful LLM to assist with reasoning by explaining the coding error and the provided solution with sufficient context. For this, we pull the phi4-mini model.
!ollama pull phi4-mini
We start with creating an agent to analyze screenshots of Python Coding Errors. This agent leverages a vision language model (minicpm-v) in the backend.
images_dir = os.getcwd()
def analyze_screenshots():
driver = LocalFileManagerDriver(workdir=images_dir)
return Agent(
tools=[
FileManagerTool(file_manager_driver=driver),
ImageQueryTool(
prompt_driver=OllamaPromptDriver(model="minicpm-v"), image_loader=ImageLoader(file_manager_driver=driver)
),
])
We then create two agents, one for web searching on possible solutions of the coding error and another for reasoning behind the error and its found solutions.
def websearching_agent():
return Agent(
tools=[WebSearchTool(web_search_driver=DuckDuckGoWebSearchDriver()), PromptSummaryTool(off_prompt=False)],
)
def reasoning_agent():
return Agent(
prompt_driver=OllamaPromptDriver(
model="phi4-mini",
))
We use screenshots like this for automatic evaluation. We save it in our current working directory as “sample.jpg”. Its a handwritten answer sheet. This agentic system will first extract errors from coding screenshots and identify possible solutions. It will then provide sufficient reasoning behind the errors and their solutions.
image_file_name = "pythonerror1.jpg"
team = Workflow()
screenshotanalysis_task= StructureRunTask(
(
"""Extract IN TEXT FORMAT ALL THE LINES FROM THE GIVEN SCREEN SHOT %s"""%(image_file_name),
),
id="research",
structure_run_driver=LocalStructureRunDriver(
create_structure=analyze_screenshots,
),
)
findingsolution_task =StructureRunTask(
(
"""FIND SOLUTION TO ONLY THE CODING ERRORS FOUND in the TEXT {{ parent_outputs["research"] }}. DO NOT INCLUDE ANY ADDITIONAL JUNK NON CODING LINES WHILE FINDING THE SOLUTION.
""",
),id="evaluate",
structure_run_driver=LocalStructureRunDriver(
create_structure=websearching_agent,
)
)
reasoningsolution_task = StructureRunTask(
(
"""ADD TO THE PREVIOUS OUTPUT, EXPANDED VERSION OF REASONING ON HOW TO SOLVE THE ERROR BASED ON {{ parent_outputs["evaluate"] }}.
DO INCLUDE THE WHOLE OUTPUT FROM THE PREVIOUS AGENT {{ parent_outputs["evaluate"] }} AS WELL IN THE FINAL OUTPUT.
""",
),
structure_run_driver=LocalStructureRunDriver(
create_structure=reasoning_agent,
)
)
Now we will execute the workflow.
screenshotanalysis_task.add_child(findingsolution_task)
findingsolution_task.add_child(reasoningsolution_task)
screenshotanalysis_task.add_child(reasoningsolution_task)
team = Workflow(
tasks=[screenshotanalysis_task,findingsolution_task,reasoningsolution_task],
)
answer = team.run()
print(answer.output)
Input Screenshot
Certainly! Here is an expanded explanation of how you can solve this error in
Python:
When working with strings and integers together, it's important that both elements
are either numbers (integers or floats) for numerical operations like addition. In
your case, you're trying to concatenate a string ("hello world") The error occurs
because Python does not allow direct concatenation of strings and integers without
explicitly handling them as separate types first (i.e., by conversion). The
solution is straightforward: convert both elements to com Here's an expanded
explanation along with your corrected code:
```python
try:
# Initialize variable 'a' as 1234 (an integer)
a = 1234
# Convert 'a' from int to str and then concatenate" hello world" print(str(a) +
"hello world")
except Exception as error: # Catch any exceptions that might occur print("Oops! An
exception has occured: ", error)
# Print the type of the caught exception for debugging purposes. print("Exception
TYPE:", type (error))
# Explicitly stating what class TypeError is expected in this context,
# though it's redundant since we've already captured and printed it above.
print("Exception TYPE: <class 'TypeError'>")
In summary, converting an integer to a string before concatenation solves the issue
by ensuring both elements are strings. This allows for seamless addition
(concatenation) of these two pieces into one coherent output.
Remember that this approach is not limited just to adding integers and strings; it's
applicable whenever you need to concatenate different data types in Python,
provided they can be converted or handled as compatible formats first.
As seen from the above output, not just the error is correctly explained with sufficient reasoning but the solution is also provided with enough reasoning.
Let us try with another query screenshot with a different Python coding issue
Output From Agentic System
Certainly! Here is an expanded explanation of how we can solve any potential issues
with this corrected version:
1. **Syntax Correction**: The original error was due to improper indentation, which
caused a `IndentationError. In Python, proper block structure and consistent
indentations are crucial for the interpreter. 2. **Handling ZeroDivisionError
Properly**:
- A division by zero is an exceptional case that needs special handling.
- Using try-except blocks allows us gracefully handle this exception without
crashing our program unexpectedly (i.e., `ZeroDivisionError").
3. **Logging and Debugging Information**:
- The use of the logging module (log.debug`) helps in tracking what values are being
processed, which is useful for debugging.
Here's a complete example with proper indentation:
```python
import logging
# Configure logger to display debug level messages.
logging.basicConfig(level-logging.DEBUG)
log = logging.getLogger(____name___)
def division(a, b):
log.debug(f"Dividing {a} by {b}.")
try:
return a / b # Attempt the division operation here
except ZeroDivisionError as e: # Catch and handle zero-division error specifically.
log.exception("Oh noes!") # Log an exception message with traceback
# Example usage of function to demonstrate logging output
if __name_ "_main_":
print(division(10, 2))
try:
division (5, 0) except Exception as e:
pass
# Should work fine
# This should raise a ZeroDivisionError and log it.
# Catch any unexpected exceptions (though unlikely here)
As seen from the above output, the error is correctly explained with sufficient reasoning. Additionally, an alternative coding solution is provided to avoid this error, along with enough reasoning.
Let us try with another query screenshot with a different Python coding issue.
Output From Agentic System
Certainly! Here's an expanded version of how to solve this error based on the
previous output:
1. **Identify Missing Dependency:**
The error message indicates that there is a missing dependency, specifically for the
`packaging module which setuptools requires.
2. **Install 'packaging' Package Manually (if not already installed):**
***bash
pip install packaging
3. **Upgrade Setuptools to Ensure Compatibility:**
Upgrading setuptools can help resolve any compatibility issues and ensure that all
dependencies are correctly managed:
***bash
pip install --upgrade setuptools
4. **Re-run the Installation Command:**
After installing `packaging` manually (if it wasn't installed previously) or upgrading setuptools, re-execute your original installation command to see if this
resolves any issues.
5. **Verify Dependency Resolution and Reinstallation Attempts:**
If you encounter further errors related specifically to missing dependencies after
following the above steps:
- Check for additional required packages by reviewing error messages.
- Install those specific requirements using pip, e.g., `pip install <missing-
package-name>`.
6. **Check Environment Consistency:**
Ensure that your Python environment is consistent and not conflicting with other
installations or virtual environments:
***bash
# List installed packages to verify consistency across different setups (if
applicable)
pip list
# If using a specific version of setuptools, ensure it's correctly configured:
7. **Consult Documentation:**
Refer to the official documentation for both `packaging and `setuptools if you
encounter persistent issues or need more detailed guidance on resolving complex
dependency problems.
8. **Seek Community Help (if needed):**
If after following these steps, you're still facing difficulties:
- Post a question with specific error messages in relevant forums like Stack
Overflow.
- Provide details about your environment setup and the commands you've run for
better assistance from community members or experts.
By carefully addressing each step above based on what you encounter during
installation attempts (as indicated by any new errors), you'll be able to resolve
missing dependencies effectively. This systematic approach ensures that all
required packages are correctly installed
Integrating a multi-agent system (MAS) to automatically detect coding errors from screenshots offers significant improvements in developer efficiency. By leveraging AI and tools like Griptape, this approach provides timely and accurate solutions with detailed reasoning, saving valuable time for developers. Additionally, the flexibility and scalability of MAS can be applied across various industries, enabling seamless task management and enhanced productivity.
A. A multi-agent system (MAS) consists of multiple autonomous agents that work together in a decentralized manner to solve complex problems. These agents communicate and collaborate within a shared environment to achieve individual and collective objectives, using their own localized data to make informed decisions.
A. By integrating AI-driven multi-agent systems, developers can automate the process of detecting coding errors in screenshots. These systems can analyze the visual data, identify mistakes, and provide solutions with logical explanations, significantly reducing the time spent manually searching for errors.
A. Griptape is a flexible framework designed for developing multi-agent systems. It simplifies the creation of collaborative AI agents by providing modular design, secure workflows, and scalability, making it suitable for complex applications like error detection, customer support automation, and real-time data analysis.
A. Multi-agent systems are used in various industries, such as customer support (e.g., ticket assignment agents), content quality control (e.g., redundancy deduction agents), and knowledge base enhancement (e.g., knowledge gap analysis agents). These systems help streamline processes, improve productivity, and maintain quality standards.
A. Griptape ensures security by implementing off-prompt data handling, which minimizes the exposure of sensitive information, and permission controls to restrict tool usage based on agent roles. It also supports cloud integration, allowing for scalable deployment of agents and facilitating horizontal scaling as system demands grow.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.