SQL is easily one of the most important languages in the computer world. It serves as the primary means for communicating with relational databases, where most organizations store crucial data. SQL plays a significant role including analyzing complex data, creating data pipelines, and efficiently managing data warehouses. However, writing optimized SQL queries can often be challenging and cumbersome. But thanks to the rapid progress in AI in the past few years, we now have AI agents augmented with Large Language Models capable of writing queries on our behalf.
This article demonstrates how to build an AI agent using CrewAI, Composio, and Gemini to access databases and execute SQL queries to retrieve data.
This article was published as a part of the Data Science Blogathon.
CrewAI is an open-source collaborative multi-agent framework. It lets you build a crew of AI agents with various tasks, tools, roles, and motivations akin to a real-world crew. CrewAI manages the flow of information from one agent to another, letting you build autonomous efficient agentic workflows.
CrewAI mainly consists of five core features Agents, Tasks, Tools, Processes, and Tasks.
Here is a mind-map CrewAI.
Also Read: How to Build a Collaborative AI Agents With CrewAI?
Composio is an open-source platform that provides tooling solutions for building reliable and useful AI agents. Composio provides over 150 tools and applications with built-in user authentication and authorization to help developers build reliable, secure, and production-ready agentic workflows. The tools have been designed from the ground up keeping real-world readiness of agents in mind.
Composio offers several advantages over other tooling solutions, including managed user authentication and authorizations, a wide array of tools and integrations, a dashboard for monitoring live integrations, and the flexibility to add custom tools.
Composio has four key concepts.
Here, you will create an agentic system, which takes a user query regarding the data stored in a database, fetches the relevant data using an SQL agent, and creates nice plots to visualize the data. For this workflow, we shall use CrewAI to orchestrate agents and Composio for tooling support.
The magnetic system will have an SQL query writer agent and a coding agent to write and execute the queries. The SQL agent will have access to the SQL tool from Composio and the coding agent will be able to use the E2B’s Codeinterpreter via Composio. The Codeinterpreter provides a sandboxed environment for executing Python programs.
The SQL agent will connect to a local database and query from an appropriate table. The data fetched from the SQL agent will be used by the Coding agent to create plots. The crew will return the plot as the final output.
To run this project successfully, you will need the Composio API key. First, create an account on Composio and we will explain how to get API keys later. Also, get a free Gemini API key from Google AI studio, but remember the free account is rate-limited. The execution may take longer.
Furthermore, you will need a dummy database for executing queries. If you do not have a spare database, run the following code to create an employee table with names, departments, and salaries.
import sqlite3
# Connect to the SQLite database
connection = sqlite3.connect('company.db')
# Create a cursor object
cursor = connection.cursor()
# Create the 'employee' table
create_table_query = '''
CREATE TABLE IF NOT EXISTS employee (
id INTEGER PRIMARY KEY AUTOINCREMENT,
employee_name TEXT NOT NULL,
department TEXT NOT NULL,
salary INTEGER NOT NULL
);
'''
cursor.execute(create_table_query)
# Data to insert into the 'employee' table
employees = [
("John Doe", "Engineering", 70000),
("Jane Smith", "Human Resources", 65000),
("Alice Johnson", "Marketing", 72000),
("Bob Brown", "Sales", 68000),
("Charlie Black", "Engineering", 71000),
("Daisy White", "Human Resources", 66000),
("Edward Green", "Marketing", 69000),
("Fiona Grey", "Sales", 64000),
("George Yellow", "Engineering", 73000),
("Hannah Blue", "Human Resources", 61000),
("Ivan Purple", "Marketing", 75000),
("Jessica Cyan", "Sales", 70000),
("Kyle Red", "Engineering", 68000),
("Lily Orange", "Human Resources", 67000),
("Martin Indigo", "Marketing", 72000),
("Nina Teal", "Sales", 65000),
("Oscar Lime", "Engineering", 73000),
("Penny Olive", "Human Resources", 62000),
("Quentin Silver", "Marketing", 74000),
("Rachel Maroon", "Sales", 69000),
("Steve Pink", "Engineering", 71000),
("Tina Violet", "Human Resources", 68000),
("Ursula Gold", "Marketing", 76000),
("Victor Bronze", "Sales", 64000),
("Wendy Mauve", "Engineering", 69000),
("Xavier Cream", "Human Resources", 65000),
("Yolanda Peach", "Marketing", 70000),
("Zack Sage", "Sales", 68000),
("Abby Coral", "Engineering", 72000),
("Bill Moss", "Human Resources", 63000)
]
# Insert data into the 'employee' table
insert_query = 'INSERT INTO employee (employee_name, department, salary) VALUES (?, ?, ?);'
cursor.executemany(insert_query, employees)
# Commit the changes
connection.commit()
# Close the connection
connection.close()
print("Table created and data inserted successfully.")
Now, that the basics are covered, we can start with the coding part. As with any Python project, we will first set up a virtual environment and environment variables, and install libraries. The project will use Gemini 1.5 Pro as the language model.
Create a virtual environment using Python Venv.
python -m venv sqlagent
cd sqlagent
source bin/active
Install the following libraries using pip install .
composio-core
composio-crewai
langchain-google-genai
dotenv
To use Composio toolsets, you need to authenticate your Composio account. Run the below command to log in to Composio and follow the login flow.
composio login
Now, get your Composio API keys.
Composio whoami
Create a .env file and add COMPOSIO_API_KEY and GOOGLE_API_KEY variables to it.
COMPOSIO_API_KEY=your Composio API key
GOOGLE_API_KEY=your Gemini API key
Now, create a Python file and import the necessary libraries.
import os
import dotenv
from composio_langchain import App, ComposioToolSet
from crewai import Agent, Crew, Process, Task
from langchain_google_genai import ChatGoogleGenerativeAI
# Load environment variables from the .env file
dotenv.load_dotenv()
Define Composio tools.
# Initialize the ComposioToolSet
toolset = ComposioToolSet(api_key=os.environ["COMPOSIO_API_KEY"])
code_interpreter_tools = toolset.get_tools([App.CODEINTERPRETER])
sql_tools = toolset.get_tools([App.SQLTOOL])
Now, define the LLM with Gemini 1.5 Pro.
llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro",
api_key=os.environ['GOOGLE_API_KEY']
)
As discussed earlier, we will here define two agents and two tasks. The SQL agent, Coding agent, and their respective tasks.
code_interpreter_agent = Agent(
role="Python Code Interpreter Agent",
goal="Run a code to get achieve a task given by the user",
backstory="You are an agent that helps users run Python code.",
verbose=True,
tools=code_interpreter_tools,
llm=llm,
)
sql_agent = Agent(
role="SQL Agent",
goal="Run SQL queries to get achieve a task given by the user",
backstory=(
"You are an agent that helps users run SQL queries. "
"Connect to the local SQLite DB at connection string = company.db"
"Try to analyze the tables first by listing all the tables and columns "
"and doing distinct values for each column and once sure, make a query to \
get the data you need."
),
verbose=True,
tools=sql_tools,
llm=llm,
allow_delegation=True,
)
In the above code snippet, we defined the agents, each with a defined role, goal, and backstory. The additional information offers additional context to LLMs before generating responses to queries. A tool equips each agent to perform the actions.
Now, define tasks.
code_interpreter_task = Task(
description=f"Run Python code to achieve the task - {main_task}. \
Exit once the image has been created.",
expected_output="Python code executed successfully. Return the image path.",
agent=code_interpreter_agent,
)
sql_task = Task(
description=f"Run SQL queries to achieve a task - {main_task}",
expected_output=f"SQL queries executed successfully. The result of the task \
is returned - {main_task}",
agent=sql_agent,
)
We defined the tasks that the agents will perform. Each task has a description, expected output, and the agent responsible for performing it.
Now, define the Crew with the agents and the tasks.
crew = Crew(
agents=[sql_agent, code_interpreter_agent],
tasks=[sql_task, code_interpreter_task],
)
result = crew.kickoff()
print(result)
You can put this flow in a while loop to make it more engaging,
while True:
main_task = input("Enter the task you want to perform (or type 'exit' to quit): ")
if main_task.lower() == "exit":
break
code_interpreter_agent = Agent(
role="Python Code Interpreter Agent",
goal="Run a code to get achieve a task given by the user",
backstory="You are an agent that helps users run Python code.",
verbose=True,
tools=code_interpreter_tools,
llm=llm,
)
sql_agent = Agent(
role="SQL Agent",
goal="Run SQL queries to get achieve a task given by the user",
backstory=(
"You are an agent that helps users run SQL queries. "
"Connect to the local SQLite DB at connection string = company.db"
"Try to analyze the tables first by listing all the tables and columns "
"and doing distinct values for each column and once sure, make a query to \
get the data you need."
),
verbose=True,
tools=sql_tools,
llm=llm,
allow_delegation=True,
)
code_interpreter_task = Task(
description=f"Run Python code to achieve the task - {main_task}. \
Exit once the image has been created.",
expected_output="Python code executed successfully. Return the image path.",
agent=code_interpreter_agent,
)
sql_task = Task(
description=f"Run SQL queries to achieve a task - {main_task}",
expected_output=f"SQL queries executed successfully. The result of the task \
is returned - {main_task}",
agent=sql_agent,
)
crew = Crew(
agents=[sql_agent, code_interpreter_agent],
tasks=[sql_task, code_interpreter_task],
)
result = crew.kickoff()
print(result)
This will prompt you to enter a query, which will then be passed to the Crew of AI agents. After execution, you will have the option to either ask another question or exit the loop.
Once the execution of a query is completed, it will output the file path for the plot’s image.
I asked it to create a bar plot of the number of employees in each department. This was the outcome.
GitHub Gist: SQLsgent
These steps demonstrated how to build an agentic workflow to automate SQL data extraction and visualization. However, you can go further, and make it more robust and reliable by adding a memory component to agents and the Crew. This will help the Agents remember their past outcomes, which will make them steer the workflow better, You can also add a frontend with Streamlit or Gradio, to make it more interactive.
The AI landscape is evolving at an unprecedented pace. As the quality of AI models, frameworks, and tools continues to improve, building powerful AI agents is becoming increasingly convenient each day. The future of the workforce is agentic, where humans and AI will complement each other to create even more efficient systems. With frameworks like CrewAI and Composio, you can conveniently create AI workflows to automate many routine tasks. This article demonstrates how to automate data extraction and visualization. You can expand this workflow to handle even more complex scenarios.
A. A. CrewAI is an open-source agent orchestration framework for building role-playing and collaborative agents.
A. In Autogen, orchestrating agents’ interactions requires additional programming, which can become complex and cumbersome as the scale of tasks grows. CrewAi has a simplified multi-agent AI implementation.
A. A. CrewAI lets you build collaborative multi-agent AI systems to accomplish complex automation workflows.
A. A. CrewAI is an open-source AI agent orchestration framework distributed under MIT license.
A. An SQL agent is an AI-augmented software that can autonomously perform SQL operations like querying, insertion, deletion, and updation.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.