Building a Conversational AI SQL Assistant with LangChain, GROQ, and Streamlit

Gourav Lohar Last Updated : 14 Nov, 2024
8 min read

Have you ever wished you could simply chat with your database, asking questions in plain language and getting instant, relevant answers? Imagine the possibilities – no more complex SQL queries or digging through spreadsheets. Well, with the power of LangChain and its new SQL toolkit, that’s exactly what you can do!

Diving into the new exciting world of conversational database interactions, we will explore how with GROQ API, Streamlit, and LangChain SQL toolkit, you can create your chat application in your MySQL database. Be it as a data enthusiast, developer, or just enthusiastic to see potential, everyone is going to gain value insights and some practical knowledge to unseal the true power of your data.

In this article, you will discover how to leverage SQL Assistant tools like langchain_groq and Streamlit to build interactive applications. We’ll explore how to use Streamlit Python to create an effective AI SQL Assistant that simplifies data querying and enhances your data science projects.

Learning Outcomes

  • Understand how to integrate LangChain’s SQL toolkit with a MySQL database to enable natural language queries.
  • Learn how to use the GROQ API for efficient and precise data retrieval.
  • Gain insights into building an interactive UI for database interaction using Streamlit.
  • Explore best practices for establishing database connections with SQLAlchemy.
  • Discover how to create a conversational AI-powered SQL assistant with seamless query handling.

This article was published as a part of the Data Science Blogathon.

Chat and Visualize MySQL Data with LangChain, GROQ, and Streamlit

Databases-the warehouses of highly useful, structured information-have been the backbone of thousands and thousands of applications. In robustness, ease of use, and wide usage, however, one stands out: MySQL, the popular open-source relational database management system. It offers a structured way of storing, organizing, and retrieving data in an efficient manner, making it everyone’s go-to business or developer’s choice.

However, to a nontechnical individual, the direct manipulation of MySQL or any other database, for that matter, would be akin to trying to crack a cryptic code. Powerful as they are, the old way of using SQL queries often acts as a barrier against access into insights locked away in tables and columns.

From SQL Queries to Natural Language Conversations

But what if we could converse with our MySQL database in natural language, much like chatting with a colleague? What if we could visualize the data instantly in beautiful charts and graphs, making trends and patterns leap off the screen?

That brings us to the exciting trio of LangChain, GROQ, and Streamlit. LangChain translates our human-like questions into structured languages understandable to the database for which it uses its SQL toolkit. GROQ fetches the specific data that we are looking for with pinpoint accuracy due to its intuitive API. Finally, Streamlit is a powerful Python library for building web applications, providing the canvas that allows us to paint the masterpiece driven by our data. Streamlit supports simplicity and ease in rapid development, making it easier to build interactive interfaces with minimal effort.

Read about the Guide to Understand and Implement Natural Language Processing

Imagine asking your MySQL database: “What were our best-selling products last quarter?” “Here is a graph of customer growth over the past year.” Now you can, thanks to LangChain, GROQ, and Streamlit. In this tutorial, we will learn how to build a powerful, user-friendly application that lets you chat with your MySQL database like an informed assistant while enjoying the real-time data presentation beauty of Streamlit.

Get ready to unlock the full potential of your MySQL data, making it accessible, understandable, and actionable like never before.

Tech Stack Used

Database Queries with LangChain’s SQL Toolkit

LangChain is therefore an adaptable platform for AI that enables the interaction between a human and a variety of data sources to flow. Much like a bridge helps connect our natural expressions in language with the structured world of databases, APIs, and other kinds of knowledge repositories.

The LangChain SQL toolkit acts as an intelligent translator, converting our human-like questions into SQL queries that the database understands. It leverages the power of language models to grasp the intent behind our requests and generate the corresponding SQL code. This not only simplifies the data retrieval process but also opens up database access to a wider audience, including those without extensive SQL knowledge.

Setting Up Required Packages

To get your application up and running smoothly, you’ll need to install several essential packages. This step ensures that all necessary libraries and tools are available for the seamless integration of LangChain, GROQ, and Streamlit with your MySQL database.

pip install streamlit
pip install mysql-connector-python
pip install langchain-community
pip install python-dotenv
pip install langchain
pip install langchain-groq
pip install sqlalchemy

Developing Your MySQL Chat Interface

Let’s bring our concepts to life! The following code combines the power of LangChain’s SQL toolkit, GROQ’s data fetching capabilities, and Streamlit’s intuitive UI to create a conversational interface for your MySQL database. We’ll walk through the key steps, from establishing the database connection to handling user queries and displaying results.

SQL Code for Database Initialization

To get started, we need to set up our MySQL database and define its schema. Below is the SQL code required to initialize the database and create the necessary tables for storing user and post information.

create database instagramdb;
use instagramdb;
create table if not exists users(
userId int primary key,
userName varchar(50),
email varchar(100)
);
create table if not exists posts(
postId int primary key,
userId int,
caption varchar(100)
);

insert into users(userId,userName,email)
values
(1,"Gourav","[email protected]"),
(2,"Tushar","[email protected]"),
(3,"Adi","[email protected]");

insert into posts(postId,userId,caption)
values
(51,"587","rain"),
(2,"963","water"),
(3,"821","sunny");

Checkout this article about SQL For Data Science

Python Code for Interactive Application

To bring our concept to life, we’ll write the Python script that integrates LangChain, GROQ, and Streamlit. This code will set up the interactive chat interface, handle database connections, and enable natural language interactions with your MySQL database.

Import Required Libraries

Let us start by importing required libraries below:

import streamlit as st
from pathlib import Path
from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain_community.utilities import SQLDatabase
from langchain.agents.agent_types import AgentType
from langchain_community.callbacks.streamlit import StreamlitCallbackHandler
from langchain_community.agent_toolkits.sql.toolkit import SQLDatabaseToolkit
from sqlalchemy import create_engine
from langchain_groq import ChatGroq

These imports lay the foundation for building a Streamlit application that leverages LangChain, GROQ, and SQLAlchemy to enable natural language interactions with your MySQL database. The code sets the stage for creating an SQL agent that understands user queries, translates them into SQL, executes them on the database, and displays the results in a user-friendly manner within your Streamlit app.

Streamlit Configuration

In this section, we’ll set up the foundation for our Streamlit application, including its initial layout and configuration. This will establish the environment where users will interact with the AI SQL Assistant.

st.set_page_config(page_title="AI SQL Assistant", page_icon="🤖", layout="wide")

st.title("🤖 AI SQL Assistant")

MYSQL = "USE_MYSQL"

Sidebar and Database Configuration

We’ll dive into the sidebar setup, allowing users to input their MySQL connection details and API keys. This configuration ensures seamless connectivity to the database and integration with the necessary tools.

with st.sidebar:
    st.header("Database Configuration")
    selected_opt = st.radio("Choose an option", ["How to Use", "Connect to MySQL Database"]

    if selected_opt == "Connect to MySQL Database":
        db_uri = MYSQL
        mysql_host = st.text_input("MySQL Host")
        mysql_user = st.text_input("MySQL User")
        mysql_password = st.text_input("MySQL Password", type="password")
        mysql_db = st.text_input("MySQL Database")
        api_key = st.text_input("Groq API Key", type="password")
    else:
        db_uri = None
        api_key = None

    if st.button("Clear Chat History"):
        st.session_state["messages"] = []

    st.markdown("---")
    st.subheader("About")
    st.info("This AI SQL Assistant uses LangChain and Groq to provide a natural language interface to your MySQL database. Ask questions in plain English, and get SQL-powered answers!")
  • This code sets up the initial layout and configuration options for your AI SQL Assistant app.
  • The sidebar allows users to either learn how to use the app or provide their MySQL database connection details and GROQ API key.
  • The “Clear Chat History” button enables users to reset the conversation.
  • The “About” section provides a brief description of the app’s functionality.

Remember, this is just the initial setup. You’ll likely have additional code to handle the actual database connection, query processing using LangChain and GROQ, and displaying the results in the main area of the Streamlit app.

Learn How to Generate Your Own OpenAI API Key and Add Credits

Handling User Input and Query Execution

This part covers the core functionality of processing user queries and executing them against the MySQL database. We’ll detail how the application translates natural language inputs into SQL commands and displays the results interactively.

if db_uri == MYSQL and not api_key:
    st.sidebar.error("Please add the Groq API key")
    st.stop()

if api_key:
    llm = ChatGroq(groq_api_key=api_key, model_name="llama-3.1-70b-versatile", streaming=True)
  • This code demonstrates good practice by including input validation to improve the robustness of your application.
  • The use of st.stop() effectively prevents the app from running without the required API key.
  • The initialization of ChatGroq sets the stage for using GROQ’s capabilities within your LangChain-powered SQL agent.
@st.cache_resource(ttl="2h")
def configure_db(db_uri, mysql_host=None, mysql_user=None, mysql_password=None, mysql_db=None):
    if db_uri == MYSQL:
        if not (mysql_host and mysql_user and mysql_password and mysql_db):
            st.error("Please provide all MySQL connection details.")
            st.stop()
        return SQLDatabase(create_engine(f"mysql+mysqlconnector://{mysql_user}:{mysql_password}@{mysql_host}/{mysql_db}"))
  • The @st.cache_resource decorator optimizes performance by caching the database connection
  • Input validation helps prevent errors and improves the user experience
  • SQLAlchemy provides a robust way to connect to and interact with your MySQL database
  • LangChain’s SQLDatabase class bridges the gap between SQLAlchemy and LangChain’s natural language processing capabilities
if db_uri == MYSQL:
    db = configure_db(db_uri, mysql_host, mysql_user, mysql_password, mysql_db)
else:
    st.header("How to Use")
    st.markdown("""
    1. Select "Connect to MySQL Database" in the sidebar.
    2. Fill in your MySQL connection details and Groq API key.
    3. Once connected, you can start chatting with your SQL database!
    4. Ask questions in natural language, and the AI will translate them into SQL queries.
    """)
    st.stop()
toolkit = SQLDatabaseToolkit(db=db, llm=llm)

streamlit_callback = StreamlitCallbackHandler(st.container())

agent = create_sql_agent(
    llm=llm,
    toolkit=toolkit,
    verbose=True,
    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    callbacks=[streamlit_callback]
)

st.header("Chat Interface")

if "messages" not in st.session_state:
    st.session_state["messages"] = [{"role": "assistant", "content": "How can I help you?"}]

for msg in st.session_state.messages:
    with st.chat_message(msg["role"]):
        st.write(msg["content"])

user_query = st.chat_input(placeholder="Ask anything from the database")

if user_query:
    st.session_state.messages.append({"role": "user", "content": user_query})
    st.chat_message("user").write(user_query)

    with st.chat_message("assistant"):
        response_container = st.container()
        with response_container:
            response = agent.run(user_query)
            response_container.markdown(response)
        
        st.session_state.messages.append({"role": "assistant", "content": response})

In essence, this code snippet brings the interactive chat functionality to life. It enables users to ask questions about their database in plain language and receive informative responses, all within a visually appealing and user-friendly Streamlit interface.

AI SQL Assistant

ReadMore About Python Tutorial to Learn Data Science from Scratch

Conclusion

We have a good amount of fun in this article walking through how to connect humans and databases. By combining LangChain, GROQ, and Streamlit, we had this driven AI SQL assistant providing natural language question answering back into your MySQL database.

We’ve witnessed how LangChain’s SQL toolkit acts as the translator, converting our plain-English questions into the structured language of SQL. GROQ, with its efficient API, seamlessly fetches and transforms the data we need. And Streamlit, with its intuitive interface, presents the results in a visually engaging and interactive manner.

You can find Requirements.txt file here: GitHub

Key Takeaway

  • The code enables users to query their MySQL database using plain English, eliminating the need for complex SQL syntax.
  • LangChain SQL Toolkit leverages the LangChain framework to translate natural language queries into SQL, making database interactions more accessible.
  • GROQ is used for efficient data retrieval and transformation, potentially enhancing performance and enabling complex data operations.

Frequently Asked Questions

Q1. What is LangChain, and how does it help in this project?

A. LangChain is a powerful framework for building applications that leverage language models. It acts as an orchestrator, connecting different components like language models, data loaders, and tools to create complex workflows. In our project, LangChain’s SQL toolkit is crucial for translating natural language queries into SQL, enabling us to interact with the MySQL database using plain English.

Q2. What role does GROQ play in this setup?

A. GROQ is a query language and API platform designed for efficient data transformation and delivery. It allows us to fetch data from the MySQL database and apply any necessary transformations on the fly before presenting it to the user. GROQ’s flexibility and performance make it ideal for handling complex data operations and ensuring a responsive user experience.

Q3. Why use Streamlit for this project?

A. Streamlit is a Python library that simplifies the creation of interactive web applications. It provides a user-friendly interface for building chatbots, dashboards, and other data-driven tools. In our project, Streamlit is responsible for creating the chat interface, handling user input, and displaying the responses generated by the LangChain SQL agent.

Q4. Can I use this approach with databases other than MySQL?

A. Yes, LangChain’s SQL toolkit is designed to work with various relational databases. While this guide focuses on MySQL, you can adapt the code to connect to other databases like PostgreSQL, SQLite, or Oracle by modifying the connection details and potentially adjusting the SQL syntax if needed.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Hi I'm Gourav, a Data Science Enthusiast with a medium foundation in statistical analysis, machine learning, and data visualization. My journey into the world of data began with a curiosity to unravel insights from datasets.

Responses From Readers

Clear

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details