Machine Learning is a fast-growing field, and its applications have become ubiquitous in our day-to-day lives. As the demand for ML models increases, so makes the demand for user-friendly interfaces to interact with these models. This blog is a tutorial for building intuitive frontend interfaces for Machine Learning models using two popular open-source libraries – Streamlit vs. Gradio.
Streamlit is a python library for building data-driven applications specifically designed for machine learning and data science. It makes it easy to create a frontend UI in just a short amount of time with multiple features. On the other hand, Gradio is a library for Machine Learning models that makes it possible to quickly and easily create web-based interfaces for your models.
Together, these two libraries provide a powerful solution for building interfaces for Machine Learning that are both functional and user-friendly. In this blog, we will see how to build interactive interfaces using Streamlit and Gradio and how they can improve user experience when interacting with ML models. For both beginners and experienced data scientists, this blog will provide you with the tools you need to create your interfaces. So, in this article, we will discuss Streamlit vs. Gradio.
This article was published as a part of the Data Science Blogathon.
Streamlit is a modern, easy-to-use, open-source python library that allows developers to build beautiful and interactive data applications. With Streamlit, you can easily create beautiful visualizations and interactive dashboards. Streamlit is built using the Python programming language, making it handy for data scientists and machine learning engineers already familiar with Python. However, even if you’re unfamiliar with Python, Streamlit is designed to be easy to learn and use, with a simple and intuitive API that makes it easy to get started. This is about Streamlit now. Further, we will see Streamlit vs. Gradio.
Streamlit works by using Flask to provide a server-side environment for Python code and React to provide a client-side environment for rendering and interacting with the results of that code.
When a Streamlit application is run, the Flask server is started and listens for API calls from the user’s browser. When a user interacts with the application, the React front-end sends an API call to the Flask server, executing the appropriate Python code and returning the results to the user’s browser, where they are rendered and displayed.
This approach provides several key benefits, including fast and responsive performance and the ability to interact with the application in real time. Additionally, because the Flask server is running in a server-side environment, data scientists and machine learning engineers can take advantage of the power and scalability of server-side computing, allowing them to build applications that can handle large amounts of data and complex computations with ease.
Let’s start by installing and importing streamlit and other libraries necessary for this tutorial. We will use scikit-learn’s logistic regression model for this tutorial.
!pip install streamlit
#Import Streamlit
import streamlit as st
#Other imports
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler
We will create the dashboard for model training. Here’s an example of how you can create a dashboard using streamlit to train a scikit-learn model, with the option to upload the input data:
# Title of the dashboard
st.title("Streamlit Dashboard for Model Training")
#User uploading feature for input csv file
st.write("Upload your dataset (CSV file format)")
file = st.file_uploader("", type="csv")
#read the csv file and display the dataframe
if file is not None:
data = pd.read_csv(file)
st.write("Preview of the uploaded dataset:")
st.dataframe(data)
target = st.selectbox('Select the target variable: ',
list(data.columns), index = list(data.columns).index(list(data.columns)[-1]))
X = data.drop(columns=target)
y = data[target]
# split the dataset into train and test and traina logistic regrresison model
st.write("Splitting the dataset into training and testing sets:")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=0)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
st.write("Training a Logistic Regression Model:")
model = LogisticRegression(random_state = 0, solver='lbfgs', multi_class='auto')
model.fit(X_train, y_train)
#Evaluate the model and print the accuracy score
st.write("Evaluating the Model:")
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
st.write("Accuracy: ", accuracy)
st.write("End of Training")
In this example, we use a file uploader to let the user choose between different pages. Depending on the uploaded file, we display the Dataframe of the uploaded file and let the user choose the target variable column. The code then standardizes the data, trains a logistic regression model, and prints the accuracy.
Let’s try it with the iris dataset, which can be downloaded from several sources. Below is a screenshot of the dashboard:
Streamlit supports creating multiple pages for the app, which can be added with a few lines of code as below:
menu = ["Homepage", "Page 1", "Page 2"]
choice = st.sidebar.selectbox("Select a page", menu)
if choice == "Homepage":
st.write("Welcome to the Homepage!")
elif choice == "Page 1":
st.write("This is Page 1")
elif choice == "Page 2":
st.write("This is Page 2")
This can have several pages for model training, predictions, and visualizations. The user can choose the page from the sidebar, displaying the appropriate menu and options.
Streamlit provides several built-in functions for creating different types of visualizations, including:
Here’s a simple example that shows how you can create a histogram in Streamlit:
nums_data = np.random.normal(1, 1, size=100)
fig, ax = plt.subplots()
ax.hist(nums_data, bins=20)
st.pyplot(fig)
This creates the following histogram:
Now let’s look at Gradio and build a new dashboard to compare the libraries.
Gradio is an open-source library that provides tools for building and deploying interactive interfaces for machine learning models. It allows you to easily turn your machine-learning models into web-based applications that can be used by all users, including those with little to no coding experience. With Gradio, you can create interactive sliders, dropdown menus, and checkboxes to control the inputs to your model and display the outputs using visualizations for charts, tables, and images. It can be integrated with Pytorch and TensorFlow for deep learning, making it easy to use your existing models or train new ones. This is about Gradio now. Further, we will see Streamlit vs. Gradio.
One of the key features of the Gradio library is its modular architecture. This allows developers to easily add or remove components and functionality as needed, creating a wide range of interfaces.
Let’s look at the process of creating a dashboard with Gradio:
Let’s start by installing and importing Gradio and other necessary libraries.
#Install Gradio
!pip install gradio
import gradio as gr
#Other Imports
import os
import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
We will build the dashboard with similar features as we did with Streamlit and let the user choose the input file. The data frame will be displayed on the right, and the model training will happen in the background. Let’s start by defining the model training steps:
# Train the model
def train_model(data, target):
# dependent and independent variables
X = data.drop(columns=target)
y = data[target]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
#standardize the data
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
#train the model
model = LogisticRegression(random_state=0, solver='lbfgs', multi_class='auto')
model.fit(X_train, y_train)
#print the accuracy score
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
return accuracy
The user will be given the option to upload or select the file location. In this case, let’s try a dropdown where the user selects a CSV file, with the list of options showing all CSV files in the current working directory.
# Upload csv file and train the model
def upload_csv(Input_CSV, Target_Variable):
columns = list(pd.read_csv('./' + Input_CSV).columns)
if Target_Variable not in columns:
Target_Variable = columns[-1]
data = pd.read_csv('./' + Input_CSV)
accuracy = train_model(data, Target_Variable)
return (data.head(4)), Target_Variable, accuracy
#list the csv files in current working directory
files = [f for f in os.listdir('.') if os.path.isfile(f) and f.endswith('csv')]
#set the inputs and corresponding outputs
inputs = [gr.Dropdown(files), gr.Textbox()]
outputs = ['dataframe', gr.Textbox(label="Target Variable"), gr.Textbox(label="Accuracy Score")]
#launch the dashboard
demo = gr.Interface(upload_csv, inputs, outputs)
demo.launch() #in some cases this line might produce an error
# in case the above block of code throws error
# an argument needs to be passed in launch()
# demo.launch(share=True)
# the above line when run, solves the error
The dashboard is launched locally, and from the view in the browser, we can select the CSV file, train a logistic regression model, and it outputs the accuracy score.
We compare the two dashboards and the process of building them – Streamlit, with its well-documented modules and support for various popular machine learning libraries, including TensorFlow, Keras, and PyTorch, makes it easy to build the interface quickly. On the other hand, Gradio is useful for simple and easy-to-use interfaces with a list of inputs on the left, the function executing in the background on these inputs, and the outputs displayed on the right.
The dashboards built using Gradio and Streamlit are user-friendly and efficient tools for training any ML models and displaying the visualizations, outputs, graphs, and metrics. Streamlit provides a larger support base and detailed documentation and examples, whereas Gradio is for quick visualization of inputs and outputs side by side. Both libraries provide easy-to-use, quick dashboard-building modules that are user-friendly, fast, and efficient, and the choice is left to the end user to decide the better library based on their use case.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.