In this fast-changing landscape of AI, efficiency and scalability become paramount. Developers are actively reaching out to those models that provide high performance at a reduced cost with lower latency and better scalability. Enter Gemini Flash 1.5 a new release that retains all the great features of Gemini 1.1 and offers even better performance for many image-related tasks. Specifically, As a part of the Gemini 1.5 release, which also includes the Gemini 1.5 Pro variant, Flash 1.5 stands out as a model to make fast, efficient, and high-volume tasks possible. Now, let’s consider the importance of Gemini Flash 1.5 in this blog and make a Food Vision WebApp with Flask.
This article was published as a part of the Data Science Blogathon.
With the integration of AI into different industries, fast and efficient models to process high amounts of data are therefore needed. Traditional AI models are very resource-intensive, usually high in latency, and low in scaling. This creates a huge challenge, especially to developers working on applications that require real-time responses or which are field-deployed on resource-constrained environments such as mobile devices or edge computing platforms.
Recognizing these challenges, Google introduced the Gemini Flash 1.5 model—a lightweight AI solution tailored to meet the needs of modern developers. Gemini Flash 1.5 is designed to be cost-efficient, fast, and scalable, making it an ideal choice for high-volume tasks where performance and cost are critical considerations.
Flask is a lightweight micro web framework that allows developers to build web applications using Python. It’s called a “micro” framework because it doesn’t require a lot of setup or configuration, unlike other frameworks like Django. Flask is perfect for building small to medium-sized web applications, prototyping, and even large-scale applications with the right architecture.
Flask is a lightweight micro web framework that allows developers to build web applications using Python. It’s perfect for building small to medium-sized web applications, prototyping, and even large-scale applications with the right architecture.
from flask import Flask
app = Flask(__name__)
@app.route("/")
def hello_world():
return "<p>Hello, World!</p>"
if __name__ == "__main__":
app.run(debug=True)
Output:
Read the Flask Documentation for more details
The Food Vision WebApp is organized into several key components: a virtual environment folder (myenv/
), static files for frontend assets (static/
), HTML templates (templates/
), and a main application file (app.py
). The .env
file stores sensitive configuration details. This structure ensures a clean separation of concerns, making the project easier to manage and scale.
This section outlines the folder structure of the Food Vision WebApp, detailing where various components are located. Understanding this organization is crucial for maintaining and expanding the application efficiently.
myenv/ # folder for virtual environment
│
static/ # Folder for static files
│ ├── scripts.js
│ └── styles.css
│
templates/ # Folder for HTML templates
│ └── index.html
│
.env # Environment variables file
app.py # Main application file
Creating a virtual environment ensures that your project dependencies are isolated from the global Python environment. Follow these steps to set up and activate a virtual environment for the Food Vision WebApp.
python -m venv myenv
.\myenv\Scripts\activate
.\myenv\Scripts\Activate.ps1
source myenv/bin/activate
Install the required Python packages to run the Food Vision WebApp effectively. These dependencies include libraries for web development, image processing, and environment management.
pip install google-generativeai
pip install flask
pip install pillow
pip install python-dotenv
The HTML template provides the structure for the Food Vision WebApp’s front-end. This section covers the layout, file upload form, and placeholders for displaying the uploaded image and results.
<!-- templates/index.html -->
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Nutrify</title>
<link rel="stylesheet" href="{{ url_for('static', filename='styles.css') }}">
<script src="{{ url_for('static', filename='scripts.js') }}" defer></script>
</head>
<!-- templates/index.html -->
<body>
<div class="container">
<div class="upload-section">
<div class="upload-form">
<h2>Upload a file</h2>
<p>Attach the file below</p>
<form id="uploadForm" method="post" enctype="multipart/form-data">
<div class="upload-area" id="uploadArea">
<input type="file" id="uploadInput" name="uploadInput" accept=".jpg, .jpeg, .png" required>
<label for="uploadInput">Drag file(s) here to upload.<br>Alternatively, you can select a file by <a href="#" onclick="document.getElementById('uploadInput').click(); return false;">clicking here</a></label>
</div>
<div id="fileName" class="file-name"></div>
<button type="submit" id="submitBtn">Upload File</button>
</form>
<div id="loadingIndicator" style="display: none;">
<div class="spinner"></div>
<p>Loading...</p>
</div>
</div>
<div id="imageDisplay" class="image-display"></div>
</div>
<div id="responseOutput" class="response-output"></div>
</div>
</body>
</html>
The CSS file enhances the visual presentation of the Food Vision WebApp. It includes styles for layout, buttons, loading indicators, and responsive design to ensure a seamless user experience.
body {
font-family: 'Roboto', sans-serif;
background-color: #f4f4f4;
margin: 0;
padding: 0;
color: #333;
overflow-y: auto; /* Allows scrolling as needed */
min-height: 100vh; /* Ensures at least full viewport height */
display: flex;
flex-direction: column; /* Adjusts direction for content flow */
}
.center-container {
display: flex;
align-items: center;
justify-content: center;
flex-grow: 1; /* Allows the container to expand */
}
.container {
display: flex;
flex-direction: column;
justify-content: center;
align-items: center;
width: 100%;
max-width: 100%;
padding: 20px;
background-color: #fff;
box-shadow: 0 5px 15px rgba(0, 0, 0, 0.1);
border-radius: 8px;
flex-grow: 1;
box-sizing: border-box; /* Add this line */
}
.upload-section {
display: flex;
width: 100%;
justify-content: space-between;
align-items: flex-start;
margin-bottom: 20px;
}
.upload-form {
width: 48%;
}
.image-display {
width: 48%;
text-align: center;
}
h2 {
color: #444;
margin-bottom: 10px;
}
p {
margin-bottom: 20px;
color: #666;
}
/* Upload area styles */
.upload-area {
border: 2px dashed #ccc;
border-radius: 8px;
padding: 20px;
margin-bottom: 20px;
cursor: pointer;
}
.upload-area input[type="file"] {
display: none;
}
.upload-area label {
display: block;
color: #666;
cursor: pointer;
}
.upload-area a {
color: #007bff;
text-decoration: none;
}
.upload-area a:hover {
text-decoration: underline;
}
.file-name {
margin-bottom: 20px;
font-weight: bold;
color: #444;
}
/* Button styles */
button {
padding: 10px 20px;
border: none;
border-radius: 8px;
cursor: pointer;
font-size: 1em;
transition: background-color 0.3s ease, transform 0.2s ease;
background-color: #007bff;
color: #fff;
}
button:hover {
background-color: #0056b3;
transform: translateY(-2px);
}
/* Loading indicator styles */
#loadingIndicator {
display: none;
text-align: center;
margin-top: 20px;
}
.spinner {
border: 4px solid rgba(0, 0, 0, 0.1);
border-top: 4px solid #007bff;
border-radius: 50%;
width: 40px;
height: 40px;
animation: spin 1s linear infinite;
margin: 0 auto;
}
@keyframes spin {
0% { transform: rotate(0deg); }
100% { transform: rotate(360deg); }
}
/* Image display styles */
#imageDisplay img {
max-width: 100%;
height: auto;
border-radius: 8px;
box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
}
/* Response output styles */
.response-output {
width: 100%;
}
#responseOutput {
text-align: left;
margin-top: 20px;
}
#responseOutput h2 {
color: #333;
margin-bottom: 10px;
font-size: 1.5em;
}
#responseOutput pre {
white-space: pre-wrap;
padding: 10px;
background-color: #f9f9f9;
border: 1px solid #ddd;
border-radius: 8px;
font-size: 1em;
}
The app.py file powers the Food Vision WebApp by managing routes and handling image uploads. It integrates with the Gemini Flash 1.5 model to provide nutritional analysis and responses.
This section imports the necessary libraries and modules for the Flask application. These include Flask for web development, google.generativeai
for interacting with the Gemini API, and PIL for image processing.
from flask import Flask,render_template,request, redirect, url_for,jsonify
import google.generativeai as genai
from PIL import Image
import base64
import io
import os
Here, you configure the Gemini AI library using your API key. This setup ensures that the application can communicate with the Gemini API to process image data and generate nutritional information.
my_api_key_gemini = os.getenv('GOOGLE_API_KEY')
genai.configure(api_key=my_api_key_gemini)
Obtain your API key from the Google AI Studio. This key is crucial for authenticating requests to the Gemini API.
Go to Google AI Studio here and get your API key.
Save your API key in a .env
file to keep it secure and easily accessible. The application retrieves the key from this file to configure the Gemini API.
GOOGLE_API_KEY="Your_API_KEY"
In this step, you create the routes for the Flask application. These routes handle requests and responses, including rendering the homepage and processing file uploads.
app = Flask(__name__)
@app.route('/')
def index():
return render_template('index.html')
Creating a well-structured Flask route for handling an image upload, processing it, and sending it to the Gemini Flash 1.5.
@app.route('/upload', methods=['POST'])
def upload():
uploaded_file = request.files['uploadInput']
if uploaded_file:
image = Image.open(uploaded_file)
# Ensure correct mime type based on file extension
if uploaded_file.filename.endswith('.jpg') or uploaded_file.filename.endswith('.jpeg'):
mime_type = 'image/jpeg'
elif uploaded_file.filename.endswith('.png'):
mime_type = 'image/png'
else:
return jsonify(error='Unsupported file format'), 400
# Encode image to base64 for sending to API
buffered = io.BytesIO()
image.save(buffered, format=image.format)
encoded_image = base64.b64encode(buffered.getvalue()).decode('utf-8')
image_parts = [{
"mime_type": mime_type,
"data": encoded_image
}]
input_prompt = """
You are an expert in nutritionist where you need to see the food items from the image
and calculate the total calories, also provide the details of every food items with calories intake
is below format
1. Item 1 - no of calories, protein
2. Item 2 - no of calories, protein
----
----
Also mention disease risk from these items
Finally you can also mention whether the food items are healthy or not and Suggest Some Healthy Alternative
is below format
1. Item 1 - no of calories, protein
2. Item 2 - no of calories, protein
----
----
"""
# Simulate API response
model1 = genai.GenerativeModel('gemini-1.5-flash')
response = model1.generate_content([input_prompt, image_parts[0]])
result = response.text
return jsonify(result=result, image=encoded_image)
return jsonify(error='No file uploaded'), 400
Execute the Flask app with app.run(debug=True)
to start the server. This provides a local development environment where you can test and debug the application.
from flask import Flask,render_template,request, redirect, url_for,jsonify
import google.generativeai as genai
from PIL import Image
import base64
import io
import os
my_api_key_gemini = os.getenv('GOOGLE_API_KEY')
genai.configure(api_key=my_api_key_gemini)
app = Flask(__name__)
@app.route('/')
def index():
return render_template('index.html')
@app.route('/upload', methods=['POST'])
def upload():
uploaded_file = request.files['uploadInput']
if uploaded_file:
image = Image.open(uploaded_file)
# Ensure correct mime type based on file extension
if uploaded_file.filename.endswith('.jpg') or uploaded_file.filename.endswith('.jpeg'):
mime_type = 'image/jpeg'
elif uploaded_file.filename.endswith('.png'):
mime_type = 'image/png'
else:
return jsonify(error='Unsupported file format'), 400
# Encode image to base64 for sending to API
buffered = io.BytesIO()
image.save(buffered, format=image.format)
encoded_image = base64.b64encode(buffered.getvalue()).decode('utf-8')
image_parts = [{
"mime_type": mime_type,
"data": encoded_image
}]
input_prompt = """
You are an expert in nutritionist where you need to see the food items from the image
and calculate the total calories, also provide the details of every food items with calories intake
is below format
1. Item 1 - no of calories, protein
2. Item 2 - no of calories, protein
----
----
Also mention disease risk from these items
Finally you can also mention whether the food items are healthy or not and Suggest Some Healthy Alternative
is below format
1. Item 1 - no of calories, protein
2. Item 2 - no of calories, protein
----
----
"""
# Simulate API response (replace with actual API call)
model1 = genai.GenerativeModel('gemini-1.5-flash')
response = model1.generate_content([input_prompt, image_parts[0]])
result = response.text
return jsonify(result=result, image=encoded_image)
return jsonify(error='No file uploaded'), 400
if __name__ == "__main__":
app.run(debug=True)
Output:
The output will be a JSON response containing the nutritional analysis and health recommendations based on the uploaded food image. The analysis includes details like calories, protein content, potential health risks, and suggestions for healthier alternatives.
Get the code from my GitHub Repo: here.
Gemini Flash 1.5 advances the state of AI models by addressing core requirements with enhanced speed, efficiency, and scalability. It aims to meet the demands of today’s fast-moving digital world. Armed with quite a few powerful performance features, flexible tuning support, and broadened scope in text, image, and structured data tasks, Gemini Flash 1.5 empowers developers to build highly creative AI solutions with power and cost-effectiveness. It is lightweight, high in volume for processing; hence, it serves as a very good choice for real-time mobile apps and large enterprise systems.
A. Gemini Flash 1.5 is a lightweight, cost-efficient AI model developed by Google, optimized for high-volume tasks with low latency. It is part of the Gemini 1.5 release, alongside the Gemini 1.5 Pro variant.
A. Gemini Flash 1.5 is designed for faster and more cost-effective processing, making it ideal for high-volume tasks. While both models share similarities, Flash 1.5 optimizes speed and scalability for scenarios where these factors are critical.
A. Key features include enhanced performance with 1000 requests per minute, tuning support for customization, JSON schema mode for structured outputs, and mobile support with light mode in Google AI Studio.
A. Yes, tuning support is available for Gemini Flash 1.5, allowing you to customize the model according to your specific needs. Tuning is currently free of charge, with no additional per-token costs.
A. Yes, Gemini Flash 1.5 supports image processing, making it suitable for tasks such as image classification and object detection, in addition to text and JSON handling.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.