Style your Pandas DataFrame and Make it Stunning

Kaustubh Gupta Last Updated : 12 Oct, 2024

10 min read

Pandas is a widely-used data science library that presents data in table format, similar to Excel. Just like in Excel, you can customize tables by adding colors and highlighting important values. The Pandas Style API allows for similar styling within dataframes to enhance presentation and make data more visually appealing. This article covers the features of Pandas styling, built-in functions, customizations, and advanced usage for improving dataframe aesthetics using df.style.

This article was published as a part of the Data Science Blogathon.

What is Pandas Style?
Styling the DataFrame
Create your Own Styling Method
- Target: apply function
- Target: apply map function
Table Styles
Export to Excel
How to apply style to DataFrame?
Conclusion

What is Pandas Style?

A pandas dataframe is a tabular structure comprising rows and columns. One prevalent environment for data-related tasks is Jupyter notebooks, which are web-based, platform-independent integrated development environments (IDEs). In Jupyter notebooks, the pandas style of the dataframe is achieved through the use of HTML tags and CSS for rendering. Consequently, you have the flexibility to customize the appearance of these web elements.

We will see this in action in upcoming sections. For now, let’s create a sample dataset and display the output dataframe.

import pandas as pd
import numpy as np
np.random.seed(88)
df = pd.DataFrame({'A': np.linspace(1, 10, 10)})
df = pd.concat([df, pd.DataFrame(np.random.randn(10, 4), columns=list('BCDE'))],
               axis=1)
df.iloc[3, 3] = np.nan
df.iloc[0, 2] = np.nan
print(df)

Doesn’t this look boring to you? What if you transform this minimal table to this:

The transformed table above has:

Maximum values marked yellow for each column
Null values marked red for each column
More appealing table style, better fonts for header, and increased font size.

Now, we will be exploring all the possible ways of styling the dataframe and making it similar to what you saw above, so let’s begin!

Styling the DataFrame

To leverage all the pandas styling properties for the dataframe, employ the pandas styling accessor (assuming the dataframe is stored in the variable “df”):

df.style

This accessor helps in the modification of the styler object (df.style), which controls the display of the dataframe on the web. Let’s look at some of the methods to style the dataframe.

1. Highlight Min-Max values

The dataframes can take a large number of values but when it is of a smaller size, then it makes sense to print out all the values of the dataframe. Now, you might be doing some type of analysis and you wanted to highlight the extreme values of the data. For this purpose, you can add style to your dataframe that highlights these extreme values.

For Highlighting Maximum Values

Chain “.highlight_max()” function to the styler object. Additionally, you can also specify the axis for which you want to highlight the values. (axis=1: Rows, axis=0: Columns – default).

df.style.highlight_max()

For Highlighting Maximum Values | Pandas style

For Highlighting Minimum Values

Chain “.highlight_min()” function to the styler object. Here also, you can specify the axis at which these values will be highlighted.

df.style.highlight_min()

Both Min-Max highlight functions support the parameter “color” to change the highlight color from yellow.

2. Highlight Null Values

Every dataset has some or the other null/missing values. These values should be either removed or handled in such a way that it doesn’t introduce any biasness. To highlight such values, you can chain the “.highlight_null()” function to the styler object. This function doesn’t support the axis parameter and the color control parameter here is “null_color” which takes the default value as “red”

df.style.highlight_null(null_color="green")

set_na_rep()

Along with highlighting the missing values, they may be represented as “nan”. You can change the representation of these missing values using the set_na_rep() function. This function can also be chained with any styler function but chaining it with highlight_null will provide more details.

df.style.set_na_rep("OutofScope").highlight_null(null_color="orange")

3. Create Heatmap within dataframe

Heatmaps are used to represent values with the color shades. The higher is the color shade, the larger is the value present. These color shades represent the intensity of values as compared to other values. To plot such a mapping in the dataframe itself, there is no direct function but the “styler.background_gradient()” workaround does the work.

df.style.background_gradient()

Create Heatmap within dataframe | Pandas style

There are few parameters you can pass to this function to further customize the output generated:

cmap: By default, the “PuBu” colormap is selected by pandas You can create a custom matplotlib colormap and pass it to the camp parameter.
axis: Generating heat plot via rows or columns criteria, by default: columns
text_color_threshold: Controls text visibility across varying background colors.

4. Table Properties

As mentioned earlier also, the dataframe presented in the Jupyter notebooks is a table rendered using HTML and CSS. The table properties can be controlled using the “set_properties” method. This method is used to set one or more data-independent properties.

This means that the modifications are done purely based on visual appearance and no significance as such. This method takes in the properties to be set as a dictionary.

Example: Making table borders green with text color as purple.

df.style.set_properties(**{'border': '1.3px solid green',
                          'color': 'magenta'})

Also Read: How to Create a Pandas DataFrame from Lists ?

5. Create Bar Charts

Just as the heatmap, the bar charts can also be plotted within the dataframe itself. The bars are plotted in each cell depending upon the axis selected. By default, the axis=0 and the plot color are also fixed by pandas but it is configurable. To plot these bars, you simply need to chain the “.bar()” function to the styler object.

df.style.bar()

6. Control Precision

The current values of the dataframe have float values and their decimals have no boundary condition. Even the column “A”, which had to hold a single value is having too many decimal places. To control this behavior, you can use the “.set_precision()” function and pass the value for maximum decimals to be allowed.

df.style.set_precision(2)

Now the dataframe looks clean.

7. Add Captions

Like every image has a caption that defines the post text, you can add captions to your dataframes. This text will depict what the dataframe results talk about. They may be some sort of summary statistics like pivot tables.

df.style.set_caption("This is Analytics Vidhya Blog").set_precision(2).background_gradient()

(Here, different methods have been changed along with the caption method)

8. Hiding Index or Column

As the title suggests, you can hide the index or any particular column from the dataframe. Hiding index from the dataframe can be useful in cases when the index doesn’t convey anything significant about the data. The column hiding depends on whether it is useful or not.

df.style.hide_index()

9. Control Display Values

Using the styler object’s “.format()” function, you can distinguish between the actual values held by the dataframe and the values you present. The “format” function takes in the format spec string that defines how individual values are presented.

You can directly specify the specification which will apply to the whole dataset or you can pass the specific column on which you want to control the display values.

df.style.format("{:.3%}")

You may notice that the missing values have also been marked by the format function. This can be skipped and substituted with a different value using the “na_rep” (na replacement) parameter.

df.style.format("{:.3%}", na_rep="&&")

Also Read: The Ultimate Guide to Pandas For Data Science!

Create your Own Styling Method

Although you have many methods to style your dataframe, it might be the case that your requirements are different and you need a custom styling function for your analysis. You can create your function and use it with the styler object in two ways:

apply function: When you chain the “apply” function to the styler object, it sends out the entire row (series) or the dataframe depending upon the axis selected. Hence, if you make your function work with the “apply” function, it should return the series or dataframe with the same shape and CSS attribute-value pair.
apply map function: This function sends out scaler values (or element-wise) and therefore, your function should return a scaler only with CSS attribute-value pair.

Let’s implement both types:

Target: apply function

def highlight_mean_greater(s):
    '''
    highlight yellow is value is greater than mean else red.
    '''
    is_max = s > s.mean()
    return ['background-color: yellow' if i else 'background-color: red' for i in is_max]

df.style.apply(highlight_mean_greater)

Target: apply map function

def color_negative_red(val):
    """
    Takes a scalar and returns a string with
    the css property `'color: red'` for negative
    strings, black otherwise.
    """
    color = 'red' if val < 0 else 'black'
    return 'color: %s' % color

df.style.apply(color_negative_red)

Target: apply map function | Pandas style

Table Styles

These are styles that apply to the table as a whole, but don’t look at the data. It is very similar to the set_properties function but here, in the table styles, you can customize all web elements more easily.

The function of concern here is the “set_table_styles” that takes in the list of dictionaries for defining the elements. The dictionary needs to have the selector (HTML tag or CSS class) and its corresponding props (attributes or properties of the element). The props need to be a list of tuples of properties for that selector.

The images shown in the beginning, the transformed table has the following style:

styles = [
    dict(selector="tr:hover",
                props=[("background", "#f4f4f4")]),
    dict(selector="th", props=[("color", "#fff"),
                               ("border", "1px solid #eee"),
                               ("padding", "12px 35px"),
                               ("border-collapse", "collapse"),
                               ("background", "#00cccc"),
                               ("text-transform", "uppercase"),
                               ("font-size", "18px")
                               ]),
    dict(selector="td", props=[("color", "#999"),
                               ("border", "1px solid #eee"),
                               ("padding", "12px 35px"),
                               ("border-collapse", "collapse"),
                               ("font-size", "15px")
                               ]),
    dict(selector="table", props=[
                                    ("font-family" , 'Arial'),
                                    ("margin" , "25px auto"),
                                    ("border-collapse" , "collapse"),
                                    ("border" , "1px solid #eee"),
                                    ("border-bottom" , "2px solid #00cccc"),                                    
                                      ]),
    dict(selector="caption", props=[("caption-side", "bottom")])
]

And the required methods which created the final table:

df.style.set_table_styles(styles).set_caption("Image by Author (Made in Pandas)").highlight_max().highlight_null(null_color='red')

Export to Excel

You can store all the styling you have done on your dataframe in an excel file. The “.to_excel” function on the styler object makes it possible. The function needs two parameters: the name of the file to be saved (with extension XLSX) and the “engine” parameter should be “openpyxl”.

df.style.set_precision(2).background_gradient().hide_index().to_excel('styled.xlsx', engine='openpyxl')

How to apply style to DataFrame?

Using `Styler.apply()`

What it does: This method lets you apply styles to the entire DataFrame, either by rows or columns.
How to use it:
- Create a function that returns a style (like color) for each row or column.
- Use Styler.apply() to apply your function.
Example: If you want to color the text in a column based on its value, you can write a function that checks the value and returns a color.

2. Using `Styler.applymap()`

What it does: This method applies styles to each individual cell in the DataFrame.
How to use it:
- Create a function that returns a style for a single cell.
- Use Styler.applymap() to apply your function to every cell.
Example: If you want to change the background color of cells based on their values, this is the method to use.

3. Formatting Values

What it does: This method allows you to change how numbers or text look in the DataFrame.
How to use it:
- Use Styler.format() to specify how you want certain columns to be displayed (like showing numbers with two decimal places).
Example: You can format a column of prices to always show two decimal points.

4. Setting Table Styles

What it does: This method lets you add CSS styles to the entire table.
How to use it:
- Use Styler.set_table_styles() to define styles for headers, rows, or cells.
Example: You can set all table headers to have a bold font and a specific background color.

Additional Options

Background Gradient: Use Styler.background_gradient() to create a color gradient based on the values in the DataFrame.
Highlight Maximum Value: Use Styler.highlight_max() to automatically highlight the highest value in a column.
Set Caption: Use Styler.set_caption() to add a title to your table.

Conclusion

In this detailed article, we saw all the built-in methods to style the dataframe. Then we looked at how to create custom styling functions and then we saw how to customize the dataframe by modifying it at HTML and CSS level. We also saw how to save our styled dataframe into excel files.

Hope you like the article! To enhance data presentation, use pandas to HTML style with df.style. The pandas style feature allows customization, enabling df.style.format for elegant outputs, making df.style pandas more visually appealing.

Q1. What is pandas style?

A. Pandas styling refers to the capability in the Python library Pandas to apply formatting and styling to tabular data frames. It allows users to customize the visual representation of their data, such as changing cell colors, fonts, and highlighting specific values, making it easier to analyze and present data in a more visually appealing and informative manner.

Q2. What is the code style of pandas?

A. Pandas follows the PEP 8 style guide, which is the Python Enhancement Proposal for code style conventions. This style guide outlines various recommendations for writing clean, readable, and consistent Python code. Key aspects of the Pandas code style include:
1. Indentation: Pandas code uses 4 spaces for indentation, following the standard Python convention.
2. Variable and Function Naming: Descriptive, lowercase variable names with underscores (snake_case) are preferred. Function names should also be lowercase with underscores.
3. Line Length: Pandas code adheres to the recommended line length of 79 characters per line, with a maximum of 72 characters for docstrings and comments.
4. Import Statements: Imports are organized with standard libraries first, followed by third-party libraries, and then Pandas imports.
5. Whitespace: Consistent use of whitespace around operators and after commas is encouraged.
6. Comments: Code should be well-documented with clear and concise comments.
By adhering to these guidelines, Pandas code remains consistent, readable, and easy to maintain.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Kaustubh Gupta

Kaustubh Gupta is a skilled engineer with a B.Tech in Information Technology from Maharaja Agrasen Institute of Technology. With experience as a CS Analyst and Analyst Intern at Prodigal Technologies, Kaustubh excels in Python, SQL, Libraries, and various engineering tools. He has developed core components of product intent engines, created gold tables in Databricks, and built internal tools and dashboards using Streamlit and Tableau. Recognized as India’s Top 5 Community Contributor 2023 by Analytics Vidhya, Kaustubh is also a prolific writer and mentor, contributing significantly to the tech community through speaking sessions and workshops.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Style your Pandas DataFrame and Make it Stunning

Table of contents

What is Pandas Style?

Styling the DataFrame

1. Highlight Min-Max values

For Highlighting Maximum Values

For Highlighting Minimum Values

2. Highlight Null Values

set_na_rep()

3. Create Heatmap within dataframe

4. Table Properties

5. Create Bar Charts

6. Control Precision

7. Add Captions

8. Hiding Index or Column

9. Control Display Values

Create your Own Styling Method

Target: apply function

Target: apply map function

Table Styles

Export to Excel

How to apply style to DataFrame?

Using Styler.apply()

2. Using Styler.applymap()

3. Formatting Values

4. Setting Table Styles

Additional Options

Conclusion

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

Using `Styler.apply()`

2. Using `Styler.applymap()`