How to choose the right chart for your data visualization
By the end of this article, you will learn “How to choose the right chart for data visualization”
Introduction
I love data visualization. The complete amount of knowledge it conveys to the audience in such a limited space is astonishing. It is so easy to broadcast your message to your audience using data visualization. It allows the audience to grasp the insights in the fastest and easiest way. I’ve worked with many data visualizing tools such as Power BI, Tableau, and MS Excel. These are the brilliant tool to perform data cleaning, data preprocessing, and data visualization in many analytics projects. There are many varieties of graphs that are present in these tools such as Bar graphs, Line charts, scatter plots, Dual-axis charts, Sparklines charts, Waterfall charts, Pie charts, Area charts, Column charts are many more. In this article, I want to answer the eternal question of “How do you decide which chart to choose for your problem or your project?” It can be very overwhelming if you are new to this kind of thing, and choosing the right chart is very important.
If you are new to the data visualization field and excited to learn more, make sure you check out the FREE “MS Excel” and “Tableau” courses. You will learn the basic functionalities and how to create different charts. It’s a perfect starting point.
Table of content
Importance of Data Visualization
The objective of your visual
Choosing the right visualization for your data
Comparison charts
Distribution charts
The breakup of a whole charts
Relationship charts
Trend charts
Importance of Data Visualization
Data Visualization is a graphical representation of data and plays a vital role in understanding information in a better way. It is a way to represent data in visual content.
Look at the data that is displayed below:
Picture 1: Doesn’t make any sense
Picture 2: This makes sense(because of visualization)
What do you think, by looking at which picture, you can grasp the insights?
Of course, it is the second picture because of the graphical representation of the data.
I’ve listed down some benefits of visualization:
It helps us to convey the right message to the audience through visuals.
It helps us find outliers in our data.
It helps the business leader to take an accurate decision.
It helps us to understand how the data is distributed over time.
The objective of your visual
Before making the visualization, it is best to ask yourself what the audience will be looking for in your chart. Understand the requirements and preferences of your viewer. Know their background. Do they have enough time for a detailed visualization? How aware are they of the context of the visualization? What additional information are they looking for? Are they aware of the graphs being used? And so on. Your viewer’s information needs should be your guide in creating effective and compelling data visualizations.
Choose the right visualization for your data
There are a tremendous number of charts available. Choosing the right visualization is paramount when you’re presenting to a senior leader. It is not easy as it sounds, because an incorrect representation can lead to a wrong message or wrong decision taken by the audience or whatever you’ve in your mind when you were creating that chart, that message might not be conveyed to the audience. Here, your focus should be on conveying the right message to your audience in an optimal way. Now let me take you through the type of messages, that we usually send out when we’re creating impactful visualizations in business.
These are the types of messages that you usually work on. Maybe you want to show a comparison of two features for example reason wise sales, the distribution of the data, maybe you want to show the breakup of the entire whole visualization, or you simply want to show trends for example sales trends.
Let’s look at all these one by one and see what kinds of charts we can use to convey the right message.
1) Comparison Chart
In this chart, we compare one value with the other like region-wise sales, economy rate comparison of bowler in cricket. We can use the following charts for comparison.
Column charts
It is used to compare values across multiple categories.
Here, the category appears horizontally(X-axis) and values vertically(Y-axis).
In the column charts, you can also show information about parts of a whole across different categories, and you can show this in absolute value as well as relative terms. Here comes the concept of a stacked column chart and 100% stacked column charts.
Bar charts
As you’re quite familiar with column charts, you will find that working with bar charts is very synonymous.
The only difference between them is that in a bar chart, values are represented on the X-axis and categories on the Y-axis.
We typically use a bar graph to show values across categories when the duration or category text is long.
Stacked bar charts are used to compare parts of a whole(relative and absolute) and compare change over categories or time.
Line charts
It is one of the most popular charts and vitally used in most industries.
Whether you’re analyzing sales data, whether you’re looking at year-on-year profit, whether you’re looking at how a person’s salary increases in the last year, line charts are very helpful in these scenarios.
The line chart is used to show trends over time or categories.
Here, the category appears horizontally(X-axis) and value vertically(Y-axis).
Scatter plots
An XY(Scatter) chart uses numerical values along both axes.
Scatter plots are useful for showing a correlation between the data points that may not be easy to see from the data alone.
It is used for displaying and comparing numerical values, such as scientific or statistical data.
2) Distribution charts
These charts are used to show the spread of the data values over categories or continuous values. We can use the following charts in order to visualize the distribution of the data. For example Distribution of bugs found in 10 weeks of the software testing phase.
Histogram
It is used to graphing the frequency over a distribution. It is a very useful graph in the analytics world and can infer many useful insights from the data.
Visually, all the bars are touching each other with no space between them.
Box plot
It is also known as Box and whiskers plot.
The line in the middle of the box is the median value. This means that 50% of the data are above the median value and 50% of the data are below the median value.
Medians are useful because they’re not swayed by outliers as mean is.
Within the box itself, there is 25% of data above the median and 25% of data below the median, so 50% of the data is within the box.
By using this plot, we can easily spot outliers and the distribution of the plot.
KDE Plot
KDE is an abbreviation for the Kernel Density Estimation plot.
It’s a smooth form of a histogram.
A kernel density estimate (KDE) plot is a method for visualizing the distribution of observations in a dataset, analogous to a histogram.
Relative to a histogram, KDE can produce a plot that is less cluttered and more interpretable, especially when drawing multiple distributions.
3) The breakup of a whole chart
These charts are used to analyze, how various parts comprise the whole. These charts are very handy in many scenarios where we have to analyze revenue contribution by different regions, batsmen scored on which sides of the ground. Charts used to represent these are listed below.
Pie Chart
If you want to represent your categorical data as part of the whole, then you should use a pie chart.
Each slice represents the percentage that the given category occupies out of the whole.
It’s better to use a pie chart if you’re having less than 5 categories.
Donut Chart
It is a variant of a pie chart, with the hole in the center.
It displays the categories as arcs rather than slices.
Stacked Column Chart
A Stacked column chart is used when you want to show the relative percentage of multiple data series in stacked columns, the total (cumulative) of stacked columns always equals 100%.
The 100% stacked column chart can show the part-to-whole proportions over time, for example, the proportion of quarterly sales per region or the proportion of monthly mortgage payment that goes toward interest vs. principal.
Stacked Bar Chart
A Stacked Bar chart is used to show the relative percentage of multiple data series in a stacked bar.
4) Relationship charts
These relationships charts are very helpful when we want to know that what is the relation between the different variables. Charts used to visualize the relationship between the variables are listed below.
Scatter Plot
A scatter chart uses numerical values along both axes.
It uses dots to represent the values for two different numerical values.
The position of each dot on the horizontal axis and the vertical axis signifier the value of a particular data point.
It is useful for showing a correlation between the data points that may not be easy to see from the data alone.
It is used for displaying and comparing numerical values, such as scientific or statistical data.
Line Chart
As discussed above, a line chart is also used to find the relationship between the two variables.
5) Trend charts
This is used to visualize trends of values over time and categories, it is also known as “Time Series” data in the data-driven world. For example Run rate tracker over by over, Hourly temperature variation during a day. Listed below are the charts used to represent time series data.
Line Chart
The best way to visualize trend data is by line chart.
Line charts are also used to see the trends in various domains.
Area Chart
It is used to see the magnitude of the values.
It shows the relative importance of values over time.
It is similar to a line chart, but because the area between lines is filled in, the area chart emphasizes the magnitude of values more than the line chart does.
Column Chart
A column chart as discussed above is also used to show the trends of values over time and categories.
End Notes
With this, we’ve reached the end of the article. To keep this small and concise, I’ve listed some of the basic plots that we can use in the different scenarios. Let me know in the comments if you want me to cover some more visualization concepts in the future.
We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.
Show details
Powered By
Cookies
This site uses cookies to ensure that you get the best experience possible. To learn more about how we use cookies, please refer to our Privacy Policy & Cookies Policy.
brahmaid
It is needed for personalizing the website.
csrftoken
This cookie is used to prevent Cross-site request forgery (often abbreviated as CSRF) attacks of the website
Identityid
Preserves the login/logout state of users across the whole site.
sessionid
Preserves users' states across page requests.
g_state
Google One-Tap login adds this g_state cookie to set the user status on how they interact with the One-Tap modal.
MUID
Used by Microsoft Clarity, to store and track visits across websites.
_clck
Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_clsk
Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.
SRM_I
Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.
SM
Use to measure the use of the website for internal analytics
CLID
The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.
SRM_B
Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.
_gid
This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.
_ga_#
Used by Google Analytics, to store and count pageviews.
_gat_#
Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.
collect
Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.
AEC
cookies ensure that requests within a browsing session are made by the user, and not by other sites.
G_ENABLED_IDPS
use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.
test_cookie
This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.
_we_us
this is used to send push notification using webengage.
WebKlipperAuth
used by webenage to track auth of webenagage.
ln_or
Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.
JSESSIONID
Use to maintain an anonymous user session by the server.
li_rm
Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.
AnalyticsSyncHistory
Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.
lms_analytics
Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.
liap
Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.
visit
allow for the Linkedin follow feature.
li_at
often used to identify you, including your name, interests, and previous activity.
s_plt
Tracks the time that the previous page took to load
lang
Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings
s_tp
Tracks percent of page viewed
AMCV_14215E3D5995C57C0A495C55%40AdobeOrg
Indicates the start of a session for Adobe Experience Cloud
s_pltp
Provides page name value (URL) for use by Adobe Analytics
s_tslv
Used to retain and fetch time since last visit in Adobe Analytics
li_theme
Remembers a user's display preference/theme setting
li_theme_set
Remembers which users have updated their display / theme preferences
We do not use cookies of this type.
_gcl_au
Used by Google Adsense, to store and track conversions.
SID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
SAPISID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
__Secure-#
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
APISID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
SSID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
HSID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
DV
These cookies are used for the purpose of targeted advertising.
NID
These cookies are used for the purpose of targeted advertising.
1P_JAR
These cookies are used to gather website statistics, and track conversion rates.
OTZ
Aggregate analysis of website visitors
_fbp
This cookie is set by Facebook to deliver advertisements when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
fr
Contains a unique browser and user ID, used for targeted advertising.
bscookie
Used by LinkedIn to track the use of embedded services.
lidc
Used by LinkedIn for tracking the use of embedded services.
bcookie
Used by LinkedIn to track the use of embedded services.
aam_uuid
Use these cookies to assign a unique ID when users visit a website.
UserMatchHistory
These cookies are set by LinkedIn for advertising purposes, including: tracking visitors so that more relevant ads can be presented, allowing users to use the 'Apply with LinkedIn' or the 'Sign-in with LinkedIn' functions, collecting information about how visitors use the site, etc.
li_sugr
Used to make a probabilistic match of a user's identity outside the Designated Countries
MR
Used to collect information for analytics purposes.
ANONCHK
Used to store session ID for a users session to ensure that clicks from adverts on the Bing search engine are verified for reporting purposes and for personalisation
We do not use cookies of this type.
Cookie declaration last updated on 24/03/2023 by Analytics Vidhya.
Cookies are small text files that can be used by websites to make a user's experience more efficient. The law states that we can store cookies on your device if they are strictly necessary for the operation of this site. For all other types of cookies, we need your permission. This site uses different types of cookies. Some cookies are placed by third-party services that appear on our pages. Learn more about who we are, how you can contact us, and how we process personal data in our Privacy Policy.
Nice explanation Ram
I liked how well you have explained everything
👌