This article was published as a part of the Data Science Blogathon
Data is everywhere you just need an eye to select which data is useful, by keeping stories interesting. That doesn’t mean you have to only just show graph and work is done it is the role of the data visualizer how to present the right data which helps the business to grow and have a powerful impact.
The Data Which we are going to use is available here and the description of the data is available here
The data tell us which products are recommended on basis of Ratings, Reviews of products, and many other factors.
Clothing ID: Integer Categorical variable that refers to the specific piece being reviewed. Age: Age of the reviewer’s age. Title: The Title of the review. Review Text: The description of the product by customers. Rating: Ratings were given by the customer to a different product from worst 1 to best 5 Recommended IND: Binary variable stating where the customer recommends the product where 1 is recommended, 0 is not recommended. Division Name: Categorical name of the product high-level division. Department Name: Categorical name of the product department name. Class Name: Categorical name of the product class name. Positive Feedback Count: Positive Integer documenting the number of other customers who found this review positive.
Original DataFrame looks Like:
1. what is plotly
2. Points to keep in mind while designing graph
3. Data visualization graph configuration
4. Chart Types
5. Embedding charts in a blog with Chart Studio
6. Plotly Dash
Plotly is an open-source library that provides a list of chart types as well as tools with callbacks to make a dashboard. The charts which I have embedded here are all made in chart studio of plotly. It helps to embed charts easily anywhere you want.
The main plus point of plotly is its interactive nature and of course visual quality. Plotly is in great demand rather than other libraries like Matplotlib and Seaborn. Plotly provides a list of charts having animations in 1D, 2D, and 3D too for more details of charts check here.
If you just want to embed charts in your blogs you don’t need to have prior knowledge of coding or javascript you can just use chart studio, where you just need to select the parameters and your chart is ready.
If you want to make a dynamic dashboard, Plotly provides Dash which is a plotly extension for developing web applications. for more details check plotly documentation here.
1) No need to keep all the data in one graph.
2) Sometimes displaying data in form of a card is also a great way of representing data.
I will show you two charts tell me which helps you to understand better.
The graph shows how many people have given positive, negative, and neutral reviews for a product.
3) Styling the graph
The thing which I have observed is most of the time people overdue to it in different ways like they will put different styling in one graph only.
I will show you two charts one will be right and another one is to avoid.
There are others things to keep in mind while designing graphs, which we will discuss in the later section.
Keeping in mind these simple steps that will help you to get your work easily done.
https://unsplash.com/photos/FtZL0r4DZYk
Mainly, there are three types of analysis for Data Visualization:
Let’s start how to use Plotly for making graphs.
Installation
Install with pip or conda
# pip pip install plotly # anaconda conda install -c anaconda plotly
While importing the plot you should install the pandas library first otherwise there will be an error.
#Importing library import plotly.express as px
fig.update_layout(layout_parameters or add annotations) fig.update_traces(further graph parameters) fig.update_xaxis() # or update_yaxis fig.show()
Using update_traces we can change the text font color, size
Using update_layout we can add graph parameters. Below I have explained every parameter.
The pie chart is mostly used for categorical data when you have more than 2 categories it is easy to compare.
division_rat = px.pie(df, names='Rating', values='Rating', hole=0.6, title='Overall Ratings of Products', color_discrete_sequence=px.colors.qualitative.T10) division_rat.update_traces(textfont=dict(color='#fff')) division_rat.update_layout(autosize=True, height=200, width=800, margin=dict(t=80, b=30, l=70, r=40), plot_bgcolor='#2d3035', paper_bgcolor='#2d3035', title_font=dict(size=25, color='#a5a7ab', family="Muli, sans-serif"), font=dict(color='#8a8d93'), legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1) )
Interpret:
As we see in the graph 5-star ratings are 66% given to the products so overall products are nice.
From a histogram, we can see how one category differs from the other like which is highest and lowest.
Interpret:
From a stacked histogram we can easily compare two quantities against each other.
Interpret:
Most of the products are recommended and the ratio of recommended to non-recommended products is too much, which is a great sign.
Box plot is a great option whenever we want to look for the outliers. It will give the range where most of the data lie in quartile ranges.
fig_box = px.box(df, x='Age', title='Distribution of Age', height=250, color_discrete_sequence=['#03DAC5'], ) fig_box.update_xaxes(showgrid=False), fig_box.update_layout(margin=dict(t=100, b=0, l=70, r=40), xaxis_tickangle=360, xaxis_title=' ', yaxis_title=" ", plot_bgcolor='#2d3035', paper_bgcolor='#2d3035', title_font=dict(size=25, color='#a5a7ab', family="Muli, sans-serif"), font=dict(color='#8a8d93'), legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1) )
A funnel chart is mainly used when we have it in a decreasing manner like in sales data or company size.
Interpret:
Interpret:
Whenever we need to see the correlation between the data it is always the best option to go with heatmap.
import plotly.figure_factory as ff # Heatmap # Correlation between the feature show with the help of visualisation corrs = dff.corr() fig_heatmap = ff.create_annotated_heatmap( z=corrs.values, x=list(corrs.columns), y=list(corrs.index), annotation_text=corrs.round(2).values, showscale=True) fig_heatmap.update_layout(title= 'Correlation of whole Data', plot_bgcolor='#2d3035', paper_bgcolor='#2d3035', title_font=dict(size=25, color='#a5a7ab', family="Muli, sans-serif"), font=dict(color='#8a8d93'))
Pairplot is mostly used when we need to find the relation between different categories.
Interpret:
As we see there is a positive relation between Age and Recommended IND.
1-star, 2-star rating products are not generally recommended.
Installing chart studio
# pip pip install chart_studio
Setting the chart studio
1. First, you need to make an account on chart studio after that went to your profile select settings options scroll down select API keys where put the username and password which you set. After completing this process click on generate key button you will see a key which is your API key.
Procedure: profile >> Settings >> Api key
import chart_studio import chart_studio.plotly as py import chart_studio.tools as tls chart_studio.tools.set_credentials_file(username=' ', api_key=' ')
2. Installing the library run any code which is present above for example run a pie chart
3. Run the below code
py.plot(figure_name, fielname='Pie chart', auto_open=True)
After completing all the 3 procedure chart studio will open scroll down you will see the embed option just copy-paste the link and the graph is embedded.
If you want to make a dynamic dashboard, Plotyy provides Dash which is a plotly extension for developing web applications. for more details check plotly documentation here.
To make the dashboard looks good plotly provides Css, Html, Bootsrap, react too.