Data visualization is probably the most important and typically the least talked about area of data science.
I say that because how you create data stories and visualization has a huge impact on how your customers look at your work. Ultimately, data science is not only about how complicated and sophisticated your models are. It is about solving problems using data based insights. And in order to implement these solutions, your stakeholders need to understand what you are proposing.
One of the challenges in creating effective visualizations is to create images which speak for themselves. This article will tell one of the ways to do so using animated GIF images (Graphics Interchangeable format). This would be particularly helpful when you want to show time / flow based stories. Using animation in images, you can plot comparable data over time for specific set of parameters. In other words, it is easy to understand and see the growth of certain parameter over time.
Let me show this with an example
Let us say you want to show how GDP and life expectancy have changed for various continents / countries over time. What do you think is the best way to represent this relationship?
You can think of multiple options like:
Now, let us look at this using an animated plot using .gif file:
The recent development of gganimate package had made this possible and easier. By the end of this article, you will be able to make your own .gif file and create your own customised frame to compare different parameters on global or local scale.
Please install the following packages:
In addition to the above libraries in R, you will also need Image Magick Software in your system. You may download and install the same from Image Magick
This article is an attempt to make .gif file on earthquake data from 1965-2016. It is better to plot year wise global seismic activity rather than a static look of all the values on the map. The data set for earthquake is available on Kaggle.
The data set contains data for global seismic activity from 1965 to 2016. Please visit the above link and scroll down to get the .csv file.
The dataset had been modified and only seismic value of 7 points on richter scale has been considered for the study.
From the .csv file we have only selected few parameters for the sake of simplicity.
We are all set to start coding in R. I have used RStudio environment. You are free to use any environment you prefer.
## Read the datatset and load the necessary packages
library(plyr)
library(dplyr)
library(ggmap)
library(ggplot2)
library(gganimate)
EQ=read.csv("eq.csv",stringsAsFactors = FALSE)
names(EQ)
## Only Select the data with magnitude greater than or equal to 7.
EQ<-EQ%>%filter(Magnitude>=7)
This is done in order to get the frame which is important for the plot. In other words, The core of the approach is to treat frame
(as in, the time point within an animation) as another dimension, just like x, y, size, color, or so on. Thus, a variable in your data can be mapped to frame just as others are mapped to x or y.
## Convert the dates into character in order to split the coloumn into "dd" "mm" "yy"" columns
EQ$Date<-as.character(EQ$Date)
## Split the date and create a list for the same
list<-strsplit(EQ$Date,"-")
## Convert the list into dataframe
library(plyr)
EQ_Date1<-ldply(list)
colnames(EQ_Date1)<-c("Day","Month","Year")
## Column bind with the main dataframe
EQ<-cbind(EQ,EQ_Date1)
names(EQ)
## Change the Date to numeric
EQ$Year=as.numeric(EQ$Year)
## Get the world map for plot and load the necessary package
library(ggmap)
world<-map_data("world")
## Remove Antarctica region from the world map
world <- world[world$region != "Antarctica",]
map<-ggplot()+geom_map(data=world,map=world,aes(x=long,y=lat,map_id=region),color='#333300',fill='#663300')
#Plot points on world Map
p <- map + geom_point(data = EQ, aes(x = Longitude, y = Latitude,
frame = Year,
cumulative = TRUE,size=EQ$Magnitude), alpha = 0.3,
size = 2.5,color="#336600")+
geom_jitter(width = 0.1) +labs(title = "Earthquake above 7 point on richter scale")+theme_void()
# Plot .gif file using gganimate function
gganimate(p)
As we can see that plot has too many years from 1965 to 2016. Thus, in order to speed up the visualization, we can use the animation package to fast forward using ani.option()
library(animation)
ani.options(interval=0.15)
gganimate(p)
This article was an introductory tutorial to the world of animated map. Readers can try this and apply the same in other projects. Some of the example are,
Hope you found the article useful. If you have any questions, please feel free to ask in comments below.
Aritra Chatterjee is a professional in the field of Data Science and Operation Management having experience of more than 5 years. He aspires to develop skill in the field of Automation, Data Science and Machine Learning.
This post was received as part of our blogging competition – The Mightiest Pen. Check out other competitions here.
Awsome! Aritra, Great Work! Keep going..
Thank you, Kartik.
gganimate package is not available for R version 3.4.0. How can I get it to work?
To isntall gganimate, follow below steps: library(devtools) install.packages("Rcpp") devtools::install_github("dgrtwo/gganimate") library(gganimate)
Please install cowplot package and also update ggplot2 package before installing gganimate.
great blog , learning blog