This article was published as a part of the Data Science Blogathon.
The global battle against COVID 19 pandemic can be won only if a large part of the world gets vaccinated against the SARS-CoV-2 virus. A considerably low vaccination rate has been observed in low-income countries of the world. In this blog, we study the COVID 19 vaccination trends across the world using python, and we aim to derive key insights from the data which can help policymakers modify their policies.
The country vaccinations data have been downloaded from Kaggle, and it was last updated on March 8, 2022.
Link: https://www.kaggle.com/code/terencemao/covid-vaccination-rates/data
Country vaccinations data contain the following columns:
This study will use python’s – plotly, pandas,matplotlib, and seaborn libraries for data visualization.
import numpy as np import pandas as pd import seaborn as sns from matplotlib import pyplot as plt import plotly.express as px from plotly.offline import download_plotlyjs,init_notebook_mode,plot,iplot import plotly.graph_objects as go import plotly.figure_factory as ff from plotly.colors import n_colors from wordcloud import WordCloud,ImageColorGenerator init_notebook_mode(connected=True) from plotly.subplots import make_subplots from pywaffle import Waffle import warnings warnings.filterwarnings("ignore")
Pandas library is used to read the csv files in python.
df_vaccination = pd.read_csv("C:\Users\ASUS\Downloads\archive\country_vaccinations.csv", parse_dates = ['date']) df_manufacture = pd.read_csv("C:\UsersASUS\Downloads\archive\country_vaccinations_by_manufacturer.csv", parse_dates = ['date'])
Dataframe.info is used to summarize the data frame in python. There are 81,976 rows and 15 columns in the df_vaccination dataset and 31,126 rows and 4 columns in df_manufacture data. There are missing values in both datasets.
df_vaccination.info() RangeIndex: 81976 entries, 0 to 81975 Data columns (total 15 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 country 81976 non-null object 1 iso_code 81976 non-null object 2 date 81976 non-null datetime64[ns] 3 total_vaccinations 41873 non-null float64 4 people_vaccinated 39638 non-null float64 5 people_fully_vaccinated 37119 non-null float64 6 daily_vaccinations_raw 34033 non-null float64 7 daily_vaccinations 81697 non-null float64 8 total_vaccinations_per_hundred 41873 non-null float64 9 people_vaccinated_per_hundred 39638 non-null float64 10 people_fully_vaccinated_per_hundred 37119 non-null float64 11 daily_vaccinations_per_million 81697 non-null float64 12 vaccines 81976 non-null object 13 source_name 81976 non-null object 14 source_website 81976 non-null object dtypes: datetime64[ns](1), float64(9), object(5) memory usage: 9.4+ MB
df_manufacture.info() RangeIndex: 31127 entries, 0 to 31126 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 location 31127 non-null object 1 date 31124 non-null datetime64[ns] 2 vaccine 31127 non-null object 3 total_vaccinations 31127 non-null int64 dtypes: datetime64[ns](1), int64(1), object(2) memory usage: 972.8+ KB
# Creating a new dataset df with limited set of columns df = df_vaccination.groupby(["country"])['people_fully_vaccinated_per_hundred'].max().reset_index()
The African continent has the least percentage of fully immunized population vis-a-vis the other continents.
fig = px.choropleth(df,locations = 'country',locationmode = 'country names',color = 'people_fully_vaccinated_per_hundred', title = 'people_fully_vaccinated %',hover_data= ['people_fully_vaccinated_per_hundred']) fig.show()
Across the world, various vaccine schemes were used for immunization. India used Covaxin, Oxford/Astra Zeneca while Russia used Epivaccorona and SputnikV. Pfizer, Oxford/Astra Zeneca and Moderna were used by Australia, Canada and UK. China used CanSino, Sinopharm, Sinovac,Z52021.
fig = px.choropleth(df_vaccination, locations = 'country',locationmode = 'country names', color = 'vaccines', title = 'VaccinationbyCountry', height = 1000) fig.update_layout({'legend_orientation':'h'}) fig.update_layout({'legend_title':'Vaccine scheme'}) fig.show()
The average daily vaccination count(in Millions) is the highest in China followed by India, the United States, and Brazil.
fig = px.choropleth(dfdailyvaccination,locations = 'country',locationmode = 'country names',color = 'daily_vaccinations', title = 'Average daily_vaccinations',hover_data= ['daily_vaccinations']) fig.show()
China has the highest count of vaccination in the country followed by India and the United States due to their large population. Also, vaccinations in the country are greater than the population of the country as an individual in most of the immunization programs receives two vaccines for COVID 19.
vaccine = df_vaccination.groupby(["country"])['total_vaccinations'].max().nlargest(10).reset_index() vaccine.columns = ["country", "Total vaccinations"] fig = px.bar(vaccine, x='country', y='Total vaccinations') fig.show()
df1 = df_vaccination.groupby(["country"])['total_vaccinations_per_hundred'].max().nlargest(10).reset_index() df1.columns = ["country", "total_vaccinations_per_capita"] fig = px.bar(df1, x='country', y='total_vaccinations_per_capita', height = 1000 , width = 1000) fig.show()
The Total vaccination per capita is higher for Gibraltar, Cuba, Chile, Singapore, UAE, Malta, and Brunei.
Countries from the African continent like Burundi, Democratic Republic of Congo, Chad, Madagascar, and Tanzania have the least vaccination count per capita.
df1 = df_vaccination.groupby(["country"])['total_vaccinations_per_hundred'].max().nsmallest(15).reset_index() df1.columns = ["country", "total_vaccinations_per_capita"] fig = px.bar(df1, x='country', y='total_vaccinations_per_capita', height = 1000 , width = 1000) fig.show()
Gibraltar, Pitcairn, United Arab Emirates, Portugal, Cuba, Chile, Cayman Islands, Brunei, Singapore and Malta have the highest vaccinated(at least one dose) population per dose.
vaccine = df_vaccination.groupby(["country"])['people_vaccinated_per_hundred'].max().nlargest(10).reset_index() vaccine.columns = ["country", "people_vaccinated_per_capita"] fig = px.bar(vaccine, x='country', y='people_vaccinated_per_capita') fig.show()
Countries from the African continent like Burundi, Democratic Republic of Congo, Haiti, Chad, Yemen, Papua New Guinea, Madagascar, and Tanzania have the least vaccinated population count per capita.
vaccine = df_vaccination.groupby(["country"])['people_vaccinated_per_hundred'].max().nsmallest(10).reset_index() vaccine.columns = ["country", "people_vaccinated_per_capita"] fig = px.bar(vaccine, x='country', y='people_vaccinated_per_capita') fig.show()
China’s Cansino, Sinopharm and Sinovac vaccination schemes are most frequently used followed by India’s Covaxin, Oxford/Astra Zeneca and SputnikV and United States Johnson, Moderna and Pfizer.
colors=['#fae588','#f79d65','#f9dc5c','#e8ac65','#e76f51','#ef233c','#b7094c'] #color palette vaccinetotalpop = df_vaccination.groupby(["country", "vaccines"])['total_vaccinations'].max().nlargest(3).reset_index() fig = px.treemap(vaccinetotalpop, path = ['country','vaccines' ], values = 'total_vaccinations', title="Total vaccinations per country grouped by vaccine scheme", height = 800 , width = 1000 ) fig.update_layout( font_family = "Courier New", font_color = "black", treemapcolorway = colors) fig.show()
The daily vaccination trend peaked in Q2’21 for USA and China. India and Indonesia saw a rise in Q3’21. Pakistan and Bangladesh vaccination count spiked in March’22.
country_vaccine_time = df_vaccination[["country", "date", 'daily_vaccinations' ]] country_vaccine_time.columns = ["Country", "Date", "Daily vaccinations" ] countries = ['India','Germany', 'United Kingdom', 'United States', 'China', 'Brazil', 'Indonesia','Japan','Pakistan', 'Bangladesh'] fig = px.line(country_vaccine_time1, x="Date", y="Daily vaccinations", color='Country') fig.show()
The total COVID 19 vaccination count (in billions) is the highest in China followed by India, the United States, and Brazil. However, the total vaccination per capita is high for Gibraltar, Cuba, Chile, Singapore, UAE, Malta, and Brunei.
China’s Cansino, Sinopharm, and Sinovac vaccination schemes are most frequently used followed by India’s Covaxin, Oxford/Astra Zeneca, and SputnikV, and United States Johnson, Moderna, and Pfizer.
The analysis suggests that countries in the African continent have extremely low vaccination rates and are far behind the other continents of the world. Therefore, WHO organizations should intervene to provide equitable distribution of vaccines across the world.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.
Thanks for sharing and the data points/insights - super interesting!