Automation of work has been one of the quickest ways to reach functional efficiency. Moreover, in today’s era where success is dependent on speed, automation of myriad repetitive tasks plays a key role in any industry and at the most basic level of functionality. But many of us fail to understand how to automate tasks and end up in the loop of manually doing the same things again. One such skill you can learn is to use python to automate certain functions in Google sheets.
This article will show a step-by-step process to set up a Google service account. We will use python to read google sheets’ data with the help of Google APIs and will also update the data in the spreadsheet using python google sheets. We will read the cricket commentary data from the spreadsheet, find out the number of runs scored by each batsman, and then upload the results into a separate spreadsheet.
So, you will learn how to read Google Sheets using Python. We will explore the essential libraries, such as gspread
, and demonstrate how to authenticate and access your data effectively. By the end, you’ll understand how to leverage Python in Google Sheets for enhanced data manipulation and analysis. Whether you’re a beginner or looking to refine your skills, this guide will provide actionable steps and valuable insights on working with Google Sheets in Python.
This tutorial requires you to install Python 3 and Pip3 on your local computer. To install Python, In case you are unfamiliar with Python, do have a look at our free course Introduction to Python
This article was published as a part of the Data Science Blogathon.
We often spend hours daily extracting data and then copy-pasting it to spreadsheets and creating reports leading to excessive time consumption. Consequently, it would be great if we just run a script, the data is uploaded in the spreadsheet, and the report is prepared with just a click. There are multiple advantages of report automation, such as being able to save time on data collection and removing typos, and focus would be more on the analysis part. Let’s find out how we can do this.
There are several ways to get Python code to output to Google Sheets.
However, we are using gspread in this tutorial.
In order to read and update the data from google spreadsheets in python, we will have to create a Service Account. It is a special type of account used to make authorized API calls to Google Cloud Services – Google Cloud Docs.
First of all, make sure that you have a google account. If you have a Google account, you can follow these steps to create a Google service account.
Consequently, we will add two APIs for our project.
Google Drive API will look something like this. It will allow you to access the resources from Google Drive.
We will read the commentary data of the India-Bangladesh cricket match. You can access the data (.csv) from here.
We have ball-by-ball data of the complete match in the spreadsheet. Now, we will do a fundamental task and calculate how many runs are scored by each batsman. We can do this by using a simple groupby in pandas. And finally, we will append the results in a separate sheet.
Provide access to the Google sheet.
Now, we need to provide access to the google sheet so that the API can access it. Open the JSON file that we downloaded from the developer’s console. Look for the client_email in the JSON file and copy it.
Then click on the Share button on the Spreadsheet and provide access to this client email.
Now, we are ready to use python to code and access the google sheet data. The following are the steps:
1. Import libraries
We will use the gspread and oauth2client services to authorize and make API calls to Google Cloud Services.
You can use the following code to install gspread and oauth2 python libraries.
!pip3 install gspread
!pip3 install --upgrade google-api-python-client oauth2client
Python Code:
#!pip3 install gspread
#!pip3 install --upgrade google-api-python-client oauth2client
#importing the required libraries
import gspread
import pandas as pd
from oauth2client.service_account import ServiceAccountCredentials
2. Define the scope of the application
Then, we will define the scope of the application and add the JSON file with the credentials to access the API.
# define the scope
scope = ['https://spreadsheets.google.com/feeds','https://www.googleapis.com/auth/drive']
# add credentials to the account
creds = ServiceAccountCredentials.from_json_keyfile_name('add_json_file_here.json', scope)
# authorize the clientsheet
client = gspread.authorize(creds)
3. Create the sheet instance
Use the client object and open the sheet. You can either pass the title of the sheet as the argument or pass the URL of the sheet.
Access a particular sheet: We have multiple sheets in a single spreadsheet. You can use python to access particular google sheets by providing the index of that sheet in the get_worksheet function. For the first sheet, pass the index 0 and so on.
# get the instance of the Spreadsheet
sheet = client.open('commentary data')
# get the first sheet of the Spreadsheet
sheet_instance = sheet.get_worksheet(0)
Basic functionalities
The API provides some basic functionalities, such as the number of columns by using col_count and get the value in a particular cell. Here are some examples of the same.
# get the total number of columns
sheet_instance.col_count
## >> 26
# get the value at the specific cell
sheet_instance.cell(col=3,row=2)
## >> <Cell R2C3 '63881'>
4. Get all records
Then, we will get all the data in the sheet using the get_all_records function. It will return a JSON string containing the data.
# get all the records of the data
records_data = sheet_instance.get_all_records()
# view the data
records_data
5. Convert the dictionary to the dataframe
In data science, pandas is one of the most preferred libraries for data manipulation tasks. So we will first convert the JSON string to the pandas dataframe.
In case you are not comfortable with the pandas, I would highly recommend you to enroll in this free course: Pandas for Data Analysis in Python
# convert the json to dataframe
records_df = pd.DataFrame.from_dict(records_data)
# view the top records
records_df.head()
6. Grouping batsman
Then, we will create a groupby of the number of runs scored by a batsman and upload that dataframe in a separate sheet.
# number of runs by each batsman
runs = records_df.groupby(['Batsman_Name'])['Runs'].count().reset_index()
runs
Now, we will add this dataframe to google sheets.
The following are steps to update data in google sheets.
1.Create a separate sheet.
Firstly, we will create a separate sheet to store the results. For that, use the add_worksheet function and pass the number of rows and columns required and the sheet’s title. After that, get the instance of the second sheet by providing the index, which is 1.Once you run this command, you will see that a separate sheet has been created.
To summarize, in this article, we delved into understanding the various steps involved in the process of creating a service account. And how to read and write in Google Sheets right from your Python console using the Python Google Sheets API. We downloaded the spreadsheet data, converted it into a pandas dataframe, created a groupby table, and uploaded it to the spreadsheet again. This API can be beneficial in the automation of reports.
Hope you like the article and understand how to use Python to read Google Sheets. If you have any questions about Python in Google Sheets or how to read a Google Sheet in Python, feel free to ask!
In case you want to brush up on your spreadsheet concepts, I recommend the following article and course-
Key Takeaways
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.
A. We can use gspread and oauth2 python libraries for reading excel files and spreadsheets. More ways of reading data from Google spreadsheets using Python are explained in the above article.
A. In Google spreadsheet, you can create and edit spreadsheets directly in your web browser, without the use of any specific software, and can be used in place of excel. Multiple people can work simultaneously, you can see people’s changes as they make them, and every change is saved automatically and can download csv files.
A. Yes, we can do python programming in vscode and quickstart with python programming.
Yes, you can code in Google Sheets using Google Apps Script. It lets you automate tasks and extend functionality using JavaScript.
You can automate Google Sheets using Google Apps Script, which is based on JavaScript. It allows you to create custom functions, automate tasks, format data, send emails, generate reports, and integrate with other Google services. Just open your Google Sheet, go to “Extensions” > “Apps Script,” and start writing your scripts to automate tasks.
Hi, unable to open commentary data from spyder, console throws SpreadsheetNotFound error. while sharing the spreadsheet on email in the json file the mail bounced back.
Hi Puneet, Please make sure that you have a copy of the spreadsheet on your google drive and the spreadsheet is shared with your Google service account.
Bro thankyou very much. It worked...
Hi, unable to open commentary data from spyder, console throws SpreadsheetNotFound error. while sharing the spreadsheet on email in the json file the mail bounced back. I made sure that I have a copy of the spreadsheet on my google drive and the spreadsheet is shared with my Google service account. but it still won't work.