SQL commands for Commonly Used Excel Operations

Sunil Ray Last Updated : 05 Mar, 2016
7 min read

Introduction

Learning SQL after Excel couldn’t be simpler!

I’ve spent more than a decade working on Excel. Yet, there is so much to learn. If you dislike coding, excel could be your rescue into data science world (to some extent). Once you understand Excel operations, learning SQL is very easy.

Why can’t you use Excel for serious data science work?

Now at this stage, you might ask, why can’t I use excel for all my work. There are several reasons for it:

  1. For large datasets, excel is not effective. Computations on large datasets either won’t happen or would take a lot of time. Just a caveat: Microsoft recently launched Power BI and I need to explore it. It might have changed the boundaries of large data.
  2. There is no audit trail in Excel. With tools based on coding and work flow management, you can re-look and re-run the process again and again. It is very difficult to do so in excel. If you change or delete a cell in Excel in an accidental manner, it is difficult to trace it back.
  3. Finally, Excel takes a lot of time to update libraries with latest algorithms in data science and machine learning. Try searching for XGboost and FTRL in excel!

Moving to SQL would address point 1 and point 2 to some extent. Moreover, SQL is one of the most sought out skills in a data scientist.

If you don’t know SQL yet and have worked in Excel, you can get started right now. I’ve designed this tutorial with keeping in mind, the most commonly used excel operations. Your previous experience blended with this tutorial can quickly make you a SQL expert. (Note: If you find any trouble, please write to me in comments section below.

Related : Basics of SQL and RDBMS for Beginners

abc

 

List of Common Excel Operations

Here is the list of commonly used excel operation. In this tutorial, I’ve performed all these operations in SQL:

  1. View Data
  2. Sort Data
  3. Filter Data
  4. Delete Records
  5. Add Records
  6. Update Data in Existing Record
  7. Show Unique Values
  8. Write an expression to generate new column
  9. LookUp data from another table
  10. Pivot Table

To perform operations listed above, I’ll use the data listed below (Employee)\:
Table

 

1. View Data

In excel, we can view all the records directly. But, SQL requires a command to process this request. This can be done by using SELECT command.

Syntax:

SELECT column_1,column_2,…column_n | *  FROM table_name;

Exercise:

A. View all data of Employee table

Select * from Employee;All

B. View only ECODE and Gender data of Employee table

Select ECODE, Gender from EmployeeSelected_Cols

 

2. Sort Data

Organizing information becomes important when you have more data. It helps to generate quick inferences. You can quickly organize an excel worksheet by sorting your data in ascending or descending order.

Syntax:

SELECT column_1,column_2,…column_n | *  FROM table_name order by column_1 [desc], column_2 [desc];

Exercise:

A. Arrange records of Employee table in Descending order of Total_Payout.

Select * from Employee order by Total_Payout desc;

Sort_Data_Set

B. Arrange records of Employee table by City (ascending) and Total_Payout(descending).

Select * from Employee order by City, Total_Payout desc;

Multi_Variable_Sort

 

3. Filter Data

In addition to sorting, we often apply filter to analyze data in a better way. When data is filtered, only rows that meet the filter criteria is displayed while other rows get hidden. Also, we can apply multiple criterion to filter data.

Syntax:

SELECT column_1,column_2,…column_n | *  FROM table_name where column_1 operator value;

Below are the common list of operators, we can use to form a condition.

Operator Description
= Equal
<> Not equal. Note: In some versions of SQL this operator may be written as !=
> Greater than
< Less than
>= Greater than or equal
<= Less than or equal
BETWEEN Between an inclusive range
LIKE Search for a pattern
IN To specify multiple possible values for a column

 

Exercise:

A. Filter observations associated with city “Delhi”

Select * from Employee where City="Delhi";

Subset

B. Filter observations of Department “Admin” and Total_Payout >= 500

Select * from Employee where Department="Admin" and Total_Payout >=500;

Subset_1

 

4. Delete Records

Deleting records or columns is a commonly used operation in Excel. In excel, we simply press ‘Delete’ key on keyboard to delete a record. Similarly, SQL has command DELETE to remove records from a table.

Syntax:

DELETE FROM table_name WHERE some_column=some_value;

Exercise:

A. Delete observations which have Total_Payout >=600

Delete * from Employee where Total_Payout >=600;

It removes two record only since these two observations satisfy the condition stated above. But be careful! if we do not provide any condition, it will remove all records from a table.

B. Delete observations which have Total_Payout >=600 and Department =”Admin”

Delete * from Employee where Total_Payout >=600 and Department ="Admin";

Above command will remove only one record which satisfies the condition.

 

5. Add records

We have seen methods to remove records, we can also add records to SQL table as we do in excel. INSERT command helps to perform this operation.

Syntax:

INSERT INTO table_name VALUES (value1value2value3,…); -> Insert values to all columns

OR,

INSERT INTO table_name (column1,column2,column3,…) VALUES (value1,value2,value3,…); -> Insert values to selected columns

Exercise:

A. Add below records to the table ‘Employee’
Add_Records

Insert into employee values('A002','05-Nov-12',0.8,'Female','Admin',12.05,26,313.3,'Mumbai');
Select * from Employee where ECODE='A002';

Insert_All

B. Insert values to ECODE (A016) and Department (HR) only.

Insert into employee (ECODE, Department) values('A016','HR');
Select * from Employee where Department='HR';

Insert_Selected

 

6. Update Data in Existing Observations

Suppose, we want to update the name of “HR” department to “Manpower” for all employees. For such cases, SQL has a command UPDATE which performs this function.

Syntax:

UPDATE table_name SET column1=value1,column2=value2,… WHERE some_column=some_value;

Exercise: Rename Department “HR” to “Manpower”

Update Employee SET Department='Manpower' where Department='HR';

Select * from Employee; Update

7. Show unique values

We can show unique values of variable(s) by applying DISTINCT keyword prior to variable name.

Syntax: 

SELECT DISTINCT column_name,column_name FROM table_name;

Exercise: Show unique values of City

Select distinct City from Employee;City

8. Write an expression to generate new column

In excel, we can create a column, based on existing column using functions or operators. This can be done in SQL using the commands below.

Exercise:

A. Create a new column Incentive which is 10% of Total_Payout

Select *, Total_Payout*01 as Incentive from Employee; Incentive

B. Create a new column City_Code which has first three characters of City.

Select *, Left(City,3) as City_Code from Employee where Department="Admin";

String

For more details on SQL functions, I would recommend you to refer this link.

 

9. LookUp data from another table

The most used function of excel by any BI professional / data analyst is VLOOKUP(). It helps to map data from other table to parent table. In other words, we can say that it is the ‘excel’ way of joining 2 data sets through a common key.

In SQL, we have similar functionality known as JOIN.

SQL JOIN is used to combine rows from two or more tables, based on a common field between them. It has multiple types:

  • INNER JOIN: Return rows when there is a match in both tables
  • LEFT JOIN: Return all rows from the left table and the matched rows from the right table
  • RIGHT JOIN: Return all rows from the right table and the matched rows from the left table
  • FULL JOIN: Return all rows when there is a match in ONE of the tables

Syntax: 

SELECT table1.column1, table2.column2..... FROM table1 INNER | LEFT| RIGHT| FULL JOIN table2 ON table1.column = table2.column;

Exercise: Below is city category table “City_Cat”, now I want to map city category to Employee table and show all records of Employee table.City_MappingHere, I want to show all records of table Employee. So, we will use left join.

SELECT Employee.*,City_Cat.City_Category FROM Employee LEFT JOIN City_Cat ON Employee.City = City_Cat.City;

Left_Join

To know more about JOIN operations, I would recommend you to refer this link.

 

10. Pivot Table

Pivot Table is an advanced way of analyzing data in excel. Not only it is useful, but it allows you to extract the hidden insights from data.

Moreover, it helps us to generate inference by summarizing data and allow us to manipulate it in different ways. This operation can be done in SQL by using aggregate functions and GROUP BY command.

Syntax:

SELECT column, aggregate_function(column) FROM table WHERE column operator value GROUP BY column;

Exercise: 

A. Show sum of Total_Payout by Gender

SELECT Gender, Sum(Total_Payout) from Employee Group by Gender;

Pivot1B. Show sum of Total_Payout and Count of Records by Gender and City

SELECT Gender, City, Count(City), Sum(Total_Payout) from Employee Group by Gender, City;

Pivot2

 

End Notes

Once you work on SQL, you would realize data handling and manipulation can be much faster. For installation, mysql is open source. You can install it and get started.
In this article, we have looked at the SQL commands for 10 common excel operations like view, sort, filter, delete, lookup and summarizing data. We also looked at the different operators and types of Joins to perform SQL operation seamless.
Also See: If you have any doubts pertaining to SQL, feel free to discuss with us.

Did you find the article useful? Do let us know your thoughts about this transition guide in the comments section below

If you like what you just read & want to continue your analytics learning, subscribe to our emailsfollow us on twitter or like our facebook page.

Sunil Ray is Chief Content Officer at Analytics Vidhya, India's largest Analytics community. I am deeply passionate about understanding and explaining concepts from first principles. In my current role, I am responsible for creating top notch content for Analytics Vidhya including its courses, conferences, blogs and Competitions.

I thrive in fast paced environment and love building and scaling products which unleash huge value for customers using data and technology. Over the last 6 years, I have built the content team and created multiple data products at Analytics Vidhya.

Prior to Analytics Vidhya, I have 7+ years of experience working with several insurance companies like Max Life, Max Bupa, Birla Sun Life & Aviva Life Insurance in different data roles.

Industry exposure: Insurance, and EdTech

Major capabilities: Content Development, Product Management, Analytics, Growth Strategy.

Responses From Readers

Clear

Srini
Srini

Just to make sure... Does this mean you can use SQL in Excel instead of filters? or is this an article on how to use SQL by exporting it to a RDBMS?

A. Kumar
A. Kumar

I am completely new for this (SQL). I want to learn SQL because it will help me in learning of Analytics (R). This beginner information is invaluable for me. Thanks for sharing ! Trust you will continue your Zeal in New Year 2016. I wish you a very Happy New Year 2016 in advance ! Regards, A. Kumar

Kiran
Kiran

Really useful blog for beginner like I am.. very basic question .. pls dont laugh :) but on which platform should we write these command .. in excel only ? or any SQL has separate application where we connect excel as a database.

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details