In structured query language (SQL), mastering Common Table Expressions (CTEs) is akin to unlocking a powerful data manipulation and analysis toolset. CTEs SQL, often abbreviated as CTE, are the cornerstone of efficient SQL coding. They offer a streamlined approach to handling complex queries and enhance code readability.
CTE SQL, or Common Table Expression, serves as a temporary table within SQL, providing a concise and efficient way to reference and manipulate data from the original table. With CTEs, developers can streamline their code, focusing only on the necessary columns and operations, thus enhancing efficiency and readability. This comprehensive guide delves deep into the realm of CTEs, exploring their syntax, applications, and best practices.
This article was published as a part of the Data Science Blogathon.
CTE is the short form for Common Table Expressions. CTE is one of the most powerful tools of SQL (Structured Query Language), and it also helps to clean the data. SQL is the concept used to simplify coding and help get results as quickly as possible. CTE in SQL is the temporary table used to reference the original table. If the original table contains too many columns and requires only a few, we can make CTE (a temporary table) containing only the required columns.
Complex problems can be solved easily with good readability of code using CTE. A code should always be easier to read so that if you refer to it after some time, you will immediately get that code. Even if the third person is referring to your code, he/she can understand it very well if the readability of your code is easy and good. CTE function in SQL is one of the most important concepts in SQL. If you know about CTE, you are stepping into the world of advanced SQL. Common Table Expressions (CTE) helps Data analysts, data scientists, or any other analysts who want to extract their data efficiently and effectively from a table containing big data.
Non-recursive Common Table Expressions simplify complex queries and improve code readability. They serve as a temporary result set, allowing developers to logically break down and organize various parts of the query, which can be referenced later within the same query.
Recursive Common Table Expressions enables querying hierarchical or iterative data structures, such as parent-child relationships, tree traversal, or finding the shortest path in a graph. They consist of an initial, non-recursive part (called anchor member) and a recursive part (called recursive member), referencing the CTE in SQL.
CTEs provide various advantages that make them a valuable tool for SQL developers:
CTEs break down complex queries into smaller, more manageable parts. By naming each part and isolating it within a SQL CTE, developers can easily understand the purpose of each component and the overall query logic, making the code more readable.
CTE function in SQL improve readability and make the code easier to maintain. When updating or modifying a query, developers can quickly identify the relevant CTEs SQL and make changes without affecting the rest. This modularity simplifies troubleshooting and reduces the likelihood of introducing errors.
CTEs in SQL allow developers to encapsulate specific portions of a query’s logic, promoting a modular approach to query design. This abstraction makes the query more maintainable and promotes code reusability, as the same CTE can be referenced multiple times within a query, reducing code duplication.
Let us now learn how to create and use CTE in SQL.
To define a CTE, use the WITH keyword, followed by the CTE name, optional column list, and the AS keyword with a query expression in parentheses:
WITH CTE_NAME AS
(
SELECT column_name1, column_name2,..., column_nameN
FROM table_name
WHERE condition
)cte in sql
SELECT column_name1, column_name2,..., column_nameN
FROM CTE_NAME;
Let us now utilize Common Table Expressions (CTEs) to select and manipulate data in SQL queries.
After defining a CTE, you can select data from it using a standard SELECT statement:
SELECT column1, column2
FROM CTE_name
WHERE ...
You can use CTEs to manipulate data in your base tables. To insert, update, or delete data, reference the CTE in the respective statement:
-- Insert
INSERT INTO target_table
SELECT ... FROM CTE_name ...
-- Update
UPDATE target_table
SET ...
FROM CTE_name
WHERE ...
-- Delete
DELETE target_table
FROM target_table
JOIN CTE_name ON …
To use multiple CTEs, chain them together, separating each definition with a comma:
Here is a tutorial on how to implement CTE in SQL server.
Below code is executed in MySQL:
-- Creating databse
CREATE DATABASE employee_details;
-- Use database
USE employee_details;
-- Creating table
CREATE TABLE employee
(
emp_id INT PRIMARY KEY,
emp_name `VARCHAR`(20),
email_id `VARCHAR`(30),
city `VARCHAR`(25),
Verification_status `CHAR`(3)
);
-- Inserting values into the table
INSERT INTO employee VALUES (1,'Suhas','[email protected]','Mumbai','Yes');
INSERT INTO employee VALUES (2,'Yohani','[email protected]','Mumbai','No');
INSERT INTO employee VALUES (3,'Reshama','[email protected]','Pune','Yes');
INSERT INTO employee VALUES (4,'Raj','[email protected]','Bangalore','No');
INSERT INTO employee VALUES (5,'Shivani','[email protected]','Bangalore','Yes');
SELECT * FROM employee;
The Output is given below:
The Output is given below:
In the example above, I created a table named ’employee’ containing employee details.
The description of the column list of the ’employee’ table is listed below:
Suppose we want to extract specific information about employees; then, we can achieve this by creating SQL CTE.
If we want to check the verification status of employees, whether they are verified or not, and see the records of the employees whose verification is not done yet, we require only two columns in our CTE (temporary table): emp_name and Verification_status, which contain the value ‘No’ from the table ’employee.’
The following code shows the example of creating CTE in SQL using the table ’employee’:
-- Creating CTE
WITH mycte AS
(
SELECT emp_name , Verification_status
FROM employee
WHERE Verification_status = 'No'
)
SELECT * FROM mycte;
The Output is given below:
Here, ” mycte ” is the name of the CTE (temporary table). ” mycte contains the columns emp_name and Verification_status. It contains the details of the employees whose incomplete verification is what we wanted to achieve.
We can make one or more than one CTE from the same table separated by a comma.
This error occurs because you create the Common Table Expression but do not use it when selecting any of the columns from CTE. See the corrected code above for creating CTE, where I have created CTE and then selected the columns from this CTE in SQL. So, it is necessary to use the CTE function in SQL by selecting the required columns to avoid the error. It is important to note that you can use the CTE in a query where you created it. But you cannot use this Common Table Expression in later queries, that is, inside the queries that will be created later. For example, I created a Common Table Expression named “mycte” in the above example. Then my next SQL query is as follows:
SELECT *
FROM mycte;
This SQL query is not valid. Because here in this SQL query, I cannot use “mycte” CTE.
” mycte ” CTE does not exists for this SQL query.
” mycte ” CTE exists for the only SQL query where it has been created. That is why CTE (Common Table Expression) is known as a Temporary table in SQL.
Now, let us understand how to use the JOIN query inside CTE. Consider the MySQL code below.
-- Creating table product
CREATE TABLE product
(
p_id INT PRIMARY KEY,
p_name `VARCHAR`(20),
category `VARCHAR`(30)
);
-- Creating table sales
CREATE TABLE sales
(
p_id INT PRIMARY KEY,
p_name `VARCHAR`(20),
gross_sales DECIMAL
);
-- Inserting values into the table 'product'
INSERT INTO product VALUES (1, 'Mobile', 'Electronics');
INSERT INTO product VALUES (2, 'TV', 'Electronics');
INSERT INTO product VALUES (3, 'Car', 'Toy');
INSERT INTO product VALUES (4, 'Video game', 'Toy');
INSERT INTO product VALUES (5, 'Earphones', 'Electronics');
-- Inserting values into the table 'sales'
INSERT INTO sales VALUES (1, 'Mobile', 50000);
INSERT INTO sales VALUES (2, 'TV', 40000);
INSERT INTO sales VALUES (3, 'Car', 50000);
INSERT INTO sales VALUES (5, 'Earphones', 500000);
-- Show all columns from the table 'product'
SELECT * FROM product;
-- Show all columns from the table 'product'
SELECT * FROM sales;
Output :
“product ” table:
” sales ” table:
In the above example, I have created two tables product and sales.
The description of the columns of the table ” product ” :
The description of the columns of the table ” sales ” :
Then, I inserted values into the tables product and sales .
-- Creating CTE
WITH TEMP_CTE AS
(
SELECT p.category AS category,
COUNT(*) AS No_of_products,
SUM(s.gross_sales) AS Total_gross_sales
FROM product p JOIN sales s
ON p.p_id=s.p_id
GROUP BY category
ORDER BY Total_gross_sales DESC
)
SELECT * FROM TEMP_CTE;
Now, we have to find the gross sales in each product category and the number of products in each category.
To achieve this, I created the Common Table Expression named “TEMP_CTE”. I have used JOIN to join the two tables, “product” and “sales.” I wanted to join the tables over the records with matches in both tables. Hence, I have used INNER JOIN. INNER JOIN is also called JOIN. So, if you write JOIN in your query instead of INNER JOIN, it will be valid only. p_id is the common column in both tables.
To find the total gross sales and no. of products, we need to use aggregate functions sum() and count(), respectively. To use the aggregate functions, the GROUP BY clause has to be there. So, we have grouped our results by category and found the sum of the gross sales in each category and the product count.
Lastly, Total gross sales are ordered in descending order to see the category with the highest gross sales.
CTEs in SQL Server provide numerous benefits compared to traditional methods for creating complex queries. Here is the reason why they are helpful:
Enhanced Clarity: Breaking down complex queries into smaller, manageable CTEs can greatly improve the readability of your code. This will simplify the process of maintaining and altering the queries for both you and others in the future.
Modularization: Common Table Expressions behave as temporary named result sets that can be referenced within the main query. This methodical strategy helps in concentrating on precise data alterations and computations in every CTE, thus simplifying the overall logic comprehension.
Reusability allows for referencing a CTE multiple times in a query because it is a named result set. This eliminates the necessity of repeating intricate subqueries, cutting down on redundancy and enhancing code efficiency.
Hierarchical Data Management: Recursive CTEs are a unique form of CTE designed for handling hierarchical data structures. This is especially handy for tasks such as navigating bills-of-materials or organizational charts
CTEs and subqueries often have similar performance characteristics. However, CTEs can improve readability and maintainability in complex scenarios. In some cases, the query optimizer might generate better execution plans for CTEs, resulting in faster performance.
The Common Table Expressions are temporary result sets that exist only for the scope of a single query and are not stored in the database. They are used for readability and maintainability.
Temp tables are physical tables stored temporarily in the database. They allow indexing, modification, and access across multiple queries within the same session.
Common Table Expressions (CTEs) in SQL are a powerful tool for simplifying complex queries and enhancing code readability. By breaking down intricate SQL operations into modular, maintainable components, CTEs empower developers to unleash the full potential of their database queries. Whether basic data retrieval or tackling recursive operations, CTEs offer a versatile solution for various SQL challenges. With their ability to streamline coding processes and improve query structure, CTEs are indispensable for data analysts, data scientists, and other professionals working with large datasets.
By diving into the world of CTEs, SQL developers can elevate their skills and optimize their database operations, ultimately driving greater efficiency and effectiveness in data manipulation. Some of the key takeaways are:
If you are giving an interview for any analyst position or data scientist, the interviewer checks the knowledge of SQL. That time, CTE plays a vital role in showcasing your knowledge and letting the interviewer see your problem-solving skills and capabilities with greater complexity.
I hope this article about Common Table Expressions finds you insightful and helps you in your knowledgeable data journey with SQL. Hopefully, your knowledge of salt in the food has increased! Also, you will get a proper information about the cte in SQL server.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.
A. A Common Table Expression (CTE) in SQL is a temporary result set that you can reference within a SELECT
, INSERT
, UPDATE
, or DELETE
statement. It helps to simplify complex queries by breaking them into more manageable parts.
A. CTEs can be better than subqueries for readability and maintenance. They make queries easier to understand by giving parts of the query a name, which is helpful in complex operations. However, performance-wise, they are often similar to subqueries.
A. In an SQL interview, a CTE might be discussed to test your understanding of writing and optimizing queries. You might be asked to explain its syntax, advantages, and how it differs from subqueries or temporary tables.
A. A CTE is a temporary, named result set defined within the execution scope of a single SELECT
, INSERT
, UPDATE
, or DELETE
statement. A table, on the other hand, is a permanent data storage structure in the database. CTEs are for short-term use within a query, while tables store data persistently.