Nested Queries in SQL

Ayushi Trivedi Last Updated : 24 Sep, 2024

8 min read

Introduction

Imagine you’re trying to find a specific piece of information from a giant library where some books have other smaller books inside them. To find the right answer, you may need to first look at the smaller books, then use that information to find the larger one. This is exactly how nested queries in SQL work! By placing one query inside another, you can extract complex data with ease. In this guide, we’ll explore how nested queries function and how you can harness their power in SQL for more efficient database management.

Learning Outcome

Understand what nested queries (subqueries) are in SQL.
Write and implement nested queries within various SQL statements.
Differentiate between correlated and non-correlated nested queries.
Optimize SQL queries using nested structures for improved performance.

What Are Nested Queries in SQL?
Types of Nested Queries in SQL
Use Cases for Nested Queries
Common Mistakes with Nested Queries
Frequently Asked Questions

What Are Nested Queries in SQL?

A nested query, also known as a subquery, is an SQL query placed inside another SQL query. The result of the inner query (the subquery) is used by the outer query to achieve the desired outcome. This approach is particularly useful when the results from the inner query depend on the data retrieved by the outer query.

Basic Syntax

SELECT column_name(s)  
FROM table_name  
WHERE column_name = (SELECT column_name FROM table_name WHERE condition);

Types of Nested Queries in SQL

Nested queries, also known as subqueries, allow you to perform complex data retrieval by embedding one SQL query within another. This functionality is essential for writing efficient SQL code and handling intricate database operations. In this section, we’ll explore the different types of nested queries, complete with examples and expected outputs.

Single-row Subquery in SQL

A single-row subquery is a nested type of query that results in one or more columns in just a single row. It is very common with SQL statements where you want to use a comparison operator or a condition against a single value, such as =, <, >, etc.

Key Characteristics of Single-row Subqueries

Returns One Row: Owing to the name assigned, one can expect a single row of data.
Usually Used with Comparison Operators: Usually used with operators such as =, >, <, >=, <= etc.
Can Return One or More Columns: Though it returns a single row, yet it can return multiple columns.

Example: Find Employees Earning More Than the Average Salary

Table: employees

employee_id	first_name	last_name	salary	department_id
1	John	Doe	90000	1
2	Jane	Smith	95000	1
3	Alice	Johnson	60000	2
4	Bob	Brown	65000	2
5	Charlie	Davis	40000	3
6	Eve	Adams	75000	3

Table: departments

department_id	department_name	location_id
1	Sales	1700
2	Marketing	1700
3	IT	1800
4	HR	1900

SELECT first_name, last_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);

Output:

| first_name | last_name | salary |
|------------|-----------|--------|
| John       | Doe       | 90000  |
| Jane       | Smith     | 95000  |

For instance, in the example, the inner query (SELECT AVG(salary) FROM employees) finds what all the employees’ average salaries are. The outer query gets the first name, last name and salary of all employees whose earnings are above this number.

Multi-row Subquery in SQL

Multi-row subquery is a kind of nested query that returns hence, more than one row of data. It Is usually used with IN, ANY, or ALL operators to compare a column with a set of values returned by the subquery. One of the advantages of using multi-row subquery is that it combines the results on a list of values and applies multiple rows in making computations.

Example: Find Employees in Certain Departments

SELECT first_name, last_name
FROM employees
WHERE department_id IN (SELECT department_id FROM departments WHERE location_id = 1700);

Output:

| first_name | last_name |
|------------|-----------|
| Alice      | Johnson   |
| Bob        | Brown     |

Here, the inner query retrieves department_ids from the departments table where the location_id is 1700. The outer query then finds employees who work in these departments.

Correlated Subquery in SQL

A correlated subquery is a type of nested query in SQL. It depends on the outer query for its values. While a regular subquery can execute independently, a correlated subquery calculates in relation to every row processed by the outer query, hence dynamic and context-sensitive.

Characteristics of Correlated Subqueries

Dependency: The inner query references columns from the outer query, establishing a direct dependency.
Row-by-row Execution: The inner query runs multiple times—once for each row processed by the outer query.
Performance Considerations: Because the inner query runs repeatedly, correlated subqueries can be slower than their non-correlated counterparts, especially on large datasets.

Example: Find Employees with Salaries Above Their Department’s Average

SELECT first_name, salary
FROM employees e1
WHERE salary > (SELECT AVG(salary) FROM employees e2 WHERE e1.department_id = e2.department_id);

Output:

| first_name | salary |
|------------|--------|
| John       | 90000  |
| Jane       | 95000  |

In this case, the inner query calculates the average salary for each department as the outer query processes each employee. The outer query selects employees who earn more than their department’s average salary.

Nested Subqueries in SQL

A nested subquery is also known as a nested query. This is an inner query or a query positioned inside another query where one query appears inside another. Such queries become quite handy for accessing difficult data and transforming it in rather very specific ways, allowing complex problems to break into more constituent, manageable parts, making it much easier to query relational databases.

Structure of Nested Subqueries

A nested subquery typically consists of two main components:

Outer Query: This is the main query that contains the subquery. It uses the result of the subquery to filter or manipulate data.
Inner Query (Subquery): This query is embedded within the outer query and provides a result set that can be utilized by the outer query.

Example: Find Departments with Employees Earning More Than the Average Salary

SELECT department_id, department_name
FROM departments
WHERE department_id IN (
    SELECT department_id
    FROM employees
    WHERE salary > (SELECT AVG(salary) FROM employees)
);

Output:

| department_id | department_name |
|---------------|------------------|
| 1             | Sales            |
| 2             | Marketing        |

In this example, the innermost query (SELECT AVG(salary) FROM employees) is taking the average. The middle query will fetch department_ids of employees making above that average and the outer query will retrieve the department names assigned to it.

Scalar Subquery

A scalar subquery is defined as a subquery which gives out a single value, a single row and a single column. So scalar subqueries are pretty handy to use wherever there is a requirement of a single value in the main query. Scalar subqueries can be utilized within many SQL clauses like SELECT, WHERE and HAVING.

Characteristics of Scalar Subqueries

Returns One Value: As its name suggests, scalar subquery only returns a single value. Any subquery which tries to return a row other than a single one or a column other than one will lead to an error.
Used in Various Clauses: Derived columns can be calculated in the SELECT statements, narrowing down the results in WHERE clauses, and adding conditions on a collection of data within a HAVING clause all with the help of these scalar subqueries within this clause.
Efficient for Comparisons: They are often used for making comparisons against a single value derived from another query.

Example: Retrieve Employees and Their Salary Difference from the Average Salary

SELECT first_name, last_name, salary - (SELECT AVG(salary) FROM employees) AS salary_difference
FROM employees;

Output:

| first_name | last_name | salary_difference |
|------------|-----------|-------------------|
| John       | Doe       | 10000             |
| Jane       | Smith     | 15000             |

In this case, the scalar subquery computes the average salary once, and the outer query calculates the difference for each employee’s salary from the average.

Use Cases for Nested Queries

Nested queries, or subqueries, are powerful tools in SQL that can solve a variety of complex data retrieval challenges. Here are some common use cases:

Data Filtering

Nested queries can be used to filter results based on values derived from another table.

Example: Find employees whose salaries are above the average salary in their respective departments.

SELECT first_name, last_name, salary
FROM employees e1
WHERE salary > (SELECT AVG(salary) FROM employees e2 WHERE e1.department_id = e2.department_id);

Calculating Aggregates

You can calculate aggregates in a nested query and use those results in the outer query.

Example: Retrieve departments with an average salary greater than the overall average salary.

SELECT department_id, AVG(salary) AS average_salary
FROM employees
GROUP BY department_id
HAVING AVG(salary) > (SELECT AVG(salary) FROM employees);

Conditional Logic

Nested queries allow you to implement conditional logic within your SQL statements.

Example: List employees who belong to departments located in a specific city.

SELECT first_name, last_name
FROM employees
WHERE department_id IN (SELECT department_id FROM departments WHERE city = 'New York');

Correlated Subqueries for Row-Level Calculations

Correlated subqueries enable row-level calculations based on values from the current row in the outer query.

Example: Get a list of products with a price higher than the average price of products in the same category.

SELECT product_name, price
FROM products p1
WHERE price > (SELECT AVG(price) FROM products p2 WHERE p1.category_id = p2.category_id);

Differences Between Nested Queries and Other SQL Queries

Let us now look into the difference between nested queries and other SQL queries below:

Feature	Nested Queries	Joins	Simple Queries
Definition	A query placed inside another query	Combines rows from two or more tables based on a related column	A single SQL statement that retrieves data
Execution	Executes the inner query for each row processed by the outer query	Executes simultaneously for all rows from both tables	Executes independently without any dependencies
Use Case	Useful for complex calculations and filtering based on another query	Ideal for combining related data from multiple tables	Suitable for straightforward data retrieval
Performance	May lead to slower performance due to repeated execution of the inner query	Generally more efficient as it processes data in one go	Fastest for simple data retrieval
Complexity	Can become complex and difficult to read	Can also be complex but typically clearer with explicit relationships	Simple and easy to understand
Data Dependency	The inner query can depend on the outer query’s result	Data from joined tables is independent of each other	Data retrieved is independent, no subqueries involved
Example	`SELECT first_name FROM employees WHERE salary > (SELECT AVG(salary) FROM employees);`	`SELECT e.first_name, d.department_name FROM employees e JOIN departments d ON e.department_id = d.department_id;`	`SELECT * FROM employees;`

Common Mistakes with Nested Queries

While nested queries can be incredibly useful, they also come with pitfalls. Here are some common mistakes to watch out for:

Returning Multiple Rows

A scalar subquery must return a single value; if it returns multiple rows, it will cause an error.

Mistake:

SELECT first_name
FROM employees
WHERE salary = (SELECT salary FROM employees);

Solution: Ensure the inner query uses aggregation or filtering to return a single value.

Performance Issues

Nested queries can sometimes lead to performance bottlenecks, especially if they are executed for each row in the outer query.

Mistake: Using a nested query inside a large outer query without considering performance implications.

Solution: Analyze query execution plans and consider alternative methods, like joins, when dealing with large datasets.

Improper Use of Parentheses

Incorrect placement of parentheses can lead to unexpected results or errors.

Mistake:

SELECT first_name
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees WHERE department_id);

Solution: Ensure the logic of your query is clear, and parentheses are used appropriately to group conditions.

Not Considering NULL Values

Nested queries can produce unexpected results when NULL values are present in the data.

SELECT first_name
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees WHERE department_id IS NOT NULL);

Solution: Handle NULL values explicitly using functions like COALESCE to avoid unintended filtering.

Conclusion

SQL nested queries, also known as subqueries, are very useful in carrying out highly complex data retrieval operations efficiently. You can embed a query inside another, to do any calculations on data that cannot be done by simple queries alone. Having the knowledge of four main types of these will be helpful: single-row, multi-row, correlated, and scalar subqueries. Applying best practices and avoiding some common pitfalls, you can tap into the full potential of nested queries to improve your database management and performance.

Frequently Asked Questions

Q1. What is a nested query in SQL?

A. A nested query, or subquery, is an SQL query placed inside another query. The inner query’s result is used by the outer query to perform complex data retrieval.

Q2. What are the types of nested queries?

A. The main types include single-row subqueries, multi-row subqueries, correlated subqueries, and scalar subqueries, each serving different use cases.

Q3. When should I use a correlated subquery?

A. Use a correlated subquery when the inner query needs to reference a column from the outer query, allowing for dynamic row-by-row evaluations.

Q4. Can nested queries impact performance?

A. Yes, nested queries can lead to performance issues, especially if they are executed for every row in the outer query. Analyzing execution plans and considering alternatives like joins can help improve efficiency.

Ayushi Trivedi

My name is Ayushi Trivedi. I am a B. Tech graduate. I have 3 years of experience working as an educator and content editor. I have worked with various python libraries, like numpy, pandas, seaborn, matplotlib, scikit, imblearn, linear regression and many more. I am also an author. My first book named #turning25 has been published and is available on amazon and flipkart. Here, I am technical content editor at Analytics Vidhya. I feel proud and happy to be AVian. I have a great team to work with. I love building the bridge between the technology and the learner.

Beginner Database SQL

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Intoduction to Python

Variables and data types

OOPs Concepts

Conditional statement

Looping Constructs

Data Structures

String Manipulation

Functions

Modules, Packages and Standard Libraries

Python Libraries for Data Science

Reading Data Files in Python

Preprocessing, Subsetting and Modifying Pandas Dataframes

Sorting and Aggregating Data in Pandas

Visualizing Patterns and Trends in Data

Programming

Nested Queries in SQL

Introduction

Learning Outcome

Table of contents

What Are Nested Queries in SQL?

Basic Syntax

Types of Nested Queries in SQL

Single-row Subquery in SQL

Key Characteristics of Single-row Subqueries

Multi-row Subquery in SQL

Correlated Subquery in SQL

Characteristics of Correlated Subqueries

Nested Subqueries in SQL

Structure of Nested Subqueries

Scalar Subquery

Characteristics of Scalar Subqueries

Use Cases for Nested Queries

Data Filtering

Calculating Aggregates

Conditional Logic

Correlated Subqueries for Row-Level Calculations

Differences Between Nested Queries and Other SQL Queries

Common Mistakes with Nested Queries

Returning Multiple Rows

Performance Issues

Improper Use of Parentheses

Not Considering NULL Values

Conclusion

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us