SQL joins might appear daunting, especially for newcomers. If programming is unfamiliar, comprehending SQL joins and their types could be challenging. Yet, as a data science enthusiast or practitioner, understanding SQL joins is crucial. They empower efficient data retrieval and manipulation across tables. This article simplifies SQL joins’ learning curve, unveiling their simplicity. We’ll start by demystifying SQL joins and delve into mastering the four essential types.
Let’s answer the million-dollar question first before we look at the different types of joins in SQL.
Here is an example that will make the concept easy for you. Consider these two collections:
Let’s say that the blue circle represents the set of all boys (BOYS) and grey represents the set of people who love watching Messi play (MESSI). How would you proceed if we wanted the set of all boys who love watching Messi play?
There is a very procedural way of approaching this problem:
This is quite similar to the ‘for loop’ concept and is called sub-select in SQL.
SELECT * FROM BOYS
WHERE id IS IN (SELECT DISTINCT id FROM MESSI);
But in SQL, there is another way of approaching this problem.
To begin to understand joins, we must first have a different perspective on what we really want. In set terminology: we want the intersection of BOYS and MESSI. In graphical terms, this is expressed like:
We’re interested in the light blue part, right? This part, or the inner part (hint), are all the boys who love watching Messi. All we have to do now is express this in SQL:
SELECT * FROM BOYS
INNER JOIN MESSI
ON BOYS.id = MESSI.id;
See what the (inner) join does? It couldn’t be simpler! This is the intuitive approach on how to understand joins.
Note: Venn diagrams don’t apply directly to SQL because the items in the collections (the tables) are not identical. But because they refer to each other, we can use Venn diagrams to understand the concept better.
Also Read: Top 10 SQL Projects for Data Analysis
Now. we’ll extend this to the big picture and learn about the different types of SQL joins. Consider the below sample tables:
Want to learn the basics of what SQL is and how it can be applied in data science? Check out the popular course SQL for Data Science.
This is what we covered in the above section. Inner Join returns records that have matching values in both tables:
Let’s see what the output is using the above example:
SELECT * FROM BOYS INNER JOIN MESSI
ON BOYS.id = MESSI.id;
Output:
The inner join gives the intersection of two tables, i.e. rows which are common in both the tables.
Suppose we want ID and Name of all the people who love watching Messi play. Obviously, there are many ways of writing this query but we’ll understand with the help of joins.
Let’s see what the output is:
SELECT * FROM BOYS RIGHT JOIN MESSI
ON BOYS.id = MESSI.id;
Output:
Can you figure out what happened here? The right outer join gives us the rows that are common in both the tables as well as extra rows from the Messi table which are not present in the intersection. In other words, a right join returns all records from the right table and the matched records from the left table.
Let’s say we want the list of all the boys who love watching Messi play as well as not love watching Messi play using joins.
I want you to guess the final output before you read further.
SELECT * FROM BOYS LEFT JOIN MESSI ON BOYS.id = MESSI.id;
OUTPUT:
The left outer join gives us the rows that are common in both the tables as well as extra rows from the Boys table which are not present in the intersection. In other words, a left join returns all records from the left table and the matched records from the right table.
Finally, let’s say we want the list of all the people, including boys who love watching Messi play.
SELECT * FROM BOYS FULL OUTER JOIN MESSI
ON BOYS.id = MESSI.id;
Output:
Perfect! A full outer join gives us the rows that are common in both the tables as well as extra rows from both tables which are not present in the intersection. We get all records when there is a match on either the left or the right table.
Situation: You’re managing a database for an online store, and you want to analyze sales data along with customer information. The goal is to identify which products are popular among different customer segments.
Use an INNER JOIN when you want to match records that exist in both tables. In this scenario, you’ll retrieve sales data along with customer information for customers who have made purchases.
Example:
SELECT customers.name, products.product_name, sales.quantity
FROM customers
INNER JOIN sales ON customers.id = sales.customer_id
INNER JOIN products ON sales.product_id = products.id;
Use a LEFT JOIN when you want to retrieve all records from the left table and matching records from the right. This is useful to see which products have been purchased and by whom, including customers who haven’t made any purchases.
Example:
SELECT customers.name, products.product_name, sales.quantity
FROM customers
LEFT JOIN sales ON customers.id = sales.customer_id
LEFT JOIN products ON sales.product_id = products.id;
Use a RIGHT JOIN when you want to retrieve all records from the right table and matching records from the left. This can be helpful to identify products that have been purchased and by whom, including products that haven’t been purchased by any customer.
Example:
SELECT customers.name, products.product_name, sales.quantity
FROM customers
RIGHT JOIN sales ON customers.id = sales.customer_id
RIGHT JOIN products ON sales.product_id = products.id;
Use a FULL JOIN when you want to retrieve all records from both tables, including products that have been purchased and by whom, and also customers who haven’t made any purchases.
Example:
SELECT customers.name, products.product_name, sales.quantity
FROM customers
FULL JOIN sales ON customers.id = sales.customer_id
FULL JOIN products ON sales.product_id = products.id;
Select the appropriate join type based on your analysis goals and the relationship between tables to obtain the desired insights from your data.
In conclusion, mastering SQL joins opens the door to powerful data manipulation and analysis capabilities. Whether you’re a budding data enthusiast or a seasoned analyst, understanding the nuances of INNER, LEFT, RIGHT, and FULL joins empowers you to extract valuable insights from complex relational databases. By seamlessly combining data from multiple tables, you gain the ability to unravel intricate relationships, spot trends, and make informed decisions. Embrace the flexibility of SQL joins to efficiently explore your data’s potential, turning raw information into actionable knowledge. With this fundamental skill in your toolkit, you’re well-equipped to navigate the intricate landscape of database management and elevate your data-driven journey.
Do you want to learn how SQL can be used in data science? We highly recommend checking out this amazing course – Structured Query Language (SQL) for Data Science.
A. The types of joins in SQL include INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL JOIN, allowing data retrieval from multiple related tables based on specified conditions.
A. In SQL, a join combines data from two or more tables based on a shared column, creating a unified result set. It enhances data analysis by linking related information.
A. Join syntax in SQL involves using keywords like INNER JOIN, LEFT JOIN, RIGHT JOIN, or FULL JOIN, followed by the table names and ON clause specifying the matching conditions.
A. Yes, we can join two instances of the same table in SQL. It’s termed a self-join, useful when analyzing relationships within a single table, often utilizing aliases to differentiate between the instances.
What is the difference between right join and select from the second table?
Answer from both the queries will be the same in this case but can differ in another case when the attributes in the first table are not the same as second table then null values have to be introduced in tuples of the second table after performing right join.
in output it should show Gender field as well. Please explain if not then why?