How to Implement Normalization with SQL?

Ayushi Trivedi Last Updated : 11 Jul, 2024
4 min read

Introduction

Envision organizing a disorganized garage into a well-lit area where everything is readily available and arranged appropriately. Within the domain of databases, this procedure is referred to as normalization. A database functions better when its data is well structured and clutter-free, just like your garage does when it is kept tidy. Are you eager to find out more? The first three normal forms—1NF, 2NF, and 3NF—will be discussed in this article along with some useful, real-world examples of normalization in SQL. You can learn how to make your databases more scalable and efficient, regardless of your experience level with database design. Are you prepared to modify your data? Come on, let’s get started!

Overview

  • Understand the principles and objectives of database normalization with SQL.
  • Apply the first normal form (1NF) to ensure atomic values and primary keys.
  • Identify and eliminate partial dependencies to achieve the second normal form (2NF).
  • Remove transitive dependencies to conform to the third normal form (3NF).
  • Implement normalized database structures using practical SQL queries.
How to Implement Normalization in SQL

What is Normalization?

An essential step in relational database architecture is normalization. It facilitates effective data organization by lowering redundancy and enhancing data integrity. To minimize anomalies, the procedure entails splitting a database into tables and establishing rules-based associations between them. Let’s delve deeper into each normal form, explaining the principles and providing practical SQL examples.

First Normal Form (1NF)

Objective: Ensure each table has a primary key and each column contains atomic (indivisible) values. A table is in 1NF if it follows these rules:

  • Single Valued Attributes: Each column should contain only one value per row.
  • Unique Column Names: Each column must have a unique name.
  • Order of Storage is Insignificant: The order in which data is stored does not matter.

Example:

Consider a non-normalized table with repeating groups:

OrderIDCustomerNameProductsQuantities
1John DoePen, Pencil2, 3
2Jane SmithNotebook, Eraser1, 2

This table violates 1NF because the Products and Quantities columns contain multiple values.

Convert to 1NF:

OrderIDCustomerNameProductQuantity
1John DoePen2
1John DoePencil3
2Jane SmithNotebook1
2Jane SmithEraser2

SQL Implementation:

CREATE TABLE Orders (
    OrderID INT,
    CustomerName VARCHAR(255),
    Product VARCHAR(255),
    Quantity INT,
    PRIMARY KEY (OrderID, Product)
);

Second Normal Form (2NF)

Objective: Ensure the table is in 1NF and all non-key attributes are fully dependent on the primary key. This applies mainly to tables with composite primary keys.

Steps to achieve 2NF:

  • Ensure 1NF Compliance: The table must already be in 1NF.
  • Remove Partial Dependencies: Ensure that non-key attributes are dependent on the whole primary key, not just part of it.

Example:

Consider a table that is in 1NF but has partial dependencies:

OrderIDCustomerIDProductIDQuantityCustomerName
1112John Doe
2221Jane Smith

Here, CustomerName depends only on CustomerID, not on the composite key (OrderID, ProductID).

Convert to 2NF:

  1. Create separate tables for Orders and Customers:

Orders Table:

OrderIDCustomerIDProductIDQuantity
1112
2221

Customers Table:

CustomerIDCustomerName
1John Doe
2Jane Smith

SQL Implementation:

CREATE TABLE Orders (
    OrderID INT,
    CustomerID INT,
    ProductID INT,
    Quantity INT,
    PRIMARY KEY (OrderID, ProductID)
);

CREATE TABLE Customers (
    CustomerID INT PRIMARY KEY,
    CustomerName VARCHAR(255)
);

Third Normal Form (3NF)

Objective: Ensure the table is in 2NF and all attributes are only dependent on the primary key.

Steps to achieve 3NF:

  • Ensure 2NF Compliance: The table must already be in 2NF.
  • Remove Transitive Dependencies: Ensure that non-key attributes are not dependent on other non-key attributes.

Example:

Consider a table that is in 2NF but has transitive dependencies:

OrderIDCustomerIDProductIDQuantityProductName
1112Pen
2221Notebook

Here, ProductName depends on ProductID, not directly on OrderID.

Convert to 3NF:

  1. Create separate tables for Orders and Products:

Orders Table:

OrderIDCustomerIDProductIDQuantity
1112
2221

Products Table:

ProductIDProductName
1Pen
2Notebook

SQL Implementation:

CREATE TABLE Orders (
    OrderID INT,
    CustomerID INT,
    ProductID INT,
    Quantity INT,
    PRIMARY KEY (OrderID, ProductID)
);

CREATE TABLE Customers (
    CustomerID INT PRIMARY KEY,
    CustomerName VARCHAR(255)
);

CREATE TABLE Products (
    ProductID INT PRIMARY KEY,
    ProductName VARCHAR(255)
);

Practical Example: Putting It All Together

Let’s say we start with the following non-normalized data:

OrderIDCustomerNameProductsQuantities
1John DoePen, Pencil2, 3
2Jane SmithNotebook, Eraser1, 2

Step 1: Convert to 1NF

Split the multi-valued columns into atomic values:

OrderIDCustomerNameProductQuantity
1John DoePen2
1John DoePencil3
2Jane SmithNotebook1
2Jane SmithEraser2

Step 2: Convert to 2NF

Identify partial dependencies and separate them:

  1. Orders Table:
OrderIDCustomerIDProductIDQuantity
1112
1123
2231
2242
  1. Customers Table:
CustomerIDCustomerName
1John Doe
2Jane Smith
  1. Products Table:
ProductIDProductName
1Pen
2Pencil
3Notebook
4Eraser

Step 3: Convert to 3NF

Ensure no transitive dependencies by maintaining direct dependencies only on primary keys:

  • The tables created in 2NF are already in 3NF since all non-key attributes depend only on the primary key.

Conclusion

In this article we explored how we can implement normalization with SQL. Becoming proficient in SQL normalization is essential to building dependable and effective databases. Redundancy may be greatly decreased and data integrity can be improved by comprehending and putting the first three normal forms’ (1NF, 2NF, and 3NF) ideas into practice. This procedure enhances overall database performance in addition to streamlining data management. Now that you have access to the useful SQL examples, you can turn any complicated, disjointed data collection into an efficient, well-structured database. Adopt these strategies to guarantee the stability, scalability, and maintainability of your databases.

Frequently Asked Questions

Q1. What is database normalization?

A. Normalization is a process of organizing data in a database to reduce redundancy and improve data integrity by dividing it into well-structured tables.

Q2. Why is normalization important?

A. Normalization helps minimize duplicate data, ensures data consistency, and makes database maintenance easier.

Q3. What are the normal forms?

A. The normal forms are stages in the normalization process: 1NF (First Normal Form), 2NF (Second Normal Form), and 3NF (Third Normal Form).

My name is Ayushi Trivedi. I am a B. Tech graduate. I have 3 years of experience working as an educator and content editor. I have worked with various python libraries, like numpy, pandas, seaborn, matplotlib, scikit, imblearn, linear regression and many more. I am also an author. My first book named #turning25 has been published and is available on amazon and flipkart. Here, I am technical content editor at Analytics Vidhya. I feel proud and happy to be AVian. I have a great team to work with. I love building the bridge between the technology and the learner.

Responses From Readers

Clear

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details