An Intuitive and Easy Guide to Python Sets- Must for Becoming Data Science Professional

Chirag Goyal Last Updated : 12 Nov, 2024
12 min read

Introduction

In today’s world, in each and every domain, the utmost thing is Data Storage. While working with Python, which provides different types of data structures to organize your data. So, learning these data structures becomes an essential part of your journey to learn Python either from a Software Engineering perspective or from a Data Science perspective. In Python, among all of the data structures available, some are mutable and some are immutable. But In this article, our main focus will be on the sets.

different type of python set

                                                        Image Source: Link

Now, a question comes to mind that 🙁 “When we use Sets in Python?” So, sets are used in Python when:

  • The order of data does not matter.
  • We do not want any repetitions in the data elements.
  • We have to perform mathematical operations such as union, intersection, etc.

So, without any further delay, Let’s get started, 😎

This article was published as a part of the Data Science Blogathon

What is a Python set?

A set is basically a data type that consists of a collection of unordered elements and it is a mutable (changeable) collection of unique elements i.e, do not have repeated copies of elements. Elements in sets can be of any data type, unlike arrays, which are not type-specific. The values of a set are unindexed, therefore, indexing operations cannot be performed on sets.

Sets are written with curly brackets ({}), being the elements separated by commas.

How do you create a set in Python?

We can create the sets in Python using either of the following two ways:

  • Enclosing elements within curly braces ({})
  • By using the set() function

Using curly braces

Sets in Python can be created with the help of curly braces({}).

For Example-

Python Code:

# Create a set using curly braces
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
print(my_set)

Output:

{'Data Scientist', 'Data Engineer', 'Data Analyst', 'Analytics Vidhya'}

Using set() function

Sets in Python can also be created using the built-in function set([iterable]). This function takes an iterable (i.e. any type of sequence, collection, or iterator), as an argument and returns a set that contains unique items from the input i.e, duplicated values are removed.

For Example-

# Create a set using set() function
my_set = set(['Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'])
print(my_set)
# Check the Data Type of my_set
print(type(my_set))

Output:

{'Data Scientist', 'Data Engineer', 'Data Analyst', 'Analytics Vidhya'}

Creating an Empty Python Set

We can also create an empty set using the set() function.

For Example-

# Creating an empty set using set() function
empty_set = set()
print(empty_set)

Output:

set()

Basic Functionalities on Sets in Python

We can perform a number of operations such as adding elements, deleting elements, finding the length of a set, etc. To know what all methods can be used on sets, we can use the dir() function. Let’s see this on a given set.

# Check all functionalities which we can perform on a Python Set
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
print(list(dir(my_set)))

Output:

dir function python set

1. Finding the Length of a Python Set

To find the length of a set in Python, we use the len() function. This function takes the name of the set as a parameter and returns an integer value that is equal to the number of elements present in the set.

For Example-

# Finding the length of a set

my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
print("The length of a given set is", len(my_set))

Output:

The length of a given set is 4

2. Accessing the Elements of a Set

We cannot access the set elements using the index numbers as we specified before that the elements of a set are not indexed. Therefore, if we want to access the elements of a set, then we can use a for loop and access its elements.

For Example-

# Printing all the elements of a set using For loop

my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
for element in my_set:
    print(element)

Output:

Data Scientist
Data Engineer
Data Analyst
Analytics Vidhya

3. Adding the elements to an existing Set

We can add the new elements to a set using either of the two functions,

  • Adding a single element – add() function
  • Adding more than one elements – update() function

For Example-

# Adding a single element ‘Business Analyst’ to an existing set

my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.add('Business Analyst')
print(my_set)

Output:

{'Data Engineer', 'Data Analyst', 'Business Analyst', 'Data Scientist', 'Analytics Vidhya'}

For Example-

# Adding more than one elements ‘Business Analyst’ and ‘Data Mastermind’ to an existing set

my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.update(['Business Analyst', 'Data Mastermind'])
print(my_set)

Output:

{'Data Mastermind', 'Data Engineer', 'Data Analyst', 'Business Analyst', 'Data Scientist', 'Analytics Vidhya'}

4. Removing an element from a Set or completely delete a set

We can use the following ways to either removing elements from a set or deleting a complete set:

  • Using the set.remove(x) method
  • Using the set.discard(x) method
  • Using the set.pop() method
  • Using the set.clear() method
  • Using the del Keyword

The remove function

The set.remove(x) method takes one parameter x and removes that element x from a set. If the given element to this function does not exist, then raises an exception (KeyError).

In the below example, you can see that ‘Analytics Vidhya’ has been removed from the set using the remove() function. But when we specify ‘Business Analyst’ i.e, some element as a parameter to remove() that does not exist in the set, it will throw an error.

For Example-

# Remove the element using remove() function

my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.remove('Analytics Vidhya')
print(my_set)
my_set.remove('Business Analyst')
print(my_set)

Output:

error

The discard function

The set.discard(x) method takes one parameter x and removes that element x from a set if it is present. Now, if we want to remove some element from the set, and we are not sure whether that element is actually present in the set or not, then we can use this function. In comparison to the remove method, the discard method does not raise an exception (KeyError) if the element to be removed does not exist.

For Example-

# Remove the element using discard() function

my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.discard('Analytics Vidhya')
print(my_set)
my_set.discard('Business Analyst')
print(my_set)

Output:

{'Data Scientist', 'Data Engineer', 'Data Analyst'}
{'Data Scientist', 'Data Engineer', 'Data Analyst'}

In the above example, you can see that ‘Analytics Vidhya’ has been removed from my_set but discard() has not thrown an error when I used my_set.discard(‘Business Analyst’) even though Business Analystis not present in my set.

The pop function

The set.pop() method also removes set elements, but since a set is unordered, we will not know which element has been removed from the set.

In the below example, you can see that the pop() function removes some random element has been removed, which in this case is ‘Data Scientist’.

For Example-

# Remove the element using pop() function

my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.pop()
print(my_set)

Output:

{'Data Engineer', 'Data Analyst', 'Analytics Vidhya'}

The clear function

The set.clear() method deletes all the elements present in a given set.

For Example-

# Remove all the element using clear() function

my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.clear()
print(my_set)

Output:

set()

In the above example, we can see that after the clear() operation, my_set becomes an empty set.

The del Keyword

When we want to completely delete the set, we can use the del keyword to do so.

For Example-

# Delete the complete set

my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
del my_set
print(my_set)

Output:

del function python set

In the above example, we can see that after running the code, it will throw an error because my_set is deleted after performing the operation.

Mathematical operations on Python Sets

We can use sets in Python to compute mathematical operations such as,

  • Union,
  • Intersection,
  • Difference, and
  • Symmetric difference

These logical operations can be represented with a diagram, which is known as the Venn diagram. Venn diagrams are widely used in Mathematics, Statistics, and Computer science to visualize the differences and similarities between the sets.

Python Sets and Set Theory | Math tutorials, Learning mathematics, Mathematics worksheets

                                                          Image Source: Link

Image Showing all the Mathematical Operations which we can perform on Sets

1. Union of Sets

The union of two sets A and B is defined as the set containing the elements that are in A, B, or both, and is denoted by A ∪ B.

Union of two sets

                                              Image Source: Link

Figure Showing the Union of Two Sets

To compute this operation with Python, we can use either of the following two ways:

  • Using the ‘ | ‘ operator
  • Using set.union() method

Using the ‘|’ operator

We can concatenate two sets in Python using the ‘|’ operator.

For Example-

# Union of two sets using ‘|’ operator

my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The union of two sets is given byn", my_set_1 | my_set_2)

Output:

 The union of two sets is given by
 {1, 'Analytics Vidhya', 3.45, '1'}

Using set.union() method

To concatenate two or more sets, we can also use the union() function.

For Example-

# Union of two sets using set.union() method

my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The union of two sets is given byn", my_set_1.union(my_set_2))

Output:

 The union of two sets is given by
 {1, 3.45, 'Analytics Vidhya', '1'}

2. Intersection of Sets

The intersection of two sets A and B is defined as the set that consists of the elements that are common to both sets and is denoted by A ∩ B.

Set Operations and Venn Diagrams


Figure Showing the Intersection of Two Sets

                                                   Image Source: Link

In Python, we can compute the intersection of two sets using either of the following two ways: 

  • Using the ‘&’ operator
  • Using set.intersection() method

Using the ‘&’ operator

We can determine the intersection of two or more sets using the ‘&’ operator.

For Example-

# Intersection of two sets using ‘&’ operator

my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The intersection of two sets is given byn", my_set_1 & my_set_2)

Output:

The intersection of two sets is given by
 {1, 3.45, 'Analytics Vidhya'}

Using set.intersection() method

We can also determine the intersection of two or more sets using the intersection() function.

For Example-

# Intersection of two sets using set.intersection() method

my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The intersection of two sets is given byn", my_set_1.intersection(my_set_2))

Output:

 The intersection of two sets is given by
 {1, 3.45, 'Analytics Vidhya'}

3. Difference of Sets

When we take the difference between two sets, then it produces a new set that consists of elements that are present only in one of those sets. This means that all elements except the common elements of those sets will be returned.

In simple and short terms, we can say that the difference between two sets A and B is defined as the set of all elements of set A that are not contained in set B and is denoted by A-B.

Union & Intersection of Sets Cardinal Number of Set | Solved Problem


Figure Showing the Difference Between the Two Sets

                                                Image Source: Link

To compute the difference between two sets in Python, we can use either of the following two ways:

  • Using the ‘-‘ operator
  • Using set.difference() method

Using the ‘-‘ operator

To find the difference between the two sets, we can use the ‘-’ operator.

For Example-

# Difference of two sets using ‘-‘ operator

my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The difference of set 1 from set 2 is given byn", my_set_1 - my_set_2)
print(" The difference of set 2 from set 1 is given byn", my_set_2 - my_set_1)

Output:

 The difference of set 1 from set 2 is given by
 set()
 The difference of set 2 from set 1 is given by
 {'1'}

Using set.difference() method

The difference of sets can be determined using the built-in difference() function also.

For Example-

# Difference of two sets using set.difference() method

my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The difference of set 1 from set 2 is given byn", my_set_1.difference(my_set_2))
print(" The difference of set 2 from set 1 is given byn", my_set_2.difference(my_set_1))

Output:

 The difference of set 1 from set 2 is given by
 set()
 The difference of set 2 from set 1 is given by
 {'1'}

4. Symmetric difference of Sets

The symmetric difference of two sets A and B is defined as the set of elements that are in either of the sets A and B, but not in both, and is denoted by A △ B.

Symmetric Difference - an overview | ScienceDirect Topics

Figure Showing the Symmetric Difference Between Two Sets

                                                       Image Source: Link

In Python, we can find the symmetric difference of two sets using either of the following two ways:

  • Using the ‘^’ operator
  • Using set.symmetric_difference() method

Using the ‘^‘ operator

To find the symmetric difference between two sets, we can use the ‘^’ operator.

For Example-

# Symmetric Difference of two sets using ‘^’ operator

my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The symmetric difference of two sets is given byn", my_set_1 ^ my_set_2)
print(" The symmetric difference of two sets is given byn", my_set_2 ^ my_set_1)

Output:

 The symmetric difference of two sets is given by
 {'1'}
 The symmetric difference of two sets is given by
 {'1'}

Using set.symmetric_difference() method

The difference of sets can be also determined using the built-in symmetric_ difference() function.

For Example-

# Symmetric Difference of two sets using set.symmetric_difference() method

my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The symmetric difference of two sets is given byn", my_set_1.symmetric_difference(my_set_2))
print(" The symmetric difference of two sets is given byn", my_set_2.symmetric_difference(my_set_1))

Output:

 The symmetric difference of two sets is given by
 {'1'}
 The symmetric difference of two sets is given by
 {'1'}

Alternative container: Frozenset

Sometimes we make a set that we do not change i.e, don’t change the elements of our set at all, so at that time we can make use of frozen sets which I discussed in this section.

A frozenset object is a Python set that cannot be modified i.e, once created, cannot be changed. This means that it is immutable, unlike a normal set that I have discussed previously. One of the applications of frozen sets is to serve as a key in dictionary key-value pairs.

Since Frozen sets are immutable, therefore, you cannot perform operations such as add(), remove(), update(), etc. If you trying to add an element to a frozenset, then it raises an exception (AttributeError). (See the below Example)

For Example-

simple_set = {1,'Analytics Vidhya', 4.6, 'r'}
frozen_set = frozenset(simple_set)
print("The frozenset corresponding to a given set is n", frozen_set)
frozen_set.add('Business Analyst')
print(frozen_set)

Output:

How to Create a Frozenset?

We create a frozenset in Python using the frozenset([iterable]) method, providing an iterable as input. This function takes any iterable items and converts them to immutable. 

For Example-

simple_set = {1,'Analytics Vidhya', 4.6, 'r'}
frozen_set = frozenset(simple_set)
print("The frozenset corresponding to a given set is n", frozen_set)

Output:

The frozenset corresponding to a given set is 
 frozenset({1, 'r', 4.6, 'Analytics Vidhya'})

The above output consists of the set frozen_set which is a frozen version of simple_set.

Accessing Elements of a Frozen Set

Elements of a frozen set can be accessed by using a for loop.

For Example-

frozen_set =frozenset([1,'Analytics Vidhya', 4.6, 'r'])
for element in frozen_set:
    print(element)

Output:

1
r
4.6
Analytics Vidhya

The above output shows that using the for loop, all the elements of the frozen_set have been returned one after the other.

Conclusion

This article covered the basics of Python sets, what they are, how to create them and some of their attributes. We also had a look at different operations like adding, removing elements and performing mathematical set operators.When considering Python sets vs lists, sets are ideal when you need to store unique elements without any order, making them highly efficient for operations like membership testing and removing duplicates.

Other Blog Posts by Me

You can also check my previous blog posts.

Previous Data Science Blog posts.

LinkedIn

Here is my Linkedin profile in case you want to connect with me. I’ll be happy to be connected with you.

Email

For any queries, you can mail me on [email protected]

End Notes

Thanks for reading!

I hope that you have enjoyed the article. If you like it, share it with your friends also. Something not mentioned or want to share your thoughts? Feel free to comment below And I’ll get back to you. 😉

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

I am a B.Tech. student (Computer Science major) currently in the pre-final year of my undergrad. My interest lies in the field of Data Science and Machine Learning. I have been pursuing this interest and am eager to work more in these directions. I feel proud to share that I am one of the best students in my class who has a desire to learn many new things in my field.

Responses From Readers

Clear

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details