In today’s world, in each and every domain, the utmost thing is Data Storage. While working with Python, which provides different types of data structures to organize your data. So, learning these data structures becomes an essential part of your journey to learn Python either from a Software Engineering perspective or from a Data Science perspective. In Python, among all of the data structures available, some are mutable and some are immutable. But In this article, our main focus will be on the sets.
Image Source: Link
Now, a question comes to mind that 🙁 “When we use Sets in Python?” So, sets are used in Python when:
So, without any further delay, Let’s get started, 😎
This article was published as a part of the Data Science Blogathon
A set is basically a data type that consists of a collection of unordered elements and it is a mutable (changeable) collection of unique elements i.e, do not have repeated copies of elements. Elements in sets can be of any data type, unlike arrays, which are not type-specific. The values of a set are unindexed, therefore, indexing operations cannot be performed on sets.
Sets are written with curly brackets ({}), being the elements separated by commas.
We can create the sets in Python using either of the following two ways:
Sets in Python can be created with the help of curly braces({}).
Python Code:
# Create a set using curly braces
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
print(my_set)
Output:
{'Data Scientist', 'Data Engineer', 'Data Analyst', 'Analytics Vidhya'}
Sets in Python can also be created using the built-in function set([iterable]). This function takes an iterable (i.e. any type of sequence, collection, or iterator), as an argument and returns a set that contains unique items from the input i.e, duplicated values are removed.
# Create a set using set() function
my_set = set(['Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'])
print(my_set)
# Check the Data Type of my_set
print(type(my_set))
Output:
{'Data Scientist', 'Data Engineer', 'Data Analyst', 'Analytics Vidhya'}
We can also create an empty set using the set() function.
# Creating an empty set using set() function
empty_set = set()
print(empty_set)
Output:
set()
We can perform a number of operations such as adding elements, deleting elements, finding the length of a set, etc. To know what all methods can be used on sets, we can use the dir() function. Let’s see this on a given set.
# Check all functionalities which we can perform on a Python Set
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
print(list(dir(my_set)))
Output:
To find the length of a set in Python, we use the len() function. This function takes the name of the set as a parameter and returns an integer value that is equal to the number of elements present in the set.
# Finding the length of a set
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
print("The length of a given set is", len(my_set))
Output:
The length of a given set is 4
We cannot access the set elements using the index numbers as we specified before that the elements of a set are not indexed. Therefore, if we want to access the elements of a set, then we can use a for loop and access its elements.
# Printing all the elements of a set using For loop
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
for element in my_set:
print(element)
Output:
Data Scientist Data Engineer Data Analyst Analytics Vidhya
We can add the new elements to a set using either of the two functions,
# Adding a single element ‘Business Analyst’ to an existing set
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.add('Business Analyst')
print(my_set)
Output:
{'Data Engineer', 'Data Analyst', 'Business Analyst', 'Data Scientist', 'Analytics Vidhya'}
# Adding more than one elements ‘Business Analyst’ and ‘Data Mastermind’ to an existing set
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.update(['Business Analyst', 'Data Mastermind'])
print(my_set)
Output:
{'Data Mastermind', 'Data Engineer', 'Data Analyst', 'Business Analyst', 'Data Scientist', 'Analytics Vidhya'}
We can use the following ways to either removing elements from a set or deleting a complete set:
The set.remove(x) method takes one parameter x and removes that element x from a set. If the given element to this function does not exist, then raises an exception (KeyError).
In the below example, you can see that ‘Analytics Vidhya’ has been removed from the set using the remove() function. But when we specify ‘Business Analyst’ i.e, some element as a parameter to remove() that does not exist in the set, it will throw an error.
# Remove the element using remove() function
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.remove('Analytics Vidhya')
print(my_set)
my_set.remove('Business Analyst')
print(my_set)
Output:
The set.discard(x) method takes one parameter x and removes that element x from a set if it is present. Now, if we want to remove some element from the set, and we are not sure whether that element is actually present in the set or not, then we can use this function. In comparison to the remove method, the discard method does not raise an exception (KeyError) if the element to be removed does not exist.
# Remove the element using discard() function
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.discard('Analytics Vidhya')
print(my_set)
my_set.discard('Business Analyst')
print(my_set)
Output:
{'Data Scientist', 'Data Engineer', 'Data Analyst'} {'Data Scientist', 'Data Engineer', 'Data Analyst'}
In the above example, you can see that ‘Analytics Vidhya’ has been removed from my_set but discard() has not thrown an error when I used my_set.discard(‘Business Analyst’) even though ‘Business Analyst’ is not present in my set.
The set.pop() method also removes set elements, but since a set is unordered, we will not know which element has been removed from the set.
In the below example, you can see that the pop() function removes some random element has been removed, which in this case is ‘Data Scientist’.
# Remove the element using pop() function
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.pop()
print(my_set)
Output:
{'Data Engineer', 'Data Analyst', 'Analytics Vidhya'}
The set.clear() method deletes all the elements present in a given set.
# Remove all the element using clear() function
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
my_set.clear()
print(my_set)
Output:
set()
In the above example, we can see that after the clear() operation, my_set becomes an empty set.
When we want to completely delete the set, we can use the del keyword to do so.
# Delete the complete set
my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'}
del my_set
print(my_set)
Output:
In the above example, we can see that after running the code, it will throw an error because my_set is deleted after performing the operation.
We can use sets in Python to compute mathematical operations such as,
These logical operations can be represented with a diagram, which is known as the Venn diagram. Venn diagrams are widely used in Mathematics, Statistics, and Computer science to visualize the differences and similarities between the sets.
Image Source: Link
Image Showing all the Mathematical Operations which we can perform on Sets
The union of two sets A and B is defined as the set containing the elements that are in A, B, or both, and is denoted by A ∪ B.
Image Source: Link
Figure Showing the Union of Two Sets
To compute this operation with Python, we can use either of the following two ways:
We can concatenate two sets in Python using the ‘|’ operator.
# Union of two sets using ‘|’ operator
my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The union of two sets is given byn", my_set_1 | my_set_2)
Output:
The union of two sets is given by {1, 'Analytics Vidhya', 3.45, '1'}
To concatenate two or more sets, we can also use the union() function.
# Union of two sets using set.union() method
my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The union of two sets is given byn", my_set_1.union(my_set_2))
Output:
The union of two sets is given by {1, 3.45, 'Analytics Vidhya', '1'}
The intersection of two sets A and B is defined as the set that consists of the elements that are common to both sets and is denoted by A ∩ B.
Figure Showing the Intersection of Two Sets
Image Source: Link
In Python, we can compute the intersection of two sets using either of the following two ways:
We can determine the intersection of two or more sets using the ‘&’ operator.
# Intersection of two sets using ‘&’ operator
my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The intersection of two sets is given byn", my_set_1 & my_set_2)
Output:
The intersection of two sets is given by {1, 3.45, 'Analytics Vidhya'}
We can also determine the intersection of two or more sets using the intersection() function.
# Intersection of two sets using set.intersection() method
my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The intersection of two sets is given byn", my_set_1.intersection(my_set_2))
Output:
The intersection of two sets is given by {1, 3.45, 'Analytics Vidhya'}
When we take the difference between two sets, then it produces a new set that consists of elements that are present only in one of those sets. This means that all elements except the common elements of those sets will be returned.
In simple and short terms, we can say that the difference between two sets A and B is defined as the set of all elements of set A that are not contained in set B and is denoted by A-B.
Figure Showing the Difference Between the Two Sets
Image Source: Link
To compute the difference between two sets in Python, we can use either of the following two ways:
To find the difference between the two sets, we can use the ‘-’ operator.
# Difference of two sets using ‘-‘ operator
my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The difference of set 1 from set 2 is given byn", my_set_1 - my_set_2)
print(" The difference of set 2 from set 1 is given byn", my_set_2 - my_set_1)
Output:
The difference of set 1 from set 2 is given by set() The difference of set 2 from set 1 is given by {'1'}
The difference of sets can be determined using the built-in difference() function also.
# Difference of two sets using set.difference() method
my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The difference of set 1 from set 2 is given byn", my_set_1.difference(my_set_2))
print(" The difference of set 2 from set 1 is given byn", my_set_2.difference(my_set_1))
Output:
The difference of set 1 from set 2 is given by set() The difference of set 2 from set 1 is given by {'1'}
The symmetric difference of two sets A and B is defined as the set of elements that are in either of the sets A and B, but not in both, and is denoted by A △ B.
Figure Showing the Symmetric Difference Between Two Sets
Image Source: Link
In Python, we can find the symmetric difference of two sets using either of the following two ways:
To find the symmetric difference between two sets, we can use the ‘^’ operator.
# Symmetric Difference of two sets using ‘^’ operator
my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The symmetric difference of two sets is given byn", my_set_1 ^ my_set_2)
print(" The symmetric difference of two sets is given byn", my_set_2 ^ my_set_1)
Output:
The symmetric difference of two sets is given by {'1'} The symmetric difference of two sets is given by {'1'}
The difference of sets can be also determined using the built-in symmetric_ difference() function.
# Symmetric Difference of two sets using set.symmetric_difference() method
my_set_1 = {1, 'Analytics Vidhya', 3.45}
my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1}
print(" The symmetric difference of two sets is given byn", my_set_1.symmetric_difference(my_set_2))
print(" The symmetric difference of two sets is given byn", my_set_2.symmetric_difference(my_set_1))
Output:
The symmetric difference of two sets is given by {'1'} The symmetric difference of two sets is given by {'1'}
Sometimes we make a set that we do not change i.e, don’t change the elements of our set at all, so at that time we can make use of frozen sets which I discussed in this section.
A frozenset object is a Python set that cannot be modified i.e, once created, cannot be changed. This means that it is immutable, unlike a normal set that I have discussed previously. One of the applications of frozen sets is to serve as a key in dictionary key-value pairs.
Since Frozen sets are immutable, therefore, you cannot perform operations such as add(), remove(), update(), etc. If you trying to add an element to a frozenset, then it raises an exception (AttributeError). (See the below Example)
simple_set = {1,'Analytics Vidhya', 4.6, 'r'}
frozen_set = frozenset(simple_set)
print("The frozenset corresponding to a given set is n", frozen_set)
frozen_set.add('Business Analyst')
print(frozen_set)
Output:
We create a frozenset in Python using the frozenset([iterable]) method, providing an iterable as input. This function takes any iterable items and converts them to immutable.
simple_set = {1,'Analytics Vidhya', 4.6, 'r'}
frozen_set = frozenset(simple_set)
print("The frozenset corresponding to a given set is n", frozen_set)
Output:
The frozenset corresponding to a given set is frozenset({1, 'r', 4.6, 'Analytics Vidhya'})
The above output consists of the set frozen_set which is a frozen version of simple_set.
Elements of a frozen set can be accessed by using a for loop.
frozen_set =frozenset([1,'Analytics Vidhya', 4.6, 'r'])
for element in frozen_set:
print(element)
Output:
1 r 4.6 Analytics Vidhya
The above output shows that using the for loop, all the elements of the frozen_set have been returned one after the other.
This article covered the basics of Python sets, what they are, how to create them and some of their attributes. We also had a look at different operations like adding, removing elements and performing mathematical set operators.When considering Python sets vs lists, sets are ideal when you need to store unique elements without any order, making them highly efficient for operations like membership testing and removing duplicates.
You can also check my previous blog posts.
Previous Data Science Blog posts.
Here is my Linkedin profile in case you want to connect with me. I’ll be happy to be connected with you.
For any queries, you can mail me on [email protected]
Thanks for reading!
I hope that you have enjoyed the article. If you like it, share it with your friends also. Something not mentioned or want to share your thoughts? Feel free to comment below And I’ll get back to you. 😉
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.