In the previous article, we discussed the comprehensive overview of the “Python Programming Language and its built-in Data Structure.” We have also discussed the importance of learning Python to stay relevant in today’s competitive data science market, where many people have already been laid off due to the automation of tasks and the rise in Gen-AI and LLMs.
In this article, I’ll help you understand the core of the Advanced Python Topics, such as Classes and Generators, as well as some additional important topics from the perspective of a data scientist, along with a sample example.
By the end of this article, you will have a solid understanding of Python programming language, which is helpful for both interview preparation and day-to-day work as a data scientist and Python developer. By adopting these tips, you will write efficient code and boost your productivity while working with a team.
Advanced Python Programming is studying and applying sophisticated Python concepts beyond basic programming. It includes topics like object-oriented programming (OOP), decorators, generators, context managers, and metaclasses. It also covers advanced data structures, algorithms, concurrency, parallelism, and techniques for optimizing code performance. Mastery of these concepts enables developers to write more efficient, scalable, and maintainable code suitable for complex applications in fields like data science, machine learning, web development, and software engineering.
Python allows the developer to create custom objects using the `class` keyword. The object’s blueprint can have attributes or encapsulated data, the methods or the class’s behavior.
class Container():
def __init__(self, data):
self.data = data
class Container:
def __init__(self, data):
self.data = data
The container is a simple class that wraps a primitive (in this case, an integer).
# The code defines a class called `Container` with a constructor method `__init__`
# that takes a parameter `data` and assigns it to an instance variable `self.data`.
class Container:
def __init__(self, data):
self.data = data
def calculate(input):
input.data **= 5
container = Container(5)
calculate(container)
print(container.data)
3125
c1 = Container(5)
c2 = Container(5)
print(id(c1), id(c2))
print(id(c1) == id(c2)) # returns False because they are different objects.
print(c1 is c2) # same objects but returns False because they are distinct instances.
1274963509840 1274946106128
False
False
False
The `eq method` added dynamically in the previous step is used for the equality check (c1 == c2).
c1 = Container(5)
c2 = Container(5)
print(c1 == c2) # Compares by value
# This time, the result is True because the custom __eq__ method compares
# the values inside the Container instances.
print(c1 is c2) # Compares by identity (address)
# The is operator still returns False because it checks for identity.
True
False
Generators are a special type of iterators, created using a function with the `yield` keyword, used to generate values on the fly.
import sys
my_list = [i for i in range(1000)]
print(sum(my_list))
print("Size of list", sys.getsizeof(my_list), "bytes")
my_gen = (i for i in range(1000))
print(sum(my_gen))
print("Size of generator", sys.getsizeof(my_gen), "bytes")
499500
Size of list 8856 bytes
499500
Size of generator 112 bytes
def fib(count):
a, b = 0, 1
while count:
yield a
a, b = b, b + a
count -= 1
gen = fib(100)
print(next(gen), next(gen), next(gen), next(gen), next(gen))
for i in fib(20):
print(i, end=" ")
0 1 1 2 3
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181
This code demonstrates positive infinity, a Fibonacci number generator, and how to break out of the generator loop based on a condition.
import math
# Printing Infinity: special floating-point representation
print(math.inf)
# Assign infinity to a variable and perform an operation
inf = math.inf
print(inf, inf - 1) # Always infinity, Even when subtracting 1, the result is still infinity
# Fibonacci Generator:
def fib(count):
a, b = 0, 1
while count:
yield a
a, b = b, b + a
count -= 1
# Using the Fibonacci Generator:
# Use the Fibonacci generator with an infinite count
f = fib(math.inf)
# Iterate through the Fibonacci numbers until a condition is met
for i in f:
if i >= 200:
break
print(i, end=" ")
inf
inf inf
0 1 1 2 3 5 8 13 21 34 55 89 144
import math
# The code is creating a Fibonacci sequence generator using a generator function called `fib`.
def fib(count):
a, b = 0, 1
while count:
yield a
a, b = b, b + a
count -= 1
# The `fib` function takes a parameter `count` which determines the number of Fibonacci numbers to generate.
f = fib(10)
# This code generates Fibonacci numbers and creates a list containing the square root of each Fibonacci number.
data = [round(math.sqrt(i), 3) for i in f]
print(data)
[0.0, 1.0, 1.0, 1.414, 1.732, 2.236, 2.828, 3.606, 4.583, 5.831]
The generator function could be simpler without having to take a max count property. This can be done easily with itertools.
import itertools
def fib():
a, b = 0, 1
while True:
yield a
a, b = b, b + a
# itertools.islice is used to get the first 20 values from an infinite generator.
print(list(itertools.islice(fib(), 20)))
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]
# Defines a custom LinkedList class with a Node class as an element.
class Node:
def __init__(self, data, next_node=None):
self.data = data
self.next = next_node
class LinkedList:
def __init__(self, start):
self.start = start
# The __iter__ method is implemented to allow iteration over the linked list.
def __iter__(self):
node = self.start
while node:
yield node
node = node.next
ll = LinkedList(Node(5, Node(10, Node(15, Node(20)))))
for node in ll:
print(node.data)
5
10
15
20
Literals are the constants that provide variable values, which can be directly utilized later in expressions. They are just a syntax used in Python to express a fixed value of a specific data type.
For Example:
4x - 7 = 9
# 4 : Coefficient
# x : Variable
# - and = : Operator
# 7 and 9 : Literals (constants)
Python supports various types of literals, such as
In my previous article, I have discussed the collection literals, which you can refer to here. In this article, we will be discussing about :
A string literal is created by writing a text (i.e. the group of Characters) inside the single inverted commas(‘ ‘), double inverted commas(” “), or triple quotes (to store multi-line strings).
For instance,
# in single quote
s = 'AnalyticalNikita.io'
print(s)
# in double quotes
d = "AnalyticalNikita.io"
print(d)
# multi-line String
m = '''Analytical
Nikita.
io'''
print(m)
# Character Literal
char = "A"
print(char)
# Unicode Literal
unicodes = u"\u0041"
print(unicodes)
# Raw String
raw_str = r"raw \n string"
print(raw_str)
AnalyticalNikita.io
AnalyticalNikita.io
Analytical
Nikita.
io
A
A
raw \n string
There are three types of numeric literals in Python which are immutable by nature, namely:
# integer literal
# Binary Literals
a = 0b10100
# Decimal Literal
b = 50
# Octal Literal
c = 0o320
# Hexadecimal Literal
d = 0x12b
print(a, b, c, d)
# Float Literal
e = 24.8
print(e)
# Complex Literal
f = 2+3j
print(f)
20 50 208 299
24.8
(2+3j)
Like other programming languages, there are only two Boolean literals in Python also, namely: True (or 1) and False (or 0). Python considers Boolean the same as a number in a mathematical expression.
Such as:
a = (1 == True)
b = (1 == False)
c = True + 4
d = False + 10
print("a is", a)
print("b is", b)
print("c:", c)
print("d:", d)
a is True
b is False
c: 5
d: 10
Typically, ‘None’ is used to define a null variable.
hi = None
print(hi)
None
Note: If we compare ‘None’ with anything else other than a ‘None’, it will always return False.
hi = None
bye = "ok"
print(hi == bye)
False
It is known as a special literal because, in Python, it is also used for variable declaration. If you do not know the number of variables, you can use `None`, as it will not throw any errors.
k = None
a = 7
print("Program is running..")
Program is running..
We had already seen this function previously with respect to Python Built-in Data Structure when we had an equal lengths of iterables (such as lists, dictionaries, etc.) as arguments and `zip()` aggregates the iterators from each of the iterables.
# Example using zip with two lists
numbers = [1, 2, 3]
letters = ['a', 'b', 'c']
# Zip combines corresponding elements from both lists
zipped_result = zip(numbers, letters)
# Iterate over the zipped result
for number, letter in zipped_result:
print(f"Number: {number}, Letter: {letter}")
Number: 1, Letter: a
Number: 2, Letter: b
Number: 3, Letter: c
But what if we have an unequal number of iterators? In this case, we will use `zip_longest()` from the `itertools` module to aggregate the elements. If two lists have different lengths, it will aggregate `N/A`.
from itertools import zip_longest
# Example using zip_longest with two lists of different lengths
numbers = [1, 2, 3]
letters = ['a', 'b']
# zip_longest fills missing values with a specified fillvalue (default is None)
zipped_longest_result = zip_longest(numbers, letters, fillvalue='N/A')
# Iterate over the zipped_longest result
for number, letter in zipped_longest_result:
print(f"Number: {number}, Letter: {letter}")
Number: 1, Letter: a
Number: 2, Letter: b
Number: 3, Letter: N/A
When you have default values, you can pass arguments by name; positional arguments must remain on the left.
from itertools import zip_longest
def zip_lists(list1=[], list2=[], longest=True):
if longest:
return [list(item) for item in zip_longest(list1, list2)]
else:
return [list(item) for item in zip(list1, list2)]
names = ['Alice', 'Bob', 'Eva', 'David', 'Sam', 'Ace']
points = [100, 250, 30, 600]
print(zip_lists(names, points))
[['Alice', 100], ['Bob', 250], ['Eva', 30], ['David', 600], ['Sam', None], ['Ace', None]]
You can also pass named arguments in any order and can skip them even.
from itertools import zip_longest
def zip_lists(list1=[], list2=[], longest=True):
if longest:
return [list(item) for item in zip_longest(list1, list2)]
else:
return [list(item) for item in zip(list1, list2)]
print(zip_lists(longest=True, list2=['Eva']))
[[None, 'Eva']]
while True:
print("""Choose an option:
1. Do this
2. Do that
3. Do this and that
4. Quit""")
# if input() == "4":
if True:
break
Choose an option:
1. Do this
2. Do that
3. Do this and that
4. Quit
fruits = ['apple', 'banana', 'kiwi', 'orange']
# Using enumerate to iterate over the list with both index and value
for index, fruit in enumerate(fruits):
print(f"Index {index}: {fruit}")
print("\n")
# You can also specify a start index (default is 0)
for index, fruit in enumerate(fruits, start=1):
print(f"Index {index}: {fruit}")
Index 0: apple
Index 1: banana
Index 2: kiwi
Index 3: orange
Index 1: apple
Index 2: banana
Index 3: kiwi
Index 4: orange
import time
def done():
print("done")
def do_something(callback):
time.sleep(2) # it will print output after some time for ex 2 means 2.0s
print("Doing things....") # callback functions as an argument and prints "Doing things...." before calling the provided callback.
callback() # Call the provided callback function
# Call do_something with the done function as the callback
do_something(done)
Doing things....
done
dictionary_data = [{"name": "Max", "age": 6},
{"name": "Max", "age": 61},
{"name": "Max", "age": 36},
]
sorted_data = sorted(dictionary_data, key=lambda x : x["age"])
print("Sorted data: ", sorted_data)
Sorted data: [{'name': 'Max', 'age': 6}, {'name': 'Max', 'age': 36}, {'name': 'Max', 'age': 61}]
If you’re curious to find the Python version on which you are working, you can use this code:
from platform import python_version
print(python_version())
3.9.13
We can also use, `__doc__` to return the document of the functions, which provides all the details of the object, explaining its parameters and its default behavior.
print(print.__doc__)
You can use `.setdefault()` function to insert key with a specified default value if the key is not already present in your dictionary. Else `.get()` will return None, if the item has no specified key.
my_dict = {"name": "Max", "age": 6}
count = my_dict.get("count")
print("Count is there or not:", count)
# Setting default value if count is none
count = my_dict.setdefault("count", 9)
print("Count is there or not:", count)
print("Updated my_dict:", my_dict)
Count is there or not: None
Count is there or not: 9
Updated my_dict: {'name': 'Max', 'age': 6, 'count': 9}
from collections import Counter
my_list = [1,2,1,2,2,2,4,3,4,4,5,4]
counter = Counter(my_list)
print("Count of the numbers are: ", counter)
most_commmon = counter.most_common(2) # passed in Number will denotes how many common numbers we want (counting starts from 1-n)
print("Most Common Number is: ", most_commmon[0]) # printin zeroth index element from 2 most common ones
Count of the numbers are: Counter({2: 4, 4: 4, 1: 2, 3: 1, 5: 1})
Most Common Number is: (2, 4)
d1 = {"name": "Max", "age": 6}
d2 = {"name": "Max", "city": "NY"}
merged_dict = {**d1, **d2}
print("Here is merged dictionary: ", merged_dict)
Here is merged dictionary: {'name': 'Max', 'age': 6, 'city': 'NY'}
There are mainly two types of errors that can occur in a program, namely:
For hands-on experience and better understanding opt for the – Learn Python for Data Science Course
Congratulations! By now, I believe you have built a strong foundation in Python Programming. We have covered everything from Python Basics, including Operators and literals (numbers, strings, lists, dictionaries, sets, tuples), to Advanced Python topics such as Classes and Generators.
To level up your production-level coding skills, I have also discussed the two types of errors that can occur while writing a program. This way, you’ll be aware of them, and you can also refer to this article, where I have discussed how to Debug those Errors.
Additionally, I have compiled all the codes in a Jupyter Notebook, which you can — find here. These codes will serve as a quick future syntax reference.
Ans. Python literals are fixed values that we define in the source code, such as numbers, strings or booleans. They can be used in the program later as required.
Ans. A function is a block of code designed to perform a specific task and will only return a value when it is called. On the other hand, Python classes are blueprints used for creating application-specific custom objects.
Ans. Here’s the difference:
A. Iterators are objects with a `__next__()` method that helps retrieve the next element while iterating over an iterable.
B. Generators are a special type of iterator similar to the function definition in Python, but they `yield` the value instead of returning it.
Ans. Syntax errors occur during compilations raised by the interpreter when the program is not written according to the programming grammar. Meanwhile, Runtime Errors or Exceptions occur when the program crashes during execution.