In Python programming, threading is a powerful technique for achieving concurrency and optimizing performance. By allowing the execution of multiple tasks simultaneously within a single process, threading opens up new horizons for efficiently utilizing system resources. In this article, we explore the intricacies of threading in Python, uncovering its potential to enhance speed, responsiveness, and scalability in your applications. Get ready to embark on a journey where concurrency becomes a reality!
This article was published as a part of the Data Science Blogathon.
In the past, machines had a single core within the CPU, limiting their ability to handle multiple tasks simultaneously. However, the number of cores in a machine has become crucial as it determines its capacity for multitasking. For instance, a machine with 16 cores can perform 16 operations concurrently. To illustrate, if you need to perform 16 addition operations taking 1 second each, a single-core machine would require 16 seconds, whereas a 16-core machine could complete them all in just 1 second. This ability to execute tasks simultaneously across multiple cores is known as parallelism, a key advantage of modern computing architectures.
Thread is a set of operations that needs to execute. The thread will be deployed in one of the cores in the CPU. Note- 1 thread can be deployed only in 1 core, it cannot be transferred/switched tooth
Let us have deployed two threads to a core. Note- A core can do only one thing at a time.
Now we can process the two threads in the way we want.
First, we can process half of the first thread.
Half of the next thread can be processed now.
The remaining half of the threads can be processed in a similar fashion.
This what threading is- It is how do we run different things on the same CPU core. TLDR- Threading is about how do we handle the threads in a core.
Note- Threading does not involve running on multiple cores. It is about how to sequence the set of programs(threads) in the same core. In the above example, we are gonna tell the CPU how to sequence and execute the two threads on the given core.
You can actually see the number of threads that are currently running on your machine. In Windows- Go to Task Manager → Performance → CPU.
Why can’t we actually process one thread at a time and move on to the next?
Sometimes, a thread can go into a hanging, which means it is supposed to be idle at that point in time. The best example is time.sleep() function, which does nothing but waits for a given time. While one thread is idle/hanging, we can move on and process the other thread until the previous thread becomes active. TLDR- When one thread is waiting, you can process the other thread meanwhile.
This exactly what we call Concurrent Computing.
Let us explain the threading with a small example. Look at the code snippet below.
#Part One of the code
import time
print(1)
time.sleep(10)
print('Done sleeping)
print(2)
#Part Two of the code
print(3)
When you execute the whole code as a single thread, the code is executed step by step. First, the library is imported. Then ‘1’ is printed. The threads sleep for 10 seconds. Next ‘2’ is printed followed by ‘Done sleeping’ and finally ‘3’ printed.
1
Done sleeping
2
3
Now let us say you are executing the code as two threads. Part one as a thread and part two as a thread. (Note- By default, the Python code is not provisioned with threading- we need to import the threading library to do so.)
First, the library is imported, and then ‘1’ is printed. Now the thread goes to sleep. This is where threading comes into action.
The core now switches to the other thread.
Now ‘3’ is printed. Since all the process is done in Thread 2, the core now switches back to Thread 1 (which is still in sleep).
Now after the sleep duration, ‘2’ is printed.
So the output will be
1
3
Done sleeping
2
As always, a concept is only clear when we explain it with a real-world example. I/O processes are the ones that benefit from threading.
Let us say you are watching Shawshank Redemption on Netflix. Now two things happen while you are watching Andy Dufresne suffering in jail- One- the application fetches data from the server, Two- The fetched data is shown to you like a movie on your screen.
Imagine what would be the situation without threading. You would have to wait for the video to get downloaded once in a while, watch the segment that was fetched, wait for the next segment to get downloaded, and so on.
Thanks to threading, we can divide the two processes into different threads. While one thread fetches data (that is, it is in hang/sleep mode), the other thread can show you the amazing performance of Morgan Freeman.
It is also much useful for you as a Data Scientist. For example, when you scrape the data from multiple web pages, you can simply deploy them in multiple threads and make it faster. Even when you push the data to a server, you can do so in multiple threads, so that when one thread is idle others can be triggered.
As said before, by default, the Python code is not provisioned with threading- we need to import the threading library to do so.
Take a look at the code.
import threading
import time
def sleepy_man(secs):
print('Starting to sleep inside')
time.sleep(secs)
print('Woke up inside')
x = threading.Thread(target = sleepy_man, args = (1,))
x.start()
print(threading.activeCount())
time.sleep(1.2)
print('Done')
First, let me explain the code step by step. Then we will analyze the output.
x = threading.Thread(target = sleepy_man, args = (10,))
x.start()
print(threading.activeCount())
Now let us look at the flow of control. Once you call the start() method, it triggers sleepy_man() and it runs in a separate thread. The main program will also run in parallel as another thread. The flow is shown in the image below.
Now let us increase the time in which the program sleeps inside the function.
import threading
import time
def sleepy_man(secs):
print('Starting to sleep inside')
time.sleep(secs)
print('Woke up inside')
x = threading.Thread(target = sleepy_man, args = (4,))
x.start()
print(threading.activeCount())
time.sleep(1.2)
print('Done')
Starting to sleep inside
2
Done
Woke up inside
The flow is given in the diagram below:
Now let’s spice things a bit. Let us run a for loop that triggers multiple threads.
import threading
import time
def sleepy_man(secs):
print('Starting to sleep inside - Iteration {}'.format(5-secs))
time.sleep(secs)
print('Woke up inside - Iteration {}'.format(5-secs))
for i in range(3):
x = threading.Thread(target = sleepy_man, args = (5-i,))
x.start()
print('Active threads- ', threading.activeCount())
At every iteration, we trigger a thread. Note that we pass the arguments 5, 4, 3 at 1st, 2nd, and 3rd iteration respectively. Thus the sleepy_man() sleeps 5 seconds, 4 seconds, and 3 seconds respectively.
Output :
Starting to sleep inside - Iteration 0
Starting to sleep inside - Iteration 1
Starting to sleep inside - Iteration 2
Active threads- 4
Woke up inside - Iteration 2
Woke up inside - Iteration 1
Woke up inside - Iteration 0
Thus we have seen how multiple threads can be defined and triggered, ensuring a better way of processing which is very essential for heavy I/O operations.
These threading functions play crucial roles in managing threads, synchronization, and concurrency in Python programs. Understanding their purposes and how to use them effectively is essential for developing efficient and responsive applications.
Function | Description |
---|---|
threading.Thread(target, args) | Creates a new thread object and specifies the target function to be executed in that thread. Additional arguments can be passed through the args parameter. |
threading.Thread.start() | Starts the execution of a thread by calling its target function. |
threading.Thread.join() | Waits for the thread to complete its execution before proceeding with the rest of the program. |
threading.active_count() | Returns the number of currently active threads. |
threading.current_thread() | Returns the currently executing thread object. |
threading.enumerate() | Returns a list of all currently active thread objects. |
threading.Lock() | Creates a lock object that can be used for synchronization to prevent multiple threads from accessing shared resources simultaneously. |
threading.Timer(interval, function) | Creates a timer object that waits for a specified interval and then executes the specified function. |
In conclusion, threading in Python unlocks the potential for concurrency and enhanced application performance. By harnessing the power of multiple threads, you can efficiently utilize system resources, achieve parallel execution, and easily handle complex tasks. Take your Python skills to the next level with the Blackbelt Program and become a master of threading and advanced Python concepts. Upgrade your abilities and unlock new possibilities in your programming journey.
A. Threading in Python allows concurrent execution of tasks. For example, using the threading module, you can create multiple threads to perform different operations simultaneously, such as downloading files while processing data in the background.
A. Threading and multithreading in Python both involve executing multiple threads concurrently. Threading generally refers to managing threads in a program, while multithreading focuses explicitly on using multiple threads to improve performance and handle parallel tasks. Python provides threading modules and libraries to facilitate multithreading and efficiently utilize system resources.
A. Threading in Python refers to executing multiple threads (lightweight units of execution) within a single process. It allows concurrent execution of multiple tasks, enabling efficient utilization of CPU resources and handling of I/O operations.
The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.
Thanks a lot!!! You did an amazing job explaining the useful concepts
Very well explained.
Nicely explained. Thank you