To use LRU caching in Python, you just need to add two lines – import and declaration of the @lru_cache decorator. We show with examples how and why to use it.
Caching is one approach that, if used correctly, significantly speeds up work and reduces the load on computational resources. A decorator is implemented in the Python standard library module that makes it possible to cache the output of functions using the Least Recently Used (LRU) strategy. This is a simple yet powerful technique that allows you to leverage caching capabilities in your code. functools @lru_cache
This article was published as a part of the Data Science Blogathon
In this guide, we’ll cover:
How to use decorators to implement all caching strategies that are available;
what is LRU and how does this approach work;
how to improve program performance using a decorator; @lru_cache
how to extend decorator functionality and stop caching after a certain time interval. @lru_cache
Caching is a storage optimization technique in which data is manipulated more efficiently than at its source.
Imagine we are creating a newsreader application that aggregates news from various sources. The user navigates through the list, the application downloads the articles and displays them on the screen.
What will the program do if the reader decides to compare a couple of articles and begins to move between them many times? Without caching, the application will have to receive the same content every time. In this case, both the user’s system and the server with articles, on which additional load is created, are ineffectively used.
The best approach after receiving the article would be to store the content locally. The next time the user opens the article, the app will be able to open the content from the saved copy, instead of reloading the material from the source, and this is a procedure called caching in data science.
Python can implement caching using a dictionary. Instead of going to the server every time, you can check to see if the content is in the cache and query the server only if there is no content. You can use the URL of the article as a key, and its content as a value:
import requests
link = "https://www.analyticsvidhya.com/blog/2021/08/edge-contour-detection-an-application-of-computer-vision/"
cache = dict()
def get_article_from_server(url):
response = requests.get (url)
return response.text
def get_article (url):
print ("Getting the article ...")
if url not in cache:
cache[url] = get_article_from_server(url)
return cache[url]
print(get_article(link)[:1000])
We get the article…
We take the article from the server…
We get the article..
Note– To run this example, you must have the library installed requests:
pip install requests
Although the call get_article()is made twice, the article is downloaded from the server only once. After the first access to the article, we put its URL and content into a dictionary cache. The second time, the code does not need to retrieve the item from the server again.
In this simple implementation of caching, an obvious problem crept in: the content of the dictionary will grow indefinitely: the more entries a user opens, the more memory space was used.
To work around this problem, we need a strategy that allows the program to decide which articles to delete. There are several different strategies that you can use to remove items from the cache and prevent it from exceeding their maximum size. The five most popular are listed in the table.
Strategy | Which record do we delete | These records are most often reused. |
First-In / First-Out (FIFO) | The oldest | New |
Last-In / First-Out (LIFO) | Most recent | Old |
Least Recently Used (LRU) | Used the longest | Recently read |
Most Recently Used (MRU) | Used last | Read First |
Least Frequently Used (LFU) | Used most rarely | Used frequently |
The cache, implemented through the LRU strategy, orders the items in the order in which they are used. Each time we access a record, the LRU algorithm moves it to the top of the cache. Thus, the algorithm can quickly identify the entry that has not been used the longest by checking the end of the list.
The following figure shows the cache view after a user has requested an article from the web.
LRU Cache Population Process Step 1
The article is saved in the last cache slot before being sent to the user. The following figure shows what happens when the user requests the next article.
LRU Cache Population Process Step 2
The second article takes up the last slot, moving the first article down the list.
This LRU strategy works as the later the object was used, the more likely it will be needed in the future. The algorithm keeps such an object in the cache for as long as possible.
A combination of a doubly linked list and a hash table is a common method to implement an LRU cache in Python. The head of a doubly-linked list points to the last requested entry, and the tail to the most recently used.
The figure below shows a possible structure for an LRU cache implementation.
Caching implementation diagram
Using a hash table, we provide access to each item in the cache by mapping each entry to a specific location in the doubly linked list. At the same time, access to the recently used item and updating the cache are operations performed in constant time (that is, with the time complexity of the algorithm 𝑂 ( 1 ).
Since version 3.2, Python includes a decorator to implement the LRU strategy. @lru_cache
The decorator behind the scenes uses a dictionary. The result of the function execution is cached under the key corresponding to the function call and the supplied arguments. That is, for the decorator to work, the arguments must be hashable. @lru_cache
Imagine that we want to determine the number of ways in which we can reach a certain rung on a staircase. How many ways are there, for example, to get to the fourth step, if we can step over – jump over 1, 2, 3 (but no more) steps? The figure below shows the corresponding combinations.
A path is shown under each of the figures with an indication of the number of steps covered in one jump. Moreover, the number of ways to reach the fourth step is equal to the total number of ways in which you can get to the third, second, and first steps:
It turns out that the solution to the problem can be decomposed into smaller subtasks. To define the different paths to the fourth step, we can add four ways to reach the third step, two ways to reach the second step, and the only way to reach the first. That is, a recursive approach can be used.
Let’s describe the programmatically recursive solution exactly as we see it now:
def steps_to (stair):
if stair == 1:
# The first step can be reached from the floor
# unique way
return 1
elif stair == 2
#to reach 2nd step, just stepping one at a time or crossing two at once
return 2
elif stair == 3:
# For getting the 3rd step:
# 1. Jump immediately to the third
# 2. Jump over two, then one
# 3. Jump over one then two
# 4. One at a time
return 4
else:
# All intermediate steps are different
# so the total number of options is the sum
# of such combinations
return (
steps_to (stair - 3)
+ steps_to (stair - 2)
+ steps_to (stair - 1)
)
>>> print(steps_to(4))
OUTPUT
7
The code works for 4 rungs. Let’s check how he calculates the number of options for a staircase of 30 steps.
>>> steps_to(30)
OUTPUT
53798080
More than 53 million combinations turned out. However, when we were looking for a solution for Step 30, the scenario could take quite a long time.
Let’s measure how long it takes to execute the code.
For this, we can use the Python module ‘timeit’ or as the corresponding command in the Jupyter notebook.
%%timeit >>> steps_to(30)
OUTPUT
53798080
The number of seconds depends on the specifications of the computer being used. On my system, the calculation took 3 seconds, which is quite slow for only thirty steps. This solution can be greatly improved with memoisation.
Our recursive implementation solves the problem by breaking it down into smaller steps that complement each other. The following figure shows a tree for seven rungs, with each node representing a specific challenge: steps_to()
The step_to () function call tree
You may notice that the algorithm has to be called with the same argument multiple times. For example, it is evaluated twice, – four times, – seven times, etc. Calling the same function several times starts computations that are not necessary – the result is always the same. steps_to() steps_to(5) steps_to(4) steps_to(3)
To solve this problem, we can use memoisation: we store the result obtained for the same input values in memory and then return it on the next similar request. Great opportunity to apply as a decorator! @lru_cache
We import the decorator from the module and apply it to the main function. functools
from functools import lru_cache
@lru_cache()
def steps_to(stair):
if stair == 1:
return 1
elif stair == 2:
return 2
elif stair == 3:
return 4
else:
return (steps_to(stair - 3)
+ steps_to(stair - 2)
+ steps_to(stair - 1))
%%timeit
>>> steps_to(30)
OUTPUT
53798080
From units of seconds to tens of nanoseconds is a tremendous improvement due to the fact that behind the scenes, the decorator stores the call results for each unique input value. @lru_cache steps_to()
By attaching a decorator, we store each call and response in memory for later access if needed again. But how many of these combinations can we keep until memory runs out? @lru_cache
As soon as the cache starts deleting old elements, the decorator returns an attribute that defines the maximum number of entries till that time. The default is 128. If we assign a value, then the cache will grow without any deletion of entries. This can become a problem if we store too many different calls in memory.
Let’s apply using an attribute and add a method call: @lru_cache max size cache_info()
@lru_cache(maxsize=16)
def steps_to(stair):
if stair == 1:
return 1
elif stair == 2:
return 2
elif stair == 3:
return 4
else:
return (steps_to(stair - 3)
+ steps_to(stair - 2)
+ steps_to(stair - 1))
>>> steps_to(30)
OUTPUT
53798080
>>> steps_to.cache_info()
OUTPUT
CacheInfo(hits=52, misses=30, maxsize=16, currsize=16)
We can use the information returned to understand how the cache works and tweak it to find the right balance between speed and memory: cache_info()
hits=52 – the number of calls returned directly from memory since they were present in the cache; @lru_cache
misses=30 – the number of calls that were not taken from memory, but were calculated (in the case of our problem, this is each new stage);
max_size=16 – this is the size of the cache, which we determined by passing it to the decorator;
curr_size=16 – the current size of the cache, in this case, the cache is full.
Let’s move from a case study to a more realistic one. Imagine that we want to track the appearance on the Real Python resource of news articles containing the word in the title – display the title, download the article and display its length (number of characters). python
Real Python provides the Atom protocol so that we can use the pipe parser library and the article content library as we did before. Feed parser requests
import feedparser # pip install feedparser
import requests
import ssl
import time
if hasattr(ssl, "_create_unverified_context"):
ssl._create_default_https_context = ssl._create_unverified_context
def get_article_from_server(url):
response = requests.get(url)
return response.text
def monitor(url):
maxlen = 45
while True:
try:
print (" nChecking the feed ...")
feed = feedparser.parse (url)
for entry in feed.entries [: 5]:
if "python" in entry.title.lower ():
truncated_title = (
entry.title [: maxlen] + "..."
if len (entry.title)> maxlen
else entry.title
)
print (
"Coincidence:",
truncated_title,
len(get_article_from_server(entry.link)),
)
time.sleep(5)
except KeyboardInterrupt:
break
The script will run continuously until we stop it by clicking [Ctrl + C]in the terminal window (or interrupt the execution in Jupyter Notepad).
>>> monitor("https://realpython.com/atom.xml")
OUTPUT
Check the tape ... Retrieving an article from the server ... Coincidence: The Real Python Podcast - Episode # 35: Securi ... 28704 Retrieving an article from the server ... Match: PyPy: Faster Python With Minimal Effort 67387 Retrieving an article from the server ... Match: Handling Missing Keys With the Python default ... 33224 Retrieving an article from the server ... Match: Use Sentiment Analysis With Python to Classif ... 158401 Retrieving an article from the server ... Coincidence: The Real Python Podcast - Episode # 34: The Py ... 29576 Check the tape ... Retrieving an article from the server ... Coincidence: The Real Python Podcast - Episode # 35: Securi ... 28704 Retrieving an article from the server ... Match: PyPy: Faster Python With Minimal Effort 67389 Retrieving an article from the server ... Match: Handling Missing Keys With the Python default ... 33224 Retrieving an article from the server ... Match: Use Sentiment Analysis With Python to Classif ... 158400 Retrieving an article from the server ... Coincidence: The Real Python Podcast - Episode # 34: The Py ... 29576
The code downloads and parses the XML file from Real Python. Then the loop iterates over the first five entries in the list. If the word is part of the title, the code prints the title and length of the article. Then the code “falls asleep” for 5 seconds, after which monitoring starts again. python
Each time the script downloads an article, the message “Getting article from server…” is displayed on the console. If we let the script run long enough, we will see this message re-appear even when loading the same link.
We can use a decorator, but the content of the article may change over time. The decorator returns the same data every time that it saves while the first opening of the article. If the message is updated, the monitoring script will never know about it. To solve this problem, we need to set the expiration date for the entries in the cache. @lru_cache
We can implement the described idea in a new decorator that extends. The cache should return a result per request only if the write caching time has not yet expired – otherwise, the result should be fetched from the server. Here’s a possible implementation of the new decorator: @lru_cache
from functools import lru_cache, wraps
from datetime import datetime, timedelta
def timed_lru_cache(seconds: int, maxsize: int = 128):
def wrapper_cache(func):
func = lru_cache(maxsize=maxsize)(func)
# instrumentation of the decorator with two attributes,
#for lifetime representation and its expiration date expiration
func.lifetime = timedelta(seconds=seconds)
func.expiration = datetime.utcnow() + func.lifetime
@wraps(func)
def wrapped_func(*args, **kwargs):
if datetime.utcnow() >= func.expiration:
func.cache_clear()
func.expiration = datetime.utcnow() + func.lifetime
return func(*args, **kwargs)
return wrapped_func
return wrapper_cache
The decorator implements functionality to operate on the lifetime of entries in the cache (in seconds) and the maximum cache size. @timed_lru_cache
The code wraps the function with a decorator. This allows us to use the already familiar caching functionality. @lru_cache
Before accessing a cache entry, the decorator checks to see if the expiration date has passed. If so, the decorator clears the cache and recalculates the lifetime and expiration date.
Article caching with the new decorator
We can now use a new decorator with a function to prevent the article content from being downloaded from the server on every new request. Collecting the code in one place, we get the following result: @timed_lru_cache monitor()
import feedparser
import requests
import ssl
import time
from functools import lru_cache, wraps
from datetime import datetime, timedelta
if hasattr(ssl, "_create_unverified_context"):
ssl._create_default_https_context = ssl._create_unverified_context
def timed_lru_cache(seconds: int, maxsize: int = 128):
def wrapper_cache(func):
func = lru_cache(maxsize=maxsize)(func)
func.lifetime = timedelta(seconds=seconds)
func.expiration = datetime.utcnow() + func.lifetime
@wraps(func)
def wrapped_func(*args, **kwargs):
if datetime.utcnow() >= func.expiration:
func.cache_clear()
func.expiration = datetime.utcnow() + func.lifetime
return func(*args, **kwargs)
return wrapped_func
return wrapper_cache
@timed_lru_cache(60)
def get_article_from_server(url):
print("Получение статьи с сервера...")
response = requests.get(url)
return response.text
def monitor(url):
maxlen = 45
while True:
try:
print("n lets take the feed as.")
feed = feedparser.parse(url)
for entry in feed.entries[:5]:
if "python" in entry.title.lower():
truncated_title = (
entry.title[:maxlen] + "..."
if len(entry.title) > maxlen
else entry.title
)
print(
"Соnfidence:",
truncated_title,
len(get_article_from_server(entry.link)),
)
time.sleep(5)
except KeyboardInterrupt:
break
>>> monitor ("https://realpython.com/atom.xml")
OUTPUT
Retrieving an article from the server ... Coincidence: The Real Python Podcast - Episode # 35: Securi ... 28704 Retrieving an article from the server ... Match: PyPy: Faster Python With Minimal Effort 67387 Retrieving an article from the server ... Match: Handling Missing Keys With the Python default ... 33224 Retrieving an article from the server ... Match: Use Sentiment Analysis With Python to Classif ... 158400 Retrieving an article from the server ... Coincidence: The Real Python Podcast - Episode # 34: The Py ... 29576 Checking the tape ... Coincidence: The Real Python Podcast - Episode # 35: Securi ... 28704 Match: PyPy: Faster Python With Minimal Effort 67387 Match: Handling Missing Keys With the Python default ... 33224 Match: Use Sentiment Analysis With Python to Classif ... 158400 Coincidence: The Real Python Podcast - Episode # 34: The Py ... 29576
Notice how the code prints the message “Getting an article from the server…” the first time you access the corresponding articles. After that, depending on the speed of the network, the script will fetch articles from the cache several times before going back to the server.
Caching is an important optimization technique that improves the performance of any software system. Understanding how caching works is a fundamental step towards effectively incorporating it into your code.
In this tutorial, we briefly covered:
what are the caching strategies;
how LRU caching works in Python;
how to use the decorator; @lru_cache
how the recursive approach in combination with caching helps to solve the problem quickly enough.
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.