functools module is an interesting assortment of utilities that alter functions or methods in some way. We’ll divide this lesson into four parts:
- Caching
- Partial and Reduce
- Wrapping
- Dispatches and Total Ordering
1. Caching
functools offers three main cache decorators: cache, cached_property, lru_cache.
1.1. lru_cache
In Python, lru in lru_cache stands for “least recently used.” This means that once the cache reaches the maximum size specified by maxsize, adding a new item will automatically remove the oldest, least-used item from the cache to make space. For example:
lru_cache(maxsize=128, typed=False)if the cache limit is 128 items, the 129th addition will cause the most outdated entry to be discarded, ensuring the cache stays within the defined limit. Furthermore, lru_cache adds some support for strict type checking for cached items. Most important the only things that can be cashed are things that can be hashable (you can check if something is hashable by running hash function on it).
Let’s define a function with this decorator:
from functools import lru_cache
@lru_cache(maxsize=3)
def add_5(num):
print(f"Adding 5 to {num}")
return num + 5Then:
>>> add_5(15)
Adding 5 to 15
20
>>> add_5(15)
20
>>> add_5(12)
Adding 5 to 12
17
>>> add_5(15)
20
>>> add_5(27)
Adding 5 to 27
32
>>> add_5(30)
Adding 5 to 30
35
>>> add_5(15)
Adding 5 to 15
20
>>> add_5(12)
Adding 5 to 12
17- Line 5 & 12: if we call again
add_5(15)after computed it the first time at line 2, it doesn’t print anything out because it doesn’t need to recalculate it. - Line 27: the result of
add_5(12), computed the first time at line 8, was pushed out from the cache because it’s the oldest cached entry. So, it needs to recompute again and that’s the reason why we see the print statement.
lru_cache decorator adds some information to the function itself, such as .cache_clear(), .cache_info(), .cache_parameters():
>>> add_5.cache_info()
CacheInfo(hits=2, misses=5, maxsize=3, currsize=3)hits=2: how many times it returned a cache valuemisses=5: how many times it had to calculate a valuemaxsize=3: the defined max sizecurrsize=3: the current size (in our example we are at the limit)
>>> add_5.cache_parameters()
{'maxsize': 3, 'typed': False}It returns the explicit and implicit arguments that were passed into the lru_cache.
If you want to invalidate the entire cache you can clear it with cache_clear() method:
>>> add_5.cache_clear()
>>> add_5.cache_info()
CacheInfo(hits=0, misses=0, maxsize=3, currsize=0)Note that if you set maxsize=None, the cache can grow without bound potentially until as much as your system has memory for. The author of Python decides then to create an alias for @lru_cache(maxsize=None), that is @cache decorator.
1.2. cache
Since @cache is an alias for @lru_cache(maxsize=None), you don’t need to specify the maxsize parameter with this decorator. Let’s give an example:
def fib(n):
if n <= 1:
return n
return fib(n-1) + fib(n-2)This function to compute the fibonacci sequence is pretty inefficient and relies heavily on recursion. We can use cache in this recursion though:
from functools import cache
@cache
def cached_fib(n):
if n <= 1:
return n
return cached_fib(n-1) + cached_fib(n-2)Let’s create a benchmark to test them:
import time
def bench_fib():
goal = 38
start = time.time()
fib(goal)
print(f"Time taken without caching: {time.time() - start:.5f} seconds")
start = time.time()
cached_fib(goal)
print(f"Time taken with caching: {time.time() - start:.5f} seconds")Then:
>>> bench_fib()
Time taken without caching: 6.17687 seconds
Time taken with caching: 0.00002 secondsThis result is self-explanatory.
1.3. cached_property
Let’s look now at cached_property. Let’s write a new class:
from functools import cached_property
import json
import urllib.request
class METAR:
def __init__(self, icao):
self.icao = icao.upper()
@property
def data(self):
print(f"Getting METAR for {self.icao}")
res = urllib.request.urlopen(
f'https://api.weather.gov/stations/{self.icao.upper()}/observations/latest'
).read()
return json.loads(res)
@property
def raw(self):
return self.data['properties']['rawMessage']
@property
def temp(self):
return self.data['properties']['temperature']['value']
@property
def dewpoint(self):
return self.data['properties']['dewpoint']['value']
@property
def wind(self):
return "{direction}{speed}KT".format(
direction=self.data['properties']['windDirection']['value'],
speed=self.data['properties']['windSpeed']['value']
)This class is going to take an icao value, which is an international code for airports. We use it to take information about weather information at the airports. It contains only properties and we haven’t added any caching into this. Let’s take information about a specific airport:
>>> krdu = METAR("krdu")
>>> krdu.data
Getting METAR for RKDU
{'@context': ['https://geojson.org/geojson-ld/geojson-context.jsonld',
{'@version': '1.1',
'wx': 'https://api.weather.gov/ontology#',
's': 'https://schema.org/',
'geo': 'http://www.opengis.net/ont/geosparql#',
'unit': 'http://codes.wmo.int/common/unit/',
'@vocab': 'https://api.weather.gov/ontology#',
'geometry': {'@id': 's:GeoCoordinates', '@type': 'geo:wktLiteral'},
'city': 's:addressLocality',
'state': 's:addressRegion',
'distance': {'@id': 's:Distance', '@type': 's:QuantitativeValue'},
...
}Say we want to call raw property:
>>> krdu.raw
Getting METAR for KRDU
'KRDU 031251Z 03004KT 10SM FEW060 SCT250 11/07 A3038 RMK AO2 SLP289 T01060072 $'Here, what happened is that we see it’s getting again the METAR data for KRDU, it’s having to download that data again and process it trough the json loader before it gives back to us, which are then accessing the items from the dictionary. This is especially bad for wind property because we need to put two different fields together because it’s using self.data twice:
>>> krdu.wind
Getting METAR for KRDU
Getting METAR for KRDU
'307.56KT'This is not ideal not only because it takes time, but if this were some type of production application, we’d be hitting the API a lot and this particular API does have rate limiting. The way we get around this is very easy and this is done by changing data decorator from @property to @cache_property:
from functools import cached_property
import json
import urllib.request
class METAR:
def __init__(self, icao):
self.icao = icao.upper()
@cached_property
def data(self):
print(f"Getting METAR for {self.icao}")
res = urllib.request.urlopen(
f'https://api.weather.gov/stations/{self.icao.upper()}/observations/latest'
).read()
return json.loads(res)
@property
def raw(self):
return self.data['properties']['rawMessage']
@property
def temp(self):
return self.data['properties']['temperature']['value']
@property
def dewpoint(self):
return self.data['properties']['dewpoint']['value']
@property
def wind(self):
return "{direction}{speed}KT".format(
direction=self.data['properties']['windDirection']['value'],
speed=self.data['properties']['windSpeed']['value']
)Then let’s use it:
>>> krdu = METAR("krdu")
>>> krdu.raw
Getting METAR for KRDU
'KRDU 031251Z 03004KT 10SM FEW060 SCT250 11/07 A3038 RMK AO2 SLP289 T01060072 $'
>>> krdu.wind
'307.56KT'We can see that when you call wind property, it’s not getting the METAR again.
2. Partial and Reduce
2.1. partial and partialmethod
Where partial is used it to build in an argument into an existing function in a way that you don’t have to pass it again. Let’s write this function:
from urllib.request import urlopen
def get_site_status(url):
try:
return urlopen(url).getcode()
except:
returnThen:
>>> get_site_status("https://google.com")
200
>>> get_site_status("https://facebook.com")
200This is a normal function. Where partial lets us do is to make things a bit more simple moving forward. Say we wanted a new function dedicated solely to getting the status of Google website. All we have to do is call partial, pass get_site_status function and the argument that we’re going to bake into it:
from functools import partial
get_google_status = partial(get_site_status, "https://google.com")
get_facebook_status = partial(get_site_status, "https://facebook.com")Then:
>>> get_google_status()
200
>>> get_facebook_status()
200Similar to partial is partialmethod. Let’s create a new class:
class VMManager:
def toggle_power(self, to_state):
if to_state == "on":
print("Powering on VM")
elif to_state == "off":
print("Powering off VM")Say we want to use the partial function on .toggle_power() method: it really wouldn’t work out too well because the first argument that goes is self, but the argument we want to bake into partial would be to_state. So, we need to use partialmethod:
from functools import partialmethod
class VMManager:
def toggle_power(self, to_state):
if to_state == "on":
print("Powering on VM")
elif to_state == "off":
print("Powering off VM")
power_on = partialmethod(toggle_power, "on")
power_off = partialmethod(toggle_power, "off")Then, let’s use it:
>>> vm = VMManager()
>>> vm.power_on()
Powering on VM
>>> vm.power_off()
Powering off VMOf course, at any time we could always call vm.toggle_power("on") and/or vm.toggle_power("off"), but partialmethod gives us a way to make these help methods by baking an argument into another method.
2.2. reduce
reduce function is a little bit weird and the uses of this are definitely much less common than partial in Python. Let’s create a simple function:
def multiply(num1, num2):
print(f"Multiplying {num1=} by {num2=}")
return num1*num2Then, let’s first look at the docstring of reduce function, especially the signature:
>>> from functools import reduce
>>> reduce?
Docstring:
reduce(function, iterable[, initial]) -> value
Apply a function of two arguments cumulatively to the items of a sequence
or iterable, from left to right, so as to reduce the iterable to a single
value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
((((1+2)+3)+4)+5). If initial is present, it is placed before the items
of the iterable in the calculation, and serves as a default when the
iterable is empty.
Type: builtin_function_or_methodLet’s try to use this function:
>>> reduce(multiply, range(1,5)) # [1, 2, 3, 4]
Multiplying num1=1 by num2=2
Multiplying num1=2 by num2=3
Multiplying num1=6 by num2=4
24What’s happened?
- first we get multiplying
num1bynum2 - then it’s taking the output of the previous run and it’s feeding it into the next run as the first value.
There are some uses for this. One use, if it didn’t exists in the Python Standard library, would be to construct a factorial function (this function actually exists in Python standard library: from math import factorial):
>>> factorial = partial(reduce, multiply)
>>> factorial(range(1, 20))
Multiplying num1=1 by num2=2
Multiplying num1=2 by num2=3
Multiplying num1=6 by num2=4
Multiplying num1=24 by num2=5
Multiplying num1=120 by num2=6
Multiplying num1=720 by num2=7
Multiplying num1=5040 by num2=8
Multiplying num1=40320 by num2=9
Multiplying num1=362880 by num2=10
Multiplying num1=3628800 by num2=11
Multiplying num1=39916800 by num2=12
Multiplying num1=479001600 by num2=13
Multiplying num1=6227020800 by num2=14
Multiplying num1=87178291200 by num2=15
Multiplying num1=1307674368000 by num2=16
Multiplying num1=20922789888000 by num2=17
Multiplying num1=355687428096000 by num2=18
Multiplying num1=6402373705728000 by num2=19
1216451004088320003. Wrappers
These are almost exclusively going to be used when you’re working with decorators. Let’s consider the print_time decorator and the perfect_function I want to decorate:
def print_time(f):
def wrapper(*args, **kwargs):
start_time = time.time()
result = f(*args, **kwargs)
print(f"Function {f.__name__} took {time.time() - start_time:.2f} seconds to execute")
return result
return wrapper
def perfect_function():
"""This is a perfect docstring."""
time.sleep(1)
print("Finished being perfect")We can take a look at perfect_function() and, in particular, some of its attribute:
>>> perfect_function.__doc__
'This is a perfect docstring.'
>>> perfect_function.__name__
'perfect_function'If we were apply the print_time() decorator manually overriding perfect_function(), we now have a new decorated perfect_function():
>>> perfect_function = print_time(perfect_function)
>>> perfect_function
<function __main__.print_time.<locals>.wrapper(*args, **kwargs)>However, this one isn’t as perfect: if we look at the docstring it’s missing and if wr look at the name it’s now changed because this decorator print_time() is actually replacing perfect_function() with the wrapper() function:
>>> perfect_function.__doc__
>>> perfect_function.__name__
'wrapper'So, let’s store the decorated perfect_function() as a new function called decorated() and, as before, this new function has 'wrapper' name:
>>> decorated = print_time(perfect_function)
>>> decorated.__name__
'wrapper'What we can do here is we can use update_wrapper() function to updated decorated() function with perfect_function() (Note, you have to redefine perfect_function() as the original function because currently in our code perfect_function = print_time(perfect_function), and I don’t want that; I want a fresh perfect_function() without decorator):
>>> from functools import update_wrapper
>>> update_wrapper(decorated, perfect_function)
<function __main__.perfect_function()>Then, let’s check:
>>> perfect_function.__doc__
'This is a perfect docstring.'
>>> decorated.__name__
'perfect_function'So, what update_wrapper function does is it copies important attributes from the function that you’re wrapping over to the function that is wrapping it. Let’s see the signature of this function:
>>> update_wrapper?
Signature:
update_wrapper(
wrapper,
wrapped,
assigned=('__module__', '__name__', '__qualname__', '__doc__', '__annotations__'),
updated=('__dict__',),
)We can see that in the assigned argument there are all the default attributes that are being copied over from the wrapped function to the wrapper function. There also the assigned field, which will copy over the contents of the __dict__ from the original function. This isn’t used nearly as much, but it does have some uses and we can demonstrate it. Let’s look at this dunder __dic__ of the perfect_function(), which is empty:
>>> perfect_function.__dict__
{}An interesting thing about functions in Python is that they are still objects, so you can assign random attributes to them:
>>> perfect_function.something = "test"So, now this function has this new attribute and this will now show up in its __dict__:
>>> perfect_function.something
'test'
>>> perfect_function.__dict__
{'something': 'test'}The function decorated() currently doesn’t have that:
>>> decorated.something
AttributeError: 'function' object has no attribute 'something'but we can call update_wrapper function again:
>>> update_wrapper(decorated, perfect_function)
<function __main__.perfect_function()>
>>> decorated.something
'test'Here, we can see 'test' because it copied over the entire contents of perfect_function() dictionary. So, how to use update_wrapper function in our code?
def print_time(f):
def wrapper(*args, **kwargs):
start_time = time.time()
result = f(*args, **kwargs)
print(f"Function {f.__name__} took {time.time() - start_time:.2f} seconds to execute")
return result
update_wrapper(wrapper, f)
return wrapper
def perfect_function():
"""This is a perfect docstring."""
time.sleep(1)
print("Finished being perfect")And this is perfectly valid. Let’s check:
>>> decorated = print_time(perfect_function)
>>> decorated.__name__
'perfect_function'However, there is a cleaner way to do the same, that is by using wraps function from functools module:
from functools import wraps
def print_time(f):
@wraps(f)
def wrapper(*args, **kwargs):
start_time = time.time()
result = f(*args, **kwargs)
print(f"Function {f.__name__} took {time.time() - start_time:.2f} seconds to execute")
return result
return wrapper
@print_time
def perfect_function():
"""This is a perfect docstring."""
time.sleep(1)
print("Finished being perfect")Then:
>>> perfect_function.__doc__
'This is a perfect docstring.'
>>> perfect_function.__name__
'perfect_function'As far as we can tell, nothing has really changed here, but let’s run this function:
>>> perfect_function()
Finished being perfect
Function perfect_function took 1.00 seconds to executeHere, we see different behaviour to perfect_function(). That’s the beauty of wraps: it keeps all the metadata that someone expect from a decorated function in place instead of overwriting it when you’re replacing it with a wrapper function.
4. Dispatches and Ordering
4.1. singledispatch and singledispatchmethod
Functions singledispatch and total_ordering from functools module are some interesting helpers that you may not come across often.
What singledispatch allows us to do is to handle different types of arguments differently. If you know some other languages that have operator overloading, this is somewhat equivalent to that. However, it only considers the type of the first argument. Let’s give an example and we’ll use singledispatch as a decorator and it’s gonna decorate a single function:
from functools import singledispatch
@singledispatch
def handle_error(error):
raise NotImplementedError("Can't handle this error type")This function is going to take in some type of error and we’ll use it to provide a default behaviour. So far, nothing interesting. However, that decorating handle_function() gives us a new decorator and that decorator provides another decorator called register and in here we can register specific behaviour given a specific type (for instance in the following example we want to handle a TypeError):
@handle_error.register(TypeError)
def _(error):
print("Handling TypeError")
print(error)A common convention when dealing with singledispatch is not to actually name the additional register functions and give them a single underscore _ because the name isn’t really going to matter in this regard.
With this code, handle_error() function implement this behaviour if it receives a TypeError. Of course we can extend with any other type of errors:
@handle_error.register(TypeError)
def _(error):
print("Handling TypeError")
print(error)
@handle_error.register(ValueError)
def _(error):
print("Handling ValueError")
print(error)
@handle_error.register(ZeroDivisionError)
def _(error):
print("Handling ZeroDivisionError")
print(error)Let’s see their usage:
try:
1 + "1"
except Exception as e:
handle_error(e)output:
Handling TypeError
unsupported operand type(s) for +: 'int' and 'str'
Or:
try:
int("a")
except Exception as e:
handle_error(e)output:
Handling ValueError
invalid literal for int() with base 10: 'a'
Like we saw before with caching, we also have a method version of this: singledispatchmethod because remember with singledispatch it only checks the type of the first argument. So, if I were to create a new class:
class NyNum:
def __init__(self, num):
self.num = num
def add_it(self, another):
passWhat we want to do with this class it to be able to handle three different things being added to num: an integer, a string, and a list. Each of these needs to behave differently. Like before, let’s first implement a default behaviour and then we’ll implement the specific ones:
from functools import singledispatchmethod
class MyNum:
def __init__(self, num):
self.num = num
@singledispatchmethod
def add_it(self, another):
raise NotImplemented("Can't add these two things!")
@add_it.register(int)
def _(self, another):
self.num += another
@add_it.register(str)
def _(self, another):
self.num += int(another)
@add_it.register(list)
def _(self, another):
for item in another:
self.add_it(item)Then:
>>> the_num = MyNum(5)
>>> the_num.num
5
>>> the_num.add_it(13)
>>> the_num.num
18
>>> the_num.add_it("7")
>>> the_num.num
25
>>> the_num.add_it([1, "2", 3])
>>> the_num.num
31Of course, this could have been done with if-else statements handling each of the different cases here.
4.2. total_ordering
What this function total_ordering does is it helps to fill in missing comparison methods for your class, meaning you really only need to define __eq__ and one other method. In order to demonstrate that, we’ll define a class called BadInt and this is an integer class whose comparison is based solely on the length of the integer, not on the value of the integer:
class BadInt:
def __init__(self, value):
self.value = value
def __eq__(self, other):
if isinstance(other, int | BadInt):
return len(str(self.value)) == len(str(other))
return NotImplemented
def __lt__(self, other):
if isinstance(other, int | BadInt):
return len(str(self.value)) < len(str(other))
return NotImplementedLet’s make an example:
>>> five = BadInt(5)
>>> five == 7
True
>>> five < 15
TrueThey give True because it’s checking just by size. If we want to know if five is greater than 15:
>>> five > 15
TypeError: '>' not supported between instances of 'BadInt' and 'int'There’s an easy fix to this because, since we’ve already defined __eq__ and __lt__, what we have to do is to decorate BadInt with total_ordering:
from functools import total_ordering
@total_ordering
class BadInt:
def __init__(self, value):
self.value = value
def __eq__(self, other):
if isinstance(other, int | BadInt):
return len(str(self.value)) == len(str(other))
return NotImplemented
def __lt__(self, other):
if isinstance(other, int | BadInt):
return len(str(self.value)) < len(str(other))
return NotImplementedThen we will not get the previous error anymore:
>>> five = BadInt(5)
>>> 5 > 15
FalseNow, it gives False because, since we’ve implemented __eq__ and __lt__ dunder methods, Python uses them as the basis for the rest of the equalities (indeed now also >= and <= will work).