The philosophical idea behind Abstraction is that it’s a way to summarize complexity. A sailboat and an airplane have a common, abstract relationship of being vehicles, but they differ a lot as vehicles. In Python, we have two approaches to defining similar things:

  • Duck Typing (already discussed here): briefly “If looks like a duck and quacks like a duck, it must be a duck”.
  • Inheritance: when two class definitions have common aspects, a subclass can share common features of a superclass. The implementation details of the two classes may vary, but the classes should be interchangeable when we use the common features defined by the superclass.

We can take Inheritance one step further: we can have superclass definition that are Abstract, meaning that they’re not directly usable, but they can be used through inheritance to create concrete class. Let’s make a visual example of inheritance:

  • our base class BaseClass has a special class, abc.ABC as parent class. This provides some special metaclass features that help make sure the concrete class have replaces the abstractions;
  • the abstract BaseClass has an abstract method, a_method(), which doesn’t have an implementation and a subclass must provide this;
  • the two concrete classes, ConcreteClass_1 and ConcreteClass_2, extends the abstract BaseClass and they provide a concrete implementation of the abstract a_method() method.

1. Creating an abstract base class

Imagine we are creating a Media Player with third-party plugins. We’ll create an Abstract Base Class (ABC) to document what API the third-party plugins should provide. Documentation is one of the stronger use cases for ABCs.

The general design is to define the media player as an Abstraction (eventually with features that are common to all the concrete class we’ll implement) and each unique kind of media format will provide a concrete implementation of the abstraction. The abc module allows us to do that:

from abc import ABC, abstractmethod
 
class MediaLoader(ABC):
	@abstractmethod
    def play(self) -> None:
        pass
 
    @property
    @abstractmethod
    def ext(self) -> None:
        pass

Using @abstractmethod, we are marking a method or a property as Abstract, so any subclass must implement that method or property in order to be a concrete implementation. Now the MediaLoader class has a new special attribute, __abstractmethods__ that lists all the names that weed to be filled:

>>> MediaLoader.__abstractmethods__
frozenset({'ext', 'play'})

Let’s see what happens if we try to instantiate a concrete implementation of the MediaLoader abstract class without giving a proper implementation of the methods or properties marked with @abstracthmethod:

class Wav(MediaLoader):
	pass
 
x = Wav()

As expected, the output will be:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[51], line 4
      1 class Wav(MediaLoader):
      2 	pass
----> 4 x = Wav()

TypeError: Can't instantiate abstract class Wav with abstract methods ext, play

Now, a proper concrete implementation of the superclass:

class Ogg(MediaLoader):
    ext = '.ogg'
    def play(self):
        pass
 
o = Ogg()

One important use case for ABCs is the collections module. This module defines the built-in generic collections using a sophisticated set of base classes and mixins.

1.1. The ABCs of collections

collections module is a really comprehensive use of the abstract base classes in the Python standard library. Indeed the collections we commonly use are extensions of the Collection abstract class. Furthermore Collection is an extension of an even more fundamental abstraction, Container.

Since the foundation is the Container class, let’s inspect what methods this class require:

>>> from collections.abc import Container
 
>>> Container.__abstractmethods__
frozenset({'__contains__'})

To see the signature that this method has you can use the help() function:

>>> help(Container.__contains__)
Help on function __contains__ in module collections.abc:
 
__contains__(self, x)

We can see that this method needs to take a single argument x and, even though there are no further explanations, we can guess that this argument is the value that the user wants to check whether it is hold by the container. This special method implements the Python in operator (in is a syntax sugar that delegates to the __contains__() method) and it’s implemented by set, list, str, tuple, and dict. However we can define a new container and implement this method to check if a given value is an odd integer:

class OddIntegers:
    def __contains__(self, value: int) -> bool:
        return value % 2 != 0
    
odd = OddIntegers()

The interesting part is that this class behaves as a Container object, even though we didn’t extend from it:

>>> isinstance(odd, Container)
True
 
>>> issubclass(OddIntegers, Container)
True

That’s way duck typing is way more awesome than classical polymorphism (“if odd behaves like a Container, then it’s a Container”). We can create is-a relationship without of writing the code to set up inheritance.

One thing to notice about the Container ABC is that any class that extends it gets to use the in keyword for free:

Tip

Any class that has a __contains__() method is a Container and can therefore be queries by the in keyword.

A quote from Fluent Python book by Luciano Ramalho (pg. 402): ”[…] we saw that you don’t need to inherit from any special class to create a fully functional sequence type in Python; you just need to implement the methods that fulfill the sequence protocol.”

>>> odd = OddIntegers()
>>> 1 in odd
True
>>> 2 in odd
False
>>> 3 in odd
True

The real value here is the possibility to create new kinds of collections that are completely compatible with Python’s built-in generic collections. For example, we could create a dictionary that uses a binary tree structure to retain keys instead of a hashed lookup (more details about how python dictionary are implemented here) by starting from Mapping abstract base class and by properly define methods like __getitem__(), __setitem__(), __delitem__().

Python’s Duck Typing works (in part) via the isinstance() and issubclass() built-in functions. They’re used to determine class relationships and they rely on two internal methods that classes can provide: __instancecheck__() and __subclasscheck__().

1.2. Abstract base classes and type hints

todo

1.3. The collections.abc module

Abstract base classes are widely used in the collections.abc module. This module provides the abstract base class definitions for Python’s build-int collections and that’s how list, set, and dict (etc.) can be built from individual component definitions. Indeed, this module doesn’t contain definitions for list, set, or dict, but it contains definitions like MutableSequence, MutableMapping, and MutableSet, which are abstract base classes for which the list, set, and dict are the concrete implementations.

Let’s see graphically how Python dict class is implemented:

  • dict class is a concrete implementation of MutableMapping class;
  • the MutableMapping class depends on the Mapping class, an immutable, frozen dictionary (indeed it doesn’t have __setitem__() and __delitem__() methods).
  • the Mapping class depends on three other abstract base classes: Container, Iterable, and Sized.

If we want to create a lookup-only dictionary - a concrete Mapping implementation - we’ll need to implement at least the following methods:

  • The Sized abstraction requires an implementation for the __len__() method; this lets an instance of our class respond to the len() function.
  • The Iterable abstraction requires an implementation for the __iter__() method; this lets an object work with the for statement and the iter() function.
  • The Container abstraction requires an implementation for the __contains__() method; this permits the in and not in operators to work.
  • The Collection abstraction combines Sized, Iterable, and Container without introducing additional abstract methods.
  • The Mapping abstraction, requires, among other things, __getitem__(), __iter__(), and __len__().

By building this new dictionary-like immutable class from these abstractions, we can be sure that our class will collaborate seamlessly with other Python generic classes.

Example: Lookup class

Let’s define an immutable Mapping object implementation by extending the abstract classes. The goal is:

  1. to be able to create a dictionary-like mapping with keys and values;
  2. then extract a value from its corresponding key.

The goal is to create a Lookup class with the following type hint:

abc.Mapping[Comparable, Any]

so, we want to create a dictionary-like mapping from some key (with the type Comparable because we want to be able to compare the keys and sort them into order because searching through an ordered list is often more efficient) to an object of any possible type (Any). Note that the dict built-in class has the following type hint MutableMapping[_KT, _VT].

A Python dictionary can be built from two different kinds of data structures and we want this flexibility in our new mapping class:

  • from an existing mapping (1)
  • from a sequence of two-tuples with keys and values (2)
x = dict({"a": 42, "b": 7, "c": 6}) # (1)
y = dict([("a", 42), ("b", 7), ("c", 6)]) # (2)
 
x == y # Output: True

This means that we need two separate definitions for __init__() that represent two alternatives to create an instance of our new Lookup class:

  1. def __init__(self, source: Mapping[Comparable, Any]) -> None
  2. def __init__(self, source: Iterable[tuple[Comparable, Any]]) -> None

We’ll provide two methods definitions with these two alternatives (we’ll use the @overload decorator) and then the real method definition with the actual logic.

Let’s implement both Lookup and Comparable classes :

from collections.abc import Mapping, Iterable, Iterator
from typing import Any, overload, Union, Sequence
from __future__ import annotations
import bisect
from typing import Protocol
 
 
class Comparable(Protocol):
    def __eq__(self, other: Any) -> bool: ...
    def __ne__(self, other: Any) -> bool: ...
    def __le__(self, other: Any) -> bool: ...
    def __lt__(self, other: Any) -> bool: ...
    def __ge__(self, other: Any) -> bool: ...
    def __gt__(self, other: Any) -> bool: ...
 
 
class Lookup(Mapping[Comparable, Any]):
    @overload
    def __init__(self, source: Iterable[tuple[Comparable, Any]]) -> None:
        ...
 
    @overload
    def __init__(self, source: Mapping[Comparable, Any]) -> None:
        ...
 
    def __init__(self, source: Union[Iterable[tuple[Comparable, Any]], Mapping[Comparable, Any], None] = None) -> None:
        sorted_pairs: Sequence[tuple[Comparable, Any]]
        if isinstance(source, Sequence):
            print("The source is of Sequence type")
            sorted_pairs = sorted(source)
        elif isinstance(source, Mapping):
            print("The source is of Mapping type")
            sorted_pairs = sorted(source.items())
        else:
            sorted_pairs = []
        self.key_list = [p[0] for p in sorted_pairs]
        self.value_list = [p[1] for p in sorted_pairs]
 
        
    def __len__(self) -> int:
        return len(self.key_list)
    
    def __iter__(self) -> Iterator[Comparable]:
        return iter(self.key_list)
    
    def __contains__(self, key: object) -> bool:
        index = bisect.bisect_left(self.key_list, key)
        return key == self.key_list[index]
    
    def __getitem__(self, key: Comparable) -> Any:
        index = bisect.bisect_left(self.key_list, key)
        if key == self.key_list[index]:
            return self.value_list[index]
        raise KeyError(key)

Now we’ll create two instances of the Lookup class first using a Sequence as input and then a Mapping:

sequence_source = Lookup(
    [
        ["z","Zillah"],
        ["a", "Amy"],
        ["c", "Clara"],
        ["b", "Basil"]
    ]
)
 
mapping_source = Lookup(
    {
        "z": "Zillah",
        "a": "Amy",
        "c": "Clara",
        "b": "Basil"
    }
)

(PS: some type hints in the Lookup class are not completely clear to me. I.E. why using Iterable in the _init_ and then Sequence in the if-else statement?) Then:

>>> print(sequence_source["c"])
The source is of Sequence type
Clara

>>> print(mapping_source["c"])
The source is of Mapping type
Clara

>>> x["m"] = "Maud"
TypeError: 'Lookup' object does not support item assignment
  • The __init__() handles 3 cases for loading a mapping:
    1. building the values from a sequence of pairs
    2. building the values from a mapping object
    3. creating an empty sequence of values;
  • we provide concretes implementations of:
    • __len__(), __iter__(), __contains__(), required by the Sized, Iterable, and Container abstract classes. Then the Collection class combines all together
    • __getitem__(), required by Mapping abstract class
    • Note that the __contains__() definition has the object type as the type hint because Python’s in operations needs to support any kind of object, even ones that don’t obviously support the Comparable protocol;
  • this class doesn’t support item assignment because we didn’t define __setitem__() method;
  • the Comparable class defines the minimum set of features - the protocol - for the keys of the object used to instantiate the Lookupclass. It’s a way to formalize the comparison rules required to keep the keys for the mapping in order. There is no implementation of such methods because it us used to introduce a new type hint. We provide ... as the bodies because the bodies will be provided by existing class definitions like str and int.

The general approach to using abstract classes is:

  1. find the class that does most of what you need;
  2. identify the methods in the collections.abc definitions that are marked as abstract;
  3. subclass the abstract class, filling in the missing methods.

1.4. Creating your own abstract base class

There are two general paths to create classes that are similar:

  1. duck typing: offers the most flexibility, but it may sacrifice the ability yo use mypy;
  2. define common abstractions: can be wordy and confusing.

Let’s handle a problem: we can to build a simulation of games that involve polyhedral dice. A problem to solve is how to best simulate rolls of these different shaped dice; Python offers three way to get random data:

  1. the random module
  2. the os module
  3. the secrets module

Rather than bake the choice of random number generator into a class, we can define an abstract class that has the general features of a die and concrete subclasses can supply the missing randomization capability.

Let’s use the abc module that has the foundational definitions for abstract classes (note that the abc module is distinct from the collections.abc module):

from abc import ABC, abstractmethod
 
class Die(ABC):
    def __init__(self) -> None:
        self.face: int
        self.roll()
 
    @abstractmethod
    def roll(self) -> None:
        pass
 
    def __repr__(self) -> str:
        return f"{self.face}"

This class inherits from the abc.ABC class that assures us that any attempt to create an instance of the Die class directly will raise a TypeError exception:

>>> d = Die()
TypeError: Can't instantiate abstract class Die with abstract method roll

Let’s extend this abstract class with two subclasses using two different approaches to select a random value:

import random
 
class D4(Die):
    def roll(self) -> None:
        self.face = random.choice((1, 2, 3, 4))
 
class D6(Die):
    def roll(self) -> None:
        self.face = random.randint(1,6)

Now, let’s create another abstract class that represents a handful of dice. In some games, the rules require the player to roll all the dice, some two dice, some at most two dice. So, there are different rules that apply to a simple list of Die instances. Here’s a class that leaves the roll implementation as an abstraction:

from typing import Type
 
class Dice(ABC):
    def __init__(self, n: int, die_class: Type[Die]) -> None:
        self.dice = [die_class() for _ in range(n)]
 
    @abstractmethod
    def roll(self) -> None:
        pass
 
    @property
    def total(self) -> int:
        return sum(d.face for d in self.dice)
  • n is an integer representing how many dice are needed;
  • the type hint Type[Die] means that any subclass of the abstract base class Die is accepted. die_class is expected to be a class object (like Die or one of its subclasses, like D4 and D6), not an instance of that class.

Now we’ll extend the Dice class with a subclass that implements the roll-all-the-dice rule:

class SimpleDice(Dice):
    def roll(self) -> None:
        for d in self.dice:
            d.roll()

Now we’ll create an instance of this class using 6 instances of D6 and we’ll calculate the sum of the faces after rolling this handful of 6 dice:

sd = SimpleDice(6, D6)
sd.roll()
sd.total # Output: 30 - each time you got a different number of course

Let’s extend the Dice class with another implementation:

from collections.abc import Set
 
 
class YachtDice(Dice):
    def __init__(self) -> None:
        super().__init__(5, D6)
        self.saved: Set[int] = set()
 
    def saving(self, positions: Iterable[int]) -> YachtDice:
        if not all(0 <= n < 6 for n in positions):
            raise ValueError("Invalid position")
        self.saved = set(positions)
        return self
 
    def roll(self) -> None:
        for i, d in enumerate(self.dice):
            if i not in self.saved:
                d.roll()
        self.saved = set()

Let’s see how this works:

>>> sd = YachtDice()
 
>>> sd.roll()
>>> sd.dice
[3, 3, 1, 4, 2]
 
>>> sd.saving([0, 1, 2]).roll()
>>> sd.dice
[3, 3, 1, 3, 4]

In this example, we saved the dice in positions 0, 1, and 2. When rolling the dice a second time, the first three dice remain unchanged, and only the last two are rolled.