Guido van Rossum, Jukka Lehtosalo, and Łukasz Langa, PEP 484—Type Hints

It should also be emphasized that Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention.

Type hints are one of the biggest changes in the history of Python.

PEP 484 - Type Hints - introduced syntax and semantics for explicit type declarations in function arguments, return values, and variables. The goal is to help developer tools find bugs in Python codebases via static analysis, i.e., without actually running the code through tests.

However, type hints do not benefit all Python users equally, and that’s the reason why they should always be optional. For most of them (for example, for those who have small codebases and small teams), the cost of learning type hints is likely too high, and the benefits will be too low.

About Gradual Typing

PEP 484 introduced a Gradual Type System (invented by Jeremy Siek) and it’s explained in PEP 483. Characteristics of a gradual type system:

  • It is optional. By default, the type checker should not emit warnings for code that has no type hints (in this case, the type is inferred to Any). This is the best usability feature of gradual typing.
  • It does not catch type errors at runtime. Type hints do not prevent inconsistent values from being passed to functions or assigned to variables at runtime.
  • It does not enhance performance. Type annotations provide data that could, in theory, allow optimizations in the generated bytecode, but such optimizations are not implemented in any Python runtime (at least until July 2021).

The best usability feature of gradual typing is that annotations are always optional.

Gradual Typing in Practice

Let’s define a function without annotations, and let’s save it into messages.py file:

def show_count(count, word):
    if count == 1:
        return f'1 {word}'
    count_str = str(count) if count else 'no'
    return f'{count_str} {word}s'

Here’s its usage:

>>> show_count(99, "bird")
'99 birds'
 
>>> show_count(1, "bird")
'1 bird'
 
>>> show_count(0, "bird")
'no birds'

Starting with Mypy

In the following examples, we’ll use Mypy, which is a Python type checker. Let’s start with type checking by running:

$ mypy messages.py
Success: no issues found in 1 source file

Mypy with default settings finds no problem because, if a function has no annotations, it ignores it by default.

Let’s create some unit tests in another file called messages_test.py:

from messages import show_count
from pytest import mark
 
@mark.parametrize('qty, expected', [
    (1, '1 part'),
    (2, '2 parts'),
])
def test_show_count(qty, expected):
    got = show_count(qty, 'part')
    assert got == expected
 
def test_show_count_zero():
    got = show_count(0, 'part')
    assert got == 'no parts'

To make Mypy more strict, we can use --disallow-untyped-defs command-line option which makes Mypy flag any function definition that does not have type hints for all its parameters and for its return value:

$ mypy --disallow-untyped-defs messages_test.py
python_code/08/messages.py:1: error: Function is missing a type annotation  [no-untyped-def]
python_code/08/messages_test.py:8: error: Function is missing a type annotation  [no-untyped-def]
python_code/08/messages_test.py:13: error: Function is missing a return type annotation  [no-untyped-def]
python_code/08/messages_test.py:13: note: Use "-> None" if function does not return a value
Found 3 errors in 2 files (checked 1 source file)

The fully annotated signature that satisfies Mypy is:

def show_count(count: int, word: str) -> str:

A default Parameter Value

The show_count function works only with regular nouns. In cases where appending -s doesn’t work, we should let the user provide the plural form like this:

>>> show_count(3, 'mouse', 'mice')
3 mice

Let’s write a new unit test:

def test_irregular() -> None:
    got = show_count(2, 'child', 'children')
    assert got == '2 children'

Then let’s use Mypy:

$ mypy messages_test.py
python_code/08/messages_test.py:18: error: Too many arguments for "show_count"  [call-arg]
Found 1 error in 1 file (checked 1 source file)

Now, let’s modify show_count function:

def show_count(count: int, singular: str, plural: str = '') -> str:
    if count == 1:
        return f'1 {singular}'
    count_str = str(count) if count else 'no'
    if not plural:
        plural = singular + 's'
    return f'{count_str} {plural}'

Now Mypy will no longer return errors:

$ mypy messages_test.py
Success: no issues found in 1 source file

Good Style for Type Hints

  • No space between the parameter name and the :; one space after the : (i.e. def example(parameter: str))
  • Spaces on both sides of the = that precedes a default parameter value (i.e. def example(parameter: str = ''))

On the other hand, PEP 8 says there should be no spaces around the = if there is no type hint for that particular parameter (i.e. def example(parameter='')).

Code Style: Use flake8 and blue

Instead of memorizing these rules, use tools like flake8 and blue:

  • flake8 reports on code styling, among many other issues.
  • blue rewrites source code according to (most) rules embedded in black code formatting tool.

Given the goal of enforcing a standard coding style, blue is better than black because it follows Python’s own style of using single quotes by default, double quotes as an alternative:

>>> "I prefer single quotes"
'I prefer single quotes'

The preference for single quotes is embedded in repr(), among other places in CPython.

Using None as a Default

Instead of using '' as a default as we did in the previous example, it’s better to use None (remember that, as we said in a previous paragraph Mutable Types as Parameter Default: Bad Idea!, you have to pay attention even when using None as a default for a mutable object).

Here’s the signature for having None as default for a parameter:

from typing import Optional
 
def show_count(count: int, singular: str, plural: Optional[str] = None) -> str:
  • Optional[str] means plural may be str or None
  • We have to explicitly provide the default value None because, otherwise, the Python runtime will treat it as a required parameter (remember: at runtime, type hints are ignored).

Actually, Optional is not a great name since it can be misleading because it might imply that the parameter itself becomes optional (i.e., it doesn’t need to be passed when calling the function). However, what Optional really does is allow the parameter to take on the value None in addition to another type. In other words, it means “this parameter can either be of type str or None”.

Types are defined by Supported Operations

Let’s see what the concept of “type” means in practice. What is a type?

Pep 483 - The Theory of Type Hints

There are many definitions of the concept of type in the literature. Here, we assume that type is a set of values and a set of functions that one can apply to those values.

There are several ways to define a particular type:

  • By explicitly listing all values. E.g., True and False form the type bool.
  • By specifying functions that can be used with variables of a type. E.g. all objects that have __len__method form the type Sized. Both [1, 2, 3] and 'abc' belong to this type, since one can call len() on them:
    >>> len([1, 2, 3]) # OK
    >>> len('abc') # also OK
    >>> len(42) # not a member of Sized
  • By a simple class definition, for example, if one defines a class:
    class UserID(int):
    	pass
    Then all the instances of this class also form a type.
  • There are also more complex types. E.g., one can define the type FancyList as all lists containing only instances of int, str, or their subclasses. The value [1, 'abc', UserID(42)] has this type.

Type as “set of values. Python doesn’t provide syntax to control the set of possible values for a type—except in Enum types. For example, using type hints, you can’t define Quantity as an integer between 1 and 1000, or AirportCode as a 3-letter combination. NumPy offers uint8, int16, and other machine-oriented numeric types, but in the Python standard library, we only have types with very small sets of values (NoneType, bool [True or False]) or extremely large sets (float, int, str, all possible tuples, etc.).

Type as “set of functions that one can apply to those values”. In practice, it’s more useful to consider the set of supported operations as the defining characteristic of a type. For example, from the point of view of applicable operations, what are the valid types for x in the following function?

def double(x):
	return x * 2

The x parameter may be numeric (int, complex, Fraction, numpy.uint32, etc.), or a sequence (str, tuple, list), or any other type that implements the __mul__ method that accepts an integer argument. Now, let’s look at this annotated version:

from collections import abc
 
def double(x: abc.Sequence):
	return x * 2

Note that:

  • A type checker would reject that code by declaring x * 2 as wrong because the Sequence abstract class doesn’t implement or inherit the __mul__ method.
  • At runtime, that code will work with concrete sequences such as str, tuple, list, etc, as well as numbers (because the type hints are ignored at runtime). The computation may work, or it may raise TypeError if the operation is not supported by x. In contrast, Mypy will declare x * 2 as wrong after analyzing the annotated double parameter, because it’s an unsupported operation for the declared type abc.Sequence.

That’s the reason why the title of this paragraph is “Types are defined by Supported Operations”.

In a Gradual Type System, we have the interplay of two different views of types:

  • Duck Typing:

    1. Objects have types:
      print(type('Hello')) # <class 'str'>
    2. But variables (including parameters) are untyped (a variable is just a name pointing to an object, and that object can be of any type, and can even change later).
      x = 42 # points to an int
      x = 'hello' # now x points to a str

    In practice, it doesn’t matter what the declared type of the object is, only what operations it actually supports. By definition, duck typing is only enforced at runtime, when operations on objects are attempted. This view is adopted by Smalltalk, Python, JavaScript, and Ruby. Note that Duck Typing is an implicit form of Structural Typing, which Python ≥ 3.8 also supports with the introduction of typing.Protocol (what “implicit form” means is covered later in this chapter—in Static Protocol —and also with more details in Chapter 13).

  • Nominal Typing:

    1. Objects and variables have types:
      Dog d = new Dog();
      In this example variable d has type Dog and the object new Dog() is also of type Dog. The types are fixed and checked at compile time.
    2. But objects only exists at runtime.

    So, in practice:

    • When you write Dog d = new Dog();, the type Dog is known at compile time (types are just labels or contracts used at compile time to enforce correctness).
    • But the actual Dog object created by new Dog() only exists when you run the program.

    This view is adopted by C++, Java, and C#, and is supported by annotated Python.

Here’s an example that contrasts Duck Typing and Nominal Typing, as well as static type checking and runtime behaviour in bird.py python module:

class Bird:
    pass
 
class Duck(Bird):
    def quack(self):
        print('Quack!')
 
def alert(birdie):
    birdie.quack()
 
def alert_duck(birdie: Duck) -> None:
    birdie.quack()
 
def alert_bird(birdie: Bird) -> None:
    birdie.quack() 

We can see that alert_bird() function is problematic because Bird class has no quack() method. Let’s do a type check on this file:

>>> mypy birds.py 
birds.py:15: error: "Bird" has no attribute "quack"  [attr-defined]
Found 1 error in 1 file (checked 1 source file)

Now, let’s use this module in daffy.py python file:

from birds import *
 
daffy = Duck()
alert(daffy)
alert_duck(daffy)
alert_bird(daffy)

Mypy sees no problem with daffy.py, but raises the same error as before on bird.py module:

>>> mypy daffy.py
birds.py:15: error: "Bird" has no attribute "quack"  [attr-defined]
Found 1 error in 1 file (checked 1 source file)

However, everything works if we run this file:

>>> python daffy.py 
Quack!
Quack!
Quack!

At runtime, Python doesn’t care about declared types. It uses duck typing only. Of course, a static type checker would raise an error (even though the program actually executes).

Now, let’s execute the same functions, but using an instance of Bird:

from birds import *
 
woody = Bird()
alert(woody)
alert_duck(woody)
alert_bird(woody)

First, let’s do type checking with Mypy:

>>> mypy woody.py
birds.py:15: error: "Bird" has no attribute "quack"  [attr-defined]
woody.py:5: error: Argument 1 to "alert_duck" has incompatible type "Bird"; expected "Duck"  [arg-type]
Found 2 errors in 2 files (checked 1 source file)
  • The first error is the same as before.
  • The second error is in woody.py file because the function alert_duck() requires a Duck, but woody is a Bird.

None of these calls succeeded because Bird instances (like woody in this case) have no quack() method:

>>> alert(woody)
Traceback (most recent call last):
  ...
AttributeError: 'Bird' object has no attribute 'quack'
 
>>> alert_duck(woody)
Traceback (most recent call last):
  ...
AttributeError: 'Bird' object has no attribute 'quack'
 
>>> alert_bird(woody)
Traceback (most recent call last):
  ...
AttributeError: 'Bird' object has no attribute 'quack'

This experiment shows that:

  • Duck Typing is easier to get started and is more flexible, but allows unsupported operations to cause errors at runtime (such as Line 4 in the above terminal);
  • Nominal Typing detects errors before runtime, but sometimes can reject code that actually runs (such as the call alert_bird(duffy)).

Gradual Type System by Gemini 2.5 Flash

I asked Gemini (with Flash 2.5) to make a summary about type systems, specifically in Python. That’s because I am confused about some concepts. Here’s what Gemini wrote.

Duck Typing

Duck typing is a programming philosophy where the type of an object is determined by its behavior, not its explicit class or name. The core idea is “If it walks like a duck and it quacks like a duck, then it must be a duck.” This means that as long as an object has the required methods and attributes, it can be used in a particular context. Python, by default, is a duck-typed language.

In practice:

  • Variables don’t have a fixed type; they can hold objects of any type.
  • The interpreter checks for the existence of a method or attribute at runtime (when the code is running).
  • This provides flexibility, allowing you to use different objects interchangeably as long as they support the same operations.

Example:

class Duck:
    def quack(self):
        print("Quack!")
 
class Person:
    def quack(self):
        print("I'm not a duck, but I can imitate one: Quack!")
 
def make_it_quack(animal):
    # This function doesn't care if 'animal' is a Duck or a Person.
    # It just checks if the 'quack' method exists at runtime.
    animal.quack()
 
make_it_quack(Duck())
make_it_quack(Person())

Nominal Typing

Nominal typing is a system where the type of an object is based on its name or its declared class. This is the more traditional approach seen in languages like C++, Java, and C#. With Python’s type hints, you can adopt a form of nominal typing.

In practice:

  • Both objects and variables are assigned a specific type.
  • type checker (a static analysis tool like mypy) analyzes the code before it runs to ensure types are used correctly.
  • The type checker only looks at the declared type hints, not the actual object that will be used at runtime.
  • This provides rigidity and helps catch bugs early, but it can be less flexible than duck typing.

Example:

class Bird:
    def fly(self):
        print("Flying...")
 
class Duck(Bird):
    def quack(self):
        print("Quack!")
 
def get_a_bird(birdie: Bird):
    # A type checker would see that 'birdie' is of type 'Bird'.
    # It would then flag the following line as an error because
    # the 'Bird' class does not have a 'quack' method.
    # It doesn't matter that we might pass a 'Duck' object at runtime.
    birdie.quack() # A type checker would raise an error here.
 
# Even though this would work at runtime, the type checker would flag the call.
get_a_bird(Duck())

Gradual Typing

Gradual typing is a blend of both duck typing and nominal typing. Python, with its introduction of optional type hints, has become a gradually typed language. This means you can add type hints to your code where you want to, and still use the flexible duck typing for the rest.

  • The Python interpreter itself ignores type hints at runtime, so the code behaves as it would with duck typing.
  • You use external tools like mypy to perform static analysis and get the benefits of nominal typing.
  • This allows developers to incrementally add type checking to their codebase, providing a middle ground between the full flexibility of duck typing and the safety of static typing.

Duck Typing as Implicit Form of Structural Typing

When the author says “Duck typing is an implicit form of structural typing, which Python ≥ 3.8 also supports with the introduction of typing.Protocol” it means that it doesn’t care about a variable’s declared type; it just tries to perform the operation. If the object has the required structure (the right methods and attributes), the code works.

Structural vs. Nominal Typing

Nominal typing (which we discussed earlier) is about a type’s name. For a function expecting a Bird, you must pass an object that is an instance of the Bird class or one of its subclasses. - When you write def my_func(arg: Bird):, you’re telling the type checker that arg must be an instance of the Bird class or a subclass of Bird.

Structural typing is all about an object’s structure—its methods and attributes. It doesn’t care about the object’s class name. As long as an object has the right “shape” (i.e., the required methods and attributes), it’s considered compatible. When you write def my_func(arg: Quacking):, you’re telling the type checker that arg can be an instance of any class, as long as that class has a quack() method. So, the class doesn’t have to inherit from a base class or implement any specific interface named Quacking.

Duck Typing as Implicit Structural Typing

Python’s duck typing is the implicit version of structural typing. At runtime, the interpreter checks for the existence of methods or attributes on the object. If the required method is there, the code works. The structure of the object is what matters, not its name or class hierarchy.

typing.Protocol for Explicit Structural Typing

With the introduction of typing.Protocol in Python 3.8, you can now enforce structural typing statically using type hints. typing.Protocol allows us to make duck typing explicit for static type checkers. A Protocol defines a “contract” or a “blueprint” of a class’s required methods and attributes without needing to inherit from a base class.

You can create a Protocol that specifies a certain structure, and a static type checker like mypy will then verify that any class passed to a function adheres to that structure.

Example

Let’s revisit the duck example, but this time using a Protocol.

from typing import Protocol
 
class Quacking(Protocol):
    def quack(self) -> None:
        ...
 
class Duck:
    def quack(self) -> None:
        print("Quack!")
 
class Person:
    def quack(self) -> None:
        print("I can quack too!")
 
def make_it_quack(animal: Quacking):
    # This function expects any object that adheres to the Quacking protocol.
    animal.quack()
 
make_it_quack(Duck()) # mypy would accept this.
make_it_quack(Person()) # mypy would also accept this.

In this example, the Quacking protocol says, “I don’t care about the class name, I just need an object with a quackmethod that takes no arguments and returns None.” mypy will check if Duck and Person conform to this protocol, and because they do, the code passes the static check.

This is why typing.Protocol is so powerful: it allows you to get the flexibility of structural typing (which is what duck typing does at runtime) but with the safety and early error-checking benefits of a static type checker. It bridges the gap between Python’s dynamic nature and the desire for more robust, statically checked code.

Types usable in annotations

The Any Type

The keystone of any gradual type system is Any type, also known as the dynamic type. When a type checker sees an untyped function, it assumes that the untyped arguments are of type Any. So, when the type checker sees:

 def double(x):
	 return x * 2

actually it assumes:

def double(x: Any) -> Any:
	return x * 2

This means that x and the returned value of double() function can be of any type, including different types.

Any is supposed to support every possible operation, and this is in contrast with object. For example, let’s consider this function:

def double(x: object) -> object:
	return x * 2

Mypy rejects this function, and the reason is that object does not support the __mul__ operation:

>>> mypy double_object.py 
double_object.py:2: error: Unsupported operand types for * ("object" and "int")  [operator]
Found 1 error in 1 file (checked 1 source file)

Actually, this is not completely unpredictable because more general types (and object is a general type) have narrower interfaces (i.e. they support fewer operations): object class implements fewer operations than abc.Sequence, which implements fewer operations that abc.MutableSequence, which implements fewer operations than list.

Any is a special type that exists both at the highest and lowest points of any type hierarchy. It’s at once the broadest type and the most specific, allowing any kind of operation. This is, at least, how the type checker interprets Any. Naturally, no real type can truly handle every operation, so relying on Any stops the type checker from doing its main job: spotting operations that might be invalid before they cause your program to fail with a runtime error.

Subtype-of versus Consistent-with

Subtype-of in Nominal Type System

Traditional object-oriented Nominal Type Systems rely on the “is subtype-of” relationship. Given a class T1 and a subclass T2, then T2 is subtype-of T1. Let’s consider this code:

class T1;
	...
 
class T2(T1):
	...
 
def f1(p: T1) -> None:
	...
 
o2 = T2()
 
f1(o2) # OK

The call f1(o2) is an application of the Liskov Substitution Principle - LSP. Barbara Liskov actually defined “is subtype-of” in terms of supported operations: if an object of type T2 substitutes an object of type T1 and the program still behaves correctly, then T2 is subtype-of T1. Instead, if you consider:

def f2(p: T2):
	pass
 
o1 = T1()
 
f2(o1) # type error

this is a violation of the LSP.

This behaviour makes perfect sense:

  • T2 inherits and must support all operations of T1. So, an instance of T2 can be used anywhere an instance of T1 is expected.
  • The reverse is not necessarily true: T2 may implement additional methods that T1 doesn’t have, so an instance of T1 may not be used everywhere an instance of T2 is expected.

Consistent-with in Gradual Type System

Gradual Type Systems rely on the “consistent with” relationship. It applies wherever subtype-of applies, with special provisioning for type Any. The rules for consistent-with are:

  1. Given T1 and a subtype T2, then T2 is consistent-with T1 (Liskov substitution);
  2. Every type is consistent-with Any: you can pass objects of every type to an argument declared of type Any (but every type is not a subtype of Any):
    def f3(p: Any) -> None
     
    o0 = object()
    o1 = T1()
    o2 = T2()
     
    f3(o0) # OK
    f3(o1) # OK
    f3(o2) # OK
  3. Any is consistent-with every type: you can always pass an object of type Any where an argument of another type is expected (but Any is not a subtype of every type):
    def f4(): # implicit return type: `Any`
    	...
     
    o4 = f4() # inferred type: `Any`
     
    # since o4 is inferred of type `Any`, I can use it everywhere
    f1(o4)
    f2(o4)
    f3(o4)

Gradual typing allows one to annotate only part of a program, thus leveraging desirable aspects of both dynamic and static typing.

Simple Types and Classes

The following object can be used directly in type hints:

  • simple types like int, float, str, etc.
  • concrete classes from the standard library;
  • concrete classes from external packages;
  • user-defined classes;
  • abstract base classes (we’ll start in one of the next chapters Abstract Base Classes).

Among classes, consistent-with is defined like subtype-of: a subclass is consistent-with all its superclasses. However, since “practicality beats purity”, there is an important exception with int, float, and complex: there is no nominal subtype relationship between int, float, and complex because they’re direct subclasses of object. But PEP 484 declares that:

  • int is consistent-with float (even though int is not a subclass of float)
  • float is consistent-with complex (even though float is not a subtype of complex).

It makes sense in practice: int implements all operations that float does, and int implements additional ones as well (like bitwise operations like &, |, <<, etc.)

Optional and Union Types

We have already seen in Using None as a Default paragraph that Optional is a special type used to set None as a default and we said that arg: Optional[str] simply means that this parameter can either be of type str or None”.

Indeed, Optional[T] is just a shortcut for Union[T, None], where Union simply allows a parameter to be one of the listed types.

Here’s an example of Union usage:

def fun(arg: Union[str, int]) -> Union[str, int]
	pass

Here:

  • arg may be either str or int;
  • The function may return a str or an int.

It is very common to try to avoid creating functions that return Union types because this forces the user to check the type of the returned values at runtime to know what to do with them.

Union is useful with types that are not consistent among themselves. For example, Union[int, float] is redundant because, as we said in Simple Types and Classes, int is consistent-with float. If you just use float to annotate a parameter, it will accept int type as well

Note that Python 3.10 (with PEP 604) introduces a new syntax for expressing optional and union by using | operator:

  • instead of arg: Union[str, int] we can write arg: str | int;
  • instead of arg: Union[str, int] = None we can write arg: str | int = None;

Generic Collections

In Python collections, you can put a mixture of different types. In practice, this is not very useful because it’s likely to want to operate on them. Generic Types can be declared with type parameters to specify the type of the items they can handle.

For example, you can parameterize a listto constrain the type of the elements in it:

def tokenize(text: str) -> list[str]:
	return text.upper.split()

PEP 585 lists collections from the standard library accepting generic type hints.

Python 3.7 and 3.8

For these Python versions, we need a __future__ import to make the [] notation work with built-in collections:

from ___future__ import annotations
 
def tokenize(text: str) -> list[str]:
	return text.upper().split()

Python 3.5 and 3.6

The __future__ import does not work with Python 3.6 or earlier. For Python >= 3.5:

from typing import List
 
def tokenize(text: str) -> List[str]:
	return text.upper().split()

To provide the initial support for generic type hints, the author of PEP 484 created dozens of generic types in the typing module. Some of them are:

Past and Future of Generic Collections

PEP 585 started a multiyear process to improve the usability of generic type hints. We can summarize that process in four steps:

  1. Introduce from __future__ import annotations in Python 3.7 to enable the use of standard library classes as generics with list[str] notation.
  2. Make that behavior the default in Python 3.9: list[str] now works without the future import.
  3. Deprecate all the redundant generic types from the typing module.
  4. Remove those redundant generic types in the first version of Python released 5 years after Python 3.9 (that could be Python 3.14).

Tuple Types

There are three ways to annotate tuple types:

  1. Tuples as records
  2. Tuples as records with named fields
  3. Tuples as immutable sequences

Tuples as records

If you use a tuple as a record, use the tuple built-in and declare the types of the fields within [], such as arg: tuple[str, float, str].

Tuples as records with named fields

To annotate a tuple with many fields, it is recommended to use typing.NamedTuple, as already seen in 05. Data Class Builders. For example:

from typing import NamedTuple
 
class Coordinate(NamedTuple):
	lat: float
	lon: float
 
def find_location(lat_lon: Coordinate) -> str:
	pass

Considering that:

Therefore: Coordinate is consistent-with tuple[float, float], but the reverse is not true (Coordinate has extra methods added by NamedTuple). This means that it is type-safe to pass a Coordinate instance to the display function:

def display(lat_lon: tuple[float, float]) -> str:
	pass

Tuples as immutable sequences

To annotate tuples of unspecified length that are used as immutable lists, you must specify a single type followed by a comma and ... (that’s Python’s ellipsis token).

For example, tuple[int, ...] is a tuple with int items where the ellipsis indicates that any number of elements >= 1 is acceptable. Actually, arg: tuple[Any, ...] is the same as arg: tuple.

Generic Mappings

Generic mapping types are annotated as Mapping[KeyType, ValueType]. The built-in dict and the mapping types in collections (such as collections.defaultdict, collections.OrderedDict, collections.Counter, collections.ChainMap, etc.) and collections.abc (such as collections.abc.Mapping, collections.abc.MutableMapping) accept that notation in Python >= 3.9. Mapping types from the typing module are necessary for earlier versions.

Abstract Base Classes

First, let’s introduce Postel’s Law:

Postel's law

Be conservative in what you send, be liberal in what you accept.

Be liberal in what you accept

When annotating function arguments, it’s best to use abstract classes (such as those listed in Python 3.5 and 3.6) or their equivalents in the typing module (for versions before Python 3.9), rather than concrete types. This provides greater flexibility for the caller.

For example, using arg: Mapping[str, int] instead of arg: dict[str, int] allows the caller to provide an instance of dict, defaultdict, ChainMap, a UserDict subclass, or any other type that is a subtype-of Mapping.

On the other hand, arg: dict[str, int] allows the caller to provide a dict or one of its subtypes, such as defaultDict or OrderedDict. Indeed, a subclass of collections.UserDict would not pass the type check, despite being the recommended way to create user-defined mappings, as we saw in 5.5. Subclassing UserDict instead of dict, because it is not a subclass of dict; they are siblings. Both are subclasses of abc.MutableMapping (actually dict is a virtual subclasses of abc.MutableMapping; this concept will be explained in chapter 13).

Therefore, in general, it’s better to use abc.Mapping or abc.MutableMapping in parameter type hints instead of dict.

Be conservative in what you send

The return value of a function is always a concrete object, so the return type hint should be a concrete type, such as def test_function(text: str) -> list[str]

Iterable

The typing.List documentation recommends Sequence and Iterable for function parameter type hints.

As an example, let’s use a function taking as a parameter an Iterable that produces items (I was wondering why the book uses the word “produces”. The following is an explanation given by ChatGPT: “An Iterable is something that yields or provides items one by one when iterated over. This wording emphasizes that the function does not necessarily take a concrete data structure [like a list or a tuple], but rather any object that can be iterated over to generate the required items.) that are tuple[str, str]:

from collections.abc import Iterable
 
FromTo = tuple[str, str]
 
def zip_replace(text: str, changes: Iterable[FromTo]) -> str:
	for from_, to in changes:
		text = text.replace(from_, to)
	return text
 
 
l33t = [('a', '4'), ('e', '3'), ('i', '1'), ('o', '0')]
text = 'mad skilled noob powned leet'
zip_replace(text, l33t) # Output: 'm4d sk1ll3d n00b p0wn3d l33t'

Note that PEP 613 introduces TypeAlias, which is used to indicate that a particular name is meant to be a type alias, making it clearer and preventing confusion with regular type annotations.

Parametrized Generics and TypeVar

A Parametrized Generic is a generic type written as list[T], where T is a type variable that will be bound to a specific type with each usage. This allows a parameter type to be reflected on the result type. Let’s give this example:

from collections.abc import Sequence
from random import shuffle
from typing import TypeVar
 
T = TypeVar('T')
 
def sample(population: Sequence[T], size: int) -> list[T]:
	if size < 1:
		raise ValueError('size must be >= 1')
	result = list(population)
	shuffle(result)
	return result[:size]

We used a type variable because:

  • if we consider a tuple of type tuple[int, ...] - which is consistent-with Sequence[int] - then the type parameter is int, so the return type is list[int];
  • if we consider a tuple of type tuple[str, ...] - which is consistent-with Sequence[str] - then the type parameter is str, so the return type is list[str].

Another interesting example involves the statistics.mode function from the standard liberay, which returns the most common data point from a series. Without using TypeVar this function could have this signature:

from collections import Counter
from collections.abd import Iterable
 
def mode(data: Iterable[float]) -> float:
	pairs = Counter(data).most_common(1)
	if len(pairs) == 0:
		raise ValueError('no mode for empty data')
	return pairs[0][0]

This signature is great for int or float values, but not for other numerical types. We can improve the signature using TypeVar. Let’s start with a wrong parametrized signature:

from typing import TypeVar
 
T = TypeVar('T')
 
def mode(data: Iterable[T]) -> T:
	...

This signature has a problem: every iterable is consistent-with Iterable[T], including iterables of unhashable types that collections.Counter cannot handle. Therefore, we need to restrict the possible types assigned to T.

Restricted TypeVar

TypeVar accepts extra positional arguments to restrict the type parameter:

from collections.abc import Iterable
from decimal import Decimal
from fractions import Fraction
from typing import TypeVar
 
NumberT = TypeVar('NumberT', float, Decimal, Fraction)
 
def mode(data: Iterable[NumberT]) -> NumberT:
	...

If we want to add strings too, we can simply extend the list of arguments in TypeVar:

NumberT = TypeVar('NumberT', float, Decimal, Fraction, str)
 
def mode(data: Iterable[NumberT]) -> NumberT:
	...

This code works, but:

  • NumberT is badly misnamed since it accepts str;
  • we can’t keep listing types forever.

To solve this, we can use another feature of TypeVar.

Bounded TypeVar

As we mentioned before, collections.Counter cannot handle iterables of unhashable types, so the first (wrong) solution that comes to mind could be:

from collections.abc import Iterable, Hashable
 
def mode(data: Iterable[Hashable]) -> Hashable:
	...

The problem is that the type of the returned item is an Hashable, an abstract class that implements only the __hash__ method. So, the type checker will not let us do anything with the return value except call hash() on it. The solution is to use TypeVar with the bound parameter, which sets an upper boundary for the acceptable types. For example, bound=Hashable means the type parameter may be Hashable or any subtype-of it. Let’s rewrite the mode function:

from collections.abc import Iterable, Hashable
from collections import Counter
from typing import TypeVar
 
 
HashableT = TypeVar('HashableT', bound=Hashable)
 
def mode(data: Iterable[HashableT]) -> HashableT:
	pairs = Counter(data).most_common(1)
	if len(pairs) == 0:
		raise ValueError('no mode for empty data')
	return pairs[0][0]

Then:

>>> mode(['cat', 'dog', 'cat'])
'cat'
 
>>> mode([1,2,3,4,4,5])
4

str and int implement __hash__(), so they are subtypes-of (or consistent-with) the Hashable abstract class.

Static Protocol

The Protocol type, as presented in PEP 544, is similar to interfaces in Go: a protocol type is defined by specifying one or more methods, and the type checker verifies that those methods are implemented where that protocol type is required.

In Python, a protocol definition is written as a typing.Protocol subclass. However, classes that implement a protocol don’t need to inherit, register, or declare any relationship with the class that defines the protocol. It’s up to the type checker to find the available protocol types and enforce their usage.

Let’s create a function top(it, n) that returns the largest n elements of the iterable it by using Protocol and TypeVar:

from collections.abc import Iterable
 
def top(series: Iterable[T], length: int) -> list[T]:
    ordered = sorted(series, reverse=True)
    return ordered[:length]

The problem is: how to constrain T? It cannot be Any or object, because series must work with sorted() function. This problem could not arise when in the presence of the optional parameter key in the sorted() function. Consider the following scenario:

class A:
	def __init__(self, val):
		self.val = val
 
	def __repr__(self):
		return f'Obejct with val {self.val}'
 
items = [A(5), A(1), A(3)]
sorted_items = sorted(items, key=lambda a: a.val)

Then:

>>> sorted_items
[Obejct with val 1, Obejct with val 3, Obejct with val 5]

In this case, Any would be correct because Python’s sorted() doesn’t care what types are in the iterable, as long as you tell it how to compare them using the key parameter.

On the other hand, the problem arises when the key argument is not present:

>>> l = [object() for _ in range(4)]
>>> sorted(l)
TypeError: '<' not supported between instances of 'object' and 'object'

We got the error because the objects we are trying to sort don’t implement the __lt__ special method that supports the < operator. This means that the T parameter must be limited to types that implement __lt__ special method. Remember that in the example Bounded TypeVar, we resolved the issue by using typing.Hashable as the upper bound of T; but now for this case, there is no suitable type in typing or abc, so we need to create it:

from typing import Protocol, Any
from collections.abc import Iterable
from typing import TypeVar
 
class SupportsLessThan(Protocol):
    def __lt__(self, other: Any) -> bool:
        ...
 
T = TypeVar('T', bound=SupportsLessThan)
 
def top(series: Iterable[T], length: int) -> list[T]:
    ordered = sorted(series, reverse=True)
    return ordered[:length]

A type T is consistent-with a protocol P if T implements all the methods defined in P, with matching type signatures.

protocol type is a new feature in Python that allows you to specify a required “interface” for a function parameter or variable without needing to use inheritance. This is also called structural subtyping or static duck typing.

Duck Typing vs. Static Duck Typing:

  • Duck typing (the traditional Python way) is implicit: “If it walks like a duck and quacks like a duck, it’s a duck.” You can pass any object to a function as long as it has the required methods, but this is invisible to a static type checker.
  • Static duck typing (using typing.Protocol) makes this explicit for static type checkers. It allows you to define a protocol that outlines the required methods, so type checkers can verify the code without actually running it.

Advantages of Protocol Types:

  • No Special Declarations: A type doesn’t need to be derived from or registered with a protocol class. It just needs to implement the required methods. For example, str and float can be used with a SupportsLessThan protocol as long as they implement the __lt__ method.
  • Works with Existing Code: You can create a protocol that works with classes and types that you don’t control or can’t modify.
  • Type Checker Visibility: Protocols make the implicit rules of duck typing visible to static type checkers, allowing them to perform their job effectively and catch potential errors before runtime. This is not possible with traditional duck typing.

The typing.Protocol class was introduced in PEP 544. By inheriting from Protocol, you can define a class that serves as a blueprint for the required methods. For example, to create a protocol for a parameter that needs to support the < operator, you would define a protocol class with an __lt__ method. This tells the type checker that any object passed to this parameter must have an __lt__ method.

The nominal type of series doesn’t matter, as long as it implements the __lt__ method.

Python duck typing always allowed us to say that implicitly, leaving static type checkers clueless because a type checker can’t read CPython’s source code in C, or perform console experiments to find out that sorted only requires that the elements support <.

Protocol types are precisely what is needed to make duck typing explicit for static type checkers.

That’s why it makes sense to say that typing.Protocol gives us static duck typing.