18. Iterables and Iterators

The most basic thing that comes in mind in Python are lists, so let’s create one:

basic_list = [1, 2, 3, 4, 5]

We can also check its attribute:

>>> dir(basic_list)
['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

Among these methods, we have __iter__, which is called when you run constructs like for loops. This method returns the iterator object, which then provides a new value each time the loop iterates. The key difference between iterables and iterators is that iterables are objects that can return an iterator (via the iter method), while iterators are objects that define how to retrieve values from an iterable.

Let’s call this dunder __iter__ directly on our list:

>>> basic_list.__iter__()
<list_iterator at 0x1031cfbe0>

This method provides a new object back. A shorthand for this is just to call the iter method on our list, which does the exact same thing:

>>> basic_iter = iter(basic_list)
>>> basic_iter
<list_iterator at 0x1031cf490>

Let’s check the type:

>>> type(basic_iter)
list_iterator

which is different than the type on basic_list:

>>> type(basic_list)
list

The new iterator object basic_iter has a new magic method (visibile with dir(basic_iter)) called __next__. This is the magic method that defines the behaviour for getting a new value out of list.

Let’s call this method directly on the basic_iter:

>>> basic_iter.__next__()
1

Of course, we can alternatively use the next() function:

>>> next(basic_iter)
2

Let’s call again this function:

>>> next(basic_iter)
3
 
>>> next(basic_iter)
4
>>> next(basic_iter)
5
 
>>> next(basic_iter)
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
Cell In[9], line 4
      2 print(next(basic_iter))
      3 print(next(basic_iter))
----> 4 print(next(basic_iter))
 
StopIteration:

This StopIteration exception is how your for loops know when to stop, otherwise they just continue on forever.

Let’s try to call next() function on our list:

>>> next(basic_list)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[10], line 1
----> 1 next(basic_list)
 
TypeError: 'list' object is not an iterator

We are understanding that iterators work by maintaining state as demonstrated above.

Let’s create a new class to see better these concepts in action:

import random
from string import ascii_lowercase
 
class RandomLetters:
    def __init__(self, count=5):
        self.count = count
 
    def __iter__(self):
        print("__iter__ called")
        self.i = 0
        return self
    
    def __next__(self):
        print("__next__ called")
        if self.i < self.count:
            self.i += 1
            return random.choice(ascii_lowercase)
        else:
            raise StopIteration

Now, let’s use a for loop on an instance of this class:

for l in RandomLetters():
    print(l)

Output:

__iter__ called
__next__ called
b
__next__ called
j
__next__ called
z
__next__ called
x
__next__ called
y
__next__ called

def __iter__(self): it stores the current iteration number and we’ll start from 0.
without the if self.i < self.count: statement, the code would be executed foerver indefinetely
raise StopIteration: as we’ve seen before, the only way the for loop knew how to exit the iteration was beacuse of the StopIteration exception.

From the output above, you can see that the __iter__ method is called at the beginning. This happens because when the for loop starts, it calls the __iter__ method to get the iterator. As the loop progresses, the __next__ method is called to retrieve the first value, then called again to get the second value, and so on until it reaches the end.

If you know generators, you can notice that the behaviour seems similar and actually it is. Technically, all generators follow the iterator protocol, but not all iterators are generators.

Something like this behaviour is more easily defined as a generator, so why would you choose defining your own interator instead of just defining a generator? Iterators in this form provide a little bit more control, especially if you want to work with more complex data or if you want to do more complex state management. So, as new example, we’re going to define a new class:

class RandomPoints:
    def __init__(self, l_bound=0, u_bound=100, count=10) -> None:
        self.l_bound = l_bound
        self.u_bound = u_bound
        self.count = count
 
    def __iter__(self):
        self.i = 0
        return self
    
    def __next__(self):
        if self.i < self.count:
            self.i += 1
            return (
                random.randint(self.l_bound, self.u_bound),
                random.randint(self.l_bound, self.u_bound)
            )
        else:
            raise StopIteration

This class is quite similar to the previous one, except for the fact that now we’re returning two values instead of a single value. Let’s use this class in a for loop:

for x, y in RandomPoints():
    print(f"{x=}, {y=}")

Output:

x=53, y=4
x=39, y=60
x=90, y=2
x=39, y=24
x=18, y=16
x=54, y=94
x=46, y=96
x=69, y=65
x=54, y=89
x=25, y=40

Quartz 4

Explorer

18. Iterables and Iterators

Graph View