The Single Responsibility Principle

The Single Responsibility Principle (SRP) is a fundamental rule in software design. It says that every part of the code—especially a class—should have just one responsibility. In simpler terms, a class should only do one specific job, and it should only change if there’s a change in that particular area of the software.

When a class takes on too many tasks, it’s often called a God object—it tries to do everything and becomes too complex and hard to manage. This makes the code harder to read, maintain, or extend in the future.

To fix this, we should break the class into smaller, more focused classes, each handling one responsibility.

The SRP is connected to another idea in software design called cohesion. Cohesion means that the parts of a class (its methods and attributes) should be closely related and used together. If that’s not the case—if different parts of the class never interact or feel unrelated—it’s a sign the class might be handling multiple concerns and should be split.

One practical test for SRP: Look at the class and its methods. If some methods don’t relate to each other at all, or they serve different purposes, it’s a clue that the class is doing more than one job. Those responsibilities should probably be moved into separate classes.

A class with too many responsibilities

This section builds on the Single Responsibility Principle by showing a practical example of what happens when a class tries to do too much.

The example is a class named SystemMonitor, which has three methods:

class SystemMonitor:
    def load_activity(self):
        """Get the events from a source, to be processed."""
    def identify_events(self):
        """Parse the source raw data into events (domain objects)."""
    def stream_events(self):
        """Send the parsed events to an external agent."""

At first glance, it may seem like a useful class that handles all the tasks related to event monitoring. But there’s a problem: each method does something very different, and they don’t rely on each other. These are orthogonal actions—meaning they could happen independently and aren’t tightly connected.

This is a design flaw. Why? Because each method represents a separate responsibility. That means the class could need to change for multiple different reasons:

  • If we decide to change how data is loaded from the source, the load_activity() method changes.
  • If we need to adjust how events are parsed, identify_events() changes.
  • If we change the way events are sent somewhere else, stream_events() changes.

All of these are unrelated reasons for the class to change. That’s a violation of SRP.

Take the load_activity() method as an example. Imagine we change the data source or the structure used to store that data. It doesn’t make sense that this would affect the whole SystemMonitor class. Why should the system monitor itself need to change just because we altered a detail about data loading?

The same applies to parsing or streaming events. All these tasks should live in different, specialized classes, each with its responsibility. Right now, the SystemMonitor is like a “kitchen sink” class that handles too many jobs. This makes it:

  • Fragile – A small change might break something unrelated.
  • Hard to maintain – Too many reasons to change.
  • Inflexible – Difficult to reuse or extend parts of the class without touching others.

The solution is to refactor this large class into smaller, focused abstractions. Each new class would take care of just one part of the process (e.g., loading, parsing, or streaming), following SRP and making the code easier to maintain and evolve.

Distributing responsibilities

To make the software easier to maintain, the solution is to separate each responsibility into its own class. Instead of having one big class that does everything (like the earlier SystemMonitor), we split the responsibilities into smaller classes, each handling a single, specific task:

Even though we might still need an object (like a coordinator or manager) that brings all these smaller classes together and interacts with them, the key idea is that each class is self-contained. Each one has only the logic it needs and does not depend on how others work.

Why this approach is better:

  • If something changes—like the way we load events—only the related class needs to be updated.
  • The rest of the system (like alerting or streaming data) is not aware of these changes and it remains untouched, as long as the interface (how they communicate) stays the same.
  • This keeps the impact of changes local and small, making the system less fragile and much easier to maintain.

Another benefit is that the new classes are more reusable. For example, if in another part of the app we want to read logs for a different reason, we can reuse the new class that handles just the log-reading part (e.g., ActivityWatcher). This wouldn’t be possible in the old design because the class included unrelated methods like identify_events() or stream_events()—methods we wouldn’t need in that new context.

Important note: Following SRP doesn’t mean every class must have only one method. A class can have several methods, as long as they all deal with the same responsibility or concern.

Final thoughts and practical advice:

  • You don’t have to perfectly follow SRP from the very beginning. That’s not the goal.
  • Instead, think of SRP as a guideline or thought process.
  • When designing a class that seems to be doing too many things, it’s a sign you should think about splitting it into smaller parts.
  • A good way to figure out how to split responsibilities is to start by writing a larger, monolithic class. Once you see how the parts interact internally, you’ll have a better idea of how to break it down into smaller, more logical pieces.

The open/closed principle

The Open/Closed Principle (OCP) is one of the fundamental ideas in writing clean and maintainable code. It says that a software module—this can be a class, a function, or even a whole module—should be open for extension, but closed for modification.

Let’s break that down:

  • Open for extension means we should be able to add new behaviors or features without touching the existing code.
  • Closed for modification means that once code is written and tested, you shouldn’t have to edit it every time requirements change.

The main goal of this principle is to improve maintainability. When the rules or features of our software domain change (which happens a lot in real life), we want to be able to adapt to those changes by adding code, not by changing code that already works. This reduces the risk of introducing bugs into tested and stable parts of the system.

If we often need to modify the original code to handle new cases, that’s a sign that our design could be improved—it may not be following the Open/Closed Principle properly.

This concept can be applied to different levels of software: individual classes, modules, or larger components. The basic idea stays the same: design your code so that it grows by adding, not by rewriting.

Example of maintainability perils for not following the OCP

In this example, the book shows a system that does not follow the Open/Closed Principle (OCP). The goal is to identify the problems that come with this kind of design—mainly, that it’s hard to maintain and inflexible when new features are needed.

We have a system that monitors events happening in another system. The events are described by some data (in the form of dictionaries), and a component tries to identify the type of event based on that data. The code is structured with a class called SystemMonitor, which uses the data to decide whether an event is a login, logout, or unknown:

There is a simple class hierarchy:

  • A base Event class, with subclasses:
    • LoginEvent
    • LogoutEvent
    • UnknownEvent (used when the type can’t be determined)

The logic inside the SystemMonitor.identify() method tries to detect the type of event based on simple session flags:

  • If a session changes from 0 to 1 → it’s a login
  • If it changes from 1 to 0 → it’s a logout
  • Otherwise → it’s unknown

Here’s the corresponding code:

@dataclass
class Event:
	raw_data: dict
 
class UnknownEvent(Event):
"""A type of event that cannot be identified from its data."""
 
class LoginEvent(Event):
"""A event representing a user that has just entered the system."""
 
class LogoutEvent(Event):
"""An event representing a user that has just left the system."""
 
class SystemMonitor:
"""Identify events that occurred in the system."""
 
	def __init__(self, event_data):
		self.event_data = event_data
	
	def identify_event(self):
		if (
			self.event_data["before"]["session"] == 0
			and self.event_data["after"]["session"] == 1
		):
			return LoginEvent(self.event_data)
		elif (
			self.event_data["before"]["session"] == 1
			and self.event_data["after"]["session"] == 0
		):
			return LogoutEvent(self.event_data)
		return UnknownEvent(self.event_data)

When we run this code, it behaves as expected. For example:

>>> l1 = SystemMonitor({"before": {"session": 0}, "after": {"session": 1}})
>>> l1.identify_event().__class__.__name__
'LoginEvent'

So far, it seems to work. But here’s the design problem: even though the class uses inheritance and polymorphism, the identify_event() method is not well designed for growth. All the logic to detect the event type is packed into one method using a chain of if and elif conditions. This creates multiple problems:

  • Violation of the Open/Closed Principle: Every time a new type of event is introduced, we have to modify the identify_event() method. That’s exactly what OCP says we should avoid.
  • Poor scalability: As more event types are added, the method will get longer and more complex. It becomes harder to read, understand, and test. This makes the code fragile and harder to maintain.
  • Lack of separation of concerns: The method is doing too much. It’s responsible for checking many different conditions and choosing which subclass to use. This goes against the principle of writing code that does one thing and does it well.
  • Harder to test and reuse: Because all event-detection logic is in one place, it’s harder to test new behavior in isolation.

One good thing in this design is that instead of returning None when no event is recognized, it returns an instance of UnknownEvent. This follows the null object pattern, which avoids having to constantly check for None in the rest of the code. This pattern will be explained more in Chapter 9.

What we want is:

  • to add new types of events without having to change the identify_event method (closed for modification);
  • to be able to support new types of events (open for extension) by adding code, not by changing the code that already exists.

Refactoring the events system for extensibility

In the original version:

  • The SystemMonitor class had all the logic to detect events inside a single method.
  • It directly used specific event types (LoginEvent, LogoutEvent, etc.).
  • Every time a new event type was added, we had to modify the identify_event() method, making the system harder to maintain.

To follow the OCP, we need to design the system so that:

  • It interacts with abstractions (not specific classes).
  • The system is open for adding new features, but closed for modifying existing code.

So, instead of having SystemMonitor contain all the decision logic, we now delegate that responsibility to each specific event class. Here’s what changes:

  1. A common interface (a base class Event) is defined.
  2. Each specific event subclass (LoginEvent, LogoutEvent) implements a polymorphic method called meets_condition() to determine if it matches the data.
  3. The SystemMonitor doesn’t decide anything itself. Instead, it loops through all event subclasses and asks them: “Do you match this data?”

Here’s the new code:

class Event:
    def __init__(self, raw_data):
        self.raw_data = raw_data
 
    @staticmethod
    def meets_condition(event_data):
        return False  # Default, override in subclasses
 
class LoginEvent(Event):
    @staticmethod
    def meets_condition(event_data: dict):
        return (
            event_data["before"]["session"] == 0 and
            event_data["after"]["session"] == 1
        )
 
class LogoutEvent(Event):
    @staticmethod
    def meets_condition(event_data: dict):
        return (
            event_data["before"]["session"] == 1 and
            event_data["after"]["session"] == 0
        )
 
class SystemMonitor:
"""Identify events that occurred in the system."""
 
	def __init__(self, event_data):
		self.event_data = event_data
	
	def identify_event(self):
	    for event_cls in Event.__subclasses__():
		    try:
		        if event_cls.meets_condition(self.event_data):
		            return event_cls(self.event_data)
			except KeyError:
				continue
	    return UnknownEvent(self.event_data)

Why This Design Is Better

  • Closed for modification: The identify_event() method no longer needs to change when new events are added.
  • Open for extension: To support a new event, we just create a new class that extends Event and defines its own meets_condition() logic.
  • Decoupled logic: Each class is responsible for its own behavior. No long if/elif chains anymore.
  • Polymorphism: All event classes follow the same interface (meets_condition), making the system flexible and consistent.

How Events Are Discovered In this example, the system finds all possible event types using Python’s built-in __subclasses__() method. While this is enough to demonstrate the idea, in real-world projects you might use more advanced patterns like:

  • Class registries
  • The abc module (for abstract base classes)

But no matter the method, the core idea stays the same: keep the system’s behavior generic and delegate specific rules to specialized classes.

Extending the events system

Now we’ll see how easy it is to add a new feature—without modifying any existing code.

Let’s say the system needs to support a new kind of event: one that represents a user transaction. This means whenever a transaction happens in the monitored system, it should be detected and classified as a TransactionEvent:

Since we’re following the Open/Closed Principle, we don’t have too much to change. To support this new requirement, we only have to do one thing: create a new class that extends the Event base class and defines its own detection logic. Here’s the new class:

class TransactionEvent(Event):
    """Represents a transaction that has just occurred on the system."""
    @staticmethod
    def meets_condition(event_data: dict):
        return event_data["after"].get("transaction") is not None

Why This Is a Good Design? The most important point here is that:

  • The rest of the system remains untouched.
  • The SystemMonitor.identify_event() method works just as before, because it loops through all subclasses of Event and checks their meets_condition() method.

In other words:

  • identify_event() is closed for modification—we didn’t need to change it.
  • The event system is open for extension—we were able to add a new class to handle a new case.

Final Thoughts on the Open/Closed Principle

This last part reflects on the deeper meaning and practical implications of the Open/Closed Principle in real-world software design.

1. OCP and Polymorphism Go Hand-in-Hand.

OCP is strongly tied to polymorphism:

  • To design systems that follow OCP, we need to work with abstractions.
  • These abstractions must define a polymorphic contract, meaning different components (like event types) can be used interchangeably as long as they implement the same interface or method.
  • The client code (like the SystemMonitor) should be able to rely on this contract, without needing to know the details of each implementation.

In simpler words: we work with general structures, and the specific behavior is handled by specialized classes that “plug into” the system.

2. The Real Goal: Maintainability

At its core, OCP helps solve a big problem in software: maintainability. When we don’t follow OCP:

  • Small changes in requirements can lead to changes all across the codebase.
  • This creates ripple effects—a change in one place causes unexpected problems in others.
  • It becomes risky and time-consuming to add new features or fix bugs.

3. It’s Not Always Easy or Possible

The author also gives a realistic perspective: while OCP is a great goal, it’s not always achievable everywhere. Sometimes:

  • One abstraction works well for some types of changes, but not for others.
  • It’s hard to find a single design that’s “closed” against all possible future modifications.

Liskov’s Substitution Principle

Liskov’s Substitution Principle (LSP) is a rule in object-oriented programming that helps ensure that our code stays reliable and easy to maintain when using inheritance (i.e., creating subclasses from a parent class).

The core idea is this: if you have a class (let’s say Class T), and you create another class (Class S) that inherits from it, then you should be able to use Class S anywhere you use Class T—without breaking anything or changing how the program behaves. The program should continue to work just as expected.

In practical terms:

  • If you have some code (a client class) that depends on a certain type (like an interface or an abstract class), that code should work perfectly with any subclass of that type.
  • The client shouldn’t need to know whether it’s dealing with the base type or a specific subtype. Everything should behave the same way, following the same rules.

This principle ties in with other important ideas in software design:

  • Designing for interfaces: It’s better to program using interfaces or abstract classes rather than specific implementations. This makes the code more flexible.
  • Design by contract: Think of the class and the client as having a contract. As long as subclasses keep the promises made by the parent class (for example, not changing how a method is supposed to work), then everything stays consistent.

Detecting LSP issues with tools

This section explains how static analysis tools like mypy and pylint can help us find problems in our code related to the Liskov Substitution Principle (LSP)—which, as you’ve seen, says that subclasses should behave in a way that doesn’t break what the parent class promises.

Using mypy to detect incorrect method signatures

When we use type annotations and run mypy, it can automatically detect when a subclass doesn’t follow the correct method signature of its parent class.

1st Violation Type of LSP. Let’s consider:

class Event:
    def meets_condition(self, event_data: dict) -> bool:
        return False
 
class LoginEvent(Event):
    def meets_condition(self, event_data: list) -> bool:
        return bool(event_data)

Here, the subclass LoginEvent changes the parameter type from dict to list. This breaks LSP because a client expecting to pass a dict to an Event should not suddenly get an error when that Event turns out to be a LoginEvent.

  • mypy will raise an error: it notices that the method signature has changed in an incompatible way.
  • This kind of change makes the subclass unusable in place of the parent, which defeats the purpose of inheritance and breaks polymorphism.

2nd Violation Type of LSP. Also, if you change the return type (e.g., from bool to str), it would be a problem too. Clients expect a Boolean value, and changing it would break the “contract” of the method.

A good tip: sometimes the data types used (like list vs dict) are not the same but they may still work logically if they share a common interface (so they share behaviour, like both being iterables). In such cases, maybe the problem is just the type annotation. You might fix it by using a union or a more abstract type like Iterable, instead of changing the logic.

Furthermore, don’t use # type: ignore to bypass errors. These tools are showing you real design problems. Fix them properly by refactoring the code.

LSP also makes sense from an object-oriented design perspective. Remember that subclassing should create more specific types, but each subclass must be what the parent class declares.

Detecting incompatible signatures with pylint

3rd Violation Type of LSP. Another way LSP can be broken is when a subclass changes the method signature more drastically, like adding an extra parameter:

class LogoutEvent(Event):
    def meets_condition(self, event_data: dict, override: bool) -> bool:
        ...

This kind of mismatch might not be caught right away because Python doesn’t have a compiler—it checks types at runtime, not when the code is written:

  • That’s why static analyzers like pylint are useful.
  • pylint will catch this issue and show a message like (even mypy catches these types of errors, but pylint gains more insights):
    Parameters differ from overridden 'meets_condition' method (arguments-differ)

Just like with mypy, the recommendation is never to ignore these warnings. They’re pointing to real violations of LSP that can cause errors in the future, especially when the code is extended or maintained.

More subtle cases of LSP Violations

Sometimes, it’s not easy for tools to automatically find when the Liskov Substitution Principle (LSP) is broken. We need to carefully check the code, especially when the rules (contracts) of how classes should work are changed in their subclasses.

The main idea of LSP is that you should be able to use a subclass instead of its parent class without problems. This also means that the contracts defined by the parent class must be followed by its subclasses.

Think back to the idea of “design by contract”, discussed in Chapter 3, where the contract between the client (the code using a class or a method) and the supplier (the class or method being used) sets some rules:

  • The client must provide certain things (preconditions) to the method.
  • The supplier might check these preconditions.
  • The supplier then returns a result that the client will check (postconditions).

In an object-oriented hierarchy (when a class has subclasses), the parent class defines a contract. The subclasses must follow it, or else we break the LSP. More specifically:

  • A child class cannot make the preconditions stricter than they are in the parent class. If the parent says “you need to give me a number,” the child can’t say “you need to give me a positive number.”
  • A child class cannot make the postconditions weaker than they are in the parent class. If the parent says “I will return a positive number,” the child can’t just return any number (positive or negative).

To explain this idea, let’s consider a hierarchy of event classes as an example:

  • There’s a base class Event with a method validate_precondition() that checks if the input data (event_data) is a dictionary containing the keys "before" and "after", and that both of these are dictionaries too.
  • This validation means that clients (like another class called SystemMonitor) can safely assume that if these preconditions are met, they don’t need to worry about catching exceptions like KeyError.
  • The SystemMonitor class calls Event.validate_precondition() once. Then it checks each subclass of Event to see which one matches the event type using a method called meets_condition().

Here’s the code representing the client (SystemMonitor class) and the superclass of the supplier (Event class):

from collections.abc import Mapping
 
class Event:
	def __init__(self, raw_data):
		self.raw_data = raw_data
 
	@staticmethod
	def meets_condition(event_data: dict) -> bool:
		return False
	
	@staticmethod
	def validate_precondition(event_data: dict):
		"""Precondition of the contract of this interface.
		Validate that the ``event_data`` parameter is properly formed.
		"""
		if not isinstance(event_data, Mapping):
			raise ValueError(f"{event_data!r} is not a dict")
		for moment in ("before", "after"):
			if moment not in event_data:
				raise ValueError(f"{moment} not in {event_data}")
			if not isinstance(event_data[moment], Mapping):
				raise ValueError(f"event_data[{moment!r}] is not a dict")
 
 
class SystemMonitor:
	"""Identify events that occurred in the system."""
	
	def __init__(self, event_data):
		self.event_data = event_data
		
	def identify_event(self):
		Event.validate_precondition(self.event_data)
		event_cls = next(
			(
				event_cls
				for event_cls in Event.__subclasses__()
				if event_cls.meets_condition(self.event_data)
			),
			UnknownEvent,
		)
		return event_cls(self.event_data)

A subclass like TransactionEvent is written correctly:

class TransactionEvent(Event):
    """Represents a transaction that has just occurred on the system."""
    @staticmethod
    def meets_condition(event_data: dict):
        return event_data["after"].get("transaction") is not None

It checks if "transaction" is in the "after" data using .get(), which is safe because .get() won’t raise a KeyError if the key is missing.

The problem arises when a subclass demands more than what the base class contract defines. That’s the case for subclasses like LoginEvent or LogoutEvent:

class LoginEvent(Event):
    @staticmethod
    def meets_condition(event_data: dict):
        return (
            event_data["before"]["session"] == 0 and
            event_data["after"]["session"] == 1
        )
 
class LogoutEvent(Event):
    @staticmethod
    def meets_condition(event_data: dict):
        return (
            event_data["before"]["session"] == 1 and
            event_data["after"]["session"] == 0
        )
  • These classes are using square bracket syntax (like event_data["after"]["session"]), which does raise a KeyError if "session" is missing.
  • This breaks the contract, because the base class doesn’t say that "session" must be present. So, subclasses are demanding something extra, which violates LSP.
  • After changing those subclasses to use .get("session"), the contract is respected again, and the polymorphism works as expected: you can use any subclass of Event in the same way, without surprises or crashes.

Here’s the corrected version of these two classes:

class LoginEvent(Event):
    @staticmethod
    def meets_condition(event_data: dict):
        return (
            event_data["before"].get("session") == 0 and
            event_data["after"].get("session") == 1
        )
 
class LogoutEvent(Event):
    @staticmethod
    def meets_condition(event_data: dict):
        return (
            event_data["before"].get("session") == 1 and
            event_data["after"].get("session") == 0
        )

Remarks on the LSP

This short section highlights why the Liskov Substitution Principle (LSP) is so important in object-oriented programming (OOP). At its core, the LSP helps make sure that our use of polymorphism—one of the key features of OOP—is safe and correct.

Key idea: LSP supports polymorphism

Polymorphism means that you can use a subclass wherever a parent class is expected, and everything will still work as intended. The LSP makes sure that this kind of substitution is reliable.

So when we design a class that inherits from another, we must ensure it acts in a compatible way with the parent. Otherwise, problems arise when a client tries to use the subclass like it used the base class—and things break because the subclass behaves differently or expects different things.

Connection with the Open/Closed Principle (OCP)

This section also links the LSP to another key design principle called the Open/Closed Principle (OCP), which says that:

Software should be open for extension but closed for modification.

In other words, we should be able to add new features by extending existing code, not by changing code that already works (especially code used by clients).

But here’s the problem:

  • If we extend a class incorrectly (violating LSP), the new subclass might not behave the same as the parent.
  • This will break the expectations of the client code that uses it.
  • Then, to fix it, we might have to change the client code, which violates the OCP (because we’re modifying what should have been closed).

So, breaking LSP also leads to breaking OCP, which is a serious design flaw.

Interface Segregation

The Interface Segregation Principle (ISP) is all about keeping interfaces small and focused. It says that an interface shouldn’t try to do too many things. Instead, it should be broken into smaller, more specific parts—ideally, each interface should have just one or a few related methods.

An interface is represented by the set of methods and behaviors that a class exposes.

The interface separates the definition of the exposed behaviour for a class from its implementation. Basically, it separates what to do from how to do.

This idea has already come up several times in clean code discussions, but ISP puts it into a formal principle.

In traditional object-oriented languages like Java or C#, an interface is explicitly defined. But in Python, things are more flexible:

  • Python uses duck typing. That means an object’s type is defined by what it can do, not by its official class or what it inherits from. This means that, regardless of the type of the class, its name, docstring, class attributes, or instance attributes, what ultimately defines the essence of the object is the methods it has.
  • In other words, “if it walks like a duck and quacks like a duck, it’s a duck.” If a class has the right methods, Python considers it valid for use—even if it doesn’t formally declare that it implements an interface.

This makes interfaces implicit in Python: they’re not always declared in a strict way, but they’re defined by behavior.

Later, Python introduced “Abstract Base Classes” (ABCs) as another way to define interfaces more explicitly. ABCs allow you to define a basic set of methods that derived classes must implement. This is useful for ensuring that important behaviors are actually provided by subclasses and can also affect how functions like isinstance() work.

For example, if we have a general Event class, we might want to make it an abstract base class to indicate that we should only work with specific event types (like LoginEvent). The Event class then acts as an interface, defining a common set of behaviors that all event types should have. We can even use @abstractmethod to force subclasses to provide their own implementation of certain methods, like meets_condition. This is useful when you want to guarantee that all subclasses follow a certain contract.

Python also supports virtual subclasses—a looser version of inheritance. Python’s abc module also allows you to register classes as “virtual subclasses” of an ABC. This extends duck typing a bit further: an object can be considered a certain type if it has the right methods or if it has been explicitly registered as belonging to that type:

  • A class can be registered as part of an abstract hierarchy without formally inheriting from it.
  • This combines duck typing with a bit more structure—so now it’s “walks like a duck, quacks like a duck, or says it’s a duck.

Now, back to the Interface Segregation Principle (ISP) itself:

  • If you have an interface with lots of methods, it’s better to split it up into smaller, more focused interfaces (preferably just one method).
  • For example, rather than one large class implementing all methods of a general-purpose interface, each class should only implement what it truly needs.

Why is this good:

  • Improves reusability – small, well-defined behaviors can be reused more easily.
  • Increases cohesion – each class does one job and does it well.

So, in short: keep interfaces small and precise, and your code will be easier to maintain, understand, and extend.

An interface that provides too much

Imagine we want to create events from different types of data, like XML and JSON. A common practice is to depend on an interface rather than a specific class. We might initially design an interface that looks something like this, with methods to parse from both XML and JSON:

To create this in Python, we’d use an abstract base class with abstract methods from_xml() and from_json(). Any event class that inherits from this would have to implement both methods.

However, what if a specific event class only needs to be created from JSON data and doesn’t care about XML? It would still inherit the from_xml() method from the interface and would have to provide some implementation, even if it does nothing. This isn’t very flexible because it creates a dependency and forces classes to deal with methods they don’t need.

The smaller the interface, the better

A better approach is to split this single large interface into two smaller, more specific ones: one for parsing XML and one for parsing JSON.

We can still have a single EventParser class that can handle both XML and JSON by having it implement both of these smaller interfaces (because Python supports multiple inheritance):

The advantage of this design is that each method is now declared in a more focused interface. If we only need to parse XML somewhere else in our code, we can just depend on the XMLEventParser interface.

Here’s how this might look in Python code:

from abc import ABCMeta, abstractmethod
 
class XMLEventParser(metaclass=ABCMeta):
    @abstractmethod
    def from_xml(xml_data: str):
        """Parse an event from a source in XML representation."""
 
class JSONEventParser(metaclass=ABCMeta):
    @abstractmethod
    def from_json(json_data: str):
        """Parse an event from a source in JSON format."""
 
class EventParser(XMLEventParser, JSONEventParser):
    """An event parser that can create an event from source data either
    in XML or JSON format.
    """
    def from_xml(xml_data):
        pass
 
    def from_json(json_data: str):
        pass

Notice that the EventParser class must implement the abstract methods from both XMLEventParser and JSONEventParser. If it doesn’t, Python will raise a TypeError when you try to create an instance of EventParser.

This principle is similar to the Single Responsibility Principle (SRP), but it applies to interfaces (abstract definitions of behavior) rather than concrete classes. There’s no reason for an interface to change until it’s actually implemented. However, violating ISP leads to interfaces that are tied to unrelated functionalities. Consequently, classes that implement such bloated interfaces will also likely violate SRP because they’ll have multiple reasons to change.

How small should an interface be?

The idea of small interfaces is important, but the real goal is cohesion – an interface should have a clear and single purpose. While the previous example of splitting XML and JSON parsing was valid because those were distinct tasks, an interface doesn’t need to have only one method. Sometimes, several methods are inherently linked and necessary for a specific behavior. For instance, a context manager requires both __enter__ and __exit__ to function correctly. Separating such related methods would break the intended functionality. Therefore, the size of an interface should be guided by logical grouping and the need to avoid forcing implementing classes to depend on irrelevant methods.

Dependency Inversion (DI)

The Dependency Inversion Principle (DIP) is a powerful concept in clean code. It helps you protect your code by making it independent of things that change often or out of our control (like specific tools or libraries).

Instead of your code changing to fit these details, you want the details (the concrete implementations) to change to fit your code. This is done by using abstractions, which are like general blueprints or rules. That’s why this principle is called dependency inversion: because we’re inverting the dependency relationship.

Here’s a simple way to think about it:

  • Imagine you have two parts of your code, A and B.
  • Normally, A uses B directly. But if B is an external tool or something outside your control, changes in B can break A.
  • To “invert” the dependency, you create an interface (a set of rules or methods).
  • Now, A doesn’t depend on the specific B; it depends on this interface.
  • It’s B’s job to follow the rules of that interface.

In Python, interfaces are often created using abstract base classes (ABCs). These act as flexibility points, meaning your system can change or grow without you having to change the core abstract parts.

A Case of Rigid Dependencies

Imagine you have an EventStreamer that sends event data to Syslog (a specific data collector):

  • This is a bad design because EventStreamer (a high-level part) depends directly on Syslog (a low-level detail).
  • If Syslog changes, or if you want to send data to a different place (like email), you’d have to constantly change your EventStreamer code. This makes your code rigid and hard to change.

The solution to these problems is to make EventStreamer work with an interface, rather than a specific, concrete class:

  • This means EventStreamer will no longer directly know about Syslog or any other specific way of sending data.
  • Instead, EventStreamer will interact with a general concept of a “data target” that has a method for sending data.
  • It is then the responsibility of the low-level classes (like Syslog or an email sender) to implement this interface. They must adapt to the rules defined by the interface.

Now, let’s break down how this inversion works:

  • There is a new interface that represents a generic place where data can be sent (e.g., let’s call it DataTargetClient).
  • The dependencies are now “inverted”: EventStreamer no longer depends on a specific data target (like Syslog). This means EventStreamer does not have to change if the way data is sent to Syslog changes, or if you decide to use a completely different data target.
  • Instead, each specific data target (like Syslog or an email service) must implement this DataTargetClient interface correctly (that is implementing the .send() method). If there are changes, they are the ones that need to adapt to the interface’s requirements.

Even though Python allows you to pass any object with a .send() method (called duck typing), defining an ABC like DataTargetClient is a good practice because:

  • It makes your code more readable and helps people understand the relationships between your classes (e.g., “Syslog is a DataTargetClient”).
  • It helps you create a cleaner design and avoid common mistakes that Python’s flexibility might otherwise allow.

Dependency Injection

In the previous section, we learned to avoid direct reliance on specific implementations (like Syslog) by using abstractions (like the DataTargetClient interface). This makes our code more flexible, allowing us to easily add new clients (like email senders) as long as they implement the send method. This idea follows the “open for extension, closed for modification” principle.

Now, let’s look at how these dependencies are actually provided to the objects that need them.

One simple way to provide a dependency would be for an object to create it directly. For example, EventStreamer could create its own Syslog object:

class EventStreamer:
    def __init__(self):
        self._target = Syslog()  # Direct creation of Syslog
        
    def stream(self, events: list[Event]) -> None:
        for event in events:
            self._target.send(event.serialise())

However, this design is not very flexible. It forces EventStreamer to always use Syslog. It also makes testing harder because you’d have to jump through hoops to replace Syslog with a test version. If Syslog does something important when it’s created (side effect such as opening a connection), that action happens every time EventStreamer is initialized, which might not be what you want.

A much better design uses dependency injection. Instead of EventStreamer creating its own dependency, the dependency is provided to it, typically through its __init__ method:

class EventStreamer:
    def __init__(self, target: DataTargetClient): # Dependency provided here
        self._target = target
 
    def stream(self, events: list[Event]) -> None:
        for event in events:
            self._target.send(event.serialise())

This approach has several key benefits:

  • Flexibility and Polymorphism: EventStreamer now works with any object that implements the DataTargetClient interface. You can pass a Syslog object, an email sender object, or any other compatible client at the time EventStreamer is created. This allows for polymorphism, meaning EventStreamer can work with different types of “targets” in a unified way.
  • Easier Testing: If you’re unit testing EventStreamer, you don’t need a real Syslog. You can easily provide a “test double” (a simple object that mimics the DataTargetClient interface) to simulate the behavior you need without dealing with actual system interactions.

Key takeaway: Don’t force your classes to create their dependencies directly in their __init__ method. Instead, let the dependencies be provided as arguments to the __init__ method, making your code more flexible and easier to test.

Managing Complex Dependencies with Libraries

When you have many objects with complex relationships, manually passing all dependencies can become messy. In such cases, you might consider using a dependency injection library (like pinject). These libraries help you:

  • Declare dependencies: You define how your objects depend on each other in a clear, central place.
  • Automate creation: The library then handles the actual creation and “wiring” of these objects for you, reducing repetitive “glue code.”

For instance, using pinject, you could define how EventStreamer gets its target:

# ... EventStreamer class definition (same as the flexible version above) ...
 
class _EventStreamerBindingSpec(pinject.BindingSpec):
    def provide_target(self):
        return Syslog() # Here we define that the 'target' dependency is a Syslog
 
object_graph = pinject.new_object_graph(
    binding_specs=[_EventStreamerBindingSpec()])
 
# To get an EventStreamer with its dependencies automatically provided:
event_streamer = object_graph.provide(EventStreamer)
# This will give you an EventStreamer whose target is a Syslog instance.

This approach centralizes the creation logic and is especially useful for large projects, but it doesn’t remove the core flexibility of your EventStreamer class. You can still create EventStreamer instances manually by passing any DataTargetClient object.