This chapter introduces the concept of node-based data structures, where data elements (nodes) are dispersed throughout the computer’s memory. The linked list is the simplest example of this structure and forms the foundation for more complex ones. While linked lists function like arrays, their underlying structure leads to different performance trade-offs.

Linked Lists

A linked list represents a list of items, but unlike an array, its data elements are not stored in a contiguous block of memory but instead they’re scattered across different cells throughout the computer’s memory.

The core components of a linked list are:

Node: A piece of data that represents one item in the list.
Link: An extra piece of information within the node that contains the memory address (pointer) of the next node in the sequence.

Note that:

The first node in the list is called the head.
The last node in the list is called the tail. Its link is set to None to signify the end of the list.

The computer can traverse the entire list by starting at the head and following the links from one node to the next. This scattered memory allocation is a potential advantage over arrays, which must find large, contiguous blocks of empty memory.

Implementing a Linked List

Linked lists are typically implemented using two classes: Node and LinkedList.

The Node class holds the actual data and the pointer to the next node:

class Node:
    def __init__(self, data):
        self.data = data
        self.next_node = None # The link/pointer to the next Node instance

In this implementation, the next_node attribute points directly to another Node object, which achieves the same effect as pointing to a memory address.

The LinkedList class acts as the handle for the entire structure, as it only needs to know where the list begins:

import node
 
class LinkedList:
    def __init__(self, first_node: Node | None = None):
        self.first_node = first_node # Stores a reference to the head node

Once a LinkedList instance is created (e.g., list = LinkedList(node_1)), the program has immediate access only to its head. To reach any other node, the program must start at the head and sequentially follow the links. This constraint has major ramifications for the time complexity of the list’s operations.

Next, we’ll compare the performance of linked lists against arrays for the classic operations: reading, searching, insertion, and deletion.

Reading from a Linked List

The efficiency of reading an element from a linked list is a major contrast to an array:

Array Reading: An array can be read in $O (1)$ (constant time) because its data is stored contiguously in memory, allowing the computer to instantly calculate the memory address of any index.
Linked List Reading: A linked list, where nodes are dispersed throughout memory, takes $O (N)$ (linear time) in the worst case. To read the value of the $k$ -th item in a linked list, the computer must start at the head (the only node it knows) and sequentially follow the link from one node to the next until it reaches the desired node.

If you want to read the last node in a list of $N$ nodes, it will require $N$ steps to traverse the entire list. This means the worst-case read performance for a linked list is $O (N)$ . This $O (N)$ reading efficiency is a significant disadvantage when compared to the array’s $O (1)$ read time.

Code Implementation: Linked List Reading

The read method for the LinkedList class demonstrates this sequential traversal:

def read(self, index):
    current_node = self.first_node
    current_index = 0
    while current_index < index:
        current_node = current_node.next_node # Follow the link to the next node
        current_index += 1
    if not current_node:
        return None
    return current_node.data

Initialization:
- current_node is set to self.first_node (the head).
- current_index is set to 0.
Traversal Loop: The while loop runs until current_index matches the requested index.
- In each iteration, the program moves to the next node: current_node = current_node.next_node.
- If the loop executes $k$ times to reach the $k$ -th node, $k$ steps have been taken.
End Check: The line if not current_node: handles cases where the requested index is beyond the length of the list. If current_node is the last node, current_node.next_node is None. When this None is assigned back to current_node, the check triggers, and the function returns None.
Return Value: If the loop finishes, current_node.data returns the value of the element at the requested index.

Searching

Searching for a value in a linked list, like a linear search on an array, requires inspecting each element sequentially. Therefore, the search speed for a linked list is also $O (N)$ (linear time), where $N$ is the number of nodes.

Since a linked list only provides immediate access to its head, to find a target value, the algorithm must begin at the head and follow the link from node to node, checking the data in each node until a match is found or the end of the list is reached.

Code Implementation: Linked List Search

The search method encapsulates this traversal and comparison logic:

def search(self, value):
    current_node = self.first_node
    current_index = 0
    while True:
        if current_node.data == value:
            return current_index
        
        current_node = current_node.next_node
        
        if not current_node:
            break
            
        current_index += 1
        
    return None

Initialization: current_node starts at the head (self.first_node), and current_index starts at 0.
Comparison: Inside the while True loop, the current_node.data is immediately compared to the value. If they match, the current index is returned.
Traversal: If no match is found, the loop prepares for the next iteration by moving to the next node: current_node = current_node.next_node.
End Check: The check if not current_node: determines if the end of the list (where the link is None) has been reached. If so, the loop breaks, and the function returns None, signifying the value was not found.

The algorithm stops either when the value is found (returning the index) or when the entire list has been traversed (returning None).

Insertion

Insertion is the operation where linked lists gain a performance advantage over arrays, particularly when inserting at the beginning:

Array (Worst Case): $O (N)$ - requires shifting all elements.
Linked List (Best Case): $O (1)$ - requires only creating a new node and updating the head pointer.

Consider that you have the following linked list and you want to insert a new node (“yellow”) at the beginning of the list:

In this case, only two pointer manipulations are needed:

The new node’s link is set to point to the current head (“blue”).
The LinkedList head pointer is updated to point to the new node (“yellow”).

Since the program always has immediate access to the head, this takes just one step ( $O (1)$ ).

While the act of creating a new node and updating two links is an $O (1)$ operation, the “gotcha” is accessing the insertion point.

To insert a node at index $k$ , the program must first find the node at index $k - 1$ .
As established in the reading analysis, accessing a node at a given index requires sequentially traversing the list, which takes $O (N)$ steps in the worst case (inserting at the end).

Suppose you want to insert “purple” at index 2:

In this case, adding “purple” took three steps: two steps to access index 1 and one step to insert the new node. Therefore, practically speaking, insertion anywhere other than the head is dominated by the traversal time:

$O (N) to find the spot + O (1) to insert = O (N)$

The best- and worst-case insertion scenarios for arrays and linked lists are opposites:

Scenario	Array	Linked List
Insert at beginning	Worst case ( $O (N)$ )	Best case ( $O (1)$ )
Insert at middle	Average case ( $O (N)$ )	Average case ( $O (N)$ )
Insert at end	Best case ( $O (1)$ )	Worst case ( $O (N)$ )

This makes linked lists the superior choice for applications that frequently add elements to the start of the list.

Code Implementation: Linked List Insertion

The insert method handles the two main cases: $O (1)$ insertion at the head and $O (N)$ insertion elsewhere.

def insert(self, index, value):
    new_node = node.Node(value)
    
    # Case 1: Insertion at the head (O(1))
    if index == 0:
        new_node.next_node = self.first_node # New node points to old head
        self.first_node = new_node            # Head points to new node
        return
        
    # Case 2: Insertion anywhere else (O(N) due to traversal)
    current_node = self.first_node
    current_index = 0
    
    # Traverse to the node *before* the insertion spot
    while current_index < (index - 1):
        current_node = current_node.next_node
        current_index += 1
        
    # Link manipulation (O(1) operation)
    new_node.next_node = current_node.next_node # New node links to the node *after* current_node
    current_node.next_node = new_node           # Current node links to the new node

Node Creation: A new_node is created with the provided value.
Case 1: Insert at Index 0 ( $O (1)$ ):
- The method checks if index == 0.
- If true, it performs a fast $O (1)$ insertion by setting the new_node.next_node to the current head (self.first_node).
- It then updates the list’s head to the new_node (self.first_node = new_node) and returns immediately.
Case 2: Insert Elsewhere ( $O (N)$ ):
- If the index is not 0, the method initiates a traversal loop.
- It iterates through the list until current_node points to the node located immediately before the target insertion index (i.e., it stops at index - 1).
- Relinking (Insertion): Once the preceding node is found, two pointer manipulations finalize the insertion: a. The new_node is linked to the rest of the list: new_node.next_node = current_node.next_node. b. The current_node (the predecessor) is linked to the new_node: current_node.next_node = new_node.

The use of the separate if index == 0: block ensures that the most efficient operation for linked lists (head insertion) is executed without unnecessary logic.

Deletion

Linked lists excel at deletion from the beginning of the list, providing a clear advantage over arrays in this specific scenario.

The efficiency analysis for deletion is identical to insertion because the time complexity is dominated by the time required to access the node(s) involved in the operation.

Scenario	Array	Linked List
Delete at beginning	Worst case ( $O (N)$ )	Best case ( $O (1)$ )
Delete at middle	Average case ( $O (N)$ )	Average case ( $O (N)$ )
Delete at end	Best case ( $O (1)$ )	Worst case ( $O (N)$ )

To delete the first node (“once”), the operation is simple and fast. Since we have direct access to the first_node (the head), we only need to perform one step:

$list.first_node = second_node$

This changes the list’s starting point to the second node, effectively removing the original head in $O (1)$ time. Contrast this with an array, where deleting the first element requires shifting all remaining $N - 1$ elements, taking $O (N)$ time.

When deleting a node in the middle or at the end, the process requires:

Traversal ( $O (N)$ ): Traversing the list to find the node immediately preceding the one to be deleted.
Relinking ( $O (1)$ ): Changing the preceding node’s link to bypass the node being deleted.

For example, to delete “purple”, we find “blue” (the preceding node) and change its link to point directly to “green” (the subsequent node). The $O (N)$ traversal time dominates the overall efficiency.

Note on Memory: When a node is deleted, it remains in memory until the programming language’s garbage collection process detects that it is no longer being referenced by any active link and frees the memory.

Code Implementation: Linked List Deletion

The delete method uses a simple conditional structure to handle the two cases:

def delete(self, index):
	if index == 0:
		self.first_node = self.first_node.next_node # Head is reassigned to the second node
		return
		
	current_node = self.first_node
	current_index = 0
	while current_index < (index - 1):
		current_node = current_node.next_node
		current_index += 1
	node_after_deleted_node = current_node.next_node.next_node
	current_node.next_node = node_after_deleted_node

Case 1: Delete at Index 0 ( $O (1)$ ): This is the highly efficient, one-step head deletion.
Case 2: Delete Anywhere Else ( $O (N)$ ):
- The code traverses the list to find the node before the one to be deleted (current_node).
- It then identifies the node after the deleted node using the “trick” node_after_deleted_node = current_node.next_node.next_node. This jumps past the node intended for deletion.
- Finally, it performs the deletion by updating the link with current_node.next_node = node_after_deleted_node

The $O (1)$ relinking effectively bypasses and removes the target node from the logical list structure.

Efficiency of Linked List Operations

The comparison of the fundamental operations between Arrays and Linked Lists shows that linked lists are often equal or worse in terms of raw Big O efficiency:

Operation	Array	Linked List
Reading	$O (1)$	$O (N)$
Search	$O (N)$	$O (N)$
Insertion	$O (N)$ ( $O (1)$ at end)	$O (N)$ ( $O (1)$ at beginning)
Deletion	$O (N)$ ( $O (1)$ at end)	$O (N)$ ( $O (1)$ at beginning)

While the overall $O (N)$ complexity for insertion and deletion seems “lackluster,” the power of the linked list is in the fact that the actual relinking step is always $O (1)$ . This $O (1)$ advantage becomes relevant in specific scenarios where the traversal time is already necessary or if you can access the relevant node in constant time.

Linked Lists in Action

Linked lists shine when you need to traverse a list and perform many deletions or insertions along the way, such as when cleaning a list of invalid email addresses.

When iterating through an array and deleting an element (e.g., an invalid email address), the following occurs:

Read/Traverse: $O (N)$ steps to inspect all $N$ elements.
Deletion Cost: Each deletion requires an additional $O (N)$ steps to shift all subsequent elements to the left to close the gap.

Example: In a list of 1,000 emails with 100 invalid addresses:

Total Steps (Array) $\approx 1000$ (Reads) $+ (100 \times up to 1000 shifts) \approx 101000 steps$ (Worst case).

With a linked list, as the algorithm traverses the list (an unavoidable $O (N)$ process), the deletion step itself is performed locally in $O (1)$ time by simply adjusting the links of the neighboring nodes.

Read/Traverse: $O (N)$ steps to inspect all $N$ elements.
Deletion Cost: Each deletion takes only $O (1)$ because no data shifting is required.

Example: In the same list of 1,000 emails with 100 invalid addresses:

Total Steps (Linked List) $\approx 1, 000$ (Reads) $+ 100$ (Deletions at $O (1)$ each) $\approx 1, 100 steps$ .

In scenarios requiring multiple deletions or insertions during a full list traversal, the linked list’s ability to modify the structure without shifting data provides a massive performance gain over an array.

Quartz 4

Explorer

14. Node-Based Data Structures