Big O notation is a powerful tool, but it’s not the only tool for comparing algorithms. Sometimes, two competing algorithms can have the same Big O efficiency, yet one is still faster in practice. This chapter explores how to identify and select the faster algorithm in such situations.

Selection Sort

Selection Sort is another sorting algorithm that aims to arrange an array in ascending order, much like Bubble Sort, which was $O (N^{2})$ .

The fundamental idea of Selection Sort is to make passes through the unsorted portion of the array to find the absolute smallest value and then swap it into its correct, final position at the beginning of the unsorted section.

The steps are:

Search for Lowest: Conduct a pass-through of the unsorted portion of the array (starting from the current index). As you check each cell, keep track of the index that holds the lowest value encountered so far.
Swap: Once the lowest value in the entire unsorted portion is found, swap that value with the value at the index where the pass-through began.
Repeat: Repeat these pass-throughs, starting at the next index each time, until the entire array is sorted.

Selection Sort in Action

Let’s trace the sort of the example array $[4, 2, 7, 1, 3]$ .

First Pass-Through (Starts at index 0)

We look through the whole array $[4, 2, 7, 1, 3]$ to find the lowest value.
We find that 1 is the lowest value (at index 3).
We swap the lowest value (1) with the starting value (4 at index 0).
Result: $[1, 2, 7, 4, 3]$ . The value 1 is now sorted.

Second Pass-Through (Starts at index 1)

We look through the unsorted portion $[2, 7, 4, 3]$ to find the lowest value.
The lowest value found is 2 (at index 1).
Since 2 is already at the starting index (index 1), no swap is performed.
Result: $[1, 2, 7, 4, 3]$ . The value 2 is now sorted.

Third Pass-Through (Starts at index 2)

We look through the unsorted portion $[7, 4, 3]$ .
We find the lowest value is 3 (at index 4).
We swap the lowest value (3) with the starting value (7 at index 2).
Result: $[1, 2, 3, 4, 7]$ . The value 3 is now sorted.

Fourth Pass-Through (Starts at index 3)

We look through the final unsorted portion $[4, 7]$ .
The lowest value found is 4 (at index 3).
Since 4 is already at the starting index, no swap is performed.
Result: $[1, 2, 3, 4, 7]$ . The array is fully sorted.

Code Implementation: Selection Sort

def selection_sort(array):
	for i in range(len(array) - 1):
		lowest_number_index = i
		for j in range(i + 1, len(array)):
			if array[j] < array[lowest_number_index]:
				lowest_number_index = j
		if lowest_number_index != i:
			array[i], array[lowest_number_index] = array[lowest_number_index], array[i]
	return array

The Python code for selection_sort(array) uses nested loops to accomplish the sorting:

The outer loop (for i in range(len(array) - 1):) represents the pass-throughs, starting one index further each time. The variable i is the index where the smallest value will be placed.
lowest_number_index = i: This sets the starting assumption for the lowest value’s location.
The inner loop (for j in range(i + 1, len(array)):) is the search that checks the rest of the array (j starts from $i + 1$ ) to find the true lowest value. If a lower value is found (if array[j] < array[lowest_number_index]:), its index is stored in lowest_number_index.
After the inner search loop completes, the code checks if a swap is needed (if lowest_number_index != i:). If the lowest value is not already in the starting spot (i), the values at $i$ and lowest_number_index are swapped.

The function continues until the outer loop completes, returning the sorted array.

The Efficiency of Selection Sort

Selection Sort involves comparisons (searching for the minimum value) and swaps (moving the minimum into position).

The total number of comparisons decreases in each pass-through because the elements at the beginning of the array are already sorted and don’t need to be re-checked. Looking back at our example array that contains five elements, we had to make a total of 10 comparisons. Let’s break it down in the following table:

Pass-Through #1: #4 comparisons
Pass-Through #2: #3 comparisons
Pass-Through #3: #2 comparisons
Pass-Through #4: #1 comparisons

In general:

For an array of $N$ elements, the total number of comparisons is the sum of $(N - 1) + (N - 2) + (N - 3) + \dots + 1$ .
For a 5-element array, this is $4 + 3 + 2 + 1 = 10$ comparisons.

Comparison to Bubble Sort

Selection Sort is highly efficient when it comes to swaps. In each pass-through, it performs a maximum of one swap (it either swaps the lowest found value into the starting position or performs zero swaps if the starting value was already the minimum). This contrasts sharply with Bubble Sort, which might perform a swap for every single comparison in the worst-case scenario. Here’s a side-by-side comparison of Bubble Sort and Selection Sort:

While the maximum total steps for Selection Sort (comparisons + swaps) are roughly half the maximum total steps for Bubble Sort, the algorithms are classified identically in Big O notation.

Ignoring Constants

Despite Selection Sort being about twice as fast as Bubble Sort, both are described in Big O notation as $O (N^{2})$ . This is due to a major rule of Big O: Big O notation ignores constants.

This rule means that Big O expressions never include regular numbers (constants) that are not exponents. We simply drop them from the expression.

Selection Sort takes approximately $\frac{N ^{2}}{2}$ steps, but the $\frac{1}{2}$ is dropped, resulting in $O (N^{2})$ .
An algorithm that takes $\frac{N}{2}$ steps is expressed as $O (N)$ .
An algorithm that takes $N^{2} + 10$ steps is expressed as $O (N^{2})$ (the 10 is dropped).
Even an algorithm 100 times slower than $O (N)$ , described as $O (100 N)$ , is simply expressed as $O (N)$ .

At first glance, this rule seems to make Big O useless, as it means two algorithms (like Selection Sort and Bubble Sort) can be described identically, even if one is significantly faster. The reason for this simplification lies in the “soul of Big O”—its concern with the rate of growth as $N$ approaches infinity, rather than the exact step count at small $N$ .

Big O Categories

Big O notation simplifies efficiency down to general categories of algorithm speeds.

Think of it like buildings: a single-family house and a skyscraper. While you could count the exact number of floors, it’s more useful to categorize them as “house” versus “skyscraper” to signify their vast difference in scale and function.

Similarly, when comparing an $O (N)$ algorithm with an $O (N^{2})$ algorithm, the difference is so vast that it doesn’t matter if the $O (N)$ algorithm is actually $O (2 N)$ or $O (N /2)$ or even $O (100 N)$ .

$O (N)$ and $O (N^{2})$ are separate categories because of the Soul of Big O: the concern with the long-term trajectory of steps as data increases.

$O (N)$ tells a story of straight growth (linear growth), even if the steps are $100 N$ . The relationship between data and steps is proportional.
$O (N^{2})$ tells a different story: exponential (or quadratic) growth.

Le’t see how $O (N^{2})$ becomes slower than various factor of $N$ :

Exponential growth is fundamentally different from linear growth. As data increases, $O (N^{2})$ will inevitably become slower than $O (N)$ multiplied by any factor.

Therefore, multiplying or dividing the number of steps by a regular number (a constant) does not change the Big O category. $O (2 N)$ is simply classified into the general category of $O (N)$ .

However, it’s important to remember that when two algorithms fall under the same Big O classification (like Bubble Sort and Selection Sort, both $O (N^{2})$ ), further analysis is required to determine which one is truly faster, as one might contain a smaller constant (Selection Sort is twice as fast).

A Practical Example

Let’s look at two algorithms that print all even numbers starting from 2 up to an upper_limit ( $N$ ):

# Version 1
def print_numbers_version_one(upper_limit):
	number = 2
	while number <= upper_limit:
			if number % 2 == 0:
				print(number)
			number += 1
 
# Version 2
def print_numbers_version_two(upper_limit):
	number = 2
	while number <= upper_limit:
			print(number)
			number += 2

Version One: Checks every number and uses an if statement to print evens.
- Takes about $N$ steps (one loop iteration for almost every number up to $N$ ).
- Big O: $O (N)$ .
Version Two: Only iterates over even numbers (increments by 2).
- Takes $N /2$ steps (half the steps of version one).
- Big O: Although $O (N /2)$ , we drop the constant ( $\frac{1}{2}$ ), so it’s also $O (N)$ .

Even though both versions are $O (N)$ , the second version is twice as fast and is the better choice. This again shows that Big O is great for general categorization, but not for precise speed comparison within the same category.

Significant Steps

When expressing the Big O of an algorithm, how do we determine which steps are significant? Consider print_numbers_version_one again. Within the loop, multiple operations occur:

Comparison: if number % 2 == 0 (runs $N$ times)
Printing: print(number) (runs $N /2$ times)
Incrementing: number += 1 (runs $N$ times)

The total number of steps is $N + N + N /2 = 2.5 N$ steps.

The answer is that all steps are significant. However, when expressing the Big O, we drop the constant 2.5. This simplifies the expression to $O (N)$ . By dropping the constant, we effectively focus on the factor that changes with $N$ (the number of times the loop runs), rather than the exact count of operations within the loop.

Quartz 4

Explorer

05. Optimizing Code with and Without Big O