The efficiency of an algorithm is determined by the number of steps it takes. However, we can’t label an algorithm with a single step count (like “22 steps”) because the number of steps usually varies depending on the size of the data.

For example, linear search takes as many steps as there are elements in the array. If an array has $N$ elements, linear search takes $N$ steps. This is easier to express using a standard, concise language called Big O notation, which computer scientists use to easily categorize and communicate the efficiency, or time complexity, of algorithms.

Big O: How Many Steps Relative to N Elements?

Big O notation focuses on the number of steps an algorithm takes in a worst-case scenario and how that number relates to the size of the data, which is called $N$ . The core question Big O answers is: If there are N data elements, how many steps will the algorithm take?

The answer to this “key question” is placed inside the parentheses of the Big O expression.

O(N): Linear Time

Let’s look at linear search:

Key Question: If there are $N$ elements, how many steps will linear search take?
Answer: It will take $N$ steps (one step for every element).
Big O Expression: O(N)

An algorithm that is O(N) is said to have linear time because the number of steps increases directly in proportion to the number of data elements.

O(1): Constant Time

Now consider reading from a standard array.

Key Question: If there are $N$ elements, how many steps will reading from an array take?
Answer: It always takes just one step, regardless of $N$ .
Big O Expression: O(1)

An O(1) algorithm is considered the fastest kind of algorithm. It is called constant time because the number of steps remains constant (e.g., 1, 2, or 3), no matter how much the data size ( $N$ ) increases.

The Soul of Big O

Big O notation does more than just give a step count; it answers the key question: if there are $N$ data elements, how many steps will the algorithm take? However, there is a deeper purpose, which is the soul of Big O: how will an algorithm’s performance change as the data increases?

Because Big O is concerned with this rate of change, we don’t distinguish between an algorithm that takes 1 step ( $O (1)$ ) and an algorithm that takes 3 steps ( $O (3)$ ). They are both classified as O(1). This is because both algorithms are the same type—their number of steps remains constant and is not affected by an increase in data.

In contrast, $O (N)$ is a different type of algorithm because its performance is affected by the data. The steps increase in direct proportion to the data. This means $O (N)$ tells a story about the proportional relationship between the size of the data and the algorithm’s efficiency.

$O (N)$ is plotted as a perfect diagonal line . For every additional piece of data, the algorithm takes one additional step.
$O (1)$ is plotted as a perfect horizontal line. No matter how much data there is, the number of steps remains constant.

Deeper into the Soul of Big O

This concern for the rate of change explains why $O (N)$ is always considered less efficient than $O (1)$ , even if the $O (1)$ algorithm takes many steps.

Imagine an $O (1)$ algorithm that always takes 100 steps, and an $O (N)$ algorithm.

For small data sets (fewer than 100 elements), the $O (N)$ algorithm actually takes fewer steps.
At exactly 100 elements, they both take 100 steps.
The key point is: for all arrays greater than 100, the $O (N)$ algorithm takes more steps and the difference keeps growing.

Because there will always be a point (100 elements in this example) where the $O (N)$ algorithm begins to take more steps and will remain less efficient up to an infinite amount of data, $O (N)$ is considered less efficient overall than $O (1)$ , regardless of the $O (1)$ algorithm’s constant step count (even if it were one million steps).

An Algorithm of the Third Kind

We need a way to describe the efficiency of binary search using Big O notation. It’s not $O (1)$ (constant steps) because the steps increase as the data grows, but it’s much faster than $O (N)$ (linear steps).

Binary search is described as having a time complexity of: $O (lo g N)$ . This is pronounced “Oh of log N” and is also known as log time.

$O (lo g N)$ is the Big O way of describing an algorithm that increases by just one step each time the data is doubled, which is exactly how binary search works.

So far, you’ve learned three categories of algorithms, ordered from most efficient to least efficient:

O(1): Constant Time
O( $lo g N$ ): Log Time
O(N): Linear Time

The graph shows how $O (lo g N)$ curves slightly upward, making it less efficient than $O (1)$ but dramatically more efficient than $O (N)$ .

Logarithms

To understand $O (lo g N)$ , we need to understand logarithms (log). Logarithms are the inverse of exponents. For example, $2^{3} = 8$ . The logarithm $l o g_{2} 8$ asks: how many times do you have to multiply 2 by itself to get 8? The answer is 3.

A simpler way to think about it in computer science is: $l o g_{2} 8$ asks how many times you need to halve 8 until you end up with 1.

$8/2/2/2 = 1$

Since you halved it three times, $l o g_{2} 8 = 3$ . Similarly, to halve 64 until you get 1, you need to do it six times, so $l o g_{2} 64 = 6$ .

O(log N) Explained

In computer science, when we write $O (lo g N)$ , it is shorthand for $O (l o g_{2} N)$ ; the small 2 is usually omitted. The core Big O question is: if there are $N$ data elements, how many steps will the algorithm take?

$O (lo g N)$ means the algorithm will take $l o g_{2} N$ steps.

If $N = 8$ , the algorithm takes 3 steps.
This perfectly matches binary search, which works by repeatedly dividing (halving) the data until it narrows the search down to 1 element (that is, the alternative way we described the logarithms in the previous chapter.

Simply put, $O (lo g N)$ means the algorithm takes as many steps as it takes to keep halving the data elements until only 1 remains. This efficiency is striking because the $O (lo g N)$ algorithm only takes one additional step each time the data size is doubled, while an $O (N)$ algorithm takes the same number of steps as the data size.

Practical Examples

Example 1: Printing All Items

things = ['apples', 'baboons', 'cribs', 'dulcimers']
for thing in things:
    print("Here's a thing: " + thing)

This simple code is an algorithm that prints all items in a list. The for loop will run once for every element in the list. To analyze its Big O:

The number of steps increases proportionally with the number of elements in the list.
If the list has $N$ elements, the loop runs $N$ times.
Therefore, this is a classic example of O(N).

Example 2: Determining if a Number is Prime

def is_prime(number):
    for i in range(2, number):
        if number % i == 0:
            return False
    return True

In this case, the $N$ in our key question is the number itself that is passed into the function, not an array size. The size of this number directly affects how many times the loop runs.

If you pass in the number 7, the loop runs approximately 7 times.
If you pass in 101, the loop runs approximately 101 times.
Since the number of steps increases in lockstep with the input number ( $N$ ), this algorithm is also classified as O(N).

Quartz 4

Explorer

03. O Yes! Big O Notation