Networking Protocols Introduction

Now that the foundational protocols—IP, TCP, and UDP—have been established, it’s important to recognize that virtually every protocol used in modern networking builds upon these three foundations. Unless operating at very low layers (below IP), all higher-level protocols leverage IP, TCP, or UDP as their transport mechanism.

Consider the protocols encountered daily in software development:

HTTP runs on top of TCP, which runs on top of IP.
DNS typically uses UDP (though it can fall back to TCP for large responses), which runs on top of IP.
SSH operates directly on TCP, which runs on top of IP.
HTTP/3 uses QUIC, a newer protocol built on UDP, which runs on top of IP.
TLS—the encryption protocol that secures most internet traffic—runs on TCP (though it can adapt to other transports), enabling HTTPS through the combination of HTTP over TLS over TCP over IP.

This layering is fundamental to internet architecture. Each protocol solves specific problems while delegating other concerns to lower layers. HTTP doesn’t worry about packet retransmission or ordering—TCP handles that. TCP doesn’t concern itself with routing across networks—IP manages that. This separation of concerns allows protocols to evolve independently and enables engineers to reason about systems at the appropriate abstraction level.

Scope of This Section

The protocols discussed in this section each warrant extensive study. HTTP alone could fill hours of content covering its evolution from HTTP/1.0 through HTTP/1.1, HTTP/2, and HTTP/3, examining persistent connections, multiplexing, server push, and binary framing. DNS encompasses not just basic name resolution but also security extensions (DNSSEC), load balancing strategies, and the complex distributed architecture of the global DNS system. TLS involves cryptographic algorithms, certificate hierarchies, handshake protocols, and performance optimizations that have evolved over decades.

Rather than attempting exhaustive coverage that would fail to do justice to these protocols’ complexity and history, this section provides high-level overviews. Each protocol receives brief treatment—typically five to ten minutes—focusing on core concepts, typical use cases, and how it fits into the broader networking stack. The goal is not comprehensive understanding but rather establishing familiarity: what the protocol does, why it exists, and how it relates to other protocols.

DNS: Domain Name System

The Domain Name System is one of the few protocols exercised constantly throughout daily computing. Every time a URL is typed into a browser, the first action is a DNS query to resolve the domain name into an IP address. Only after obtaining this IP address can the system establish a TCP connection to the target server. The Internet Protocol operates exclusively with numeric IP addresses, making DNS the critical translation layer between human-readable names and machine-routable addresses.

Why DNS Exists

The fundamental problem DNS solves is that people cannot remember IP addresses. Expecting users to memorize numeric addresses like 142.250.185.46 for every website they visit is impractical. Moreover, IP addresses change frequently due to infrastructure updates, load balancing, and failover configurations. If a service runs on seven servers with seven different IP addresses, users cannot be expected to track and remember all of them.

DNS introduces a layer of abstraction: a domain name like www.example.com remains constant from the user’s perspective, while the underlying IP addresses can change freely. This abstraction provides several powerful capabilities beyond simple name-to-address mapping.

DNS enables load balancing at the name resolution level. When querying a domain, the DNS server can return multiple IP addresses. The client can then select which server to connect to, distributing load across multiple backends. Netflix, in its early architecture, relied heavily on this approach—DNS responses contained multiple IP addresses, and clients performed client-side load balancing by choosing which one to use. This distributes traffic without requiring a centralized load balancer.

Geographic DNS (Geo-DNS) enables location-aware routing. Content delivery networks (CDNs) like Cloudflare and Fastly operate servers worldwide—in India, Australia, the United States, Russia, and elsewhere. When a user in India queries a domain hosted on a CDN, the DNS resolver returns an IP address for a server physically located in India. This minimizes latency by reducing the physical distance packets must travel. Lower latency translates directly to faster TCP handshakes, quicker TLS negotiations, and better overall performance. A connection with 5 milliseconds of latency performs dramatically better than one with 70 milliseconds, especially for protocols requiring multiple round-trips.

DNS Records and Query Types

DNS stores more than just IP addresses. Multiple record types enable different kinds of queries:

A Records: Return IPv4 addresses
AAAA Records: Return IPv6 addresses
CNAME Records: Provide canonical name aliases, pointing one domain to another
MX Records: Specify mail server addresses
TXT Records: Store arbitrary text information, often used for domain verification and configuration
NS Records: Identify authoritative name servers for a domain
SRV Records: Define service locations, including port numbers

This flexibility makes DNS a general-purpose distributed directory service, not merely an address lookup system.

DNS Architecture: Hierarchical Resolution

DNS doesn’t operate as a single centralized database—such an approach would create an impossibly large table with billions of records and catastrophic performance. Instead, DNS uses hierarchical partitioning, progressively narrowing the search space through multiple query stages.

The resolution process involves several layers:

Root Servers: Approximately 13 root server identities exist (though each is replicated extensively). Root servers don’t know the IP addresses of individual domains—they only know which servers handle top-level domains (TLDs) like .com, .org, .io, and .engineering.
Top-Level Domain (TLD) Servers: These servers manage specific TLDs. A .com TLD server doesn’t store the IP addresses of every .com domain—that would require enormous replication. Instead, it stores references to authoritative name servers responsible for individual domains.
Authoritative Name Servers: These servers contain the actual DNS records for specific domains. An authoritative name server for google.com holds the definitive IP addresses and other records for that domain. By limiting each authoritative server’s scope to specific domains, the database remains small and query performance stays fast.

DNS Resolution Flow

Consider a client attempting to resolve google.com:

Client Query: The client sends a DNS query to its configured resolver (often the local router, or public resolvers like 8.8.8.8 for Google or 1.1.1.1 for Cloudflare). This resolver is provided as an IP address when the client connects to the network—it must be an IP address, not a hostname, to avoid circular dependencies.
- Resolver Cache Check: The resolver first checks its cache. If it recently resolved this domain, it returns the cached result immediately. If not, it begins iterative resolution.
Root Server Query: The resolver queries a root server: “Which TLD server handles .com domains?”
The root server responds with the IP address of a .com TLD server.
TLD Server Query: The resolver queries the TLD server: “Which authoritative name server handles google.com?”
The TLD server responds with the IP address of Google’s authoritative name server.
Authoritative Server Query: The resolver queries Google’s authoritative name server: “What is the IP address of google.com?”
The authoritative server responds with the actual IP address.
Client Response: The resolver returns the IP address to the client, which can now establish a TCP connection.

This process involves eight distinct network round-trips in the worst case (no caching). Each query and response uses UDP on port 53.

UDP and Transaction IDs

DNS runs on UDP, a stateless protocol, creating a challenge: how does a resolver match responses to requests? DNS solves this with transaction IDs. Each query includes a unique transaction ID in the DNS header, and responses echo this ID back. When a resolver sends multiple queries in parallel, it uses transaction IDs to correlate responses with the correct requests.

This stateless design enables high performance but also introduces security vulnerabilities, as we’ll discuss later.

DNS Packet Structure

A DNS packet consists of:

IP Header: Standard 20-byte IPv4 header with source/destination IPs
UDP Header: 8 bytes containing source/destination ports (port 53 for DNS), length, and checksum
DNS Header: Contains the transaction ID, operation codes, flags (authoritative answer, recursion desired, etc.), and counts indicating the number of questions, answers, name servers, and additional records
DNS Data: Variable-length section containing the actual queries and responses

The DNS data section can grow quite large when multiple questions are asked or multiple answers are returned. This variable sizing sometimes causes issues with MTU limits, particularly when DNSSEC (DNS Security Extensions) adds cryptographic signatures that significantly expand response sizes.

Distributed Design and Centralization Trade-offs

The DNS architecture is often called “decentralized,” but this characterization is nuanced. The root and TLD infrastructure is genuinely distributed, with extensive replication ensuring resilience. However, authoritative name servers introduce centralization. If a company’s authoritative name servers fail—as happened to Microsoft and Facebook in separate incidents—the entire domain becomes unreachable even though the root and TLD infrastructure remains operational.

This single point of failure demonstrates that DNS is only as decentralized as an organization makes its authoritative name servers. Companies hosting authoritative servers in a single datacenter or behind a single provider create centralized failure modes.

Security Concerns

DNS is unencrypted by default. Queries and responses traverse the network in plaintext UDP packets. This has several implications:

ISP Visibility: Internet Service Providers see every DNS query their customers make, logging which domains are accessed even if the subsequent HTTPS traffic is encrypted. ISPs can block domains by examining port 53 traffic and dropping packets for prohibited names.
DNS Hijacking: Attackers can inject themselves as fake authoritative name servers or TLD servers, responding to queries with malicious IP addresses. Victims are redirected to attacker-controlled servers without realizing anything is wrong.
DNS Cache Poisoning: Because DNS uses predictable transaction IDs and source ports, attackers can send forged responses. If an attacker guesses the transaction ID and source port correctly, and their response arrives before the legitimate one, the resolver caches the poisoned result. Subsequent clients querying the same domain receive the attacker’s malicious IP address. This attack is difficult but not impossible, particularly against resolvers with weak randomization.

Encrypted DNS: DoT and DoH

Two proposals address DNS’s lack of encryption:

DNS over TLS (DoT): Encrypts DNS queries using TLS over a dedicated port (typically 853). This makes DNS traffic distinguishable from other traffic, allowing network administrators to apply different policies to DNS versus general web traffic.
DNS over HTTPS (DoH): Encapsulates DNS queries within HTTPS requests on port 443. This makes DNS indistinguishable from regular web traffic, preventing network filtering of DNS queries but also preventing network administrators from monitoring or controlling DNS behavior.

The debate between DoT and DoH reflects tensions between privacy (where indistinguishability is desirable) and network management (where visibility into DNS traffic assists troubleshooting and security). Adoption remains partial—Firefox and Chrome support DoH, some resolvers support both, but universal deployment has not occurred.

Practical DNS Tools

nslookup is available on most operating systems and performs basic DNS queries. It accepts a domain name and optionally a specific DNS server to query:

nslookup husseinnasser.com

This queries the default resolver.

To query a specific server:

nslookup husseinnasser.com 8.8.8.8

This directs the query to Google’s public DNS resolver.

To query for specific record types:

nslookup -type=TXT husseinnasser.com

When nslookup returns “Non-authoritative answer,” it indicates the response came from a cache rather than the authoritative name server. To query the authoritative server directly, first find the name servers:

nslookup -type=ns husseinnasser.com

Then query one directly:

nslookup husseinnasser.com ns1.husseinnasser.com

dig (Domain Information Groper) provides more detailed output and is preferred by many network engineers. It shows full DNS responses including headers, transaction IDs, query times, and complete record information.

TLS: Transport Layer Security (TODO: finish to watch)

Transport Layer Security provides the fundamental mechanism for encrypting communication between two parties on the internet. Without encryption, IP packets carry plain-text data that anyone along the network path can read. TLS establishes a standard for securing this communication, ensuring that sensitive data remains confidential even as it traverses untrusted networks.

While TLS can theoretically encrypt any protocol operating at layer seven or below, it’s most commonly encountered in HTTPS—HTTP over TLS. TLS itself occupies an ambiguous position in the OSI model. It’s arguably best classified as layer five (session layer) because it maintains stateful connections with session variables, symmetric keys, and connection context—similar to how TCP maintains state at layer four. However, this classification isn’t definitive; TLS doesn’t map cleanly to a single OSI layer.

The Problem: Unencrypted HTTP

Consider standard HTTP communication. A client establishes a TCP connection to a server on port 80, sends an HTTP request (GET / HTTP/1.1 followed by headers and optional body), and receives a response containing headers and content. Every router and intermediary device between client and server can read this traffic. Internet Service Providers, network administrators, and attackers with access to network infrastructure can inspect requests and responses in their entirety.

This transparency is unacceptable for any sensitive communication—passwords, financial transactions, personal messages, or proprietary business data all require confidentiality.

The Solution: Symmetric Encryption

TLS encrypts traffic using symmetric encryption, where the same key encrypts and decrypts data, so the same key should exist on both client and server. Symmetric encryption is extremely fast because it typically uses XOR operations on blocks of data, making it suitable for encrypting large payloads like multi-megabyte JavaScript bundles or video streams.

The challenge is key exchange: how can client and server establish a shared symmetric key without transmitting it in plaintext where attackers could intercept it? Simply sending the key across the network defeats the entire purpose of encryption.

Asymmetric Encryption for Key Exchange

TLS solves this problem using asymmetric encryption for the key exchange phase. Asymmetric algorithms use a public key for encryption and a private key for decryption (or vice versa, depending on the algorithm). The public key can be shared freely—even if intercepted, it cannot decrypt messages encrypted with it. Only the corresponding private key, kept secret, can decrypt those messages.

However, asymmetric encryption is computationally expensive, relying on operations like modular exponentiation that consume far more CPU cycles than symmetric algorithms. This makes asymmetric encryption impractical for encrypting large volumes of data. TLS therefore uses asymmetric encryption only during the initial handshake to securely exchange a symmetric key, then switches to symmetric encryption for all subsequent data transfer.

TLS uses asymmetric encryption to set up the connection, then symmetric encryption to protect the data.

TLS 1.2 with RSA Key Exchange

TLS 1.2, while still in use for backward compatibility, often employs RSA—one of the most widely known asymmetric encryption algorithms—for key exchange. The process works as follows:

TCP Connection Establishment: Client and server complete the three-way TCP handshake.
Client Hello: The client sends a TLS Client Hello message indicating it wants to establish an encrypted connection. This message specifies which key exchange algorithm (e.g., RSA) and which symmetric encryption algorithm (e.g., AES or ChaCha20) the client supports.
Server Hello and Certificate: The server responds with its certificate, which contains the server’s public key. The certificate is signed by a Certificate Authority (CA), allowing the client to verify the server’s identity through the public key infrastructure (PKI) chain of trust.
Pre-Master Secret Exchange: The client generates a pre-master secret (essentially the foundation for the symmetric key). It encrypts this pre-master secret using the server’s public key and sends it to the server. Anyone intercepting this transmission sees only encrypted data—without the server’s private key, the pre-master secret cannot be extracted.
Symmetric Key Derivation: The server decrypts the pre-master secret using its private key. Both client and server now possess the same pre-master secret and use it to derive the actual symmetric encryption key. This completes the handshake.
Encrypted Communication: All subsequent traffic uses the symmetric key for fast, efficient encryption and decryption.

The Forward Secrecy Problem

While RSA key exchange works, it suffers from a critical vulnerability: lack of perfect forward secrecy (PFS). Consider an attacker who cannot break the encryption in real-time but records all encrypted traffic for later analysis. If the attacker later obtains the server’s private key—through a vulnerability like the Heartbleed bug in OpenSSL, which allowed attackers to extract private keys from server memory—they can retroactively decrypt all recorded sessions.

The attacker uses the leaked private key to decrypt each session’s pre-master secret, derives the symmetric key for that session, and decrypts the entire conversation. Even though each session uses a different symmetric key, the single compromised private key unlocks all of them.

This threat model is not hypothetical. Intelligence agencies, compromised network equipment, and persistent attackers routinely record encrypted traffic for later decryption if keys become available. To mitigate this, certificate lifetimes have shortened dramatically—Cloudflare issues certificates valid for only two weeks, limiting the window of exposure if a key is compromised.

Diffie-Hellman Key Exchange

Diffie-Hellman provides perfect forward secrecy by ensuring that compromising long-term keys doesn’t compromise past sessions. Unlike RSA, where the server’s static private key can decrypt all past traffic, Diffie-Hellman generates ephemeral (temporary) keys for each session. Even if an attacker obtains the server’s long-term authentication key, they cannot decrypt previous sessions.

Diffie-Hellman works through elegant mathematics. Both parties generate private keys that never leave their systems. They combine these private keys with shared public parameters to create values that can be safely transmitted. When these transmitted values are combined with the recipient’s private key, both parties arrive at the same shared secret—without ever transmitting either private key.

The Mathematics

Diffie-Hellman relies on modular exponentiation. Two public parameters, g (a generator) and n (a large prime modulus), are shared openly. The client generates a private key x, and the server generates a private key y.

The client computes g^x mod n and sends this to the server. The server computes g^y mod n and sends this to the client. Neither transmitted value reveals the underlying private key due to the difficulty of computing discrete logarithms in modular arithmetic.

The server takes the client’s transmitted value and raises it to the power of y: (g^x)^y mod n = g^(xy) mod n. The client takes the server’s transmitted value and raises it to the power of x: (g^y)^x mod n = g^(xy) mod n. Both arrive at g^(xy) mod n, which becomes the shared secret used to derive the symmetric encryption key.

An attacker observing the exchange sees g^x mod n and g^y mod n, but without knowing x or y, cannot compute g^(xy) mod n. Breaking this requires solving the discrete logarithm problem, which is computationally infeasible for sufficiently large primes.

Ephemeral Diffie-Hellman (DHE) and Elliptic Curve Variants (ECDHE)

Ephemeral Diffie-Hellman (DHE) generates new private keys for every session. Even if an attacker compromises the server’s long-term authentication key (used to sign the Diffie-Hellman parameters and prove the server’s identity), they cannot decrypt past sessions because the ephemeral keys are discarded after each session.

Elliptic Curve Diffie-Hellman Ephemeral (ECDHE) improves on DHE by using elliptic curve cryptography, which provides equivalent security with smaller key sizes and better performance. ECDHE has become the preferred key exchange mechanism in modern TLS.

TLS 1.3: Faster and More Secure

TLS 1.3 eliminates the negotiation round-trip present in TLS 1.2. In TLS 1.2, the Client Hello merely proposes algorithms, and the Server Hello selects from those proposals before beginning key exchange. TLS 1.3 removes this step by including Diffie-Hellman key share parameters directly in the Client Hello. The server can immediately compute the shared secret and begin encrypted communication in its first response.

This reduces the TLS handshake from two round-trips (four messages: Client Hello, Server Hello, Client Key Exchange, Finished) to one round-trip (two messages: Client Hello with key share, Server Hello with key share and encrypted application data).

TLS 1.3 also removes support for RSA key exchange entirely, mandating Diffie-Hellman or elliptic curve variants. This ensures perfect forward secrecy by default. Additionally, TLS 1.3 removes obsolete cryptographic primitives and cipher suites with known vulnerabilities, reducing the attack surface.

Zero Round-Trip Time (0-RTT)

TLS 1.3 introduces 0-RTT resumption for clients reconnecting to servers they’ve previously communicated with. The client can send encrypted application data in the very first message, using a pre-shared key (PSK) derived from the previous session. The server recognizes the PSK, validates it, and processes the request immediately without waiting for a full handshake.

While 0-RTT improves performance, it introduces replay attack risks—an attacker can capture and retransmit 0-RTT data. Applications using 0-RTT must ensure that replayed requests don’t cause unintended side effects (idempotency), or implement additional protections like single-use tokens.

Certificate Verification and PKI

Beyond key exchange, TLS provides server authentication through certificates. A certificate contains the server’s public key and is digitally signed by a Certificate Authority (CA). The client verifies the signature by checking against a list of trusted root CAs built into the operating system or browser.

This chain of trust extends from root CAs (which browsers trust implicitly) through intermediate CAs (which root CAs sign) to end-entity certificates (which intermediate CAs sign). If any link in this chain is compromised—if a CA is coerced into issuing fraudulent certificates, as happened with DigiNotar in 2011—attackers can impersonate legitimate servers.

Certificate Transparency logs, certificate pinning, and other mechanisms attempt to mitigate these risks, but PKI remains a potential vulnerability in TLS.

TLS in Practice

TLS negotiation adds latency to connection establishment. Each round-trip consumes time proportional to the network latency between client and server. Over a 50-millisecond connection, TLS 1.2’s two round-trips add 100 milliseconds before application data can flow. TLS 1.3’s single round-trip halves this overhead to 50 milliseconds, and 0-RTT eliminates it entirely for repeat connections.

The server’s certificate transmission during the handshake can be surprisingly large—several kilobytes—requiring multiple TCP segments. Proposals exist to compress certificates, but they remain uncompressed in most implementations, contributing to handshake latency.

TLS sessions are tied to TCP connections. When a TCP connection closes, the TLS session typically ends, requiring a full handshake for the next connection. Session resumption mechanisms (session IDs in TLS 1.2, PSKs in TLS 1.3) allow clients to skip the full handshake when reconnecting, improving performance for applications that open multiple short-lived connections.

Summary

TLS encrypts communication through a carefully orchestrated combination of asymmetric and symmetric cryptography. Asymmetric algorithms securely exchange keys during the handshake, while symmetric algorithms efficiently encrypt bulk data. TLS 1.3 improves on TLS 1.2 by reducing handshake latency, mandating perfect forward secrecy, and removing obsolete cryptographic primitives.

Understanding TLS is essential for anyone building or operating networked systems. The protocol’s design reflects decades of cryptographic research and real-world attack mitigation. Performance optimizations like session resumption and 0-RTT must be weighed against security implications like replay attacks. Certificate validation requires proper implementation to prevent man-in-the-middle attacks.

Every HTTPS request begins with TLS negotiation—understanding this overhead helps explain why connection reuse, HTTP/2 multiplexing, and connection warming strategies matter for web performance. The protocol’s evolution from SSL through TLS 1.0, 1.1, 1.2, and now 1.3 demonstrates the ongoing tension between security, compatibility, and performance in internet protocols.

Quartz 4

Explorer

06. Overview of Popular Networking Protocols