This chapter introduces communication models and protocols essential for understanding how machines communicate over the internet. These systems rely on protocols, which are sets of rules, similar to grammar in human language, that govern the exchange of data.
Communication Models and Protocols
For machines to communicate reliably over the internet, they must adhere to communication protocols. These protocols operate at different layers within a communication model. These layers represent the steps data takes to travel from one point to another, ensuring the data is correctly transformed so it can be understood by both machines (in binary) and humans (in application-level data). The chapter focuses on two models: the Open Systems Interconnection (OSI) model and the Transmission Control Protocol/Internet Protocol (TCP/IP) model.
OSI Model
The OSI model is a reference model that divides the network communication process into seven layers (L1 to L7), with each layer performing a specific function. The top layer, L7 (Application layer), deals with data as humans perceive it (e.g., text), while the bottom layer, L1 (Physical layer), deals with data as machines perceive it (e.g., binary bit streams). For communication to occur, data travels through the layers from top to bottom on the sender’s side and bottom to top on the receiver’s side.
Here is a breakdown of the seven layers:
| Layer Number | Layer Name | Layer Function |
|---|---|---|
| Layer 7 | Application layer | Has direct access to user data. Software applications rely on it via protocols like HTTP, SMTP, and SSH. It also helps in service advertisement. |
| Layer 6 | Presentation layer | Responsible for formatting, encryption, decryption, and compression of data. It ensures L7 receives data in the expected format. |
| Layer 5 | Session layer | Manages the session lifecycle, including its initiation, maintenance, and termination. |
| Layer 4 | Transportation layer | Responsible for data transport between devices using protocols like TCP and UDP. It manages data buffering, error control (requesting retransmission if data segments are lost in TCP), and windowing (identifying the data amount to transfer before needing an acknowledgment). When the buffer queue is full, new data segments are dropped (tail drop). |
| Layer 3 | Network layer | Responsible for data transfer between different networks using logical addressing, such as IP addresses. It receives data packets (smaller chunks of data broken down by L4). |
| Layer 2 | Data link layer | Transfers data between devices within the same network in the form of frames (packets broken down further). It has two parts: |
| MAC (Medium Access Control) | Handles the physical addressing (MAC address) of devices, which is a globally unique identifier for data delivery. | |
| LLC (Logical Link Control) | Manages flow control and error control inside the same network. | |
| Layer 1 | Physical layer | Converts data frames into bit streams (ones and zeros). It involves physical devices like cables and switches. |
The OSI model is primarily a reference model and is generally not practically present in modern internet applications. Internet communications typically occur via the TCP/IP model, which shares similarities with the OSI model but does not have a one-to-one layer correspondence.
TCP/IP Model
The TCP/IP model, also known as the Internet Protocol Suite**, is the communication model practically used for internet communication. It simplifies the seven layers of the OSI model by combining several into single layers with similar functionality.
| OSI model layer | TCP/IP model layer | TCP/IP protocol examples |
|---|---|---|
| Application, Presentation, Session | Application | SMTP, HTTP |
| Transport | Transport | TCP, UDP |
| Network | Internet | IP, ICMP |
| Data Link | Data Link | IEEE 802.2 |
| Physical | Physical Network | Ethernet |
Network Layer Protocols
Protocols in the network/internet layer are crucial for routing data between different networks.
- Internet Protocol (IP): This is the most widely used protocol at this layer. IP is a set of rules responsible for delivering data packets from a source IP address to a destination IP address. The IP address is the unique identifier that allows routers to correctly route information across the network. Specs about data packets:
- Fragmentation: Data is often broken into smaller chunks to avoid the burden of delivering large packets and to reduce latency.
- Maximum Transmission Unit (MTU): There is a maximum limit to the size of a packet that can be transmitted over a network interface (e.g., 1,500 bytes for Ethernet).
- IP Header and Payload: Each IP packet contains an IP header (with metadata like source/destination IP addresses, IP version, packet size, and TTL) and a payload (the actual data, called IP datagrams).
- Internet Control Message Protocol (ICMP): This protocol is mainly used for network diagnostics and implementing error mechanisms. A common use is the
pingcommand, which checks if a server is responding to traffic. For instance, if a URL doesn’t resolve to a server, the router can send an ICMP message back to the sender.
Network layer protocols like IP and ICMP work together with transport layer protocols (TCP and UDP) to ensure successful data delivery.
Transmission Control Protocol (TCP)
The Transmission Control Protocol (TCP) is a transport layer protocol that provides reliable communication between the sender and receiver, including the capability to retransmit data if packets are lost.
Connection Establishment (Three-Way Handshake)
Communication via TCP begins with a three-way handshake. This is an agreement between the two parties to exchange and accept data.
- The TCP header contains vital information such as the source and destination port numbers, sequence number(identifies data sent), acknowledgment number (used by the receiver to request the next data segment), and window (the amount of data the receiver can currently accept).
Congestion and Flow Control
After establishing a connection, TCP uses two main mechanisms—slow start and congestion avoidance—to find the optimal amount of data (window size) that can be transmitted before waiting for an acknowledgment (ACK), which ensures network bandwidth is used efficiently without causing congestion.
- Slow Start: Data transfer begins with a small size (one segment in a congestion window). The sender then gradually increases the window size exponentially (by one segment for every ACK received). The maximum size is limited by the window size advertised by the receiver.
- Congestion Detection: Congestion is detected if there are connection timeouts or if duplicate ACKs are received, indicating packets were dropped.
- Congestion Avoidance: When congestion is detected, this mechanism activates. The congestion window size is halved and reset to one segment in timeout scenarios to reduce network load.
To improve performance, it is recommended to deploy applications close to each other (e.g., in the same geographic region). This closeness reduces the Round Trip Time (RTT), which allows the sender to quickly adjust the congestion window size when congestion occurs.
Ports
A port is a virtual point managed by the operating system (OS) that serves as an entry or exit point for a software application. Ports are numbered from 0 to 65535 and are categorized into three divisions:
- Well-known ports (0 to 1023): Also called system ports. These are controlled by the Internet Assigned Numbers Authority (IANA) and used by system processes. Examples: Port 22 (SSH) and Port 80 (HTTP).
- Registered ports (1024 to 49151): Also called user ports. These are assigned by IANA and used by user processes. Example: Port 1194 (OpenVPN). Both well-known and registered ports are referred to as nonephemeral ports.
- Ephemeral ports (49152 to 65535): Also called private or dynamic ports. These are not controlled by IANA and are used for private or temporary purposes.
The overhead of establishing a connection and ensuring reliable delivery (like retransmitting lost packets) with TCP adds latency. For applications that do not require this reliability, the User Datagram Protocol (UDP) is used instead.
Hypertext Transfer Protocol (HTTP)
The world heavily relies on the internet, with communication generally following a client-server model where a client sends an HTTP request and the server sends back an HTTP response containing data. This communication typically occurs on port 80 (or port 443 for the secure version, HTTPS).
HTTP Request Structure and Methods
An HTTP request includes the HTTP method, the version, and the host.

- HTTP Methods: These specify the desired action to be performed on the server. The most widely used methods are:
- GET: Used to retrieve data without modifying the state on the server (ensures idempotency). Example: getting order details.
- POST: Used to create a new resource or send data for processing. It is non-idempotent. Example: placing a food order.
- PUT: Used to update an existing resource. It is idempotent. Example: updating a saved address.
- DELETE: Used to remove a resource. It is idempotent.
- Request Target: Specifies the location (resource) on the host (the server) the client is trying to access (e.g.,
/api/orders). - HTTP Header: Contains additional metadata about the request, such as the
User-Agent(where the request is coming from) andAccept(the content format expected from the server).
HTTP Versions
HTTP has evolved to improve performance and efficiency:
- HTTP/1.1: The most widely used version. It supports the reuse of TCP connections for multiple requests.
- HTTP/2: Optimizes performance with header compression (reducing bandwidth) and uses a single TCP connection for a domain. It also supports server push to avoid client polling for asynchronous calls.
- HTTP/3: Improves speed over HTTP/2 by utilizing UDP instead of TCP/IP protocols (known as QUIC - Quick UDP Internet Connections). It implements its own congestion control algorithm to mitigate TCP/IP congestion issues.
HTTP Response and Status Codes
The HTTP response contains the requested data and includes the HTTP status code, which indicates the outcome of the request.
| Status Code Series | Meaning and Examples |
|---|---|
| 100 | Informational (request received and processing). Example: 100 Continue, 101 Switching protocols. |
| 200 | Success (request successfully processed). Example: 200 OK, 201 Created (new resource created). |
| 300 | Redirection (server directs the request to another resource). Example: 301 Moved Permanently (client should update references), 302 Found (temporarily moved). |
| 400 | Client Error (issue with the client’s request). Example: 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found. |
| 500 | Server Error (server encountered an issue while processing). Example: 500 Internal Server Error, 503 Service Unavailable. |
While HTTP is a general protocol for application communication, some protocols, such as Simple Mail Transfer Protocol (SMTP), are designed for specific uses (like email).