On October 29, 1969, at 10:30 PM, a UCLA computer science student named Charley Kline attempted to log in remotely to a computer at the Stanford Research Institute — the first message ever sent on what would become the internet. The intended message was "login." The system crashed after the first two letters. The first internet communication was "lo."

From that inauspicious beginning, the network has grown into the most complex technological infrastructure in human history: approximately 5.4 billion users, hundreds of thousands of interconnected networks, roughly 500 undersea cable systems spanning 1.3 million kilometers of ocean floor, and data centers consuming more electricity than many countries. At any moment, incomprehensible volumes of data — emails, video streams, financial transactions, voice calls, sensor readings — are flowing across this infrastructure in the form of discrete packets of bits, navigating their way from source to destination through routing decisions made in milliseconds.

Understanding how this works requires understanding the layered architecture that made it possible: a stack of protocols, each addressing a specific problem, that together enable any device anywhere to communicate with any other device, regardless of the physical medium carrying the data.

"The internet is not something you just dump something on. It's not a big truck. It's a series of tubes." — Senator Ted Stevens (2006), memorably misunderstanding the internet — but the underlying question of how data actually travels is one most people, including senators, cannot answer.


Key Definitions

Internet — A global network of networks: thousands of interconnected autonomous systems (ISPs, corporations, universities, governments) that exchange data using common protocols, primarily the Internet Protocol suite (TCP/IP). Not owned by any single entity.

World Wide Web (WWW) — A service built on top of the internet: a system of interlinked hypertext documents and media accessed via browsers using HTTP/HTTPS. The web is one application running on the internet; email, streaming, gaming, and IoT are others.

Packet — The fundamental unit of data transmission on the internet. Data to be transmitted is broken into packets, each containing: a header (source and destination IP addresses, packet number, protocol information) and a payload (the actual data). Packets from the same transmission may travel different routes and are reassembled at the destination.

IP address (Internet Protocol address) — A numerical identifier assigned to each device on a network. IPv4 uses 32-bit addresses (e.g., 192.168.1.1), providing approximately 4.3 billion unique addresses. IPv6 uses 128-bit addresses, providing 3.4 × 10³⁸ unique addresses — effectively unlimited.

TCP (Transmission Control Protocol) — A connection-oriented protocol that establishes reliable, ordered, error-checked delivery of data between applications. TCP uses a three-way handshake to establish a connection, tracks packets, requests retransmission of lost packets, and acknowledges receipt. Used by HTTP, email, and file transfer.

UDP (User Datagram Protocol) — A connectionless protocol that sends data without establishing a connection or guaranteeing delivery. Faster and lower overhead than TCP, but unreliable. Used by video streaming, online gaming, DNS, and VoIP — applications where a dropped packet is preferable to a delayed one.

DNS (Domain Name System) — A hierarchical distributed database that translates human-readable domain names (google.com) into IP addresses (142.250.80.46) that routers use to direct traffic. DNS is the internet's phone book.

Router — A network device that forwards packets between networks based on destination IP addresses. Routers maintain routing tables — maps of which paths lead to which IP address ranges — and make forwarding decisions for each incoming packet. The internet's routing infrastructure consists of hundreds of thousands of routers operated by different organizations.

BGP (Border Gateway Protocol) — The protocol that routers use to exchange routing information between autonomous systems. BGP is the "glue" of the internet: it allows the networks operated by different ISPs, corporations, and organizations to exchange information about how to reach each other. BGP misconfigurations can cause widespread internet disruptions.

HTTP (HyperText Transfer Protocol) — The application-layer protocol for the World Wide Web. Defines how browsers and servers communicate: a browser sends an HTTP request (e.g., GET /index.html) and the server responds with the resource and status code. HTTP/2 and HTTP/3 are more efficient successors.

HTTPS (HTTP Secure) — HTTP with TLS (Transport Layer Security) encryption. The data exchanged between browser and server is encrypted, preventing eavesdropping and tampering. TLS also authenticates the server's identity using digital certificates issued by Certificate Authorities.

ISP (Internet Service Provider) — A company that provides internet access to consumers or businesses. ISPs connect customers' devices to the broader internet infrastructure. Tier 1 ISPs (AT&T, Deutsche Telekom, NTT) form the internet backbone, peering with each other to exchange traffic for free. Tier 2 and 3 ISPs pay Tier 1 ISPs for access.

Bandwidth — The maximum rate of data transfer across a network connection, typically measured in Mbps (megabits per second) or Gbps (gigabits per second). Bandwidth is the width of the pipe; latency is how long data takes to travel through it.

Latency — The time delay for a data packet to travel from source to destination and back (round-trip time). Measured in milliseconds. Primarily determined by the physical distance data must travel (light through fiber moves at about 2/3 the speed of light in vacuum) and the number of routing hops. Critical for real-time applications like gaming and video calls.


The Physical Infrastructure

Undersea Cables

Most people imagine the internet as wireless — data floating through the air between invisible towers. The reality is more literal: approximately 95% of international internet traffic travels through physical cables lying on the ocean floor.

There are currently approximately 500 undersea cable systems, totaling over 1.3 million kilometers, crossing every ocean. These cables carry fiber optic strands — glass threads thinner than a human hair — that transmit data as pulses of light. A modern cable system can carry over 250 terabits per second (Tbps) — enough bandwidth to stream every Netflix movie simultaneously, continuously, to millions of users.

The cables are remarkably thin: roughly the diameter of a garden hose in deep ocean sections (where threats are minimal), and wrapped in multiple layers of steel armor near shorelines where they face damage from anchors, fishing equipment, and underwater landslides.

Cable breaks do happen — typically a few dozen per year globally, caused by trawling, anchors, and occasional ship strikes. Repair ships with specialized equipment can locate and splice breaks, typically taking days to weeks. Redundancy in the cable network means that individual breaks rarely cause complete internet outages for any country.

Terrestrial Networks

On land, internet infrastructure includes:

  • Long-haul fiber: High-capacity fiber optic cables running between cities, often buried alongside highways or rail lines
  • Metropolitan networks: Local fiber networks serving individual cities
  • Last-mile connections: The connection from local infrastructure to individual homes and businesses — delivered via fiber (fastest), coaxial cable (common), DSL over telephone lines (slower), or fixed wireless
  • Cell towers and wireless: Mobile internet (4G, 5G) provides wireless access by connecting devices to cell towers, which connect to fiber backhaul

Data Centers

The servers that store and serve web content, video, email, and cloud services live in data centers — large facilities housing tens of thousands of servers with extensive power, cooling, and network infrastructure. A modern hyperscale data center (like those operated by Amazon, Google, or Microsoft) may contain a million servers and consume hundreds of megawatts of electricity.

Content delivery networks (CDNs) distribute popular content — Netflix videos, news articles, software updates — across hundreds of data centers worldwide, ensuring that content is physically close to users who request it. When you stream a Netflix video, the data typically comes from a CDN node within a few hundred kilometers of you, not from Netflix's central servers.


The Protocol Stack: How Data Actually Travels

The Layered Model

The internet's design principle is separation of concerns: each layer handles a specific problem independently, passing data to the layer above or below. This allows any layer to be upgraded or replaced without affecting the others.

The TCP/IP model (used in practice) has four layers:

Layer Name Examples Function
4 Application HTTP, DNS, SMTP, FTP What data means; application-specific communication
3 Transport TCP, UDP Reliability, ordering, multiplexing connections
2 Internet IP Addressing and routing between networks
1 Link Ethernet, Wi-Fi, fiber Physical transmission on a single network segment

When you send an email, the data passes down through these layers on your device (being wrapped in headers at each layer) and back up through the layers at the receiving server (with headers being stripped at each layer).

Step-by-Step: What Happens When You Load a Website

Consider typing https://www.example.com/page in your browser. Here is what happens:

1. DNS Resolution: Your browser needs the IP address for www.example.com. It sends a query to your local DNS resolver (typically operated by your ISP or a public DNS service like Google's 8.8.8.8). The resolver checks its cache; if not cached, it queries the DNS hierarchy — first a root nameserver, then the .com top-level domain nameserver, then the authoritative nameserver for example.com. The IP address is returned and cached.

2. TCP Connection: Your browser opens a TCP connection to the server's IP address on port 443 (HTTPS). The three-way handshake: SYN (your device sends a synchronize packet), SYN-ACK (server responds with synchronize-acknowledge), ACK (your device acknowledges). A reliable connection is now established.

3. TLS Handshake: Since this is HTTPS, a TLS handshake follows. The server presents its digital certificate (issued by a trusted Certificate Authority). Your browser verifies the certificate is valid and authentic. Both sides negotiate an encryption algorithm and exchange keys. Subsequent communication is encrypted.

4. HTTP Request: Your browser sends an HTTP GET request: GET /page HTTP/1.1, plus headers (Host, Accept, Cookie, etc.).

5. Routing: The request travels in IP packets from your device, through your home router, to your ISP, then through a series of routers across the internet toward the server. At each router, the router examines the destination IP address and forwards the packet toward its destination based on its routing table. Packets may traverse 10-20 router hops.

6. Server Processing: The web server receives the request, retrieves or generates the requested page, and sends an HTTP response: status code 200 (OK), headers, and the HTML content.

7. TCP Acknowledgment: TCP acknowledges each received segment. Lost packets trigger retransmission requests. The receiver reassembles packets in order.

8. Rendering: Your browser receives the HTML, parses it, discovers referenced CSS, JavaScript, images, and fonts, and issues additional requests for each (many in parallel using HTTP/2 multiplexing). The page is rendered as resources arrive.

The entire process, from typing Enter to seeing a rendered page, typically takes 200-500 milliseconds over a fast connection.


Routing: How Packets Find Their Way

IP Routing

Every router on the internet maintains a routing table: a map of which network interfaces to use to forward packets toward different destination IP address ranges. When a packet arrives, the router looks up the destination IP, finds the most specific matching route, and forwards the packet toward that next hop.

Routers do not know the complete path to every destination — only the next step. The packet progressively makes its way from router to router, with each router making a local forwarding decision, until it reaches its destination.

BGP: The Internet's Routing Backbone

The internet consists of approximately 70,000 autonomous systems (AS) — separately administered networks operated by ISPs, corporations, universities, and other organizations. Each AS has one or more AS numbers (ASNs).

BGP is the protocol these autonomous systems use to exchange routing information. Each AS announces to its BGP neighbors which IP address ranges it can reach. Those neighbors propagate the announcements to their neighbors. Eventually, every AS learns how to reach every IP address range on the internet.

BGP is essentially a policy protocol: each AS can configure BGP policies to prefer certain routes, filter others, and control how its routes are announced. This flexibility is powerful but creates risks. A single misconfigured BGP announcement can propagate globally and redirect internet traffic in seconds.

BGP incidents: In 2010, a Chinese ISP (China Telecom) accidentally announced it could reach over 37,000 networks, briefly attracting traffic from those networks through China. In 2008, Pakistan Telecom accidentally caused YouTube to become unreachable globally for about 2 hours. BGP's trust-based model, designed in an era when all internet operators knew each other personally, remains a security vulnerability.


DNS: The Internet's Phone Book

DNS is one of the internet's most elegant designs — a hierarchical, distributed database that scales to hundreds of millions of domains while remaining fast and resilient.

The DNS Hierarchy

DNS is organized as a tree:

  • Root zone (.): Managed by ICANN. 13 root server clusters (actually hundreds of physical servers worldwide via anycast) answer queries about top-level domains.
  • Top-Level Domains (TLDs): .com, .net, .org, .uk, .de, etc. Each is managed by a registry (VeriSign manages .com).
  • Second-level domains: example.com, google.com — registered by domain owners.
  • Subdomains: www.example.com, mail.example.com — configured by domain owners.

DNS Lookup Process

When your browser needs to resolve www.google.com and it's not in cache:

  1. Your device queries your local DNS resolver (e.g., 8.8.8.8)
  2. Resolver queries a root nameserver: "Who handles .com?"
  3. Root responds with the address of VeriSign's TLD nameservers
  4. Resolver queries VeriSign: "Who handles google.com?"
  5. VeriSign responds with Google's authoritative nameserver addresses
  6. Resolver queries Google's nameserver: "What is the IP of www.google.com?"
  7. Google responds: 142.250.80.46 (or similar)
  8. Resolver caches the result and returns it to your device

The entire process typically takes 20-120 milliseconds and is repeated only when cached entries expire (according to TTL values set by domain owners).


Security: How the Internet Protects Data

TLS Encryption

Every HTTPS connection uses TLS (Transport Layer Security) to encrypt data between browser and server. TLS provides:

  • Confidentiality: Data is encrypted; eavesdroppers see only ciphertext
  • Integrity: Any tampering with data in transit is detectable
  • Authentication: Digital certificates prove the server's identity

TLS uses asymmetric cryptography (typically Elliptic Curve Diffie-Hellman) to establish a shared secret, then switches to symmetric encryption (AES) for the actual data transfer.

Common Internet Security Threats

Threat Mechanism Defense
Man-in-the-middle Attacker intercepts traffic between two parties TLS encryption and certificate verification
DDoS (Distributed Denial of Service) Flooding a server with traffic from many sources Rate limiting, traffic scrubbing, CDN distribution
BGP hijacking Malicious or erroneous route announcements redirect traffic RPKI (Resource Public Key Infrastructure)
DNS poisoning Corrupting cached DNS records with false entries DNSSEC (DNS Security Extensions)
Phishing Impersonating legitimate sites to steal credentials Certificate authorities, browser warnings

The Future of the Internet

IPv6 adoption: IPv4's 4.3 billion addresses are exhausted. IPv6, with its 3.4 × 10³⁸ addresses, is being deployed gradually — currently handling about 40% of Google's traffic as of 2024. Full IPv6 transition is underway but incomplete.

HTTP/3 and QUIC: HTTP/3 replaces TCP with QUIC — a transport protocol built on UDP that provides reliability, security, and multiplexing with lower latency. It handles packet loss more gracefully (a single lost packet doesn't stall all concurrent streams) and eliminates the round trips needed for TCP's connection setup plus TLS handshake.

Satellite internet: SpaceX's Starlink and Amazon's Kuiper are deploying low-Earth orbit satellite constellations providing broadband to underserved areas. Starlink has over 2 million subscribers as of 2024, offering speeds of 50-200 Mbps with latency around 20-40ms — competitive with terrestrial broadband.

For related concepts, see how encryption works, how AI works, and how technology adoption works.


References

Frequently Asked Questions

How does data travel across the internet?

Data is broken into small packets, each containing a portion of the data plus addressing information. Packets travel independently through a network of routers, which forward each packet toward its destination based on routing tables. Packets may take different paths and are reassembled at the destination.

What is the difference between the internet and the World Wide Web?

The internet is the physical and logical infrastructure — the cables, routers, and protocols that connect computers globally. The World Wide Web is a service built on top of the internet — a system of linked hypertext documents accessed via browsers using HTTP. Email, streaming, and gaming also run on the internet but are not part of the web.

What is an IP address?

An IP (Internet Protocol) address is a numerical label assigned to every device on a network, used to identify and locate it for communication. IPv4 uses 32-bit addresses (like 192.168.1.1), providing about 4.3 billion unique addresses. IPv6 uses 128-bit addresses, providing a virtually unlimited supply.

What does DNS do?

DNS (Domain Name System) translates human-readable domain names (like google.com) into IP addresses that computers use to locate servers. It functions as the internet's phone book — you look up a name and get a number to call.

What is HTTPS and why does it matter?

HTTPS (HyperText Transfer Protocol Secure) adds TLS encryption to HTTP, protecting data in transit from interception. It authenticates that you're communicating with the genuine server (not an impersonator) and encrypts the content of communications. The padlock icon in your browser indicates an HTTPS connection.

What is the physical infrastructure of the internet?

The internet runs on a physical layer including undersea fiber optic cables, terrestrial cables, cell towers, data centers, and satellite links. About 95% of international internet traffic travels through approximately 500 undersea cables. Data centers house the servers that store and serve content.

Who owns the internet?

No single entity owns the internet. It is a network of networks — thousands of interconnected ISPs, corporations, governments, and organizations that have each built part of the infrastructure and agree to exchange traffic using shared protocols. Organizations like ICANN coordinate domain names, while the IETF maintains the technical standards.