Foundations

Networking Fundamentals for System Design

● Beginner ⏱ 25 min read network

Before you can design distributed systems, you need to understand how machines talk to each other. IP addresses identify devices, the OSI model describes how data moves through a network, TCP and UDP define how reliably it's delivered, and DNS translates human-readable names into the IP addresses computers actually use. These four concepts underpin every system design decision that follows.

IP Addresses

An IP (Internet Protocol) address is a unique identifier assigned to every device on a network. It serves two purposes: identifying the host and providing its location for routing.

IPv4 vs IPv6

IPv4 uses a 32-bit address written in dotted-decimal notation, giving roughly 4.3 billion unique addresses. Example: 102.22.192.181. The internet ran out of unallocated IPv4 blocks around 2011.

IPv6 was introduced to solve IPv4 exhaustion. It uses a 128-bit hexadecimal address, providing approximately 3.4 × 10³⁸ unique addresses — enough for every grain of sand on Earth to be assigned roughly 45 quintillion (4.5 × 10¹⁹) addresses each. Example: 2001:0db8:85a3:0000:0000:8a2e:0370:7334.

Property	IPv4	IPv6
Address length	32-bit	128-bit
Notation	Dotted decimal (4 octets)	Hexadecimal (8 groups)
Address space	~4.3 billion	~3.4 × 10³⁸
Header size	20–60 bytes	Fixed 40 bytes
NAT required	Yes (address exhaustion)	No

Types of IP Addresses

Public IP — assigned to your network by your ISP. All devices behind a home router share one public IP. This is the address the wider internet sees.

Private IP — assigned within a local network (home or office). Ranges reserved for private use: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16. Not routable on the public internet.

Static IP — manually configured; does not change. Used for servers, load balancers, and anything that must be consistently reachable at a known address.

Dynamic IP — assigned by a DHCP server; changes over time. Standard for consumer devices. Cheaper to manage at scale because addresses are recycled when devices leave the network.

💡

Why This Matters in System Design

When designing services, use private IPs for inter-service communication and only expose public IPs at load balancers or API gateways. This reduces your attack surface and avoids unnecessary egress costs.

OSI Model

The Open Systems Interconnection (OSI) model is a conceptual framework that breaks network communication into seven layers. Each layer has a specific responsibility and communicates with the layers directly above and below it.

Understanding OSI helps you reason about where a problem or feature lives: a firewall operates at layers 3–4, an API gateway at layer 7, TLS at layer 5–6. When someone says “L4 load balancer” or “L7 routing,” they’re referring to OSI layers.

Layer	Name	Responsibility	Examples
7	Application	User-facing protocols and services	HTTP, SMTP, DNS, FTP
6	Presentation	Data encoding, encryption, compression	TLS/SSL, JPEG, UTF-8
5	Session	Managing connections between applications	TLS handshake, RPC sessions
4	Transport	End-to-end delivery, segmentation, flow control	TCP, UDP
3	Network	Logical addressing and routing between networks	IP, ICMP, routers
2	Data Link	Node-to-node framing, MAC addressing, error detection	Ethernet, Wi-Fi, switches
1	Physical	Transmission of raw bits over a medium	Cables, radio, fibre optics

How Data Flows Through the Layers

When your browser makes an HTTP request, data travels down the stack on the sender side and up on the receiver side. Each layer wraps the data in its own header (encapsulation), and each layer on the receiving end strips its header (decapsulation).

For example, an HTTP request starts at layer 7, gets encrypted by TLS at layer 6, assigned to a TCP session at layer 5, segmented into TCP segments at layer 4, wrapped in IP packets at layer 3, framed as Ethernet frames at layer 2, and finally transmitted as electrical or optical signals at layer 1.

💡

Practical Reference

In system design conversations, layers 3, 4, and 7 come up most often. L3 = IP routing (firewalls, VPNs). L4 = TCP/UDP (NLBs, stateful firewalls). L7 = HTTP/application (API gateways, WAFs, ALBs). Layers 1–2 are typically managed by your cloud provider.

TCP vs UDP

Both TCP and UDP are Layer 4 transport protocols that carry application data across IP networks. They represent a fundamental trade-off: reliability vs speed.

TCP — Transmission Control Protocol

TCP is connection-oriented. Before any data is exchanged, both sides perform a three-way handshake: the client sends SYN, the server responds with SYN-ACK, and the client completes with ACK. Only then does data flow.

TCP guarantees: delivery (lost packets are retransmitted), ordering (segments are reassembled in sequence), and error-checking (checksums on every segment). These guarantees come with overhead — more round trips, more bookkeeping per connection.

UDP — User Datagram Protocol

UDP is connectionless. There is no handshake — the sender just fires packets at the destination. There is no delivery guarantee, no ordering, and no retransmission. What it lacks in reliability, it makes up for in speed and simplicity.

UDP is preferred when low latency matters more than perfect delivery: video streaming (a late frame is worse than a dropped frame), DNS lookups (a small query that fits in one packet), VoIP, and online games.

Feature	TCP	UDP
Connection	Connection-oriented (3-way handshake)	Connectionless
Delivery guarantee	Yes — retransmits lost packets	No
Ordering	In-order delivery	No ordering
Error checking	Checksum (corrupt packets discarded + retransmitted)	Checksum only (corrupt packets silently dropped)
Speed	Slower (overhead per packet)	Faster (minimal overhead)
Broadcasting	No	Yes
Use cases	HTTP/S, SMTP, SSH, FTP	DNS, video streaming, VoIP, gaming

⚠️

When to Use Which

Default to TCP for anything requiring correctness: APIs, databases, file transfers, authentication. Switch to UDP when you control both endpoints, can tolerate loss, and need the lowest possible latency — real-time media, sensor telemetry, or custom game networking protocols.

DNS

DNS is the internet’s phonebook. Humans remember google.com; computers need 142.250.80.46. DNS translates between the two via a distributed, hierarchical system of servers. It’s also a critical part of system design — DNS-level load balancing, health checks, and failover are common patterns.

How DNS Resolution Works

When you type example.com into a browser, a chain of lookups fires before a single byte of the website is fetched:

The Four DNS Server Types

DNS Resolver (recursive resolver) — the first stop for your query. Usually run by your ISP or a public resolver like Google (8.8.8.8) or Cloudflare (1.1.1.1). It does the work of asking other servers on your behalf and caches results.

Root Name Server — knows the address of every TLD server. There are 13 root server types (labelled a–m.root-servers.net), but hundreds of physical machines worldwide via Anycast. They don’t know IPs for individual domains — they just redirect queries to the right TLD server.

TLD Name Server — manages a top-level domain zone (.com, .org, .uk, etc.). It knows which authoritative name server is responsible for each registered domain in its zone.

Authoritative Name Server — the final authority for a specific domain. It holds the actual DNS records (A, CNAME, MX, etc.) and returns the definitive answer. If it doesn’t have a record, it returns NXDOMAIN.

Query Types

Recursive query — the client asks the resolver to do all the work and return a final answer (or an error). Most browser queries are recursive.

Iterative query — the resolver asks each server in turn; each either answers or redirects to another server. The resolver does the legwork.

Non-recursive query — the resolver already has the answer cached and returns it immediately without hitting any upstream servers.

Key DNS Record Types

Record	Purpose	Example
`A`	Maps a domain to an IPv4 address	`example.com → 93.184.216.34`
`AAAA`	Maps a domain to an IPv6 address	`example.com → 2606:2800::1`
`CNAME`	Alias from one name to another	`www.example.com → example.com`
`MX`	Mail server for a domain	`example.com → mail.example.com`
`NS`	Authoritative name servers for a domain	`example.com → ns1.registrar.com`
`TXT`	Arbitrary text; used for SPF, DKIM, verification	`"v=spf1 include:..."`
`PTR`	Reverse lookup — IP to domain name	`34.216.184.93.in-addr.arpa → example.com`

DNS Caching and TTL

Every DNS record has a TTL (Time to Live) in seconds. Resolvers cache records for the TTL duration. A record with TTL=300 is cached for 5 minutes; after that, the resolver must re-query. Cached results return instantly (non-recursive query).

Low TTL = faster propagation of changes but more DNS queries and higher latency on cold lookups. High TTL = fewer queries but slower rollout of IP changes. For deployments where you need rapid failover (e.g. DNS-based health checks), use a TTL of 60–120 seconds.

💡

DNS in System Design Interviews

DNS is often the first layer of load balancing in large systems. Services like Route 53 and Cloudflare DNS support health checks, weighted routing, geo-routing, and latency-based routing. When you need global load distribution or multi-region failover, DNS is the first tool to reach for — before application-layer load balancers.

Managed DNS Services

Amazon Route 53 — AWS’s DNS service with health checks, routing policies (weighted, latency, geolocation, failover), and integration with AWS services.
Cloudflare DNS — fastest public resolver (1.1.1.1), DDoS protection, and proxying built-in.
Google Cloud DNS — globally distributed, anycast DNS with 100% uptime SLA.
Azure DNS — hosts DNS zones in Azure with integration into Azure networking and RBAC.