Content Delivery Networks (CDN)
When a user in Sydney requests an image stored on a server in Virginia, physics gets in the way. That round-trip takes roughly 200–300ms before a single byte of content is transferred. A Content Delivery Network solves this by placing copies of your content on servers located close to every major population center on earth. Instead of 200ms to Virginia, the request hits a node 5ms away in Sydney. CDNs are one of the most impactful and widely deployed optimizations in web infrastructure — and a near-universal component in any system that serves users globally.
What Is a CDN?
A Content Delivery Network (CDN) is a globally distributed network of servers — called edge nodes or Points of Presence (PoPs) — that cache and serve content from locations geographically close to end users. The CDN acts as a distributed caching layer that sits between your origin server and your users.
Without a CDN, every user — regardless of location — fetches content from your origin server. With a CDN, content is served from the nearest edge node. Users get faster responses, and your origin server handles a fraction of the traffic it would otherwise.
The origin is your primary server — the authoritative source of truth for your content. Edge nodes are the CDN’s distributed servers that cache content from the origin and serve it to nearby users. The origin only receives traffic when an edge node doesn’t have a cached copy.
How CDNs Work
When a user requests a URL served by a CDN, the request flow works like this:
- The user’s DNS lookup returns the IP address of the nearest edge node (CDN providers use Anycast routing or GeoDNS to route users to the closest PoP).
- The request hits the edge node. If the content is cached and not expired, the edge node returns it immediately — the origin is never contacted.
- If the content is not cached (cache miss), the edge node fetches it from the origin server, caches a copy, and returns it to the user. Subsequent requests for the same content from users near that edge node are served from cache.
The key insight is that popular content gets cached after the first request. Every subsequent user in that region is served from the edge — not the origin. A single cache miss pays the latency cost once; all future requests pay the low edge-node cost.
CDN providers use Anycast: the same IP address is announced from multiple PoP locations around the world. The internet’s routing infrastructure automatically delivers traffic to the topologically nearest node advertising that IP. No application-level routing logic is required — proximity routing is handled at the network layer.
Push vs Pull CDNs
There are two fundamental models for how content gets onto CDN edge nodes: push and pull.
Pull CDN
The CDN fetches content from the origin on demand — when a user requests something that isn’t cached. The first request for new content is a cache miss (slightly slower), but all subsequent requests are cache hits. Edge nodes populate themselves lazily.
- Advantage: No upfront work. Content is only fetched when needed. Storage on edge nodes stays lean.
- Disadvantage: The first user in a region after a cache miss (or cache expiry) experiences origin latency.
Best for: Most web applications. Pull CDNs are the dominant model because they are simple to operate — you just point the CDN at your origin and configure TTLs.
Push CDN
You explicitly upload content to CDN edge nodes in advance. The CDN doesn’t fetch from origin — you push files to it. Changes require re-pushing updated files.
- Advantage: Zero cache misses for pre-pushed content. The first user always gets edge-node speed. Origin can be unavailable after the initial push.
- Disadvantage: You must manage what’s on the CDN. Content that is rarely accessed still consumes edge storage. Keeping CDN content in sync with origin changes requires tooling.
Best for: Large static assets that change infrequently and must be globally available from the first request — software downloads, video files, game assets.
| Model | Cache population | First-request latency | Operational complexity | Best for |
|---|---|---|---|---|
| Pull | Lazy (on first miss) | Origin latency on miss | Low | Web apps, APIs, images |
| Push | Explicit upload | Always fast | Higher | Large infrequent files, videos |
CDN Caching
CDNs cache content at edge nodes based on HTTP caching headers from the origin. The most important headers are:
Cache-Control
The primary header for controlling CDN and browser caching behavior:
Cache-Control: public, max-age=86400— cache for 24 hours at both CDN and browserCache-Control: public, s-maxage=3600—s-maxageoverridesmax-agefor shared caches (CDNs); browser may use a different TTLCache-Control: no-store— do not cache anywhereCache-Control: no-cache— revalidate with origin before serving (confusingly named; it does cache)Cache-Control: private— browser may cache, but CDN must not (for user-specific content)
Cache Invalidation at the Edge
When you deploy a new version of a file, CDN edge nodes may still hold the old cached version for the remainder of its TTL. You have two options:
- Cache purge / invalidation: Call the CDN API to immediately remove cached copies of specific URLs. All CDN providers support this (usually within seconds to a few minutes for global propagation).
- Cache-busting via URL versioning: Append a version hash to the file URL (
app.js?v=abc123orapp.abc123.js). The CDN treats it as a new URL and fetches it fresh. The old URL continues to be served until its TTL expires. This is the most reliable technique for static assets.
The standard pattern for static assets is: set a very long TTL (1 year) on the CDN and browser, and use cache-busting URLs with content hashes. You get maximum cache efficiency and the ability to instantly deploy changes. Reserve short TTLs for content where you can’t control the URL (like HTML pages).
Cache Keys
By default, CDNs use the URL (and sometimes the Host header) as the cache key. Two users requesting the same URL get the same cached response. This is correct for public, non-personalized content.
For dynamic content that varies by user properties (language, device type, A/B test group), you can configure the CDN to vary the cache key by additional dimensions — usually by including specific request headers or query parameters. Be careful: a cache key that is too granular defeats the purpose of caching (many unique keys means low hit rate).
Benefits
Reduced Latency
Serving content from an edge node 5ms away vs an origin 200ms away is a 40× improvement. For users on mobile connections with higher round-trip times, the gain is even larger. Perceived page load time, Core Web Vitals scores, and conversion rates all improve.
Reduced Origin Load
A CDN with a 95% hit rate means your origin handles only 5% of the raw request volume. Traffic spikes that would overwhelm an unprotected origin are absorbed by the CDN. You can run a smaller (cheaper) origin fleet.
High Availability
CDN edge nodes continue serving cached content even if the origin is temporarily unavailable (within the cache TTL). Some CDNs support stale-while-revalidate and stale-if-error — serving cached content even after TTL expiry when the origin returns an error. This makes CDN-served content resilient to origin outages.
DDoS Protection
A CDN’s global network can absorb volumetric DDoS attacks by distributing the traffic across hundreds of PoPs. The attack traffic is spread thin and doesn’t reach your origin. Most CDN providers (Cloudflare, Akamai, AWS Shield) include DDoS mitigation as part of their service.
Security (TLS Termination)
CDNs terminate TLS at the edge, close to the user. TLS handshake latency is sensitive to round-trip time — terminating TLS at an edge node 5ms away is far faster than a full handshake to an origin 200ms away. The CDN handles certificate management and renewals. Your origin only needs to trust the CDN’s IP ranges.
Drawbacks
Cost
CDN providers charge for bandwidth, requests, and sometimes storage. For high-traffic services, CDN costs can be significant. However, these costs are usually offset by the savings in origin bandwidth and compute.
Complexity and Debugging
An extra caching layer means more places to check when content appears stale or incorrect. X-Cache: HIT / MISS headers help debug whether a response came from cache or origin. Aggressive caching can cause users to see outdated content if invalidation is not handled carefully.
Dynamic Content Limitations
CDNs are most effective for cacheable content. Highly personalized, authenticated, or rapidly changing responses have a low cache hit rate. For these, the CDN is essentially a pass-through proxy — you pay CDN costs but get little caching benefit. Some CDNs offer edge compute (Cloudflare Workers, Lambda@Edge) to run logic at the edge, which can help with partially-dynamic content.
Geographic Blind Spots
Most CDN providers have excellent coverage in North America, Europe, and East Asia. Coverage in parts of Africa, South America, and Southeast Asia varies. If your users are concentrated in an underserved region, a CDN may not provide the latency improvement you expect — benchmark with your target audience.
| Benefit | Mechanism | Impact |
|---|---|---|
| Lower latency | Edge nodes near users | 40× latency reduction possible |
| Origin offload | High cache hit rate | 95%+ of traffic never hits origin |
| High availability | Stale content served during origin failure | Resilience to origin outages |
| DDoS absorption | Attack traffic distributed across PoPs | Origin shielded from volumetric attacks |
| Faster TLS | Handshake at nearby edge | 100ms+ saved on first connection |
CDN in System Design
In a typical system design interview, CDNs are relevant any time you need to serve content to a geographically distributed user base. Here is how to apply them confidently:
What to Put on a CDN
- Static assets: JavaScript, CSS, images, fonts, videos. These are the primary use case. Set long TTLs with cache-busting URLs.
- Cacheable API responses: Public API endpoints that return the same data for all users (e.g. product catalog, trending items). Use
Cache-Control: s-maxageto control CDN TTL independently of browser TTL. - HTML pages: For content-heavy sites (blogs, news). Use short TTLs (60s–5min) or cache invalidation on publish.
- Media files: Large video and audio files. Use a push CDN or direct upload to CDN storage (e.g. S3 + CloudFront).
What Not to Put on a CDN
- Authenticated API endpoints: Responses are user-specific and cannot be shared. Use
Cache-Control: privateorno-store. - Real-time data: If accuracy within seconds is required (live prices, inventory), a cached copy with any TTL introduces unacceptable staleness.
- Write requests:
POST,PUT,DELETEshould go directly to the origin. Some CDNs can pass them through, but caching semantics don’t apply.
CDN Architecture Pattern
The standard architecture for a globally distributed web service:
- Static assets are built with content-hash filenames and uploaded to CDN (S3 + CloudFront, or similar). TTL = 1 year.
- The HTML entry point (e.g.
index.html) is served from CDN with a short TTL orno-cache(it must always be fresh to pick up new asset hash names). - API requests bypass the CDN and hit origin servers (load balanced, auto-scaled). Or: public API endpoints use CDN with appropriate TTLs.
- Origin servers sit behind the CDN with IP allowlisting — they only accept requests from CDN IP ranges, making direct-origin attacks much harder.
Mention CDNs early when designing any system with a global audience or high read traffic for static content. State what you’re putting on the CDN and why (latency, origin offload). Be ready to discuss cache invalidation strategy (URL versioning for static assets, TTL + manual purge for dynamic content), and what happens if the CDN is unavailable (fallback to origin, stale content policy). Interviewers appreciate specificity: pull CDN with s-maxage headers is far more impressive than “we use a CDN.”