Case Studies

Design Netflix

● Advanced ⏱ 16 min read case-study

Netflix serves 200 million subscribers in 190 countries, delivering 15% of global internet bandwidth during peak hours. Its technical story is one of the most documented in engineering — Netflix pioneered the microservices pattern, invented chaos engineering, and built Open Connect, one of the world’s largest private CDNs. Designing Netflix means designing for massive read throughput, global distribution, and graceful degradation when services fail.

Requirements

Functional:

Non-functional:

Capacity Estimation

Monthly active users: 200M
Peak concurrent streams: ~15M
Average stream bitrate: 5 Mbps (mix of SD/HD/4K)
Peak streaming bandwidth: 15M × 5 Mbps = 75 Tbps

Content library: ~36,000 titles
Encoded versions per title: ~1,200 (device types × resolutions × codecs × audio tracks)
Average encoded size per version: 5 GB
Total storage: 36,000 × 1,200 × 5 GB = ~216 PB

Video Ingestion & Transcoding

Before a video can be streamed, Netflix processes it through an extensive encoding pipeline. A single movie arrives as a high-quality master file (often several hundred GB) and must be transformed into thousands of versions for every device and network condition.

Transcoding pipeline:

  1. Ingest: Studio uploads the master file to Netflix’s ingest servers. The file is validated, checksummed, and stored in blob storage (S3).
  2. Scene analysis: Netflix’s proprietary encoding pipeline (Dynamic Optimizer) analyzes the video scene by scene. Complex scenes (action, high motion) get higher bitrates; simple scenes (talking heads, static backgrounds) get lower bitrates. This per-scene optimization reduces file size by 20–40% vs fixed bitrate encoding.
  3. Parallel transcoding: The video is split into chunks (typically 2–4 minutes each). Each chunk is transcoded in parallel across a cluster of worker machines. A 2-hour movie might produce ~60 chunks, each encoded into ~20 resolutions × 5 codecs = ~1,200 parallel jobs. Total encoding time: hours rather than days.
  4. Audio and subtitle tracks: Audio is encoded separately in multiple languages and formats (Dolby Atmos, stereo). Subtitle tracks are timed text files (WebVTT). All are stored alongside the video chunks.
  5. Quality control: Automated quality metrics (VMAF — Video Multimethod Assessment Fusion, Netflix’s own metric) check each encoded chunk. Chunks below threshold are re-encoded.
  6. Packaging: Video and audio chunks are packaged into streaming formats (DASH and HLS) and pushed to the CDN.
Netflix architecture: transcoding pipeline, Open Connect CDN distribution, and streaming path to client

CDN Strategy

Netflix built Open Connect — a private CDN deployed inside ISP networks worldwide. Rather than routing video traffic over the public internet (expensive, slow), Netflix places its own servers inside Internet Service Providers’ data centers. When a Comcast subscriber in Chicago plays Stranger Things, the video comes from an Open Connect Appliance (OCA) inside Comcast’s Chicago network — zero hops over the open internet.

How Open Connect works:

Streaming Protocol

Netflix uses DASH (Dynamic Adaptive Streaming over HTTP) for most platforms and HLS (HTTP Live Streaming) for Apple devices. Both work similarly:

Netflix’s client-side ABR algorithm (BOLA — Buffer Occupancy based Lyapunov Algorithm) optimizes for buffer occupancy, not just current download speed. It prefers keeping the buffer full even at lower quality over high quality with an empty buffer — because a buffer underrun (playback stall) is worse UX than a slightly lower resolution.

Recommendation System

Netflix’s recommendation engine drives 80% of content discovered on the platform. The recommendation system runs as a complex pipeline of models:

Microservices Architecture

Netflix pioneered the microservices pattern at scale. Its backend runs 700+ microservices on AWS. Key architectural decisions:

Scaling Considerations