Case Studies

Design a URL Shortener

● Advanced ⏱ 14 min read case-study

A URL shortener converts a long URL into a short alias (e.g., bit.ly/3xKj9p) that redirects to the original. It’s a classic system design interview question because it touches encoding, database design, caching, and high-throughput reads — all in a system that’s easy to reason about but has non-obvious scaling challenges. Services like Bit.ly, TinyURL, and t.co (Twitter’s shortener) handle billions of redirects per day.

Requirements

Functional:

Given a long URL, generate a unique short URL (alias).
Given a short URL, redirect to the original long URL.
Aliases should be as short as possible (6–8 characters).
Users can optionally specify a custom alias.
URLs can have an optional expiry time.

Non-functional:

Redirects must be fast — high availability, low latency (<100ms p99).
The system is read-heavy: many more redirects than URL creations (100:1 ratio typical).
Short URLs must be unique — no collisions.
The system should be horizontally scalable.

Capacity Estimation

Start with rough numbers to size the system:

Write throughput: 100 new URLs/second
Read throughput: 10,000 redirects/second (100:1 read:write)

Storage per URL record: ~500 bytes (long URL + short code + metadata)
URLs created per day: 100/s × 86,400s = ~8.6M URLs/day
URLs created per year: ~3B URLs/year
Storage per year: 3B × 500B = ~1.5 TB/year

Bandwidth (reads): 10,000 req/s × 500B = ~5 MB/s (manageable)

Key insight: 10,000 redirects/second is high — every redirect requires a database or cache lookup. Caching is essential. A 6-character base-62 alias gives 62^6 = ~56 billion unique codes — more than enough for years.

API Design

POST /api/urls
Body: { "long_url": "https://...", "custom_alias": "mylink", "expires_at": "2026-01-01" }
Response: { "short_url": "https://short.ly/abc123", "alias": "abc123" }

GET /{alias}
Response: HTTP 301/302 redirect to long_url
          HTTP 404 if alias not found or expired

GET /api/urls/{alias}
Response: { "alias": "...", "long_url": "...", "created_at": "...", "clicks": 42 }

301 vs 302 redirect: A 301 (Moved Permanently) tells browsers to cache the redirect permanently — future clicks skip your server entirely. A 302 (Found / Moved Temporarily) forces every redirect through your server. For analytics (counting clicks), use 302. For performance and reduced server load, use 301. Most shorteners use 302 by default to track clicks, with an option for 301 for performance-sensitive use cases.

Encoding Strategy

The short alias is the critical design decision. Several approaches:

Option 1: MD5/SHA hash of the long URL

alias = base62_encode(md5(long_url))[:6]

Problem: same URL always produces the same alias (good for deduplication, but collisions occur when truncating to 6 chars). Different URLs may hash to the same 6-character prefix — collision handling required.

Option 2: Auto-increment counter + base-62 encoding

id = next_id_from_counter_service()
alias = base62_encode(id)  # e.g., id=125 → "cb"

Base-62 uses characters [0-9, a-z, A-Z]. ID 1 → “1”, ID 62 → “10”, ID 3844 → “100”. A 6-char base-62 string covers IDs up to 62^6 ≈ 56 billion. Short, predictable length. Problem: a single counter is a bottleneck; counter value reveals total URL count (enumerable).

Option 3: Distributed ID generation (Snowflake-style)

Generate a 64-bit unique ID using timestamp + machine ID + sequence (like Twitter’s Snowflake). Encode in base-62. IDs are globally unique across multiple servers without coordination. IDs are sortable by time. This is the recommended approach for high-throughput systems.

Option 4: Random alias

alias = random_string(6, charset=BASE62)

Simple. Must check for collision in the database before confirming. With 56 billion possible aliases and low fill rate, collision probability is negligible at first, but grows as the namespace fills (birthday problem). Acceptable for low-scale systems.

URL shortener architecture: write path (alias generation and storage) and read path (cache-first redirect)

Database Design

The core table is simple:

CREATE TABLE urls (
  alias       VARCHAR(8)   PRIMARY KEY,
  long_url    TEXT         NOT NULL,
  user_id     BIGINT,
  created_at  TIMESTAMP    NOT NULL DEFAULT NOW(),
  expires_at  TIMESTAMP,
  click_count BIGINT       DEFAULT 0
);

CREATE INDEX idx_urls_user ON urls(user_id);
CREATE INDEX idx_urls_expires ON urls(expires_at) WHERE expires_at IS NOT NULL;

Database choice: The access pattern is almost entirely key-value lookups by alias. A relational database works fine for write operations; a key-value store (Redis, DynamoDB, Cassandra) can serve reads with lower latency. Common approach: write to PostgreSQL, cache in Redis with a short TTL.

Expiry handling: Expired URLs can be filtered in the read path (WHERE expires_at IS NULL OR expires_at > NOW()) or cleaned up with a background job. Don’t do expiry cleanup on the hot read path — add a conditional check and let a background cleaner purge old rows.

Redirect Architecture

The read path is the critical path. Every click on a short URL must:

Look up the alias → long URL.
Return a redirect response.

Without caching: at 10,000 req/s, the database handles 10,000 reads/second — feasible for a well-indexed PostgreSQL or MySQL instance, but fragile under spikes.

With Redis caching:

GET /{alias}:
  long_url = redis.get(alias)
  if not long_url:
    long_url = db.query("SELECT long_url FROM urls WHERE alias = ? AND ...", alias)
    if long_url:
      redis.set(alias, long_url, ttl=3600)  // cache for 1 hour
  if long_url:
    return redirect(long_url, status=302)
  return 404

With a hot cache, 99%+ of redirects are served from Redis at sub-millisecond latency. The database only handles cache misses (new or rarely accessed URLs).

Cache eviction: URLs accessed frequently stay cached; rarely accessed URLs expire. With a 1-hour TTL, a URL visited once per hour stays warm; a URL that goes viral gets cached after the first miss and stays warm.

Analytics

Click counting and analytics are a separate concern from the redirect path. Don’t increment a database counter on every redirect — it adds write latency to the hot read path and creates contention.

Better approaches:

Async event stream: On each redirect, publish a click event (alias, timestamp, IP, user-agent, referrer) to a message queue (Kafka, SQS). A separate analytics service consumes the stream, aggregates counts, and writes to an analytics store. The redirect itself returns instantly.
Redis counter: INCR alias:clicks in Redis on each redirect (atomic, very fast). Periodically flush Redis counters to the database in batch. Good for real-time click counts without database write pressure.
Log-based analytics: Let your load balancer or CDN log all requests. Process logs offline (hourly, daily) to aggregate click counts. Simplest to implement; not real-time.

Scaling Considerations

Database sharding by alias prefix. At billions of URLs, a single database instance becomes a bottleneck. Shard by the first character of the alias — 62 shards, each handling ~1/62 of the URL space. Or use a consistent hash on the alias to distribute across N shards. The alias-based lookup maps directly to a shard without a routing table.
CDN for redirect caching. A CDN (Cloudflare, Fastly) can cache 301 redirects at the edge. A click from Tokyo serves the redirect from a Tokyo edge node, not your origin. 302 redirects are generally not cached by CDNs — you need 301 for CDN caching. Trade-off: CDN-cached 301s don’t count as click events on your servers.
Custom domain support. Enterprise users want branded short domains (go.company.com instead of bit.ly). Each custom domain maps to a namespace in your system. Use wildcard TLS certificates or per-domain certificates (automated with Let’s Encrypt). Route custom domain traffic through the same redirect infrastructure.
Abuse prevention. URL shorteners are abused to hide phishing and malware links. Rate-limit URL creation per IP and per account. Integrate with Google Safe Browsing or similar to scan submitted URLs. Allow users to report abusive links. Block known malicious domains at submission time.
Pre-generate alias pools. For very high write throughput, a background service pre-generates alias IDs and stores them in a pool (Redis list). The write path pops an alias from the pool instead of generating one on the fly — removes the counter service from the critical write path. Replenish the pool asynchronously.