Case Studies

Design Uber

● Advanced ⏱ 15 min read case-study

Uber is a real-time marketplace that matches riders and drivers across 10,000 cities. The defining technical challenges are geospatial — tracking millions of moving drivers, finding the closest available driver for any rider, and doing it all in under a second. Uber also introduced surge pricing, which requires near-real-time supply/demand analysis over geographic regions. Designing Uber brings together location tracking, geospatial indexing, real-time communication, and marketplace coordination.

Requirements

Functional:

Riders request a ride from a pickup location to a destination.
System matches the rider to the nearest available driver.
Rider and driver can track each other’s real-time location during the trip.
Fare estimation before the ride; final fare calculation after.
Surge pricing based on local supply/demand imbalance.
Trip history, ratings, and payments.

Non-functional:

Matching latency: a ride request should be matched within 5 seconds.
Location update frequency: drivers send GPS updates every 4 seconds.
High availability: downtime during peak hours (Friday night) is extremely costly.
Consistency: a driver should not be matched to two riders simultaneously (strong consistency required for dispatch).

Capacity Estimation

Daily trips: 15M
Peak trips per second: ~5,000 concurrent trip requests
Active drivers: 5M globally
Location updates: 5M drivers × 1 update/4s = 1.25M location updates/second
Riders actively tracking driver: 500K concurrent websocket connections

Location storage: 5M × 1 update/4s × 50 bytes = 62.5 MB/s
Only current location needs to be queryable fast — history can be async

Driver Location Tracking

Driver location is the foundation of everything — matching, ETAs, and live tracking all depend on having accurate, fresh driver positions.

Location update pipeline:

Driver app sends GPS coordinates every 4 seconds via WebSocket or HTTP POST.
Location update hits a location service, which writes to Redis: driver:{driver_id} → {lat, lng, timestamp, status} with a TTL of 30 seconds (if a driver stops sending updates, they age out of the active pool).
The location is also indexed in a geospatial index for proximity queries.
For trips in progress, the location is forwarded to the rider’s WebSocket connection for live tracking.

Geospatial indexing for proximity: To find the nearest driver to a pickup location, the system needs a geospatial index. Options:

Redis GEO: GEOADD drivers lng lat driver_id and GEORADIUS lat lng 1km. Redis implements GEO using geohash-scored sorted sets. Fast, in-memory, supports radius queries. Works well for Uber’s use case — all driver locations fit in memory (5M drivers × ~50 bytes = ~250MB).
Geohash partitioning: Partition drivers by geohash cell. Each cell covers a fixed area (e.g., 5km × 5km at precision 5). A proximity search queries the center cell and its 8 neighbors. Each cell is a separate Redis key — queries fan out to 9 cells and merge results. This distributes load across Redis nodes.
S2 geometry (Google): Uber uses S2 cells, which partition Earth’s surface into a hierarchical grid of square cells. S2 is more uniform than geohash (avoids distortion near poles) and supports efficient range queries. Uber’s Hexagonal Hierarchical Spatial Index (H3) is their open-sourced geospatial library built on this principle.

Uber architecture: location ingestion, geospatial indexing, matching service, and real-time trip tracking

Ride Matching

When a rider requests a ride, the matching service finds the best available driver:

Rider submits request with pickup location and desired vehicle type.
Matching service queries the geospatial index for available drivers within radius R (starting at 1km, expanding if needed).
For each candidate driver, compute the estimated time of arrival (ETA) from the driver’s current location to the pickup — not straight-line distance, but actual drive time on the road network.
Rank candidates by ETA (and optionally other factors: rating, acceptance rate, trip history with the rider).
Offer the ride to the top-ranked driver. If the driver accepts within N seconds, the match is confirmed. If they decline or time out, offer to the next candidate.

Preventing double-dispatch: Two riders must not be matched to the same driver simultaneously. The matching service uses an optimistic locking pattern: when offering a ride to a driver, it atomically sets a driver_status = offered flag with a TTL (e.g., 10 seconds). If the driver accepts, status transitions to on_trip. If another ride offer tries to set the flag while it’s already offered, it fails — the driver is skipped. Redis atomic operations (SET NX EX) implement this without distributed locking overhead.

Dispatch & Trip Lifecycle

Once matched, the trip moves through a state machine:

REQUESTED → MATCHED → DRIVER_EN_ROUTE → ARRIVED → IN_PROGRESS → COMPLETED
                                                               ↘ CANCELLED

Each state transition is persisted to the trip database (Schemaless — Uber’s MySQL-backed distributed datastore) and published to a Kafka topic. Services that need trip events (billing, notifications, driver payout) subscribe to the topic independently.

Real-time tracking during the trip: Both rider and driver apps maintain WebSocket connections to the trip service. As the driver moves, their location updates are forwarded to the rider’s WebSocket. The driver sees the rider’s pickup pin; the rider sees the driver’s moving icon. Connection management uses the same chat-server-style architecture: a connection registry maps user ID → WebSocket server, and servers forward location events between each other.

Surge Pricing

Surge pricing (dynamic pricing multipliers) increases fares when demand exceeds supply in a geographic area. It serves two purposes: incentivizes more drivers to go online, and reduces demand by discouraging non-urgent trips.

Computing surge:

Divide the city into geographic cells (H3 hexagons, typically ~1km diameter).
For each cell, track: active drivers, open ride requests, historical conversion rate.
Compute supply-demand ratio per cell: surge_factor = f(open_requests / available_drivers). When requests far outnumber drivers, the multiplier rises.
Smooth over time and adjacent cells (a 3× surge in one cell with 1× in the adjacent cell would send drivers a block away).
Apply the multiplier to fare estimates in the affected cells. Show the surge map and multiplier to both riders and drivers in real time.

The surge computation runs as a streaming job (Kafka + Flink) — processing location events and ride requests continuously, updating cell surge values every 30–60 seconds. Results are cached in Redis per cell for fast read access during fare estimation.

Maps & ETA

Uber needs accurate ETAs for matching, fare estimation, and rider/driver communication. Computing ETAs requires a road network graph and real-time traffic data.

Road network: Stored as a directed graph — nodes are intersections, edges are road segments with weights (travel time, distance, turn penalties). The graph is partitioned geographically for efficient shortest-path computation. Dijkstra or A* over the local partition gives the route.
Real-time traffic: Uber uses GPS traces from its own driver fleet as traffic probes. Millions of driver position updates per hour, aggregated per road segment, give near-real-time travel speed. These update edge weights in the road graph.
Historical patterns: Historical speed data by time of day and day of week improves ETA accuracy. A segment that’s fast at 10am is slow at 5pm — historical models capture this pattern.
ETA serving: Pre-computed ETAs between common locations are cached. On-demand ETAs are computed with a constrained graph search (limited to a bounding box around start and end points to avoid global graph traversal).

Scaling Considerations

City-level partitioning. Uber’s dispatch and matching are city-scoped — a driver in London is never matched to a rider in Paris. This partitions the problem naturally: each city’s matching and location services are independent. Cross-city coordination is rare (airport pickup handoff between city partitions). City-level sharding limits the blast radius of failures and allows independent scaling per city.
Ring-based architecture for location. At peak, 1.25M location updates/second exceed what a single Redis instance can handle. Uber distributes location data across a ring of Redis nodes using consistent hashing on geohash prefix. Proximity queries fan out to the relevant nodes and merge. Adding nodes to the ring redistributes load gradually.
DISCO (Dispatch System). Uber’s actual dispatch system (DISCO) uses a supply-aware probabilistic matching model — instead of greedily assigning the nearest driver, it models the probability of each driver accepting a ride and balances the pool to minimize total wait time globally. This requires solving a bipartite matching problem over the city-level driver-rider graph, solved continuously with approximation algorithms.
Payment processing isolation. Payments are the most consistency-sensitive operation in the system. Trip fare calculation and payment authorization run in a separate payment service with synchronous database writes (PostgreSQL for ACID guarantees), isolated from the eventually-consistent location and matching infrastructure. Outbox pattern ensures payment events reach downstream processors exactly once.
Graceful degradation. If the surge pricing service is slow, serve the last-known surge value rather than blocking the ride request. If ETA computation is slow, show a range (“3–7 min”) rather than timing out. Uber’s services have explicitly defined degraded modes — real-time systems that fail hard on any dependency being slow are brittle in production.