Databases

Indexes

● Intermediate ⏱ 14 min read database

An index is a separate data structure that the database maintains alongside a table, enabling it to find rows that match a query condition without scanning every row. Indexes are the single most effective tuning lever available to a database engineer — a missing index on a 100-million-row table turns a millisecond lookup into a seconds-long full scan. But indexes cost write performance and storage, and adding the wrong index can make things worse. Understanding how indexes work internally is what separates informed index design from guesswork.

What Is an Index?

Without an index, finding all users with email = 'alice@example.com' requires reading every row in the table — a sequential scan (also called a full table scan). For a table with 50 million rows this is catastrophically slow.

An index is a sorted copy of one or more columns, stored in a structure designed for fast lookup. The index stores the indexed column value alongside a pointer (usually the primary key or a physical row location) to the full row. A lookup on an indexed column walks the index structure to find matching values, then follows the pointers to retrieve the full rows — touching a tiny fraction of the table.

Indexes come at a cost:

B-Tree Indexes

The B-tree (Balanced Tree) is the default index type in PostgreSQL, MySQL, SQL Server, and virtually every relational database. Understanding B-trees explains why indexes are fast for some queries and useless for others.

Structure

A B-tree is a self-balancing tree where all leaf nodes are at the same depth. Each node holds multiple keys (column values) and pointers to child nodes or row locations. The tree is always sorted — every key in the left subtree of a node is less than the node’s key; every key in the right subtree is greater.

B-tree index structure: sorted keys at each level with pointers to child nodes and leaf-level row pointers

How a Lookup Works

  1. The query planner starts at the root node.
  2. It compares the search key against the node’s keys and follows the appropriate child pointer.
  3. This continues level by level until a leaf node is reached.
  4. The leaf node contains the matching key(s) and row pointer(s).
  5. The database fetches the full row(s) from the table using the pointers.

A B-tree with height h requires at most h node reads to find a value. For a billion-row table, a typical B-tree has height 3–4, meaning 3–4 I/Os to find any row. This is the magic of logarithmic lookup.

What B-Trees Support Well

⚠️
Functions Break Indexes

Wrapping an indexed column in a function defeats the index. WHERE LOWER(email) = 'alice@example.com' cannot use an index on email because the index stores the original values, not the lowercased ones. The fix: store data normalized (e.g., always lowercase emails on insert), or create an expression index on LOWER(email).

Hash Indexes

Hash indexes use a hash function to map a column value to a bucket. Lookups are O(1) for exact equality. They are useless for range queries — hashing destroys sort order, so there is no way to scan a range of values.

PostgreSQL supports explicit hash indexes (CREATE INDEX ... USING hash). MySQL’s InnoDB engine uses a self-tuning adaptive hash index in memory for frequently accessed B-tree pages. In practice, B-tree indexes satisfy most workloads — use hash indexes only when you have proven equality-only access patterns and need the O(1) benefit.

Composite Indexes

A composite index (also called a multi-column index) covers two or more columns. The index entries are sorted first by the first column, then by the second column within equal first-column values, and so on.

CREATE INDEX idx_orders_user_created ON orders (user_id, created_at);

The Leftmost Prefix Rule

A composite index on (a, b, c) can be used by queries that filter on:

It cannot be used by queries that filter on b alone, or c alone, or b, c. The index can only be entered from the leftmost column.

This is why column order in a composite index matters enormously. Put the most-selective equality column first. Put range-predicate columns last (a range scan on column 2 stops the index from being used for column 3).

💡
Index Column Order Example

For a query WHERE status = 'active' AND created_at > '2024-01-01', put status first (equality predicate), then created_at (range predicate). The index (status, created_at) narrows to all active rows using the equality on status, then scans the range on created_at within that subset. Reversing the order wastes the index on status because the range on created_at prevents the optimizer from skipping to specific status values.

Covering Indexes

A covering index includes all columns that a query needs — the SELECT columns, WHERE columns, and ORDER BY columns. When a query is covered by an index, the database performs an index-only scan: it reads the index and returns results without ever touching the main table rows.

-- Query
SELECT user_id, amount FROM orders WHERE status = 'pending';

-- Covering index: includes all columns referenced in the query
CREATE INDEX idx_orders_status_covering ON orders (status, user_id, amount);

Index-only scans can be dramatically faster than regular index scans because:

The trade-off: wider indexes consume more storage and add more write overhead since every covered column must be maintained in the index.

Index Selectivity

Selectivity is the ratio of distinct values to total rows. High selectivity (many distinct values, like email or UUID) means an index lookup returns very few rows. Low selectivity (few distinct values, like a boolean is_active flag) means a lookup returns many rows — possibly most of the table.

The database query planner estimates selectivity using column statistics (the pg_statistic catalog in PostgreSQL, updated by ANALYZE). It uses this estimate to decide whether to use an index or a sequential scan.

If 80% of orders have status = 'pending', an index on status is nearly useless for that value — a sequential scan is cheaper (sequential I/O is faster than random I/O for large fractions of the table). The same index is very useful when filtering for status = 'refunded' if only 0.1% of orders have that status.

Rule of Thumb

An index is most beneficial when it selects fewer than ~10–20% of table rows. For highly skewed data, partial indexes (see below) can solve the problem by indexing only the selective subset.

Partial and Expression Indexes

Partial Indexes

A partial index covers only a subset of rows defined by a WHERE clause in the index definition.

-- Index only unprocessed orders — the selective, hot subset
CREATE INDEX idx_orders_pending ON orders (created_at) WHERE status = 'pending';

Benefits:

Partial indexes are ideal for: soft-deleted records (WHERE deleted_at IS NULL), queue-like workloads (WHERE status = 'pending'), sparse flags (WHERE is_premium = true).

Expression Indexes

An expression index stores the result of a function or expression rather than the raw column value.

-- Enables case-insensitive email lookup
CREATE INDEX idx_users_email_lower ON users (LOWER(email));

-- Query that uses it
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';

Expression indexes solve the "functions break indexes" problem. They are computed once at index build time and maintained on every write, so reads are fast at the cost of additional write-time computation.

Full-Text Indexes

B-tree and hash indexes operate on exact or prefix values. Full-text search requires matching any word within a text column — a fundamentally different problem.

Inverted indexes are the standard data structure for full-text search. An inverted index maps each word (token) to the list of rows that contain it, along with positional information. To find all rows containing "distributed systems", the database intersects the posting lists for both words.

In PostgreSQL, full-text indexes use the GIN (Generalized Inverted Index) type:

CREATE INDEX idx_articles_body_fts ON articles USING GIN (to_tsvector('english', body));

Dedicated search engines like Elasticsearch and OpenSearch build on inverted indexes but add ranking (TF-IDF, BM25), stemming, synonyms, and distributed sharding — capabilities that outstrip what a relational database’s built-in FTS can offer at scale.

Write Overhead

Every index adds work to every write operation on the table:

OperationIndex cost
INSERTInsert entry into every index on the table
UPDATE (indexed column)Delete old entry and insert new entry in every affected index
UPDATE (non-indexed column)No index cost (index entry unchanged)
DELETEDelete entry from every index on the table

A table with 15 indexes processes a bulk INSERT 15 times slower (from an index maintenance perspective) than a table with no indexes. For write-heavy workloads — event logs, IoT telemetry, append-only tables — aggressive indexing can cripple write throughput. Drop unused indexes before they become a write bottleneck.

Index bloat is also a concern: B-trees do not immediately reclaim space when rows are deleted. Dead index entries accumulate over time, slowing scans and consuming storage. PostgreSQL’s VACUUM reclaims dead tuples; periodic REINDEX or REINDEX CONCURRENTLY rebuilds bloated indexes.

When to Index (and When Not To)

Strong candidates for indexing

Avoid or reconsider indexing when

Reading EXPLAIN Output

The database query planner’s decision-making is visible via EXPLAIN. In PostgreSQL:

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE user_id = 42;

Key plan nodes to recognize:

-- Look for high actual rows vs estimated rows — a sign of stale statistics
-- Run ANALYZE to refresh
ANALYZE orders;

When EXPLAIN ANALYZE shows a Seq Scan on a large table for a query that should use an index, the likely causes are: no index exists, the index doesn’t match the query shape (function on column, wrong column order in composite), or statistics are stale and the planner overestimates selectivity.

Design Considerations

In System Design Interviews

When discussing database performance at scale, mention indexes explicitly. “We’ll index user_id and created_at on the orders table. A composite index (user_id, created_at) serves both the per-user listing query and the date-range filter without needing two separate indexes. We’ll use a partial index on status = ‘pending’ to keep the queue-scan fast as the table grows.” This shows you think about access patterns, not just columns.