Databases & DBMS
Every serious application eventually confronts the same problem: where to put the data, and how to get it back reliably under load. A database is the answer, and a DBMS is the machinery that makes it work. Understanding what a database management system actually does — underneath the SQL queries or API calls — puts you in a much better position to make good architectural decisions about which database to use, how to configure it, and what can go wrong at scale.
What Is a Database?
A database is an organized collection of structured data stored so that it can be easily accessed, managed, and updated. The key word is organized: raw files sitting on a disk are not a database. A database imposes structure on the data — a schema, relationships, indexes — and provides mechanisms for querying that structure efficiently.
Databases are distinct from the application code that uses them. The application is stateless between requests (or should be); the database is where persistent state lives. This separation is fundamental: it allows multiple application instances to share the same data, and it allows the data to outlive any particular instance of the application.
In practice, engineers use “database” loosely to cover any system that persists data — Redis, S3, Kafka, Elasticsearch. In a precise technical sense, a database provides structured storage with query capabilities and transactional guarantees. When this guide says “database,” it means a system that offers at minimum: persistent storage, a query interface, and some form of consistency guarantee. The guides on NoSQL, SQL, replication, and sharding that follow explore the full landscape.
What Is a DBMS?
A Database Management System (DBMS) is the software layer that sits between applications and the raw storage. It manages the physical storage of data, enforces schema and constraints, processes queries, handles concurrent access from multiple clients, and provides transactional guarantees. When you run PostgreSQL, MySQL, MongoDB, or Cassandra, you are running a DBMS.
The distinction between the database (the data and its schema) and the DBMS (the software) is often blurred in conversation, but it matters: you can migrate the same data from one DBMS to another, and the same DBMS can host many different databases.
What a DBMS Gives You
- Data abstraction: Applications deal with tables, documents, or key-value pairs — not raw disk sectors. The DBMS handles physical layout.
- Query processing: You describe what data you want (SQL, query API); the DBMS figures out how to retrieve it efficiently using indexes and query plans.
- Concurrency control: Multiple clients read and write simultaneously without corrupting each other’s work. The DBMS enforces isolation using locks, MVCC, or optimistic concurrency.
- Durability: Committed transactions survive crashes. The DBMS writes ahead to a log (WAL) before modifying data pages, so it can recover to a consistent state after a restart.
- Integrity enforcement: Constraints (foreign keys, unique indexes, NOT NULL) are checked by the DBMS, not left to the application layer.
- Access control: Users and roles with permissions. The DBMS enforces who can read or write which data.
DBMS Components
A DBMS is not a single monolithic program. It is composed of several subsystems that each handle a different concern. Understanding them helps you reason about where bottlenecks arise and what tuning knobs matter.
Query Processor
Receives queries (SQL or otherwise), parses them into an abstract syntax tree, validates them against the schema, and generates a query plan — a step-by-step execution strategy. The query optimizer evaluates multiple possible plans (which index to use, in which order to join tables) and selects the one with the lowest estimated cost. Poor query plans are the most common cause of unexpected database slowness.
Storage Engine
The storage engine is responsible for how data is physically laid out on disk. It manages data files, indexes, and the buffer pool (the in-memory page cache). Different storage engines make different trade-offs:
- B-tree engines (InnoDB, the default in MySQL; most relational DBs): Good for reads and point lookups. Row data is stored in the B-tree leaf pages ordered by primary key, enabling efficient range scans.
- LSM-tree engines (RocksDB, used by Cassandra, TiKV, many NoSQL stores): Optimized for write-heavy workloads. Writes go to an in-memory structure (memtable) first, then are flushed to disk as sorted files (SSTables). Reads may be slower due to needing to merge multiple levels.
- Column-store engines (Parquet, Redshift, ClickHouse): Store data column-by-column rather than row-by-row. Extremely fast for analytical queries that aggregate over a subset of columns; poor for OLTP workloads that update individual rows frequently.
Transaction Manager
Coordinates the execution of transactions to ensure ACID properties. Uses a combination of:
- Locking: Readers and writers acquire locks on rows, pages, or tables to prevent conflicting concurrent access.
- MVCC (Multi-Version Concurrency Control): Keeps multiple versions of a row so readers see a consistent snapshot without blocking writers. PostgreSQL and InnoDB both use MVCC. This is why long-running read transactions can cause bloat in PostgreSQL (old row versions accumulate until vacuumed).
Write-Ahead Log (WAL)
Before modifying any data page on disk, the DBMS writes a record of the intended change to the WAL (also called the redo log or journal). If the system crashes mid-write, the DBMS can replay the WAL on restart to recover committed transactions and discard uncommitted ones. The WAL is also the foundation for replication — replicas consume the primary’s WAL stream to stay up to date.
Buffer Pool / Page Cache
Databases work in pages (typically 4KB or 8KB). The buffer pool is an in-memory cache of recently accessed pages. Reads that hit the buffer pool return in microseconds; reads that miss require a disk I/O, which is 100–1000× slower. Database performance is strongly correlated with buffer pool hit rate — which is why giving a database server enough RAM is so important. A rule of thumb: the working set (the pages frequently accessed) should fit in the buffer pool.
Types of Databases
The landscape of database systems is wide. Different workloads need fundamentally different data models and access patterns, and no single database type serves all of them well.
Relational Databases (RDBMS)
Data is organized into tables with rows and columns. Relationships between tables are expressed via foreign keys. Queries are written in SQL. The schema is defined up front and enforced strictly. Strong ACID guarantees make relational databases the default choice for any workload where correctness matters more than extreme scale.
Examples: PostgreSQL, MySQL, SQLite, Oracle, SQL Server, CockroachDB (distributed SQL)
Best for: Financial transactions, e-commerce orders, user accounts, anything with complex relationships and strong consistency requirements.
Document Databases
Data is stored as semi-structured documents (typically JSON or BSON). Each document is self-contained and can have different fields. No joins required — related data is embedded in the document. Schema is flexible; documents in the same collection need not have the same shape.
Examples: MongoDB, Couchbase, Firestore
Best for: Content management, catalogs, user profiles, event data with variable structure.
Key-Value Stores
The simplest data model: every value is identified by a unique key. No schema. Operations are GET, SET, DELETE. Extremely fast at what they do. Cannot express complex queries or relationships natively.
Examples: Redis, DynamoDB (also supports documents), Memcached, etcd
Best for: Session storage, caching, leaderboards, rate limiting, distributed locks, real-time counters.
Wide-Column Stores
Data is organized into rows and columns like a relational database, but columns are grouped into “column families” and each row can have a different set of columns. Designed for very high write throughput and petabyte-scale datasets distributed across many nodes.
Examples: Apache Cassandra, HBase, Google Bigtable
Best for: Time-series data, IoT telemetry, activity feeds, anything with high write volume and known access patterns.
Search Engines
Specialized for full-text search and relevance ranking. Data is indexed as inverted indexes (word → list of documents containing that word). Not suitable as a primary database — typically used alongside a primary store to power search features.
Examples: Elasticsearch, OpenSearch, Apache Solr, Typesense, Meilisearch
Best for: Product search, log search, full-text content search, autocomplete.
Graph Databases
Data is modeled as nodes and edges in a graph. Optimized for traversing relationships many hops deep, which is extremely slow with SQL joins. The query language (Cypher, Gremlin) expresses graph traversal naturally.
Examples: Neo4j, Amazon Neptune, JanusGraph
Best for: Social networks, fraud detection, recommendation engines, knowledge graphs.
Time-Series Databases
Optimized for storing and querying sequences of timestamped measurements. Data arrives in append-only order. Queries aggregate over time windows. Heavy compression is applied because successive values are often similar.
Examples: InfluxDB, TimescaleDB (PostgreSQL extension), Prometheus (metrics storage), OpenTSDB
Best for: Application metrics, infrastructure monitoring, IoT sensor data, financial tick data.
Large systems rarely use a single database type. A common pattern is: a relational database for core transactional data, a cache (Redis) for hot reads, an Elasticsearch cluster for search, and a time-series database for metrics. Each database type handles what it’s best at. The cost is operational complexity — multiple systems to monitor, backup, and tune. Start with one database and add others only when you have a clear performance problem that the existing database genuinely cannot solve.
Key Challenges
Databases are a solved problem for most small-to-medium systems. The hard problems emerge at scale. Here are the challenges that motivate the more advanced topics in this chapter.
Scalability
A single database server has a ceiling — bounded by CPU, RAM, disk IOPS, and network bandwidth. As data volume and query throughput grow, you hit that ceiling. The standard escalation path:
- Vertical scaling: Add more RAM, faster CPUs, faster SSDs. This works surprisingly far — a high-end server with 768GB RAM and NVMe SSDs can handle millions of queries per second. But it has a cost ceiling and a single point of failure.
- Read replicas: Distribute read load across multiple replicas while writes go to a single primary. Most production systems do this first.
- Sharding: Partition data horizontally across multiple database servers, each owning a slice of the dataset. Dramatically increases capacity but adds significant complexity (cross-shard queries, distributed transactions, resharding).
Reliability and Durability
Disks fail. Machines crash. Power cuts happen. A database must survive hardware failures without losing committed data:
- Replication: Keep copies of the data on multiple machines. If the primary fails, promote a replica.
- WAL and checkpointing: The write-ahead log ensures committed transactions are recoverable even if the server crashes between checkpoints.
- Backups: Point-in-time backups to separate storage (object storage, offsite). Replication is not a substitute for backups — it replicates corruption and accidental deletes too.
Consistency
When data is replicated across multiple nodes, keeping all replicas in sync requires carefully chosen trade-offs. A write committed on the primary may not yet be visible on replicas (replication lag). A system relying on reading from replicas may serve stale data. The CAP theorem and PACELC theorem formalize these trade-offs — covered in detail in later guides.
Query Performance
A query that runs in 2ms in development can run in 20 seconds in production with 100M rows. The most common causes:
- Missing indexes — the query does a full table scan instead of an index lookup.
- N+1 query problem — application code makes N database queries in a loop instead of one batched query.
- Unoptimized joins — joining large tables without appropriate indexes causes nested loop joins over millions of rows.
- Contention on hot rows — many transactions locking the same rows (e.g., a global counter) serialize all writes through a single point.
Data Integrity
Without constraints and transactions, data integrity must be enforced entirely in application code — and it eventually drifts. Relational databases enforce integrity at the storage layer via foreign key constraints, unique indexes, NOT NULL, check constraints, and transactional atomicity. NoSQL databases that sacrifice constraints for flexibility push this burden onto the application.
Databases in System Design
In a system design interview or architectural review, choosing the right database is one of the most impactful decisions you make. The wrong choice becomes very expensive to undo.
Decision Framework
- What is the primary data model? Structured tabular data with relationships → relational. Hierarchical or variable-structure documents → document store. Simple key-based lookups → key-value. Graph traversal → graph database.
- What are the consistency requirements? Financial transactions, inventory → strong ACID guarantees, relational or distributed SQL. User feeds, activity logs where eventual consistency is acceptable → wider choice including NoSQL.
- What is the read/write ratio? Read-heavy → add read replicas, consider caching. Write-heavy with high throughput → consider LSM-based stores (Cassandra, RocksDB). Mixed → B-tree relational is a solid default.
- What scale do you need? Start with a single relational database. Add read replicas when read throughput grows. Shard when a single primary becomes a bottleneck. Most systems never need to shard.
- What are your query patterns? Known, limited query patterns → NoSQL works well. Ad-hoc complex queries → relational database with a query optimizer handles this better.
Always justify your database choice. “We use PostgreSQL because we have complex transactional requirements and need strong consistency” is far better than “we use a relational database.” When asked about scaling, walk through the escalation path — vertical scaling first, then read replicas, then sharding if truly needed. Mention that you’d add caching (Redis) in front of any hot-read database layer. If the interviewer pushes on NoSQL vs. SQL, explain the trade-off rather than declaring a winner.