Databases

ACID and BASE

● Intermediate ⏱ 15 min read database

ACID and BASE are two opposing consistency models for database systems. ACID (Atomicity, Consistency, Isolation, Durability) describes the guarantees provided by traditional relational databases for transactions. BASE (Basically Available, Soft state, Eventually consistent) describes the relaxed consistency model adopted by distributed NoSQL systems that prioritize availability and performance over strict correctness. Understanding both is essential for choosing the right database technology and reasoning about the behavior of distributed systems.

ACID Properties

ACID was formalized in the 1970s–80s to describe what it means for a database transaction to be reliably processed. A transaction is a unit of work that groups multiple reads and writes into an atomic operation. Either all of it commits or none of it does.

ACID is not a binary feature — different databases implement the four properties with different strengths and trade-offs. The acronym is a useful checklist, but the real work is understanding each property’s implications.

Atomicity

A transaction is atomic: it either completes entirely or has no effect at all. There is no partial success.

Consider a bank transfer: debit $100 from Alice’s account, credit $100 to Bob’s account. If the debit succeeds but the credit fails (due to a crash, network error, or constraint violation), the $100 must disappear. Atomicity ensures this cannot happen — either both operations commit or both are rolled back.

Implementation: databases implement atomicity via an undo log (or rollback segment). Before modifying a page, the database writes the original value to the undo log. If the transaction rolls back or the database crashes before commit, the original values are restored from the log.

BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE user_id = 'alice';
  UPDATE accounts SET balance = balance + 100 WHERE user_id = 'bob';
COMMIT;
-- If any statement fails, ROLLBACK is automatic.
-- Both changes are applied or neither is.

Consistency

A transaction must bring the database from one valid state to another valid state. All defined rules — constraints, foreign keys, triggers, check constraints, unique constraints — must hold before and after the transaction.

Consistency is partially the database’s job (enforcing constraints) and partially the application’s job (writing semantically correct logic). The database cannot enforce business invariants it doesn’t know about: if the rule is “an account balance must never go negative”, the database can enforce this with a CHECK (balance >= 0) constraint, but only if the developer adds it.

💡
Consistency is the Weakest ACID Property

Consistency in ACID is sometimes described as “the C that doesn’t do anything.” It simply means the database won’t allow a transaction that violates defined constraints to commit. It doesn’t mean the database magically enforces all business rules — only the ones explicitly coded into the schema. The other three properties (A, I, D) are mechanical guarantees; C depends heavily on the application.

Isolation

Isolation ensures that concurrent transactions behave as if they executed serially — each transaction sees a consistent snapshot of the database, unaffected by the partial results of other in-flight transactions.

Without isolation, concurrent transactions interfere with each other, causing:

Isolation Levels

Full isolation (serializable execution) is expensive. SQL defines four standard isolation levels as a trade-off between correctness and performance:

LevelDirty ReadNon-Repeatable ReadPhantom ReadWrite Skew
Read Uncommitted✔ possible✔ possible✔ possible✔ possible
Read Committed✘ prevented✔ possible✔ possible✔ possible
Repeatable Read✘ prevented✘ prevented✔ possible*✔ possible
Serializable✘ prevented✘ prevented✘ prevented✘ prevented

* PostgreSQL’s Repeatable Read uses MVCC snapshots, which also prevents phantom reads in practice.

Read Committed (most common default)

Each statement sees only data committed before it began. A transaction can see different committed data across its statements. This is the default in PostgreSQL and Oracle, and is appropriate for most OLTP workloads where non-repeatable reads are acceptable.

Repeatable Read

The transaction sees a snapshot taken at the start of the transaction. Re-reading the same row always returns the same value. This is the default in MySQL InnoDB. It prevents the most common anomalies without the overhead of full serializability.

Serializable

The database guarantees that the outcome is identical to some serial execution of the transactions. This is the strongest guarantee. PostgreSQL implements Serializable Snapshot Isolation (SSI), which detects dangerous transaction orderings and aborts one of the conflicting transactions. Serializable adds latency and increases abort rates under contention.

MVCC: How Databases Implement Isolation

Most modern databases implement isolation via Multi-Version Concurrency Control (MVCC). Instead of locking rows for reads, the database keeps multiple versions of each row. Writers create new row versions without blocking readers. Readers see a consistent snapshot of row versions committed before their transaction began.

MVCC enables high read concurrency: readers never block writers, writers never block readers. The cost: old row versions accumulate and must be cleaned up. In PostgreSQL, this is done by VACUUM. In MySQL InnoDB, the undo log is cleaned up by the purge thread.

Durability

Once a transaction commits, its changes are permanent — they survive crashes, power failures, and OS restarts. The database must ensure that committed data is on durable storage before acknowledging the commit.

Implementation: the Write-Ahead Log (WAL). Before any data page is modified, the change is written to the WAL on disk. If the database crashes, recovery replays the WAL to restore all committed transactions. The data pages may not have been written to disk yet (they may be in the buffer pool), but the WAL records the changes, so no committed data is lost.

The critical operation is fsync() — flushing the WAL to the physical disk. Without fsync, the WAL may be in the OS page cache and lost on power failure. Some databases (and some cloud configurations) disable fsync for performance, accepting the risk of data loss on crash. This should be a deliberate, documented decision.

🚨
Disabling fsync is Dangerous

Several well-known incidents — including GitLab’s 2017 data loss event — involved databases configured with fsync disabled. A power failure or kernel panic with fsync off can corrupt the database or silently lose committed transactions. Never disable fsync in production without understanding the risk and having tested recovery procedures.

BASE Properties

BASE was coined by Eric Brewer (of CAP Theorem fame) as an alternative consistency model for distributed systems that cannot provide ACID guarantees without sacrificing availability or performance.

BASE describes systems that favor availability over consistency, accepting that data may be temporarily inconsistent across nodes, with the guarantee that it will eventually converge.

Basically Available

The system guarantees a response to every request, but the response may not be the most current data. Even during partial failures — a node going down, a network partition — the system continues to respond rather than refusing requests. The response might be stale, incomplete, or come from a replica that hasn’t received the latest writes.

This contrasts with ACID availability under Serializable isolation, where a database may block or timeout a transaction rather than return potentially inconsistent data.

Soft State

The state of the system may change over time, even without new input from clients, as replicas receive updates that were delayed or reordered. There is no guarantee that reading the same key twice returns the same value — a replica may have caught up with a write between the two reads.

Soft state reflects the reality of distributed systems: data propagates asynchronously, and at any point in time, different nodes may have different views of the world.

Eventually Consistent

If no new writes occur, all replicas will eventually converge to the same value. The system does not guarantee when convergence happens — it could be milliseconds or seconds after the write, depending on replication topology and network conditions.

Eventually consistent systems require application-level handling of consistency anomalies:

ACID vs BASE

ACIDBASE
ConsistencyStrong, immediateEventual, tunable
AvailabilityMay block on conflictAlways responds
ScalabilityHarder to distributeDesigned for horizontal scale
Data modelRelational, structuredFlexible, varied
TransactionsMulti-row, multi-tableSingle entity or limited scope
Use casesFinancial, ERP, e-commerce ordersSocial feeds, IoT, analytics, caching
ExamplesPostgreSQL, MySQL, CockroachDB, OracleCassandra, DynamoDB, CouchDB, Redis

The choice is not binary. Modern systems often blend both:

Design Considerations

In System Design Interviews

When choosing a database, justify your consistency model. “The payment service writes to PostgreSQL with Read Committed isolation — we need ACID guarantees so a failed charge never partially debits an account. The activity feed reads from a Cassandra cluster with eventual consistency — a few seconds of lag before a new post appears is fine, and Cassandra’s write throughput handles the fan-out to followers without backpressure.” This demonstrates that you pick consistency based on business requirements, not habit.