ACID and BASE
ACID and BASE are two opposing consistency models for database systems. ACID (Atomicity, Consistency, Isolation, Durability) describes the guarantees provided by traditional relational databases for transactions. BASE (Basically Available, Soft state, Eventually consistent) describes the relaxed consistency model adopted by distributed NoSQL systems that prioritize availability and performance over strict correctness. Understanding both is essential for choosing the right database technology and reasoning about the behavior of distributed systems.
ACID Properties
ACID was formalized in the 1970s–80s to describe what it means for a database transaction to be reliably processed. A transaction is a unit of work that groups multiple reads and writes into an atomic operation. Either all of it commits or none of it does.
ACID is not a binary feature — different databases implement the four properties with different strengths and trade-offs. The acronym is a useful checklist, but the real work is understanding each property’s implications.
Atomicity
A transaction is atomic: it either completes entirely or has no effect at all. There is no partial success.
Consider a bank transfer: debit $100 from Alice’s account, credit $100 to Bob’s account. If the debit succeeds but the credit fails (due to a crash, network error, or constraint violation), the $100 must disappear. Atomicity ensures this cannot happen — either both operations commit or both are rolled back.
Implementation: databases implement atomicity via an undo log (or rollback segment). Before modifying a page, the database writes the original value to the undo log. If the transaction rolls back or the database crashes before commit, the original values are restored from the log.
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE user_id = 'alice';
UPDATE accounts SET balance = balance + 100 WHERE user_id = 'bob';
COMMIT;
-- If any statement fails, ROLLBACK is automatic.
-- Both changes are applied or neither is.
Consistency
A transaction must bring the database from one valid state to another valid state. All defined rules — constraints, foreign keys, triggers, check constraints, unique constraints — must hold before and after the transaction.
Consistency is partially the database’s job (enforcing constraints) and partially the application’s job (writing semantically correct logic). The database cannot enforce business invariants it doesn’t know about: if the rule is “an account balance must never go negative”, the database can enforce this with a CHECK (balance >= 0) constraint, but only if the developer adds it.
Consistency in ACID is sometimes described as “the C that doesn’t do anything.” It simply means the database won’t allow a transaction that violates defined constraints to commit. It doesn’t mean the database magically enforces all business rules — only the ones explicitly coded into the schema. The other three properties (A, I, D) are mechanical guarantees; C depends heavily on the application.
Isolation
Isolation ensures that concurrent transactions behave as if they executed serially — each transaction sees a consistent snapshot of the database, unaffected by the partial results of other in-flight transactions.
Without isolation, concurrent transactions interfere with each other, causing:
- Dirty read: Transaction A reads data written by Transaction B, which has not yet committed. If B rolls back, A has read data that never existed.
- Non-repeatable read: Transaction A reads a row, then reads it again within the same transaction and gets a different value because B committed a change between the two reads.
- Phantom read: Transaction A queries a set of rows matching a condition, then re-queries and finds different rows because B inserted or deleted rows matching the condition between the two queries.
- Write skew: Two transactions each read overlapping data, make decisions based on what they read, and write back values that together violate an invariant. Example: two doctors both see that at least two doctors are on call, both decide to take the day off, leaving no doctor on call.
Isolation Levels
Full isolation (serializable execution) is expensive. SQL defines four standard isolation levels as a trade-off between correctness and performance:
| Level | Dirty Read | Non-Repeatable Read | Phantom Read | Write Skew |
|---|---|---|---|---|
| Read Uncommitted | ✔ possible | ✔ possible | ✔ possible | ✔ possible |
| Read Committed | ✘ prevented | ✔ possible | ✔ possible | ✔ possible |
| Repeatable Read | ✘ prevented | ✘ prevented | ✔ possible* | ✔ possible |
| Serializable | ✘ prevented | ✘ prevented | ✘ prevented | ✘ prevented |
* PostgreSQL’s Repeatable Read uses MVCC snapshots, which also prevents phantom reads in practice.
Read Committed (most common default)
Each statement sees only data committed before it began. A transaction can see different committed data across its statements. This is the default in PostgreSQL and Oracle, and is appropriate for most OLTP workloads where non-repeatable reads are acceptable.
Repeatable Read
The transaction sees a snapshot taken at the start of the transaction. Re-reading the same row always returns the same value. This is the default in MySQL InnoDB. It prevents the most common anomalies without the overhead of full serializability.
Serializable
The database guarantees that the outcome is identical to some serial execution of the transactions. This is the strongest guarantee. PostgreSQL implements Serializable Snapshot Isolation (SSI), which detects dangerous transaction orderings and aborts one of the conflicting transactions. Serializable adds latency and increases abort rates under contention.
MVCC: How Databases Implement Isolation
Most modern databases implement isolation via Multi-Version Concurrency Control (MVCC). Instead of locking rows for reads, the database keeps multiple versions of each row. Writers create new row versions without blocking readers. Readers see a consistent snapshot of row versions committed before their transaction began.
MVCC enables high read concurrency: readers never block writers, writers never block readers. The cost: old row versions accumulate and must be cleaned up. In PostgreSQL, this is done by VACUUM. In MySQL InnoDB, the undo log is cleaned up by the purge thread.
Durability
Once a transaction commits, its changes are permanent — they survive crashes, power failures, and OS restarts. The database must ensure that committed data is on durable storage before acknowledging the commit.
Implementation: the Write-Ahead Log (WAL). Before any data page is modified, the change is written to the WAL on disk. If the database crashes, recovery replays the WAL to restore all committed transactions. The data pages may not have been written to disk yet (they may be in the buffer pool), but the WAL records the changes, so no committed data is lost.
The critical operation is fsync() — flushing the WAL to the physical disk. Without fsync, the WAL may be in the OS page cache and lost on power failure. Some databases (and some cloud configurations) disable fsync for performance, accepting the risk of data loss on crash. This should be a deliberate, documented decision.
Several well-known incidents — including GitLab’s 2017 data loss event — involved databases configured with fsync disabled. A power failure or kernel panic with fsync off can corrupt the database or silently lose committed transactions. Never disable fsync in production without understanding the risk and having tested recovery procedures.
BASE Properties
BASE was coined by Eric Brewer (of CAP Theorem fame) as an alternative consistency model for distributed systems that cannot provide ACID guarantees without sacrificing availability or performance.
BASE describes systems that favor availability over consistency, accepting that data may be temporarily inconsistent across nodes, with the guarantee that it will eventually converge.
Basically Available
The system guarantees a response to every request, but the response may not be the most current data. Even during partial failures — a node going down, a network partition — the system continues to respond rather than refusing requests. The response might be stale, incomplete, or come from a replica that hasn’t received the latest writes.
This contrasts with ACID availability under Serializable isolation, where a database may block or timeout a transaction rather than return potentially inconsistent data.
Soft State
The state of the system may change over time, even without new input from clients, as replicas receive updates that were delayed or reordered. There is no guarantee that reading the same key twice returns the same value — a replica may have caught up with a write between the two reads.
Soft state reflects the reality of distributed systems: data propagates asynchronously, and at any point in time, different nodes may have different views of the world.
Eventually Consistent
If no new writes occur, all replicas will eventually converge to the same value. The system does not guarantee when convergence happens — it could be milliseconds or seconds after the write, depending on replication topology and network conditions.
Eventually consistent systems require application-level handling of consistency anomalies:
- Read-your-own-writes: After writing, the client may read from a replica that hasn’t received the write yet. Session consistency (routing a user’s reads to the same replica they wrote to) mitigates this.
- Causal consistency: If event B depends on event A, a system that sees B should also see A. Without causal tracking, a replica might serve B without having applied A.
- Conflict resolution: If two clients write to the same key on different replicas before either replica has received the other’s write, the system must resolve the conflict. Common strategies: last-write-wins (by timestamp), application-defined merge, CRDTs.
ACID vs BASE
| ACID | BASE | |
|---|---|---|
| Consistency | Strong, immediate | Eventual, tunable |
| Availability | May block on conflict | Always responds |
| Scalability | Harder to distribute | Designed for horizontal scale |
| Data model | Relational, structured | Flexible, varied |
| Transactions | Multi-row, multi-table | Single entity or limited scope |
| Use cases | Financial, ERP, e-commerce orders | Social feeds, IoT, analytics, caching |
| Examples | PostgreSQL, MySQL, CockroachDB, Oracle | Cassandra, DynamoDB, CouchDB, Redis |
The choice is not binary. Modern systems often blend both:
- DynamoDB is BASE by default but offers optional strongly consistent reads at higher cost and latency.
- Cassandra allows tuning consistency level per operation: ONE (fastest, least consistent), QUORUM (middle ground), ALL (strongest, slowest).
- CockroachDB and Spanner provide ACID semantics across a globally distributed cluster by using consensus protocols (Raft, TrueTime) — they are distributed SQL, not BASE.
- Many architectures use a SQL database (ACID) for the write path and an eventually consistent cache or read replica (BASE) for the read path, combining the benefits of both.
Design Considerations
- Money moves on ACID. Financial transactions, inventory decrement, seat reservation — anything where a partial update causes real-world harm requires ACID guarantees. Accept the performance trade-off.
- Social interactions tolerate BASE. If a user’s like count is 1000 vs 1001 for a few hundred milliseconds, the difference is imperceptible. The availability and throughput gains of eventual consistency outweigh the momentary inconsistency.
- Read your isolation level contract. Many engineers assume their database provides Serializable isolation when it actually defaults to Read Committed. Check your database’s default isolation level and understand the anomalies it allows. Several famous data corruption bugs have stemmed from assuming stronger isolation than the database provides.
- Idempotency for eventually consistent writes. When writing to a BASE system, design operations to be idempotent — executing the same operation twice should produce the same result as executing it once. This handles the case where a write succeeds but the acknowledgment is lost, causing the client to retry.
When choosing a database, justify your consistency model. “The payment service writes to PostgreSQL with Read Committed isolation — we need ACID guarantees so a failed charge never partially debits an account. The activity feed reads from a Cassandra cluster with eventual consistency — a few seconds of lag before a new post appears is fine, and Cassandra’s write throughput handles the fan-out to followers without backpressure.” This demonstrates that you pick consistency based on business requirements, not habit.