Databases

NoSQL Databases

● Beginner ⏱ 11 min read database

NoSQL databases emerged to solve problems that relational databases handle poorly: extremely high write throughput, flexible schemas that evolve without migrations, and horizontal scaling across many cheap commodity servers. Understanding the different NoSQL families — document, key-value, wide-column, graph — and what each one actually optimizes for prevents the mistake of reaching for “NoSQL” as a generic solution to a problem that may not need it.

What Is NoSQL?

NoSQL (Not Only SQL) is a broad term for databases that do not use the traditional relational table model. The defining characteristics are:

Schema flexibility: Documents or rows can have different fields. Schema changes do not require ALTER TABLE migrations and associated locks.
Horizontal scaling by design: Data is partitioned across many nodes. Adding capacity means adding nodes, not upgrading a single server.
Relaxed consistency: Many NoSQL systems trade strict ACID consistency for availability and partition tolerance, accepting eventual consistency as a trade-off.
Access pattern optimization: Each NoSQL family is designed for a specific access pattern. Choosing the wrong type for your pattern is more painful than it would be with a general-purpose relational database.

💡

NoSQL Is Not One Thing

Document databases (MongoDB), key-value stores (Redis), wide-column stores (Cassandra), and graph databases (Neo4j) are all called “NoSQL,” but they have almost nothing in common beyond rejecting the relational model. Each is designed for a fundamentally different use case. Saying “we use NoSQL” tells you almost nothing about the system; saying “we use Cassandra for time-series writes and Redis for session caching” tells you everything.

Document Databases

Document databases store data as semi-structured documents — typically JSON or BSON. Each document is self-contained: the data and its metadata live together. Documents in the same collection can have different shapes; there is no enforced schema at the database level.

How They Work

Instead of rows in tables, you have documents in collections. A user document might contain their name, email, preferences, and an embedded array of addresses — all in a single document. When you fetch the user, you get everything without a join. When you update the user’s email, you update one document.

Documents are indexed by a primary key (typically _id). Secondary indexes can be created on any field, including fields inside nested objects and arrays. Queries look like { "country": "US", "age": { "$gte": 18 } } rather than SQL.

Strengths

Schema evolution: Add or remove fields from documents without a migration. Old documents with missing fields simply don’t have them.
Data locality: Embedding related data in one document avoids joins. A blog post document can contain its author info, tags, and recent comments all in one read.
Developer ergonomics: Documents map directly to objects in application code (JSON ↔ Python dict / JavaScript object). Less impedance mismatch than SQL rows.

Weaknesses

No joins: If data that belongs in two documents must be queried together, you do it in application code or use the database’s (often limited) aggregation pipeline. Relationships between collections are clumsy.
Denormalization pressure: To avoid cross-collection lookups, you embed data. When the embedded data changes (e.g., a user’s name changes), you must update it everywhere it is embedded.
Inconsistent documents: Because the schema is unenforced, bugs can write malformed documents that silent-fail at write time and explode at read time.

Examples

MongoDB — the most widely deployed document database. BSON storage, rich query language, aggregation pipeline, Atlas managed service. Couchbase — document store with built-in caching and N1QL (SQL-like query language). Firestore — Google’s serverless document database, popular for mobile and web apps. Amazon DocumentDB — MongoDB-compatible managed service on AWS.

Best for: Content management, product catalogs, user profiles, event data, mobile app backends where the schema evolves frequently.

Key-Value Stores

The simplest data model: every piece of data is a value identified by a unique key. The database has no knowledge of the value’s structure — it’s an opaque blob. Operations are: GET, SET, DELETE. Some stores add TTL (time-to-live) for automatic expiry, atomic counters, and pub/sub.

How They Work

Data is stored in a hash table (in-memory stores like Redis) or an LSM-tree (disk-backed stores like RocksDB). Lookups are O(1) by key. There is no query language — you cannot ask “give me all keys where value.age > 18.” You must know the key to retrieve the value.

Redis in Depth

Redis is more than a simple key-value store. It supports rich data structures — strings, lists, sorted sets, hashes, bitmaps, HyperLogLogs, streams — each with dedicated atomic operations. This makes it extremely versatile:

Caching: Store serialized objects with a TTL. Eviction policies (LRU, LFU) automatically remove old data when memory fills.
Session storage: Serialize session state to a string; expire it after inactivity.
Rate limiting: Increment a counter per user per minute using atomic INCR and EXPIRE.
Leaderboards: Sorted sets maintain a ranked list of scores in O(log N) per update.
Distributed locks: SET with NX (only if not exists) and a TTL implements a lock with automatic expiry.
Pub/Sub and Streams: Decouple producers and consumers with Redis Streams (persistent) or basic pub/sub (fire-and-forget).

Examples

Redis — in-memory, supports persistence via AOF/RDB snapshots, cluster mode for horizontal scaling. Memcached — simpler in-memory cache, multi-threaded, no persistence. DynamoDB — AWS-managed, supports key-value and document models, single-digit millisecond latency at any scale. etcd — distributed key-value store designed for configuration and service discovery (powers Kubernetes).

Best for: Session storage, caching database queries, real-time counters, rate limiting, distributed coordination.

Wide-Column Stores

Wide-column stores organize data into rows with a key, but each row can have a different set of columns, and columns are grouped into column families. Unlike document databases, wide-column stores are designed for extreme write throughput and massive datasets distributed across many nodes with no single point of failure.

How They Work

In Cassandra, you design your data model around your query patterns — not your data relationships. You create a table whose partition key determines which node stores the data, and whose clustering key determines the sort order within that partition. Writes go to the node(s) responsible for that partition key; reads go to the same node(s). This enables linear horizontal scalability: double the nodes, roughly double the throughput.

Under the hood, Cassandra uses an LSM-tree storage engine. Writes go to an in-memory structure (memtable) and a commit log. The memtable is periodically flushed to disk as immutable SSTables. Reads may need to merge data from the memtable and multiple SSTables, plus tombstones for deleted data — making reads more expensive than writes in write-heavy LSM systems.

Strengths

Linear write scalability: Adding nodes linearly increases write throughput. No single primary bottleneck.
No single point of failure: Data is replicated across multiple nodes (configurable replication factor). Any node can be lost without downtime.
Tunable consistency: Per-query consistency level (ONE, QUORUM, ALL). Trade consistency for latency depending on the operation.
Time-series workloads: Clustering key on timestamp makes time-ordered appends efficient and range scans fast.

Weaknesses

Query by partition key only: Without a secondary index, you can only query by the partition key. Secondary indexes in Cassandra are local and inefficient at scale.
No joins, no transactions: Cassandra has lightweight transactions (compare-and-set) but no multi-row ACID transactions. Cross-partition operations require application-level coordination.
Schema design is hard: You must know your query patterns before designing the schema. Changing query patterns often requires a new table.

Examples

Apache Cassandra — open-source, widely deployed at Netflix, Apple, Discord. Amazon Keyspaces — managed Cassandra-compatible service. HBase — Hadoop ecosystem wide-column store. Google Bigtable — the original wide-column store, powering Gmail, Google Search indexing, and many Google services.

Best for: IoT sensor data, time-series metrics, activity feeds, audit logs — any workload with very high write rates and known, narrow query patterns.

Graph Databases

Graph databases store data as nodes (entities) and edges (relationships). Each node and edge can carry properties. The database is optimized for traversing relationships — following edges from node to node — which is extremely slow in a relational database with deep joins.

How They Work

In a relational database, finding all friends-of-friends requires a JOIN on the friends table with itself. With 5 degrees of separation and millions of users, this becomes a recursive join over a massive table — impractically slow. In a graph database, following edges is a O(1) pointer dereference per hop, regardless of total graph size. The graph is stored as adjacency lists optimized for traversal.

Queries use graph-specific languages. Cypher (Neo4j) expresses graph patterns declaratively:

MATCH (u:User {name: "Alice"})-[:FRIEND*2]->(fof:User)
WHERE NOT (u)-[:FRIEND]->(fof)
RETURN fof.name

Examples

Neo4j — the dominant graph database, native graph storage, Cypher query language. Amazon Neptune — managed graph database, supports both property graph (Gremlin) and RDF (SPARQL). JanusGraph — distributed graph database backed by Cassandra or HBase.

Best for: Social networks (friend recommendations), fraud detection (connected account rings), knowledge graphs, recommendation engines, identity and access management graphs.

Eventual Consistency

Most NoSQL databases sacrifice strong consistency for availability and partition tolerance (the CAP theorem trade-off). In an eventually consistent system, a write on one node is not immediately visible on other nodes — it propagates asynchronously. Given enough time without new writes, all nodes will converge to the same value.

This has real consequences:

A user updates their email. Another region reads their profile a millisecond later and sees the old email.
Two users simultaneously increment a shared counter in different regions. Both see the counter go from 10 to 11. The true count should be 12.
A user deletes a post. Another node doesn’t yet know it’s deleted and serves it to another request.

Many NoSQL databases offer tunable consistency. Cassandra lets you set the consistency level per query: reading from a QUORUM of replicas gives stronger guarantees than reading from ONE. DynamoDB offers strongly consistent reads at extra cost. MongoDB offers configurable write concerns and read concerns.

⚠️

Eventual Consistency Is Not “Fine for Most Cases”

Eventual consistency is correct for some use cases (activity feeds, view counts, analytics) and catastrophically wrong for others (financial balances, inventory, authentication tokens). The mistake is assuming eventual consistency is “good enough” for a use case without carefully analyzing what happens when reads return stale data. Think through: what is the worst-case impact if a user sees a value that is 500ms or 10 seconds out of date?

When to Use NoSQL

Each NoSQL type has a clear sweet spot. Use NoSQL when a specific, genuine constraint makes SQL the wrong fit:

Use Case	Recommended Type	Why
Caching, sessions, rate limiting	Key-Value (Redis)	Sub-millisecond latency, TTL, atomic ops
Variable-schema content, catalogs	Document (MongoDB)	Schema flexibility, embedded documents
High-volume time-series / IoT writes	Wide-Column (Cassandra)	Linear write scalability, partition-aware storage
Social graphs, recommendations	Graph (Neo4j)	Efficient multi-hop traversal
Full-text search	Search Engine (Elasticsearch)	Inverted index, relevance ranking
Metrics and monitoring	Time-Series (InfluxDB, Prometheus)	Optimized compression and time-window aggregations

✅

Start SQL, Add NoSQL Precisely

The pattern that works at scale is: relational database for core transactional data, one or two NoSQL stores added for specific performance requirements. Every NoSQL system you add is another thing to monitor, backup, tune, and on-call for. Add Redis when your cache hit rate analysis shows you need sub-millisecond lookups. Add Cassandra when your write throughput analysis shows a single PostgreSQL primary is the bottleneck. Don’t add NoSQL because you think you might need scale someday.