Foundations

Storage

● Beginner ⏱ 13 min read storage

Every system stores data somewhere. The question is not whether to store it but how — and the answer has profound consequences for performance, cost, scalability, and durability. Block storage, file storage, and object storage are fundamentally different models for organizing and accessing data, each suited to different workloads. Understanding them clearly lets you make deliberate choices instead of defaulting to whatever is familiar.

Storage Types

There are three primary storage abstractions used in modern system design:

Each model exposes a different interface to the data, has different performance characteristics, and scales differently. They are not interchangeable.

Block Storage

Block storage presents raw storage as a sequence of fixed-size blocks (typically 512 bytes to 4 KB). The storage system has no knowledge of the data structure — it just reads and writes blocks at addresses. The operating system or application is responsible for organizing data on top of the raw blocks, typically using a filesystem (ext4, XFS, NTFS) or a database engine.

How It Works

A block storage volume appears to the OS as a local disk. The OS formats it with a filesystem, and applications access files through normal file I/O syscalls. Alternatively, a database can use a block device directly without a filesystem (raw I/O), bypassing the OS page cache for more predictable performance.

Characteristics

Examples

💡
SSD vs HDD

HDDs (spinning disks) are cheap per GB but have high seek latency (~5–10ms) due to mechanical movement. SSDs have no moving parts, delivering ~0.1ms latency and much higher IOPS at higher cost per GB. NVMe SSDs (attached directly to the PCIe bus rather than through SATA/SAS) are another 5–10× faster than SATA SSDs. For databases, NVMe SSDs are the default choice whenever the budget allows. HDDs survive for cold archival storage where cost dominates over latency.

File Storage

File storage organizes data as a hierarchy of directories and named files, accessible through standard filesystem protocols (POSIX, NFS, SMB/CIFS). Clients mount the storage and interact with it using familiar file operations: open, read, write, seek, close.

How It Works

A file storage server (NAS appliance or cloud service) exports one or more shares over a network protocol. Clients mount the share and it appears as a local filesystem. Multiple clients can mount the same share simultaneously, enabling shared access to the same files from many servers — the key capability that distinguishes file storage from block storage.

Characteristics

Examples

Object Storage

Object storage stores data as discrete objects in a flat namespace. Each object consists of the data itself, a unique key (the object’s identifier), and metadata (content type, size, custom attributes). There is no directory hierarchy — all objects live in a flat bucket, though keys can contain slashes to simulate folder paths.

How It Works

Objects are accessed via an HTTP API (typically S3-compatible). You PUT an object to store it, GET it to retrieve it, and DELETE it to remove it. Objects are immutable — you cannot update a portion of an object in place; you must write a new version. This immutability is fundamental to how object storage achieves its durability and scalability guarantees.

Characteristics

Examples

S3 as the Universal Interface

The AWS S3 API has become the de facto standard for object storage. Every major cloud provider, most self-hosted solutions (MinIO, Ceph), and dozens of data tools (Spark, Flink, DVC, MLflow) support the S3 API natively. When designing a system that needs object storage, designing for the S3 API gives you maximum portability and tool compatibility, regardless of which backend you actually use.

Comparing Storage Types

BlockFileObject
Access methodRaw I/O / filesystemNFS, SMB (POSIX)HTTP API (S3)
NamespaceBlocks at addressesHierarchical (directories)Flat (key-value)
LatencySub-ms (NVMe)1–5ms (network)10–100ms (HTTP)
ThroughputVery high (random I/O)HighVery high (large sequential)
Shared accessNo (single host)Yes (multi-client)Yes (HTTP)
MutabilityMutableMutableImmutable (replace only)
ScaleTB per volumeHundreds of TBExabytes
DurabilityDepends on RAID/replicationDepends on NAS config11 nines (e.g. S3)
Best forDatabases, OS volumesShared filesystemsMedia, backups, data lakes

RAID

RAID (Redundant Array of Independent Disks) combines multiple physical drives into a single logical volume that provides redundancy, performance, or both. RAID is implemented either in hardware (a dedicated RAID controller) or in software (Linux mdadm, ZFS).

Common RAID Levels

RAID 0 — Striping

Data is split (striped) across multiple disks. Reads and writes are parallelized, doubling throughput with two disks, tripling with three. No redundancy — if any single disk fails, all data is lost. Used only when performance is paramount and data loss is acceptable (scratch disks, temporary caches).

RAID 1 — Mirroring

Every write goes to two (or more) disks simultaneously. The content of all disks is identical. If one disk fails, the other contains a complete copy. Read performance can be doubled (reads served from either disk). 50% storage efficiency — two 1 TB disks give 1 TB usable. Simple and robust. Commonly used for OS boot drives and small critical datasets.

RAID 5 — Striping with Distributed Parity

Data and parity information are striped across at least three disks. Parity allows reconstruction of any single failed disk. Reads are fast (parallelized). Writes require parity calculation (some overhead). Can tolerate one disk failure. Storage efficiency is (N-1)/N — three 1 TB disks give 2 TB usable. The most common RAID level for NAS devices.

RAID 6 — Striping with Double Parity

Like RAID 5 but with two parity blocks, allowing two simultaneous disk failures. Requires at least four disks. Higher write overhead than RAID 5. Preferred for large arrays where the probability of a second disk failing during a rebuild is non-negligible.

RAID 10 — Mirroring + Striping

Combines RAID 1 mirroring with RAID 0 striping. Requires at least four disks. Provides both the redundancy of mirroring and the performance of striping. 50% storage efficiency. The preferred choice for databases: high IOPS, high throughput, and can survive multiple simultaneous failures as long as the failed drives are not the same mirror pair.

LevelRedundancyMin disksEfficiencyBest for
RAID 0None2100%Max performance, no durability
RAID 11 disk failure250%Simple redundancy, small arrays
RAID 51 disk failure3(N-1)/NNAS, balanced performance/redundancy
RAID 62 disk failures4(N-2)/NLarge arrays, higher safety
RAID 101 per mirror pair450%Databases, high-performance workloads
⚠️
RAID Is Not a Backup

RAID protects against disk hardware failure. It does not protect against accidental deletion, ransomware, filesystem corruption, or datacenter failure. A RAID 1 mirror that contains corrupted data mirrors the corruption to both drives. Always maintain separate backups (offsite, point-in-time snapshots) in addition to RAID.

NAS and SAN

NAS and SAN are two approaches to providing networked storage, differing fundamentally in what they expose over the network.

NAS — Network Attached Storage

A NAS device presents a filesystem over the network using file-level protocols (NFS, SMB). Clients mount the share and see files and directories. The NAS handles the filesystem internally; clients just see file operations. Multiple clients can share the same NAS simultaneously. NAS is easy to set up and manage, and is the standard approach for shared file storage in small-to-medium environments.

SAN — Storage Area Network

A SAN presents raw block devices over a dedicated high-speed network (typically Fibre Channel or iSCSI). The client OS sees a disk device, formats it with a filesystem, and manages it like a local drive. SANs provide very low latency (comparable to local block storage) and are used for performance-critical workloads like large databases and virtualization platforms. SANs are expensive and complex to operate — primarily used in enterprise data centers.

NASSAN
What it exposesFiles (NFS, SMB)Raw blocks (FC, iSCSI)
Who manages filesystemNAS deviceClient OS
Simultaneous clientsYes (file sharing)Typically one per LUN
PerformanceGoodExcellent (near-local)
ComplexityLowHigh
CostModerateHigh

Distributed File Systems

When storage requirements exceed what a single machine can provide — petabytes of data, thousands of concurrent clients — distributed file systems spread data across many machines while presenting a single unified namespace.

HDFS — Hadoop Distributed File System

HDFS was designed for batch processing of very large files (think: 128 MB–GB+ per file). It stores files by splitting them into large blocks (default 128 MB) and replicating each block across multiple DataNodes (typically 3×). A centralized NameNode tracks block locations.

Ceph

A self-hosted distributed storage platform that provides block storage (RBD), file storage (CephFS), and object storage (RGW with S3-compatible API) over the same cluster. Designed for large-scale on-premises deployments where you need the flexibility of all three storage types without separate systems.

GFS — Google File System

The predecessor to HDFS (and published before it), GFS was designed for Google’s workloads: large files, high sequential throughput, fault tolerance at scale. Its design influenced HDFS and much of the thinking behind distributed storage systems. GFS is not publicly available; HDFS is the open-source equivalent.

Storage in System Design

Every component of a system stores data somewhere. Matching the storage type to the workload is one of the most important architectural decisions:

Decision Framework

Common Patterns

Application servers are stateless: Application code and configuration come from object storage or are baked into container images. Local disk on application servers is ephemeral. User-generated content goes directly to object storage, never to the local filesystem of an application server.

Databases use block storage: Attach a high-performance block volume (EBS io2, local NVMe) to your database server. The database manages the filesystem. Use RAID 10 or cloud-managed replication for durability.

Object storage as the data lake: Ingest raw data into S3 (or equivalent). Query it with serverless tools (AWS Athena, BigQuery) or batch processing frameworks (Spark). Avoid moving large datasets to HDFS on-premises unless you have a specific reason to — managed object storage is simpler to operate and often cheaper.

💡
In System Design Interviews

When a component needs to store data, explicitly name the storage type and justify it. “User profile photos go to S3 — object storage gives us eleven-nines durability and virtually unlimited scale at low cost” is far stronger than “we store images somewhere.” For databases, mention block storage and the volume type (SSD/NVMe vs HDD). If asked about large-scale data processing, distinguish HDFS from modern S3-based data lake architectures. Bring up RAID only if the question involves on-premises infrastructure or data durability at the disk level.