VMs & Containers
Virtual machines and containers are two approaches to running isolated workloads on shared hardware. Both let you run multiple applications on the same physical server without interference. They achieve isolation differently — VMs virtualize hardware; containers virtualize the operating system. Understanding the distinction matters when choosing infrastructure, sizing workloads, and designing deployment pipelines.
Virtual Machines
A virtual machine (VM) is a complete emulation of a physical computer. A hypervisor — software that sits between the hardware and the VMs — creates and manages these virtual environments. Each VM gets its own virtual CPU, memory, disk, and network interface.
Two types of hypervisors:
- Type 1 (bare-metal): Runs directly on hardware with no host OS. Examples: VMware ESXi, Microsoft Hyper-V, Xen. This is what cloud providers use — AWS EC2 instances run as VMs on Type 1 hypervisors on physical servers in data centers.
- Type 2 (hosted): Runs on top of a host operating system as an application. Examples: VirtualBox, VMware Workstation, Parallels. Used for development and testing on laptops, not for production workloads.
Each VM runs its own full operating system — its own kernel, device drivers, system libraries, and application stack. A VM image for a Linux web server might be 10–20 GB. Boot time is typically 30–60 seconds, since an entire OS must initialize.
Containers
A container is a lightweight, isolated process that shares the host machine’s operating system kernel. Containers don’t virtualize hardware — they use Linux kernel features to create the illusion of isolation:
- Namespaces: Isolate what a process can see. Each container gets its own namespace for process IDs (PID), network interfaces, filesystem mount points, hostname, and user IDs. A process inside a container can only see other processes in the same container; it can’t see processes on the host or in other containers.
- cgroups (control groups): Limit what a process can consume. cgroups enforce CPU, memory, disk I/O, and network bandwidth limits per container. Without cgroups, a single container could exhaust all host resources and starve other containers.
- Union filesystem (OverlayFS): Layers read-only image layers with a writable container layer. This enables the efficient image layer-sharing model Docker uses (described below).
Because containers share the host kernel, they have no OS boot sequence. A container starts in milliseconds — it’s just launching a process. A Docker container image for a web server might be 50–200 MB, compared to 10–20 GB for a VM image.
How Docker Works
Docker is the most widely used container runtime. Its architecture has three components:
- Docker daemon (
dockerd): A background process that manages containers, images, networks, and volumes. Exposes a REST API. - Docker CLI: The
dockercommand-line tool. Sends API requests todockerd. - containerd: The actual container runtime that manages the container lifecycle (create, start, stop, delete). Docker delegates to containerd; Kubernetes can use containerd directly without Docker.
When you run docker run nginx:
- The CLI sends a request to
dockerd. dockerdpulls thenginximage from Docker Hub if not cached locally.dockerdaskscontainerdto create a container from the image.containerdsets up namespaces and cgroups, mounts the image layers, and starts thenginxprocess.- The container is running —
nginxis now an isolated process on the host.
Images and Layers
A Docker image is built from layers. Each instruction in a Dockerfile creates a new layer:
FROM ubuntu:22.04 # Layer 1: base Ubuntu filesystem
RUN apt-get install nginx # Layer 2: nginx binary and dependencies
COPY ./app /var/www/html # Layer 3: your application code
CMD ["nginx", "-g", "daemon off;"] # metadata (no new layer)
Layers are content-addressed and cached. If layer 1 and 2 haven’t changed, rebuilding after modifying your app code only creates a new layer 3. Layers are shared between containers — if 10 containers use the same Ubuntu base image, that base layer exists once on disk.
When a container runs, Docker adds a thin writable layer on top of the read-only image layers. All writes go to this container layer. When the container is deleted, the writable layer is deleted — the underlying image layers are unchanged. This is why containers should treat their filesystem as ephemeral; persistent data must be stored in volumes (host directory mounts or Docker-managed volumes).
VM vs Container
| Virtual Machine | Container | |
|---|---|---|
| Isolation level | Full hardware virtualization — own kernel, OS | Process-level — shared kernel, isolated namespaces |
| Startup time | 30–60 seconds (OS boot) | Milliseconds (process start) |
| Image size | Gigabytes | Megabytes |
| Density | 10–100s of VMs per host | 1000s of containers per host |
| Security boundary | Strong — kernel compromise requires hypervisor escape | Weaker — kernel vulnerabilities affect all containers |
| OS flexibility | Any OS — Windows, Linux, different kernel versions | Must match host kernel type (Linux containers need Linux host) |
| Overhead | Higher — full OS per VM | Near-native — minimal overhead |
| Use case | Strong isolation, multi-tenant cloud, different OS requirements | Microservices, CI/CD, high-density workloads |
In cloud environments, you typically run containers inside VMs — not directly on bare metal. An EC2 instance is a VM; you run Docker containers on that VM. This gives you the cloud’s VM-level isolation between tenants, plus containers’ density and fast startup within your own VM. Managed Kubernetes services (EKS, GKE, AKS) provision VMs as worker nodes and run containers on them.
Container Orchestration
Running containers on a single host is straightforward. Running thousands of containers across hundreds of hosts — with load balancing, health checks, rolling deploys, auto-scaling, and service discovery — requires an orchestrator.
Kubernetes is the dominant container orchestrator. It manages:
- Scheduling: Deciding which host runs which container based on resource requests and constraints.
- Health management: Restarting failed containers, replacing containers on failed nodes, running the desired number of replicas.
- Service discovery: Assigning stable DNS names and virtual IPs to groups of containers.
- Rolling deploys: Replacing old containers with new versions without downtime, controllable rate.
- Auto-scaling: Horizontal Pod Autoscaler scales replica count based on CPU/memory metrics or custom metrics.
- Configuration and secrets: ConfigMaps and Secrets inject configuration into containers without baking it into images.
Kubernetes abstracts the underlying infrastructure — you declare desired state (3 replicas of this container, limit to 500m CPU) and Kubernetes reconciles reality to match. This declarative model is what makes large-scale container operations manageable.
Design Considerations
- Build stateless containers. Containers are ephemeral — they start, run, stop, and are replaced. Any state written to the container filesystem is lost when the container stops. Externalize all state: databases, object storage, distributed caches. Stateless containers are trivial to scale horizontally and replace after failures.
- Keep images small. Large images slow down deploys (more to pull), waste registry storage, and increase the attack surface. Use minimal base images (
alpine,distroless), multi-stage builds to exclude build tools from the final image, and only install what the running application needs. - Don’t run containers as root. By default, containers run as root inside the container. If a container is compromised and there’s a kernel vulnerability, root inside the container is closer to root on the host. Add a non-root user in your Dockerfile and run as that user.
- Use VMs when strong isolation is required. Multi-tenant SaaS where one customer’s code runs next to another’s should use VM-level isolation. Containers’ shared-kernel model means a kernel exploit in one tenant’s container could affect others. For most internal microservices, container isolation is sufficient.
- Resource limits are not optional. Always set CPU and memory limits on containers in production. An unlimited container can consume all host memory and trigger OOM kills for every other container on the node. Kubernetes enforces resource requests and limits at the scheduler and kubelet level — use them.
- Treat image tags as mutable by default. The tag
nginx:latestcan point to a different image tomorrow. In production, pin to digest (nginx@sha256:abc123...) or to immutable tags. Unpinned tags cause “it worked yesterday” failures after base image updates.