Event-Driven Architecture & Event Sourcing
Event-driven architecture (EDA) makes events — records of things that happened — the primary communication mechanism between components. Instead of services calling each other directly, they publish events describing state changes; other services subscribe and react. This inverts the dependency direction: producers don’t know who consumes their events, and consumers don’t know who produced them.
What Is Event-Driven Architecture
In a traditional request-response system, Service A knows about Service B and calls it explicitly. In an event-driven system, Service A publishes OrderPlaced to a broker. The Inventory, Email, Fraud, and Analytics services each subscribe to OrderPlaced and react independently. Service A knows nothing about them — new consumers can be added without touching Service A.
This inversion has profound consequences:
- Open-closed principle at the system level: Adding a new consumer (a new business capability) requires zero changes to existing producers.
- Temporal decoupling: Consumers don’t need to be online when the event is produced. The broker retains events until consumers process them.
- Natural audit log: The event stream is a chronological record of everything that happened in the system.
Event Types
Not all events are the same. Understanding the distinctions matters when designing your event schema and choosing how to route events:
Domain events: Something meaningful that happened in the business domain. OrderPlaced, PaymentFailed, UserRegistered. These represent facts — things that already happened and cannot be undone. They are named in past tense. Domain events are the most common event type in EDA.
Integration events: Domain events that cross service boundaries, published to a broker for external consumers. They may be a subset or transformation of internal domain events, shaped for public consumption. Define them carefully — they are a public API for your service.
Commands (as messages): A message that requests an action — SendEmail, ProcessPayment. Unlike events, commands have a specific intended recipient and imply an obligation to act. Some systems use messaging for commands too, but it’s important to distinguish semantically: events are facts, commands are requests.
Change Data Capture (CDC) events: Row-level change events streamed directly from a database WAL (Debezium, Postgres logical replication). Useful for integrating systems that can’t be modified to publish domain events, or for building event-sourced read models from existing state-based databases.
EDA Patterns
Event notification: The simplest form. A service publishes a lightweight event (often just IDs and a type) to notify others that something happened. Consumers query the producer’s API if they need details. Low coupling, but adds a round-trip for consumers that need data.
Event-carried state transfer: Events carry all the data consumers need — no callback to the producer required. OrderPlaced includes the full order details. Consumers are self-sufficient. This increases event size and schema coupling, but eliminates the callback round-trip.
Event streaming: A continuous, ordered stream of events that consumers can replay from any position. Kafka is the canonical implementation. Consumers maintain their own offset and can re-read historical events. This enables replaying history to rebuild state, backfilling new consumers, and time-travel debugging.
Saga (event-driven): A sequence of events and reactions that implement a long-running business process across services. Each service completes a step and publishes an event; the next service reacts. Compensating events undo steps if the saga fails. See the Distributed Transactions guide for choreography vs orchestration.
Event Sourcing
Event sourcing is a persistence pattern where the system stores the sequence of events that led to the current state, rather than storing the current state itself. Instead of UPDATE orders SET status = 'shipped', you append OrderShipped to the event log. The current state is always derived by replaying the event log.
// Event log for order #42
OrderPlaced { orderId: 42, items: [...], total: 150 }
PaymentCharged { orderId: 42, amount: 150, card: "****1234" }
OrderShipped { orderId: 42, trackingId: "UPS-9999" }
OrderDelivered { orderId: 42, deliveredAt: "2024-01-15T14:00Z" }
To get the current state of order #42, replay these four events in order. To get the state at any point in time, replay up to that timestamp. The complete history is always available.
Snapshots: Replaying a long event log on every read is expensive. After N events, store a snapshot of the current state. On subsequent reads, load the latest snapshot and replay only events that occurred after it. Snapshots are an optimization — the event log remains the source of truth.
Projections and Read Models
In event sourcing, the event log is the write model (source of truth). Read models — also called projections — are derived views built by processing the event log, optimized for specific query patterns.
A projection subscribes to the event stream and maintains its own denormalized read store:
- The Order Summary projection maintains a table of
{orderId, status, total, customerName}— optimized for listing orders. - The Customer Order History projection maintains a per-customer list of orders — optimized for the "my orders" page.
- The Analytics projection aggregates order counts and revenue by day — optimized for dashboards.
Projections can be rebuilt from scratch by replaying the event log. This means you can add new read models retroactively — something impossible with state-based storage. Change a projection’s logic, replay the log, and the new read model reflects the correct view of all historical events.
Event sourcing pairs naturally with CQRS (Command Query Responsibility Segregation) — the event log serves as the write model, and projections serve as read models. They are distinct patterns: you can use CQRS without event sourcing (separate read/write databases with state-based storage), and you can use event sourcing without strict CQRS. But they complement each other well and are often adopted together.
Event Sourcing vs State-Based Storage
| State-Based (CRUD) | Event Sourcing | |
|---|---|---|
| Storage | Current state only | Full event history |
| Audit log | Separate audit table (often incomplete) | Built-in — the event log is the audit log |
| Time travel | Requires separate history tables | Replay events up to any timestamp |
| New read models | Cannot retroactively derive from history | Replay log to build any new projection |
| Complexity | Low — familiar CRUD operations | High — projections, snapshots, schema evolution |
| Query flexibility | Arbitrary queries on current state | Only via pre-built projections |
| Storage growth | Proportional to current data size | Grows indefinitely with history |
Event sourcing is not a universal default. It excels in domains where history, auditability, and retroactive analysis are core requirements: financial systems (every transaction is permanent), e-commerce (full order lifecycle), reservation systems. It adds significant complexity — schema evolution of past events is notoriously difficult — and is overkill for simple CRUD domains.
Design Considerations
- Events are immutable facts. Once an event is written to the log, it cannot be changed or deleted. If a mistake was made (an order was incorrectly placed), model the correction as a new event (
OrderCancelled,CorrectionApplied), not by mutating history. This makes schema evolution and event versioning critical — plan event schemas carefully before going to production. - Design for idempotent consumers. Events will be redelivered. Every consumer must handle duplicates safely. Include a unique event ID in every event and track which IDs have been processed.
- Event schema versioning. Event schemas evolve. Old events in the log must remain readable by newer consumers. Use additive-only changes (new optional fields), a schema registry, or upcasting (transforming old event formats on read). Never change the meaning of an existing field.
- Eventual consistency is the default. Projections lag behind the event log by some milliseconds to seconds. Users may see stale read models briefly after a write. Design UIs and workflows that tolerate this — or use optimistic UI updates that assume the command succeeded while the projection catches up.
- Don’t event-source everything. Apply event sourcing to aggregates where history and auditability matter. Simple reference data (product catalog, configuration) is better served by plain CRUD. A hybrid approach — event sourcing for core business aggregates, state-based for supporting data — is common and practical.