Databases

Database Federation

● Intermediate ⏱ 10 min read database

Database federation (also called functional partitioning) splits a single large database into multiple smaller databases, each responsible for a specific domain or function. Instead of one database containing users, orders, products, and inventory tables, you have a Users database, an Orders database, a Products database, and an Inventory database. Each database is independently operated and owned by the team or service responsible for that domain.

What Is Database Federation

In a typical monolithic application, all tables live in a single database. As the system grows, this creates contention: all reads and writes from every part of the application compete for connections, buffer pool, and lock resources on one server. Schema changes in one domain require coordination with every team whose queries touch the affected tables. The single database becomes a bottleneck and a coordination overhead.

Federation solves this by splitting along functional boundaries:

Monolithic DBFederated (split by function)
users, profiles, sessionsUsers DB
orders, order_items, paymentsOrders DB
products, categories, inventoryProducts DB
messages, notificationsMessaging DB

Each federated database has its own connection pool, its own hardware or container, and its own deployment lifecycle. A schema change in the Orders database does not require a deployment freeze on the Products database.

Federation vs Sharding

Sharding and federation are complementary, not mutually exclusive, but they solve different problems:

ShardingFederation
Split dimensionHorizontal (rows of same table across nodes)Vertical (different tables on different nodes)
Problem solvedSingle table too large / too many writesOne database too many teams / too much contention
SchemaSame schema replicated across shardsDifferent schemas per federated database
Cross-node queriesScatter-gather across shardsApplication-level joins across databases

A federated database can itself be sharded. If the Orders database grows too large for a single server, you shard the Orders database across multiple nodes. Federation happens first (split by function), sharding happens within a domain when needed.

Benefits

Independent scaling. Each database scales based on its own load profile. The Messaging database might need high write throughput and time-series optimizations; the Products database might need full-text search; the Users database might need strong consistency. Each can use the database technology and hardware that fits its requirements — a practice sometimes called polyglot persistence.

Reduced connection pool contention. A single database shared by 50 microservices exhausts connection limits quickly. With federation, each service connects only to its own database. Connection pools are smaller and never shared across unrelated domains.

Schema autonomy. The Orders team can add a column, change an index, or migrate a table without coordinating with the Products team. Deploy cycles are decoupled. Risk is contained: a botched migration in Orders doesn't affect the availability of Products.

Blast radius reduction. A runaway query or high-write batch job in one domain can saturate that database's I/O or CPU without affecting other domains. Query isolation prevents one domain's load spike from degrading all other domains.

Security and compliance boundaries. Sensitive data (payment card data, health records) can be isolated in a dedicated database with stricter access controls, encryption keys, and audit logging — without those constraints affecting the rest of the system.

Trade-offs

No cross-database foreign keys. Referential integrity between tables in different databases cannot be enforced by the database engine. An order references a user_id, but the Orders database cannot have a foreign key constraint to the Users database. Consistency between references must be maintained by the application.

No cross-database transactions. A single ACID transaction cannot span two federated databases. Operations that must be atomic across domains require distributed transaction protocols (2PC) or eventual consistency via the Saga pattern. See the Distributed Transactions guide.

No cross-database JOINs. SQL JOINs operate within a single database connection. To join orders with user profiles, the application must query both databases separately and merge the results in memory. This increases application complexity and latency.

Operational overhead. More databases mean more servers to monitor, back up, patch, and operate. Each federated database needs its own backup schedule, failover strategy, and DBA attention. This overhead is manageable for a mature platform team but significant for a small team.

Cross-Database Operations

Most cross-domain queries in well-federated systems are simpler than they appear, because data that belongs together is usually read together at the domain boundary. An Order detail page needs order data from Orders DB and the customer's name from Users DB — but that's two targeted lookups, not a join across millions of rows.

Common patterns for cross-database data access:

API composition: Each service exposes an API for its data. The client or a BFF (Backend for Frontend) calls multiple APIs and merges the responses. Clean, decoupled, but adds network round-trips.

Data denormalization: Duplicate the data you need across domains at write time. When a user updates their name, publish an event; the Orders service listens and stores a copy of the customer name on each order. Reads are fast; writes require event publishing; eventual consistency must be accepted.

Read replicas for reporting: For analytics and reporting that require joins across domains, replicate all federated databases into a single analytics warehouse (Redshift, BigQuery, ClickHouse). OLTP databases stay federated; OLAP queries use the warehouse where JOINs are cheap.

💡
Federation and Microservices

Database federation is the data layer pattern that enables microservices. A microservice that shares a database with another service is not truly independent — schema coupling is as tight as code coupling. The rule of thumb: one database per service (or per bounded context). Federation is what makes that rule possible at scale.

Migrating to Federation

Splitting a monolithic database is not a one-day operation. A practical migration follows the Strangler Fig pattern:

  1. Identify domain boundaries. Map table ownership — which tables belong to which domain? Tables that are only read/written by one service are easy candidates. Tables with cross-domain writes are harder.
  2. Extract the new database. Create the new federated database with the target schema. Set up replication from the monolith to the new database using CDC (Debezium, logical replication).
  3. Dual-write. Update the application to write to both the monolith and the new database. This ensures the new database stays current while reads still come from the monolith.
  4. Migrate reads. Gradually shift read traffic to the new database. Monitor for discrepancies between the two databases.
  5. Cut over writes. Once reads are verified, stop writing to the monolith tables and write only to the new database. Remove the dual-write code.
  6. Drop tables from monolith. After a safe observation period, remove the migrated tables from the monolith database.

This process takes weeks to months depending on table complexity and cross-domain dependencies. The key is never cutting over all at once — incremental migration allows rollback at any step.

Design Considerations