Database Replication: When One Database Is Not Enough
A story-driven guide to database replication: why one node stops scaling, how replicas help, and why copies can lag or disagree.
TechFable Starts
TechFable Trading Co. opens as a simple trading shop.
One room and a handful of traders.
At the front of the room, one giant price board.
Throughout the day, a clerk updates the prices as gold and silver price changes.
When prices change, everyone sees it happen.
One room. One board. One truth.
Single Database Node
The database equivalent of this simple shop is a single database node.
One Clerk + One Board
↓
Single Database Node
When the clerk changes a price, that is a write to the database.
When traders check the board, that is a read from the database.
Because all reads and writes go through one database node, the model is easy to reason about.
One place to read. One place to write. One source of truth.
TechFable Grows
TechFable gets busier.
The fix is obvious: move into a bigger room and put up a bigger board.
In database terms, this is scaling up: keeping the same architecture, but moving to a bigger machine.
Small Database Node → Larger Database Node
For a while, it works — and that is a good plan.
More traders fit in the room. More people can watch prices move. The setup stays simple.
Still one board. Still one source of truth.
But TechFable keeps growing.
The bigger room slowly starts to feel small as more traders join.
DATABASE MELTDOWN
Now it is not just crowded.
The traders are getting in each other’s way.
In database terms, this is the limit of only scaling up.
When all reads and writes hit one machine, they compete for the same resources:
CPU
Memory
Disk
Network
A larger database node can handle more traffic.
That is why scaling up is a good first move.
But it is still one machine.
All reads go there.
All writes go there.
So eventually, that one machine becomes the bottleneck.
Back in TechFable, the next fix cannot be get an even bigger room.
Warning: Scaling up has a ceiling
A larger database node can buy you time, but it cannot grow forever. At some point, bigger becomes unavailable, too expensive, or still not enough.
Single-Leader: One Main Room, Many Replicas
TechFable has a better plan:
Get more trading rooms.
The original room becomes the main room: one official price board, one chief clerk.
Each new room gets a room clerk who is allowed into the main room.
When the chief clerk updates the official price board, each room clerk returns with the new price.
They update their room’s price board, and traders there see the new price.
Chief Clerk updates Main Board
↓
Room Clerks return with new price
↙ ↓ ↘
Room Board Room Board Room Board
This is exactly how databases scale reads.
The main room is the leader — also called the primary or source.
It is the only database allowed to accept writes.
The other rooms are followers, also called read replicas.
They do not accept writes.
Instead, they copy the leader’s changes through a replication log or change stream and apply them in the same order.
Their job is to serve reads — letting traders check prices without crowding the leader.
This database design is called single-leader replication.
Writes → Leader only
Reads → Leader or Followers
One source accepts writes. Many copies serve reads.
How Replication Works in Practice
At first, room clerks seem to need only the latest price.
They see silver changed at 11:45, then head back to update their board.
But while they are gone, the main room keeps moving:
11:45#104Silver$50 → $45— copied11:46#105Gold$100 → $102— missed11:47#106Silver$45 → $44— missed
The clerk cannot carry just one price.
They need to know what changed while they were away.
So the chief clerk keeps a Daybook: every price change, in order.
In databases, that Daybook is the replication log or change stream.
Room clerks copy missed entries and replay them on their boards.
A follower is caught up when it has replayed the latest entry.
A follower is lagging when it still has entries left to replay.
Databases can build this Daybook in different ways.
Statement-Based Replication
In statement-based replication, the Daybook stores the instruction.
Entry #104:
Set silver to $45.
Database version:
UPDATE prices SET silver = 45;
Followers read the instruction and run it themselves.
That sounds simple.
But some instructions are dangerous because each replica may calculate a different result.
Entry #105:
Set silver_updated_at to the current time.
Entry #106:
Pick a random gold discount.
Database version: NOW() and RAND().
These are non-deterministic functions.
If each follower runs them separately, each one may get a different result.
That creates data drift: the replicas slowly stop matching the leader.
Physical Replication: WAL Shipping
In physical replication, the Daybook stores the low-level storage changes.
Not the business instruction.
Not the friendly sentence.
The exact storage-level change the database made.
Entry #104:
On this storage page, change these bytes.
Database version: the leader writes changes to a Write-Ahead Log (WAL).
The WAL already exists so the database can recover after a crash.
With WAL shipping, followers receive that log and replay the same low-level changes.
What is WAL? Read more about WALThis is fast and faithful.
But it is tightly tied to the database engine and version, because the followers must understand the same storage format.
Logical Replication
In logical replication, the Daybook stores the business-level fact.
Entry #104:
Silver price changed from $50 to $45.
Entry #105:
Gold price changed from $100 to $102.
Entry #106:
Trade #77 was added.
Database version:
Row updated
Row inserted
Row deleted
Logical replication sends what changed, not the raw storage bytes underneath.
That makes it easier for other systems to consume.
For example, the same change stream can feed a search index, cache, analytics pipeline, or another service.
This is the idea behind Change Data Capture (CDC).
The important idea:
Replication is not copying one final price.
It is replaying every change that led there.
The Chatty Clerk and the Lag
The main room updates first.
But the other rooms only update after their room clerks return.
And not every clerk moves the same way.
One clerk walks fast.
Another walks slowly.
Another gets stopped by a familiar face in the hallway.
So one room updates quickly.
Another room updates a little later.
Another room is still showing the old price.
In database terms, this delay is called replication lag.
The leader has accepted the latest write.
But a follower may still be catching up.
This is common with asynchronous replication.
Asynchronous Replication
In TechFable, asynchronous replication means the chief clerk updates the main room board and keeps going.
He does not wait for room clerks to come back and confirm:
My room board is updated.
In database terms, the leader records the write, sends it through a replication log or change stream, and moves on.
The followers apply those changes at their own pace.
If a read goes to a follower before it catches up, it can return the old value.
That old value is a stale read.
The system is temporarily out of sync.
But if no new writes arrive and followers get time to catch up, the copies eventually agree again.
That is the idea behind eventual consistency.
The leader has the present.
A late follower can still show the past.
Synchronous Replication
There is another option: synchronous replication.
In TechFable, synchronous replication means the chief clerk waits.
He updates the main room board, then waits until the required room clerks come back and confirm their boards were updated.
Only then does he call the price change complete.
In database terms, the leader waits until at least one follower, or a configured set of followers, confirms the update before calling the write successful.
That gives you fresher copies.
But it also means the write can slow down if a follower is slow.
Or stop if the required follower is unavailable.
Inisght: Replication is full of this trade-off:
Do you wait for the rooms to catch up,
or keep the trading floor moving?
The Old-New Price Inconsistency
Replication lag feels worst when someone knows the update already happened.
A trader is standing in the main room when the chief clerk changes silver:
Silver: $50 → $45
Then the trader walks into another room.
That room’s board still says:
Silver: $50
Now the trader is confused.
They just saw the price change.
Why is this board still showing the old price?
The update is not lost.
That room just has not caught up yet.
In database terms, this is a read-after-write problem.
A user makes a write, then immediately performs a read, but the read goes to a follower that has not copied the write yet.
The guarantee we want is called read-your-writes consistency:
Insight: You upload a new profile picture from your beach vacation.
The upload succeeds. But when you refresh the app, you still see the old picture.
Your write worked. Your next read came from a replica that had not caught up yet.
Need a Fresh Price? Ask the Leader
So how do we fix it?
By choosing the read path based on how fresh the answer must be.
If the user must see the latest value, use a strongly consistent read from the leader.
If a slight delay is acceptable, use an eventually consistent read from a replica.
Need fresh data now? → Strong read → Leader
Can tolerate delay? → Replica read → Follower
Only the application owner can make that call.
After you upload a new profile picture, your own refresh should read from the leader so you see the new photo immediately.
Your friends can read from replicas.
If they see the old photo for a few seconds, that is usually acceptable.
That delay is not the same as the trading floor, where missing the latest price can change a buy or sell decision.
So for your own profile refresh, choose the leader.
For everyone else’s feed, replicas are usually good enough.
As replication catches up, everyone eventually sees the new photo.
That is the trade-off:
Strong reads when freshness matters.
Replica reads when scale matters.
Cross-Region Expansion: New York to London
TechFable opens a second trading floor in London.
But the original leader still lives in New York.
That means a London write has to cross the ocean before it is accepted:
London write → New York leader → London result
For a trading floor, that delay hurts.
This is not like waiting a few seconds for a profile picture to refresh.
A stale price can become a bad trade that costs real money.
Multi-Leader: A Leader in Every City
So TechFable changes the rule:
New York and London both can accept writes.
In database terms, this is multi-leader replication: multiple database nodes can accept writes.
Each leader records local changes, then replicates them to the other leaders.
New York Leader ⇄ London Leader
London writes stay near London.
New York writes stay near New York.
But now there is a new cost: conflict.
New York writes: Gold: SELL
At almost the same time, London writes: Gold: BUY
Both writes were accepted.
Later, when the leaders sync, TechFable must answer:
Which truth wins?
That is the core problem with multi-leader replication.
More writers means faster local writes.
But more writers also means more chances to disagree.
Conflicts need rules:
- pick the latest write
- merge the changes
- ask a human to resolve it
Single-leader asks: Who is allowed to write?
Multi-leader asks: What happens when two writers disagree?
Leaderless: No Main Room
After New York and London argue over writes, TechFable tries one last design.
No main room.
No special city.
Every room keeps a copy of the prices.
When a price changes, the update is sent to several rooms.
If enough rooms accept it, the write counts.
In database terms, this is leaderless replication.
There is no leader that owns all writes.
Reads and writes go to multiple replicas.
Write → ask several replicas
Read → ask several replicas
The magic word is quorum.
A quorum means “enough replicas agreed.”
For example:
Total replicas: 5
Write must reach: 3
Read must check: 3
Why does that work?
Because any 3 rooms and any other 3 rooms must overlap when there are only 5 rooms total.
So a read is likely to touch at least one room that saw the latest successful write.
No main room.
Just enough rooms must agree.
Leaderless replication can be resilient because one slow or broken replica does not stop everything.
But it is not free.
Replicas can still disagree for a while.
Some may miss writes while they are down.
So the system needs repair mechanisms to help old replicas catch up.
Leaderless does not remove the truth problem.
It changes how the system finds truth:
No official writer.
Just enough replicas must agree.
Three Replication Shapes
By now, TechFable has tried three shapes.
Single-leader → one writer, many readers
Multi-leader → many writers, conflict risk
Leaderless → no special writer, enough agreement
Each shape solves one pain and creates another.
Single-leader keeps writes simple, but followers can lag.
Multi-leader makes local writes faster, but writers can disagree.
Leaderless avoids one special writer, but the system must decide what “enough agreement” means.
Replication is never just:
Make more copies.
It is always:
Which copy should we trust, and when?
Daybook Recap
Single Database Node
One room, one board: all reads and writes go to one database.
Single Database Node. One room, one board: all reads and writes go to one database.
Quiz
86% of people love quizzes after learning. Are you one of them?
Question text
Quiz complete