How Databases Checkpoint to Disk Without Stopping the World

What problem does it solve

“The key invariant is not 'all pages are consistent with each other.' It is 'all pages are at least as old as the redo point, and the WAL from the redo point forward is complete.' — Gaurav Sarma, explaining PostgreSQL's fuzzy checkpointing”

You know that feeling when your database latency histogram shows a cliff every few minutes? That's checkpointing. Your database has gigabytes of dirty pages in memory that need to hit disk for crash recovery. The naive approach—pause all writes, flush everything, resume—gives you a consistent snapshot but also gives you multi-hundred-millisecond stalls. For a system doing 50,000 writes per second, that stall shows up as a cliff in your p99 latency every time the checkpoint fires. Every major database has had to solve this, and the solutions are more varied than you'd expect.

databasessystems-designpostgresqlredissqliterocksdbmongodb

How it works

Think of it like taking a photo of a moving crowd. You can't freeze everyone in place, so you use tricks: take multiple photos and stitch them together (PostgreSQL's WAL replay), have people step into a side room for their photo (SQLite's WAL file), or create a duplicate crowd that stands still while the original keeps moving (Redis's fork). PostgreSQL records a 'redo point' in its write-ahead log, flushes dirty pages in the background while writes continue, and on recovery replays the log forward from that point. SQLite writes all changes to a separate WAL file; the checkpoint merges pages back to the main file only when no readers need them. Redis calls fork() to get a child process with a copy-on-write view of memory, then the child serializes everything to disk while the parent keeps serving requests.

Key takeaways

✦

01

Three fundamental primitives — every non-blocking checkpoint is either WAL replay (PostgreSQL, WiredTiger), side-channel merge (SQLite, RocksDB), or fork-based snapshot (Redis), and understanding which primitive your database uses tells yo...

⟁

02

PostgreSQL fuzzy checkpointing — records a redo point, flushes dirty pages in background via bgwriter/checkpointer processes, spreads I/O over time with checkpoint_completion_target (default 0.9), and on recovery replays WAL forward from r...

⊕

03

SQLite WAL mode — writes only to WAL file during transactions, reads check WAL first then main file, checkpoint copies pages back only up to minimum read mark, and PASSIVE mode never blocks but WAL can grow unboundedly with long-running re...

◈

04

Redis BGSAVE fork strategy — fork() creates child with copy-on-write view of parent memory, child walks data structures and writes RDB sequentially, parent continues serving writes, and worst case memory doubles if every page modified duri...

∞

05

RocksDB immutable memtable flush — active MemTable rotates to immutable when full, background thread flushes immutable to L0 SSTable, writes accumulate in new active MemTable during flush, and GetLiveFiles() creates instant checkpoint via ...

◎

06

WiredTiger hazard pointers — maintains two checkpoints (durable and in-progress), uses hazard pointers for lock-free reader synchronization, writes to new disk locations (append-only), and atomically updates metadata file on checkpoint com...

Should you care?

Who it’s for

If you're a backend engineer who's ever stared at a latency spike and wondered 'is this checkpointing?', or a DBA tuning PostgreSQL's checkpoint_completion_target without understanding why it matters, this is for you. Especially valuable if you're choosing between databases and want to understand their persistence trade-offs. Not useful if you don't work with databases that have crash recovery requirements, or if you only use managed database services where you can't tune checkpoint behavior.

Worth exploring

Yes, this is one of the clearest explanations of database checkpointing you'll find. The author works on Stripe's database engine team and has 10 years of infrastructure experience, and it shows—the article includes actual code snippets, disk layouts, and a comparison table of trade-offs. Published March 15, 2026, so it's current. The only caveat: it assumes you know what WAL and buffer pools are, so complete beginners should read the PostgreSQL docs first.

6 more sections · unlock free

Developer playbook

Tech stack, code snippet, sentiment, alternatives.

PM playbook

Adoption angles, user fit, positioning.

CEO playbook

Traction signals, ROI, build vs buy.

Deep-dive insight

Full long-form analysis, no fluff.

Easy mode

Core idea, fast — when you need the gist.

Pro mode

Technical nuance, edge cases, tradeoffs.

Sign in free — unlock all 6

How Databases Checkpoint to Disk Without Stopping the World

Underrated tools. Unfiltered takes.