MVCC in Postgres: When It Bites You
Postgres uses MVCC for concurrency. Most of the time you don't think about it. Then you do. Here are the four ways MVCC bites in production, and how to handle each.
Multi-Version Concurrency Control is what makes Postgres feel fast even under heavy concurrent writes. Every transaction sees a consistent snapshot; readers don't block writers; writers don't block readers. It works so well most teams never think about it.
Then they hit a 200 GB table that they swore was 20 GB. Or a replication lag that grew during a long-running report. Or a query that worked yesterday and times out today. That's MVCC biting.
What MVCC is, briefly
Every row in Postgres is actually a chain of row versions. An UPDATE doesn't change the row in place; it writes a new version and marks the old one as "invisible after transaction X". A DELETE marks the row as invisible; the bytes stay.
Older versions get cleaned up by VACUUM (manual or autovacuum) once no running transaction can possibly see them.
Four ways MVCC bites
- Bloat from long-running transactions.
- Transaction ID wraparound on busy databases.
- Repeatable-read isolation hiding new commits from your read.
- Autovacuum not keeping up with churn on big tables.
Bloat from long transactions
The most common one. A long-running transaction (a 2-hour analytics query, a forgotten BEGIN in a debugging session, a sleeping connection that holds a snapshot) prevents VACUUM from cleaning up any row versions newer than the moment that transaction started.
Meanwhile your app keeps writing. Every UPDATE creates a new row version. Dead row versions pile up on the heap; the table grows even though row count is stable.
-- Find transactions that have been open for more than 5 minutes
SELECT pid, usename, state, query_start, query
FROM pg_stat_activity
WHERE state IN ('idle in transaction', 'active')
AND now() - query_start > interval '5 minutes'
ORDER BY query_start;Transaction ID wraparound
Postgres uses a 32-bit transaction ID counter that wraps every ~4 billion transactions. Before it wraps, autovacuum must "freeze" rows to mark them visible to all future transactions. If autovacuum can't keep up, Postgres eventually shuts down to prevent data corruption.
On databases that write very heavily, this becomes a real operational concern past ~1 billion transactions per month. Monitor:
SELECT datname, age(datfrozenxid) AS xid_age
FROM pg_database
ORDER BY xid_age DESC;If xid_age approaches 200 million, your autovacuum config is too conservative for your workload.
Repeatable-read surprises
Most Postgres apps use the default isolation level (read committed). Some use repeatable read for stronger consistency. The latter has a surprise: a transaction in repeatable read sees the data as it was at the transaction's start. New commits by other transactions are invisible to it.
This is fine for analytics, terrible for "read latest user balance" logic. Watch for: a transaction that reads, computes, writes, and you find the write was based on stale data. That's repeatable read being too strong for your use case.
When autovacuum can't keep up
Default autovacuum settings are tuned for moderate workloads. High- churn tables (sessions, queue tables, append-then-delete patterns) often need per-table tuning:
ALTER TABLE sessions SET (
autovacuum_vacuum_scale_factor = 0.05, -- vacuum when 5% dead (default 20%)
autovacuum_vacuum_cost_limit = 1000, -- run faster
autovacuum_analyze_scale_factor = 0.02
);Watch pg_stat_user_tables.n_dead_tup over time. If it keeps climbing, autovacuum isn't winning the race.
MVCC is a great trade-off. The cost is that you have to operate VACUUM thoughtfully on high-volume databases. Treat it as a first-class operational concern and it rarely bites; ignore it and it bites at the worst possible time.
Suparbase is an admin workspace for Supabase. Encrypted credentials, server-side proxy, RLS debugger, SQL playground, AI assistant with diff-confirmed writes. Free tier for solo projects.
Related articles
- postgres · performance
Reading Postgres EXPLAIN ANALYZE: The 2026 Guide
A field guide to Postgres EXPLAIN ANALYZE in 2026. The operators that matter, the numbers to look at, and the four patterns that explain 90% of slow queries.
Read article - postgres · partitioning
Postgres Partitioning at Scale
Declarative partitioning landed years ago; in 2026 it's the boring-and-correct answer for tables past 100M rows. The patterns that work, the gotchas, and when to skip it.
Read article - postgres · observability
The Modern Postgres Observability Stack in 2026
The metrics that actually matter, the tools that work in 2026, and the alerts to set up before your database becomes someone else's problem.
Read article