PostgreSQL Under Load - Part 2: Vacuum, Analyze, and Index

October 2025| Tags: Postgres, PgSQL, Table Partition, Performance Fix
image

In Part 1, we tackled partitioning — the first step to keeping Postgres performant under heavy write loads. Partitioning solved routing, not cleanup.

Postgres uses MVCC (Multi-Version Concurrency Control) to handle concurrent reads and writes. It’s brilliant, but it comes with a caveat: every update leaves behind dead tuples, every delete lingers until vacuumed, and every index quietly bloats over time.

This post dives into the cleanup side of scaling:

  •   How MVCC creates bloat
  •   How to tune autovacuum and ANALYZE
  •   How to keep indexes lean and writes fast

MVCC Fragmentation: Why Postgres Bloats

Postgres doesn’t overwrite rows — it creates new versions. This is the heart of MVCC (Multi-Version Concurrency Control). Dead tuples pile up in both the table and its indexes. Over time, this causes: Bloated pages, Index churn, Autovacuum lag.

Behind the scenes,

  •   UPDATE → creates a new row version, marks the old one as dead.
  •   DELETE → marks the row as dead, but doesn’t remove it.
  •   SELECT → scans both live and dead tuples unless vacuumed.

How to spot Fragmentation?

  •   Use pg_stat_user_tables to check n_dead_tup
  •   Use pgstattuple to inspect bloat percentage (Requires pgstattuple extension)
	-- Query to get the number of dead tuples
	SELECT 
	  schemaname,
	  relname AS table_name,
	  n_dead_tup AS dead_tuples
	FROM 
	  pg_stat_user_tables
	ORDER BY 
	  n_dead_tup DESC
	LIMIT 10;

Vacuum & Analyze

Postgres relies on autovacuum to clean up dead tuples and refresh query planner stats. But in high-ingest systems, default settings often lag behind reality — leaving bloated tables and stale plans.

Vacuum - Reclaim space

Use pg_stat_user_tables to spot tables with lots of dead tuples and slow vacuum cycles.

	VACUUM public.logs;
  • VACUUM - Marks dead tuples as reusable. Does not lock. Run Vacuum during bulk-update or delete. Tune autovacuum based on traffic.
  • VACUUM FULL - Rewrites the entire table. LOCKS the table and blocks reads/writes. Use with caution.
  • pg_repack (extension) - Rewrites the entire table without locking. Requires a lock only for a very short time at the beginning and end. Use only when absolutely required.

Analyze - Keeping Stats Fresh

Postgres uses statistics to plan queries. After a bulk insert or partition creation, stats may be stale — leading to bad plans. Run ANALYZE manually after COPY, backfills, or partition swaps. Keeps planner decisions accurate and efficient.

	ANALYZE public.logs;

Index Strategy for Write-Heavy Tables

Indexes speed up reads — but they come at a cost. Every insert, update, or delete must also update the indexes. In write-heavy systems, too many indexes = slow writes + bloated storage.

  •   Index only what you query. Skip speculative or “just in case” indexes.
  •   Use partial indexes for recent rows — they’re smaller, faster, and cheaper to maintain.
  •   Disable indexes before bulk loads (COPY, INSERT) and rebuild them afterward to avoid overhead.

Tuning work_mem - When Queries Spill

work_mem controls how much memory Postgres can use for operations like sorting, hashing, and joins. If a query exceeds this limit, it spills to disk, slowing everything down.

The default work_mem setting is just 4MB, which is typically insufficient for most production workloads — especially those involving joins, sorts, or aggregates. Tune the work_mem, such that:

  •   Low work_mem -> disk-based temp files = slow queries.
  •   High work_mem -> faster joins, sorts, and aggregates — but risks memory exhaustion.
	SHOW work_mem;
	SET work_mem = '64MB';

Postgres won’t slow down all at once — it degrades quietly. But with the right cleanup strategy, you can keep performance sharp even under heavy write pressure.

Let's Connect

for a cup of coffee, challenges, or conversations that spark something new

dakshin.g [at] outlook [dot] com
www.dakshin.cc