Vacuuming and Reindexing in PostgreSQL

PostgreSQL's MVCC requires regular maintenance. This guide explains dead tuples, VACUUM vs VACUUM FULL, autovacuum tuning, REINDEX strategies, and how to monitor bloat with pg_stat_user_tables.

published: reading time: 16 min read

Vacuuming and Reindexing in PostgreSQL

PostgreSQL uses Multi-Version Concurrency Control (MVCC) to allow concurrent reads and writes without locking. This design gives PostgreSQL its excellent concurrency characteristics. But MVCC is not free—every UPDATE and DELETE leaves behind dead tuples that need to be cleaned up. Ignore this cleanup long enough, and your tables bloat, your queries slow down, and you wonder why your 100GB database is eating 500GB of disk.

This guide covers the maintenance tasks that keep PostgreSQL running smoothly: vacuuming and reindexing.

Understanding MVCC and Dead Tuples

When you UPDATE a row in PostgreSQL, it does not modify the existing row. Instead, it marks the old row as dead and inserts a new row. Similarly, DELETE marks rows as dead rather than removing them. This is how MVCC works—transactions see the version of the data that existed when they started.

The problem is that dead tuples accumulate. They take up space, and they slow down scans because the query planner has to skip over them. Eventually, if you never clean them up, you run out of disk space or your tables become so bloated that performance grinds to a halt.

VACUUM is PostgreSQL’s garbage collector. It marks the space used by dead tuples as free for future use. It does not return the space to the operating system—it just marks the pages as reusable. For that, you need VACUUM FULL.

VACUUM vs VACUUM FULL

Let me distinguish between the two clearly.

VACUUM (the standard command) marks dead tuples as free space within the table. It does not shrink the table’s physical size on disk. The freed space is immediately available for new rows within the same table. This is the command you should run most of the time.

VACUUM FULL rewrites the entire table, compacting it and returning unused space to the operating system. It requires an exclusive lock on the table—writes and reads are blocked for the duration. On a large table, this can take hours. Avoid this in production unless you have a maintenance window and genuinely need the space back.

-- Standard vacuum (run regularly)
VACUUM users;

-- Full vacuum (rarely needed, blocks everything)
VACUUM FULL users;

Let autovacuum handle routine vacuuming. Only use VACUUM FULL when you need to reclaim disk space after deleting a large amount of data—and even then, do it during a maintenance window.

Autovacuum Tuning

PostgreSQL’s autovacuum daemon automates vacuuming, but the defaults are conservative. For a busy database, you will likely need to tune these settings.

The key parameters are:

  • autovacuum_vacuum_threshold (default 50): Minimum number of dead tuples before VACUUM is triggered
  • autovacuum_vacuum_scale_factor (default 0.2): Fraction of table size to add to threshold
  • autovacuum_analyze_threshold (default 50): Minimum number of tuple changes before ANALYZE
  • autovacuum_analyze_scale_factor (default 0.1): Fraction of table size for ANALYZE threshold
  • autovacuum_vacuum_cost_delay (default 20ms): Sleep time between vacuum pages

For a table with 10 million rows, the scale factor approach means autovacuum waits until there are 2 million dead tuples. That is often too aggressive for busy tables.

You can configure autovacuum per-table:

ALTER TABLE orders SET (
  autovacuum_vacuum_scale_factor = 0.01,
  autovacuum_vacuum_threshold = 1000
);

This reduces the scale factor to 1% and sets a minimum threshold of 1000 dead tuples. For large tables, this triggers vacuuming much sooner.

Watch out for tables with very high update rates. If you are updating millions of rows per hour, the default settings will let bloat accumulate faster than autovacuum can handle. You might need to manually VACUUM those tables on a schedule.

REINDEX Strategies

Indexes in PostgreSQL can also become bloated over time. B-tree indexes, the default index type, can develop page-level fragmentation. REINDEX rebuilds the index, compacting it and improving performance.

Blocking vs Concurrent REINDEX

The standard REINDEX command blocks writes and reads while rebuilding:

REINDEX INDEX idx_users_email;

For production systems with large indexes, use CONCURRENTLY:

REINDEX INDEX CONCURRENTLY idx_users_email;

REINDEX CONCURRENTLY builds a new index alongside the existing one, then swaps them. Writes are never blocked. The main caveat is that REINDEX CONCURRENTLY cannot run inside a transaction, and it requires extra disk space for the temporary new index.

When to REINDEX

Index bloat is not always the problem people think it is. B-tree indexes self-balance, and some fragmentation is normal. REINDEX is most valuable when:

  • After a bulk DELETE on a heavily indexed table
  • When an index has become significantly bloated (verify with pgstatcache or pgstattuple)
  • Following a PostgreSQL major version upgrade (some versions have index format changes)
  • When index statistics have become stale despite ANALYZE

For routine maintenance, weekly or monthly REINDEX on problematic indexes is reasonable. Do not reindex everything blindly—focus on indexes that show bloat in your monitoring.

When to Use / When Not to Use

Use VACUUM (standard) for routine maintenance — it reclaims dead tuple space within the table without blocking reads or writes. Run it after large batch DELETE or UPDATE operations, or when n_dead_tup climbs above meaningful levels but disk space is not yet critical.

Use VACUUM FULL only when you genuinely need to return disk space to the operating system and can afford a maintenance window with exclusive table access. Reserve this for tables with severe bloat above 50% that cannot wait for normal cleanup.

Use CLUSTER when you want to physically reorder table data for I/O efficiency on a specific index — particularly useful after bulk loads where the table is accessed sequentially on an indexed column.

Use REINDEX CONCURRENTLY in production systems where index rebuilds must not block writes, or after bulk deletes that significantly fragmented specific indexes.

Never run VACUUM FULL on a large table without a maintenance window — it holds an exclusive lock for the entire duration. Never use standard REINDEX on busy production indexes — use CONCURRENTLY. Never blindly reindex everything — target bloated ones only based on monitoring data.

flowchart TB
    subgraph Write["UPDATE/DELETE Transaction"]
        W1[Mark old row as DEAD] --> W2[Insert new row version]
    end

    subgraph Vacuum["VACUUM Process"]
        V1[Scan table pages] --> V2[Find dead tuples]
        V2 --> V3[Mark page space as FREE]
        V3 --> V4[Update FSM<br/>Free Space Map]
    end

    subgraph Reindex["REINDEX Process"]
        R1[Build new index<br/>alongside old] --> R2[Swap when complete]
        R2 --> R3[Drop old index]
    end

    Write --> Vacuum
    Write --> Reindex

When to Manually VACUUM/REINDEX

Autovacuum handles most situations. Manual intervention is warranted in specific scenarios.

After Large Batch Operations

If you have just loaded 10 million rows via COPY or a bulk UPDATE, autovacuum might not catch up immediately. Running VACUUM ANALYZE afterward ensures statistics are updated and dead tuples are cleaned up promptly.

COPY users FROM '/tmp/users.csv';
VACUUM ANALYZE users;

After Major Deletions

When you delete millions of rows (say, purging old orders), autovacuum will eventually clean up, but you might want to run VACUUM immediately if disk space is a concern. If you need the space back immediately, VACUUM FULL (during maintenance) or CLUSTER (which reorders physical storage) might be necessary.

Before Major Queries

If you are about to run a long report that will scan millions of rows, running ANALYZE first ensures the query planner has accurate statistics. This is especially relevant after large data changes.

ANALYZE users;
EXPLAIN ANALYZE SELECT * FROM users WHERE created_at > '2025-01-01';

Monitoring Bloat with pg_stat_user_tables

PostgreSQL provides built-in statistics for monitoring table health. The pg_stat_user_tables view is your first stop for diagnosing bloat issues.

SELECT
  relname,
  n_live_tup,
  n_dead_tup,
  last_vacuum,
  last_autovacuum,
  vacuum_count,
  autovacuum_count,
  n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0) AS dead_tuple_pct
FROM pg_stat_user_tables
WHERE schemaname = 'public'
ORDER BY n_dead_tup DESC;

This query shows dead tuple counts and percentages for all tables. A table with 100 million live tuples and 10 million dead tuples has 9% bloat. Tables approaching 20-30% bloat benefit from manual vacuuming.

For more detailed bloat analysis, the pgstattuple extension provides actual bloat estimates:

SELECT * FROM pgstattuple('users');

This returns estimated free space, dead tuples, and bloat percentage. You need to install the extension first:

CREATE EXTENSION IF NOT EXISTS pgstattuple;

Another useful view is pg_stat_user_indexes, which shows index usage statistics. Indexes that are never scanned (high idx_scan of 0) might be unnecessary:

SELECT
  indexrelname,
  idx_scan,
  idx_tup_read,
  idx_tup_fetch
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY idx_scan ASC;

Table Bloat Detection Query

For a more comprehensive bloat analysis, you can use this query that calculates bloat based on page counts:

WITH constants AS (
  SELECT current_setting('block_size')::numeric AS bs, 24 AS page_header
), calc AS (
  SELECT
    ma,
    attrelid,
    relname,
    blksizepg,
    (datawidth + (blksizepg - page_header) * (1 - fillfactor::real / 100)) AS expected,
    (headerlen + (blksizepg - page_header) * (1 - fillfactor::real / 100)) AS expected_header
  FROM (
    SELECT
      0 AS ma,
      c.oid AS attrelid,
      c.relname,
      c.reloptions,
      s.starelid,
      8 AS headerlen,
      s.stawidth AS datawidth,
      s.stanullfrac,
      COALESCE(c.reloptions[1], 'block_size=8192') AS blksizepg,
      COALESCE(a.attoptions[1], 'fillfactor=90') AS fillfactor
    FROM pg_attribute a
    JOIN pg_statistic s ON attrelid = starelid AND attnum = staattnum
    JOIN pg_class c ON oid = attrelid
    WHERE staattnum > 0
  ) AS sub
)
SELECT
  relname,
  ROUND(100 * (actual - expected) / actual, 2) AS bloat_pct,
  actual,
  expected
FROM (
  SELECT
    calc.relname,
    c.relpages AS actual,
    (c.relpages::numeric * (SELECT blksizepg FROM calc LIMIT 1) / (SELECT expected FROM calc LIMIT 1)) AS expected
  FROM calc
  JOIN pg_class c ON c.oid = calc.attrelid
) sub
ORDER BY bloat_pct DESC;

Putting It Together: A Maintenance Routine

A reasonable PostgreSQL maintenance routine looks like this:

  1. Daily: Check pg_stat_user_tables for high dead tuple counts. Manually VACUUM tables that exceed thresholds.

  2. Weekly: Run VACUUM ANALYZE on all tables. Consider REINDEX on indexes with high bloat.

  3. Monthly: Review autovacuum configuration. Adjust thresholds for tables with changing patterns.

  4. After major data changes: Always run ANALYZE. Consider VACUUM if deleting significant data.

  5. Monitoring: Graph key metrics over time—dead tuple counts, table size trends, autovacuum frequency.

The specifics depend on your workload. A database with heavy UPDATE/DELETE patterns needs more frequent maintenance than a mostly-append database. Watch your data and adjust accordingly.

Production Failure Scenarios

FailureCauseMitigation
Autovacuum thrashingTables with very high update rates generate dead tuples faster than autovacuum can processLower per-table autovacuum thresholds, run manual VACUUM on schedule, or partition high-churn tables
Index bloat emergencyBulk deletes left indexes fragmented, query performance degradesRun REINDEX CONCURRENTLY on affected indexes, monitor with pg_stat_user_indexes
VACUUM FULL locks table for hoursLarge table with extreme bloat requires exclusive accessUse CLUSTER instead (faster for I/O-ordered results), or do incremental cleanup via small batch DELETEs
Transaction ID wraparoundLong-running transactions prevent VACUUM from reclaiming XIDsMonitor pg_database.datfrozenxid, never hold open transactions for days, set idle_in_transaction_session_timeout
Table size explosion without dead tuplesRapid data growth clutters buffer cache and slows scansPartition tables by time range, use partition pruning to limit scans

Trade-Off Table: VACUUM FULL vs CLUSTER vs REINDEX

DimensionVACUUM FULLCLUSTERREINDEX CONCURRENTLY
Lock behaviorExclusive — blocks all reads and writesExclusive — blocks all reads and writesShare lock — allows writes
Disk space neededExtra space for the rewritten tableExtra space for new table copyExtra space for new index
Physical orderCompacts but no ordering guaranteeReorders rows by index for I/O efficiencyPreserves existing order
SpeedSlowest for large tablesFaster than VACUUM FULL for sequential workloadsComparable to standard REINDEX
Use whenNeed OS space back urgentlyTable is accessed sequentially on a key after bulk loadProduction index rebuild with zero downtime

Capacity Estimation: Bloat Threshold and VACUUM Cost

Bloat percentage formula:

bloat_percentage = (dead_tuple_pages / (dead_tuple_pages + live_tuple_pages)) × 100

The pg_stat_user_tables view gives n_dead_tuples and n_live_tuples. Divide by the average tuples-per-page (typically ~300 for typical row widths) to estimate pages. For a table with 1M dead tuples and 9M live tuples at 300 tuples/page: 1M/300 ≈ 3,333 dead pages, 9M/300 ≈ 30,000 live pages. Bloat = 3,333 / 33,333 = 10%.

Autovacuum cost scaling formula:

vacuum_cost_delay_ms = vacuum_cost_delay × (scale_factor / 1000)
effective_vacuum_rate = vacuum_cost_page_hit / vacuum_cost_delay

Scale factor determines how aggressively autovacuum reclaims dead tuples. At autovacuum_vacuum_cost_delay = 20ms (default), each vacuum operation sleeps 20ms per cost unit. Raising scale factor too high (e.g., autovacuum_vacuum_scale_factor = 0.5 meaning 50% of table) causes autovacuum to be too aggressive and compete with production queries. For high-churn tables, use lower scale factors (0.05-0.1) and lower cost delays (2-5ms) to keep vacuum running continuously at low intensity.

Index bloat estimation:

index_bloat_ratio = index_page_count / expected_page_count_for_live_tuples

For a B-tree index with high turnover, bloated pages contain many dead index tuples. The pgstattuple extension provides pgstatindex() to report fragmentation. An index bloat ratio above 1.5 (50% more pages than necessary) warrants REINDEX.

REINDEX lock time estimation:

reindex_time_per_gb = index_size_gb × reindex_speed_factor_seconds_per_gb

For typical B-tree indexes on spinning disk: ~30-60 seconds per GB. On NVMe: ~5-10 seconds per GB. For a 50GB index: expect 4-8 minutes on NVMe. Use CONCURRENTLY to avoid locks—REINDEX CONCURRENTLY takes 2-3x longer but maintains availability.

Real-World Case Study: E-Commerce Site Bloat Incident

A major e-commerce platform experienced a severe bloat incident during a flash sale event that provides a cautionary tale about autovacuum tuning in production.

What happened: During a 24-hour sale event, the platform processed 2M orders. Each order generated 3-5 rows in various tables, and completed orders were soft-deleted (marked as cancelled/completed with a status flag). By the end of the event, 500GB of dead tuple space had accumulated across 12 tables. Their autovacuum workers, running at default settings (autovacuum_vacuum_scale_factor = 0.2, meaning vacuum only triggers at 20% dead tuples), could not keep up with the deletion rate.

The impact: Query performance degraded by 80% on order lookup tables. Page read times went from ~1ms to ~50ms as PostgreSQL scanned through dead tuple chains. The checkout flow started timing out. By the time the on-call DBA investigated, the autovacuum workers were running but could not catch up—a 50GB table with 20% bloat threshold requires ~10GB of dead tuple accumulation before triggering.

The fix: The team temporarily lowered autovacuum_vacuum_scale_factor to 0.05 (5%) and autovacuum_vacuum_cost_delay to 2ms on the affected tables using ALTER TABLE orders SET (autovacuum_vacuum_scale_factor = 0.05, autovacuum_vacuum_cost_delay = 2). This caused autovacuum to trigger more frequently and run at lower intensity. Within 2 hours, bloat dropped from 50GB to 8GB. A VACUUM ANALYZE on the worst offending tables provided immediate relief.

The lesson: Default autovacuum settings are calibrated for development workloads, not high-churn production tables. After any event that generates bulk deletions, immediately run VACUUM on affected tables rather than waiting for autovacuum. For tables with high DELETE rates, set autovacuum_vacuum_scale_factor to 0.05-0.1 and cost delay to 1-2ms. Monitor n_dead_tuples per table and alert when dead tuple counts exceed 1M.

Interview Questions

Q1: What is the difference between VACUUM and VACUUM FULL, and when would you use each?

A: VACUUM (standard) reclaims dead tuples in place, marking pages as free space without changing physical file size. It runs concurrently with production traffic, acquires only ShareUpdateExclusiveLock (which does not block reads or writes), and is the default maintenance operation. VACUUM FULL rewrites the entire table to a new file, physically compacting it and returning space to the OS. It requires AccessExclusiveLock (blocks all reads and writes) and cannot run concurrently. Use VACUUM for regular maintenance (daily or weekly, depending on churn). Use VACUUM FULL only when you need to shrink a table’s disk footprint urgently AND can afford a maintenance window, or use pg_repack as a non-blocking alternative that achieves the same result with zero downtime.

Q2: A table has 80GB of disk space but only 10GB of live data. How do you diagnose and fix this?

A: First, diagnose with SELECT relname, n_dead_tuples, n_live_tuples, last_vacuum FROM pg_stat_user_tables WHERE relname = 'your_table'. High dead tuple count with no recent vacuum confirms bloat is from uncleared dead tuples. For a 10GB live data table with 70GB of dead space, a standard VACUUM will reclaim the free space within the table file but not return it to the OS. To actually shrink the file, use VACUUM FULL (blocks writes) or pg_repack -k -t your_table (zero-downtime compaction). Also check for bloated indexes: use pgstatindex() on each index. Often, bloated indexes contribute as much or more to storage bloat as the table itself—REINDEX each bloated index.

Q3: How does autovacuum tuning differ for a write-heavy OLTP table versus an analytical table with bulk inserts?

A: For write-heavy OLTP tables with frequent UPDATEs and DELETEs, autovacuum must run frequently with low cost to keep up with dead tuple generation. Configure: autovacuum_vacuum_scale_factor = 0.05, autovacuum_vacuum_cost_delay = 2ms, autovacuum_vacuum_cost_limit increased to allow more work per cycle (e.g., 2000). Monitor n_dead_tuples and set an alert when it exceeds thresholds specific to that table. For analytical tables with bulk loads (no updates/deletes between loads), autovacuum rarely needs to run—the dead tuple generation is bursty (during loads) rather than continuous. Configure autovacuum to be less aggressive: autovacuum_vacuum_scale_factor = 0.2 and let it trigger after large batch loads complete. Run VACUUM ANALYZE immediately after each bulk load to update statistics.

Q4: What causes index bloat, and how do you decide when to REINDEX?

A: Index bloat accumulates when many rows are updated or deleted in an index—PostgreSQL marks index tuples as dead but never removes them until REINDEX runs. Unlike table bloat which VACUUM handles continuously, index bloat only improves with REINDEX. Symptoms: an index grows significantly larger than expected for its row count; query performance degrades despite table-level vacuum; pgstatindex() shows avg_leaf_density below 50%. REINDEX is needed when bloat ratio exceeds 1.5 (50% more pages than necessary). Use pgstatindex() output to compare index_size to tree_height and leaf_pages — a tree that is too tall for its row count indicates fragmentation. Schedule REINDEX during low-write windows or use REINDEX CONCURRENTLY to avoid blocking production traffic.

Conclusion

MVCC is what makes PostgreSQL handle concurrency so well, but it requires maintenance. Vacuuming reclaims space from dead tuples. Reindexing compacts bloated indexes. Autovacuum handles the routine work, but you need to tune it for busy tables and watch for situations that overwhelm the defaults.

Build monitoring into your routine. Track dead tuple counts, vacuum timing, and table sizes. The first sign of bloat problems is easier to fix than advanced bloat disease.

PostgreSQL’s visibility into its own internals is excellent. The tools exist—you just need to use them.


Category

Related Posts

Database Capacity Planning: A Practical Guide

Plan for growth before you hit walls. This guide covers growth forecasting, compute and storage sizing, IOPS requirements, and cloud vs on-prem decisions.

#database #capacity-planning #infrastructure

Connection Pooling: HikariCP, pgBouncer, and ProxySQL

Learn connection pool sizing, HikariCP, pgBouncer, and ProxySQL, timeout settings, idle management, and when pooling helps or hurts performance.

#database #connection-pooling #performance

Database Monitoring: Metrics, Tools, and Alerting

Keep your PostgreSQL database healthy with comprehensive monitoring. This guide covers query latency, connection usage, disk I/O, cache hit ratios, and alerting with pg_stat_statements and Prometheus.

#database #monitoring #observability