Index Maintenance: Bloat, Rebuild, Reindex, and Fillfactor Tuning

Indexes degrade over time. PostgreSQL's MVCC architecture creates dead index entries that waste space and slow down scans. Regular maintenance keeps indexes healthy and query performance predictable.

Index Bloat

Index bloat occurs when dead tuples leave empty space in index pages. Unlike tables, PostgreSQL does not automatically reuse index page space aggressively.

Measuring Bloat

-- Quick bloat estimation using pg_stat_user_indexes

SELECT

indexrelname AS index_name,

relname AS table_name,

idx_scan,

idx_tup_read,

idx_tup_fetch,

pg_size_pretty(pg_relation_size(indexrelid)) AS index_size

FROM pg_stat_user_indexes

WHERE schemaname = 'public'

ORDER BY pg_relation_size(indexrelid) DESC;

For detailed bloat estimation, the `pgstattuple` extension provides accurate measurements:

CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT * FROM pgstatindex('idx_orders_user_id');

-- version, tree_level, index_size, root_block_no, internal_pages, leaf_pages,

-- empty_pages, deleted_pages, avg_leaf_density, leaf_fragmentation

Key metrics:

* `avg_leaf_density`: Below 50% indicates significant bloat.

* `leaf_fragmentation`: High fragmentation slows sequential index scans.

* `empty_pages + deleted_pages`: Pages that are allocated but useless.

External Tools

`pg_repack` rebuilds indexes without locks:

# Rebuild all indexes on a table without blocking writes

pg_repack -h localhost -d mydb --table orders -o idx_orders_user_id

REINDEX

`REINDEX` rebuilds an index from scratch, eliminating bloat and restoring optimal structure:

-- Rebuild a single index

REINDEX INDEX idx_orders_user_id;

-- Rebuild all indexes on a table

REINDEX TABLE orders;

-- Rebuild all indexes in a schema

REINDEX SCHEMA public;

-- Rebuild all indexes in a database (offline)

REINDEX DATABASE mydb;

CONCURRENTLY

`REINDEX` by default takes an `ACCESS EXCLUSIVE` lock, blocking reads and writes. The `CONCURRENTLY` option avoids this:

REINDEX INDEX CONCURRENTLY idx_orders_user_id;

Concurrent reindexing creates a new index, starts a new transaction, and drops the old index when complete. It uses more resources (CPU, I/O, temporary disk space) but does not block queries.

-- For production systems, always use CONCURRENTLY

-- But monitor for failure:

REINDEX INDEX CONCURRENTLY idx_orders_user_id;

-- If the process fails, an "invalid" index remains:

SELECT indexrelid::regclass, indisvalid

FROM pg_index

WHERE indisvalid = false;

-- Drop and retry the invalid index

DROP INDEX IF EXISTS idx_orders_user_id_ccnew;

Fillfactor Tuning

Fillfactor controls how full each index page is when it is first built. The default is 90% for B-tree indexes (10% free space per page).

-- Create index with 70% fillfactor (30% free space)

CREATE INDEX idx_orders_user_id ON orders (user_id) WITH (fillfactor = 70);

-- Alter existing index (only affects future page splits)

ALTER INDEX idx_orders_user_id SET (fillfactor = 80);

When to Adjust Fillfactor

| Workload | Recommended Fillfactor | Reason | |----------|----------------------|--------| | Read-only, bulk-loaded | 100 | No updates expected | | Standard OLTP (default) | 90 | Balance reads and writes | | Write-heavy (updates) | 70-80 | Reduce page splits | | HOT updates via fillfactor | 50-60 | Maximize HOT update ratio |

Heap-Only Tuples (HOT) optimization occurs when updated row versions fit in the same page. Lower fillfactor means more page space for HOT updates:

-- Increase HOT update ratio with lower fillfactor

CREATE TABLE frequently_updated (

id INTEGER PRIMARY KEY,

status TEXT,

counter INTEGER

) WITH (fillfactor = 70);

Monitor HOT update ratio:

SELECT relname, n_tup_upd, n_tup_hot_upd,

ROUND(n_tup_hot_upd * 100.0 / NULLIF(n_tup_upd, 0), 2) AS hot_pct

FROM pg_stat_user_tables

WHERE n_tup_upd > 0

ORDER BY hot_pct;

A low HOT ratio (below 50%) indicates many index-only updates that cause index bloat. Lowering fillfactor or adding INCLUDE columns to avoid index updates can help.

Automated Maintenance

Autovacuum Configuration

Autovacuum manages both table and index cleanup:

# postgresql.conf

autovacuum = on

autovacuum_naptime = 1min

autovacuum_vacuum_scale_factor = 0.01

autovacuum_vacuum_threshold = 1000

autovacuum_vacuum_cost_limit = 1000

autovacuum_vacuum_cost_delay = 5ms

For write-heavy tables, configure per-table autovacuum:

ALTER TABLE orders SET (

autovacuum_vacuum_scale_factor = 0.005,

autovacuum_vacuum_threshold = 1000,

autovacuum_vacuum_cost_limit = 2000

);

Scheduled REINDEX

For tables with regular bloat patterns, schedule REINDEX:

-- Using pg_cron extension

SELECT cron.schedule('reindex-orders', '0 3 * * 0',

'REINDEX INDEX CONCURRENTLY idx_orders_user_id'

);

Monitoring Index Health

-- Index size trends

SELECT

indexrelid::regclass,

pg_size_pretty(pg_relation_size(indexrelid)) AS current_size,

pg_size_pretty(pg_relation_size(indexrelid) -

pg_indexes_size(indexrelid::regclass::text::regclass)) AS estimated_bloat

FROM pg_stat_user_indexes;

-- Top 10 largest indexes

SELECT

indexrelid::regclass,

pg_size_pretty(sum(pg_relation_size(indexrelid))) AS total_size

FROM pg_stat_user_indexes

GROUP BY indexrelid

ORDER BY sum(pg_relation_size(indexrelid)) DESC

LIMIT 10;

-- Index scan frequency vs bloat

SELECT

indexrelid::regclass,

idx_scan,

idx_tup_read,

idx_tup_fetch,

pg_size_pretty(pg_relation_size(indexrelid))

FROM pg_stat_user_indexes

WHERE idx_scan = 0

ORDER BY pg_relation_size(indexrelid) DESC;

Best Practices

* **Set up bloat monitoring** with thresholds (e.g., alert when avg_leaf_density < 60%).

2\. **Use CONCURRENTLY** for all production REINDEX operations. 3\. **Tune fillfactor** per table based on update frequency. 4\. **Schedule index maintenance** during low-traffic periods using pg_cron. 5\. **Drop unused indexes** identified by `pg_stat_user_indexes` with `idx_scan = 0`. 6\. **Consider BRIN indexes** for append-only tables to avoid B-tree bloat entirely.

Index maintenance is not a one-time activity. Build monitoring and automation into your database operations, and check index health as part of your regular performance review cycle.

Index Maintenance: Bloat, Rebuild, Reindex, and Fillfactor Tuning

Index Maintenance: Bloat, Rebuild, Reindex, and Fillfactor Tuning

Related Articles