Database Capacity Planning: Sizing, Growth Forecasting, and Scaling


Database Capacity Planning: Sizing, Growth Forecasting, and Scaling

Capacity planning ensures your database has enough resources to handle current and future workloads without over-provisioning. It is a continuous process that combines monitoring, forecasting, and proactive scaling.

Key Capacity Metrics

Storage

Storage is the most predictable resource to plan. Track:




-- Database sizes


SELECT datname,


pg_size_pretty(pg_database_size(datname)) AS size


FROM pg_database


ORDER BY pg_database_size(datname) DESC;




-- Table sizes (top 10)


SELECT relname AS table_name,


pg_size_pretty(pg_total_relation_size(relid)) AS total_size,


pg_size_pretty(pg_relation_size(relid)) AS table_size,


pg_size_pretty(pg_indexes_size(relid)) AS index_size


FROM pg_catalog.pg_statio_user_tables


ORDER BY pg_total_relation_size(relid) DESC


LIMIT 10;




-- Growth by day


SELECT date(created_at) AS day,


count(*) AS rows_added,


count(*) * 200 AS estimated_bytes -- rough estimate


FROM orders


WHERE created_at > now() - interval '30 days'


GROUP BY day


ORDER BY day;





Compute (CPU)

CPU usage correlates with query complexity and concurrency:




-- Queries with highest total CPU time


SELECT queryid, query,


total_exec_time,


calls,


mean_exec_time,


rows


FROM pg_stat_statements


ORDER BY total_exec_time DESC


LIMIT 20;





Monitor: CPU utilization %, replication CPU usage, autovacuum CPU usage.

Memory




-- Shared buffers usage


SELECT name, setting, unit,


current_setting(name)::numeric / pg_size_pretty('') AS ratio


FROM pg_settings


WHERE name IN ('shared_buffers', 'effective_cache_size',


'work_mem', 'maintenance_work_mem');




-- Hit ratio (should be >99%)


SELECT 'shared_buffers' AS area,


sum(blks_hit)::float / (sum(blks_hit) + sum(blks_read)) AS hit_ratio


FROM pg_stat_database;





Connections




SELECT max_conn.setting AS max_connections,


used_conn.count AS used_connections,


used_conn.count::float / max_conn.setting::int AS utilization_pct


FROM (SELECT setting FROM pg_settings WHERE name = 'max_connections') max_conn,


(SELECT count(*) AS count FROM pg_stat_activity) used_conn;





Growth Forecasting

Simple Linear Model




import psycopg2


from datetime import datetime, timedelta


import numpy as np




conn = psycopg2.connect("dbname=mydb")


cur = conn.cursor()




# Get daily row counts for last 90 days


cur.execute("""


SELECT date(created_at) AS day, count(*) AS rows


FROM orders


WHERE created_at > now() - interval '90 days'


GROUP BY day


ORDER BY day


""")


data = cur.fetchall()




days = np.array([(row[0] - data[0][0]).days for row in data])


rows = np.array([row[1] for row in data])




# Linear regression


coefficients = np.polyfit(days, rows, 1)


daily_growth = coefficients[0]




# Forecast: 90 days out


forecast_days = 90


current_total = sum(rows)


forecast_total = current_total + daily_growth * forecast_days




print(f"Daily growth: {daily_growth:.0f} rows")


print(f"Current monthly row count: {current_total}")


print(f"Forecast in 90 days: {forecast_total:.0f} rows")





Projecting Storage




avg_row_size_bytes = 250 # From pgstattuple


bytes_per_day = daily_growth * avg_row_size_bytes


gb_per_month = bytes_per_day * 30 / (1024**3)




current_gb = 10 # Current database size


months_to_full = (50 - current_gb) / gb_per_month # Assuming 50 GB limit




print(f"Growth: {gb_per_month:.1f} GB/month")


print(f"Time to 50 GB: {months_to_full:.0f} months")





Scaling Strategies

Vertical Scaling (Scale Up)

Increase the size of the existing database instance:




# AWS RDS


aws rds modify-db-instance \


--db-instance-identifier mydb \


--db-instance-class db.r6g.xlarge \


--apply-immediately




# Or via Terraform


resource "aws_db_instance" "mydb" {


instance_class = "db.r6g.xlarge" # Was db.r6g.large


allocated_storage = 200 # Was 100


}





**Pros**: Simple, no application changes. **Cons**: Instance size limits, downtime for some changes, no infinite scaling.

Horizontal Scaling (Scale Out)

Add read replicas for read workloads:




# Add read replica


aws rds create-db-instance-read-replica \


--db-instance-identifier mydb-replica-1 \


--source-db-instance-identifier mydb \


--db-instance-class db.r6g.large





For write scaling, consider sharding (Citus):




-- Citus distributed table


SELECT create_distributed_table('orders', 'user_id');




-- Queries automatically route to correct shard


SELECT * FROM orders WHERE user_id = 42;





Autoscaling Configuration

Aurora Serverless




# Automatic scaling based on load


Resources:


MyDBCluster:


Type: AWS::RDS::DBCluster


Properties:


Engine: aurora-postgresql


ServerlessV2ScalingConfiguration:


MinCapacity: 0.5


MaxCapacity: 64





Storage Autoscaling




# Enable storage autoscaling on RDS


aws rds modify-db-instance \


--db-instance-identifier mydb \


--max-allocated-storage 1000 \


--apply-immediately





Monitoring Dashboard

Build a capacity monitoring dashboard with these key metrics:

| Metric | Alert Threshold | Action | |--------|-----------------|--------| | Storage utilization | >80% | Increase storage or archive data | | CPU utilization | >80% for 5 minutes | Scale up or optimize queries | | Connection utilization | >80% | Increase max_connections or add pooler | | Replication lag | >60 seconds | Investigate replica or network | | IOPS utilization | >80% of provisioned | Increase IOPS provision | | Cache hit ratio | <99% | Increase shared_buffers |

Seasonal Capacity

Many workloads are not uniform. Plan for seasonal peaks:




# Example: Black Friday capacity plan


# Normal: 4 vCPU, 16 GB RAM, 1000 connections


# Black Friday: 16 vCPU, 64 GB RAM, 5000 connections




# Pre-scale 2 weeks before


- date: 2026-11-10


action: scale_up


target: db.r6g.4xlarge




# Post-scale 1 week after


- date: 2026-12-07


action: scale_down


target: db.r6g.xlarge





The Planning Cycle


* **Monitor**: Continuously collect metrics.

2\. **Analyze**: Identify trends and anomalies. 3\. **Forecast**: Project future resource needs. 4\. **Plan**: Schedule scaling or optimization. 5\. **Execute**: Apply changes during maintenance windows. 6\. **Review**: Validate that changes had the expected effect.

Capacity planning is not a one-time exercise. Review your database capacity monthly, adjust forecasts quarterly, and always maintain headroom for unexpected traffic spikes. Over-provision by 20-30% for production databases to absorb traffic surges without degradation.