Database Scalability


Scalability Options

Database scalability options range from simple to complex. Start with the simplest approach and evolve.

Vertical Scaling

Upgrade to a larger server with more CPU, RAM, and storage.




# AWS RDS instance upgrade


resource "aws_db_instance" "main" {


instance_class = "db.r6g.8xlarge" # 32 vCPU, 256GB RAM


allocated_storage = 5000 # 5TB SSD


}





Simple but has a cost ceiling and hardware limits.

Read Replicas

Offload read traffic to replicas:




class DatabaseRouter:


def __init__(self, primary, replicas):


self.primary = primary


self.replicas = replicas




def get_conn(self, write=False):


if write:


return self.primary


return random.choice(self.replicas)




# Route reads to replicas, writes to primary


db_router.get_conn(write=True).execute("INSERT INTO ...")


results = db_router.get_conn(write=False).execute("SELECT ...")





Effective for read-heavy workloads. Does not help with write scaling.

Caching

Reduce database load with in-memory caching:




def get_user(user_id):


user = cache.get(f"user:{user_id}")


if not user:


user = db.query("SELECT * FROM users WHERE id = %s", user_id)


cache.setex(f"user:{user_id}", 3600, json.dumps(user))


return user





Horizontal Scaling (Sharding)

Distribute data across multiple database servers:




class ShardManager:


def __init__(self, shards):


self.shards = shards




def get_shard(self, customer_id):


return self.shards[hash(customer_id) % len(self.shards)]





Most complex. Use tools like Vitess, Citus, or CockroachDB.

Scaling Decision Tree




Is DB overloaded?


├── Read-heavy? → Add read replicas


├── Write-heavy?


│ ├── Can you cache? → Add Redis/memcached


│ └── Cache insufficient? → Shard


└── Both? → Scale vertically first, then shard





Conclusion

Scale vertically first (simple). Add read replicas for read loads. Add caching for repeated queries. Shard only when necessary. Monitor your bottleneck before choosing a strategy. Most applications never need sharding.