Database Connection Management: Pooling, PgBouncer, HikariCP, and Tuning


Database Connection Management: Pooling, PgBouncer, HikariCP, and Tuning

Every database connection consumes memory, file descriptors, and CPU. Connection management directly impacts database performance and application scalability. This article covers pooling strategies, popular poolers, and tuning guidelines.

Why Connection Pooling Matters

Each PostgreSQL backend process consumes approximately 5-10 MB of RAM, even when idle. A server with 500 connections uses 2.5-5 GB just for connection overhead. Beyond memory, connection creation is expensive: a new connection requires a TCP handshake, SSL negotiation, authentication, and backend forking.

The "Too Many Connections" Problem




-- Check current connections


SELECT count(*), state FROM pg_stat_activity GROUP BY state;




-- Find connections by application


SELECT application_name, count(*), sum(waiting) as waiting


FROM pg_stat_activity


GROUP BY application_name;





Connection pooling maintains a persistent set of database connections that applications borrow and return, avoiding the overhead of establishing connections on every request.

PgBouncer (Server-Side Pooling)

PgBouncer is a lightweight connection pooler for PostgreSQL. It runs as a separate process and proxies connections to the database.

Installation and Configuration




# pgbouncer.ini


[databases]


mydb = host=127.0.0.1 port=5432 dbname=mydb




[pgbouncer]


listen_addr = 0.0.0.0


listen_port = 6432


auth_type = md5


auth_file = /etc/pgbouncer/userlist.txt


pool_mode = transaction


max_client_conn = 500


default_pool_size = 25


reserve_pool_size = 5


reserve_pool_timeout = 3


server_idle_timeout = 300


query_timeout = 30





Pooling Modes

| Mode | Description | Use Case | |------|-------------|----------| | `session` | Connection assigned for entire client session | Legacy apps, long-lived transactions | | `transaction` | Connection assigned per transaction | Default for web applications | | `statement` | Connection assigned per statement | Rare; only when no session state needed |

User List




echo '"app_user"' '"secure_password_hash"' > /etc/pgbouncer/userlist.txt





Connecting Through PgBouncer




import psycopg2




# Connect to PgBouncer port, not PostgreSQL directly


conn = psycopg2.connect(


host="localhost",


port=6432,


dbname="mydb",


user="app_user",


password="secure_password"


)





Monitoring PgBouncer




-- Connect to PgBouncer's admin database


psql -p 6432 -U admin pgbouncer




-- Show pool statistics


SHOW POOLS;


SHOW STATS;


SHOW CLIENTS;


SHOW SERVERS;





Key metrics: `clients_active`, `clients_waiting`, `servers_active`, `servers_idle`, `maxwait`.

HikariCP (Client-Side Pooling)

HikariCP is the most popular connection pool for Java applications. It manages connections from within the application process.

Spring Boot Configuration




# application.yml


spring:


datasource:


url: jdbc:postgresql://localhost:5432/mydb


username: app_user


password: secure_password


hikari:


maximum-pool-size: 20


minimum-idle: 5


idle-timeout: 300000


connection-timeout: 10000


max-lifetime: 1800000


pool-name: MyPool


connection-test-query: SELECT 1





HikariCP Validation




// Programmatic configuration


HikariConfig config = new HikariConfig();


config.setJdbcUrl("jdbc:postgresql://localhost:5432/mydb");


config.setUsername("app_user");


config.setPassword("secure_password");


config.setMaximumPoolSize(20);


config.setMinimumIdle(5);


config.setConnectionTimeout(10000);


config.setIdleTimeout(300000);


config.setMaxLifetime(1800000);


config.setConnectionTestQuery("SELECT 1");


config.addDataSourceProperty("cachePrepStmts", "true");


config.addDataSourceProperty("prepStmtCacheSize", "250");


config.addDataSourceProperty("prepStmtCacheSqlLimit", "2048");




HikariDataSource ds = new HikariDataSource(config);





Pool Sizing Formula

The conventional formula for `max_connections` in PostgreSQL:




max_connections = (max_client_conn / default_pool_size) * 2 + superuser_reserved_connections





But the **Formula rule of thumb** for optimal throughput is:




connections = 2 * CPU_cores + effective_spindle_count





A 4-core machine with SSDs: `2 * 4 + 1 = 9` concurrent connections for optimal throughput. Beyond this, context switching and lock contention degrade performance.

Max Connections Tuning




# postgresql.conf


max_connections = 200


superuser_reserved_connections = 5





Application Statement Timeout




-- Prevent runaway queries from holding connections


ALTER DATABASE mydb SET statement_timeout = '30s';


ALTER DATABASE mydb SET idle_in_transaction_session_timeout = '5min';





Common Pitfalls

Connection Leaks




# Bad: connection not returned to pool


def get_user(user_id):


conn = pool.getconn()


cursor = conn.cursor()


cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))


result = cursor.fetchone()


# Missing: pool.putconn(conn)


return result




# Good: always use try/finally or context manager


def get_user(user_id):


conn = pool.getconn()


try:


cursor = conn.cursor()


cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))


return cursor.fetchone()


finally:


pool.putconn(conn)





Long-Running Queries in Pool




-- Set statement timeout at database level


ALTER DATABASE mydb SET statement_timeout = '30000'; -- 30 seconds




-- Kill queries exceeding threshold


SELECT pg_terminate_backend(pid)


FROM pg_stat_activity


WHERE state = 'active'


AND now() - query_start > interval '5 minutes';





Connection Starvation

When all pool connections are busy and requests queue:




spring.datasource.hikari:


maximum-pool-size: 20


connection-timeout: 5000 # 5 seconds max wait


# After 5 seconds, throw SQLException instead of hanging forever





Monitoring Connection Health




-- Connection utilization


SELECT


count(*) AS total_connections,


count(*) FILTER (WHERE state = 'active') AS active,


count(*) FILTER (WHERE state = 'idle') AS idle,


count(*) FILTER (WHERE state = 'idle in transaction') AS idle_in_txn


FROM pg_stat_activity;




-- Waiting queries


SELECT count(*) AS waiting_count


FROM pg_stat_activity


WHERE wait_event IS NOT NULL AND state = 'active';




-- Connection age


SELECT pid, now() - backend_start AS connection_age, state, query


FROM pg_stat_activity


ORDER BY connection_age DESC


LIMIT 10;





Best Practices


* **Always use a pooler**: Direct connections from every application instance are not sustainable.

2\. **Set pool size to CPU*2+disk**: More connections than that add latency, not throughput. 3\. **Use PgBouncer in transaction mode**: Best balance for web applications. 4\. **Set connection timeouts**: Prevent applications from hanging indefinitely. 5\. **Monitor and alert** on connection utilization trends. 6\. **Use prepared statement caching**: Reduces query planning overhead. 7\. **Separate pools for OLTP and analytics**: Different pool sizes and timeout values for different workloads.

Connection management is invisible when done correctly and catastrophic when done poorly. A well-tuned pool keeps your database responsive under load and your applications free from connection-related failures.