Home / Articles / Optimizing PostgreSQL for High-Traffic Applications
Optimizing PostgreSQL for High-Traffic Applications
Database
Written by: Anant
Software Engineer | Systems & Web
PostgreSQL is a powerful relational database, but it requires proper optimization to handle high-traffic production workloads. This guide covers essential techniques to maximize performance.
Understanding Query Performance
Before optimizing, you need to understand what's slow. PostgreSQL provides excellent tools for this.
Using EXPLAIN ANALYZE
EXPLAIN ANALYZESELECT u.name, COUNT(o.id) as order_countFROM users uLEFT JOIN orders o ON u.id = o.user_idWHERE u.created_at > '2024-01-01'GROUP BY u.id, u.nameORDER BY order_count DESCLIMIT 10;- Key metrics to watch:
- Execution Time: Total time to run the query
- Planning Time: Time spent planning the query
- Seq Scan: Sequential scans (usually bad for large tables)
- Index Scan: Using an index (usually good)
Indexing Strategies
Indexes are crucial for query performance.
B-Tree Indexes (Default)
CREATE INDEX idx_users_email ON users(email);CREATE INDEX idx_orders_user_id ON orders(user_id);CREATE INDEX idx_orders_created_at ON orders(created_at);Composite Indexes
CREATE INDEX idx_orders_user_status ON orders(user_id, status);Partial Indexes
CREATE INDEX idx_active_users ON users(email) WHERE active = true;Index Maintenance
-- Rebuild indexesREINDEX TABLE users;-- Remove unused indexesDROP INDEX IF EXISTS idx_old_unused;Query Optimization Techniques
1. Avoid SELECT *
-- BadSELECT * FROM users WHERE id = 1;-- GoodSELECT id, name, email FROM users WHERE id = 1;2. Use LIMIT for Large Result Sets
SELECT id, name FROM users ORDER BY created_at DESC LIMIT 100;3. Optimize JOINs
-- Ensure foreign keys are indexedCREATE INDEX idx_orders_user_id ON orders(user_id);-- Use INNER JOIN when possible (faster than LEFT JOIN)SELECT u.name, o.totalFROM users uINNER JOIN orders o ON u.id = o.user_id;4. Avoid N+1 Queries
// Bad: N+1 queriesconst users = await db.query('SELECT * FROM users');for (const user of users) { const orders = await db.query('SELECT * FROM orders WHERE user_id = $1', [user.id]);}// Good: Single query with JOINconst result = await db.query(` SELECT u.*, json_agg(o.*) as orders FROM users u LEFT JOIN orders o ON u.id = o.user_id GROUP BY u.id`);Connection Pooling
Never create a new connection for each query.
const { Pool } = require('pg');const pool = new Pool({ host: 'localhost', database: 'myapp', max: 20, // Maximum number of connections idleTimeoutMillis: 30000, connectionTimeoutMillis: 2000,});// Use the poolasync function getUser(id) { const result = await pool.query('SELECT * FROM users WHERE id = $1', [id]); return result.rows[0];}Configuration Tuning
Edit postgresql.conf for better performance:
# Memory Settingsshared_buffers = 4GB # 25% of total RAMeffective_cache_size = 12GB # 75% of total RAMwork_mem = 64MB # Per operationmaintenance_work_mem = 1GB # For VACUUM, CREATE INDEX# Checkpoint Settingscheckpoint_completion_target = 0.9wal_buffers = 16MBmax_wal_size = 4GB# Query Planningrandom_page_cost = 1.1 # For SSDseffective_io_concurrency = 200 # For SSDs# Connectionsmax_connections = 200Caching Strategies
1. Application-Level Caching
const Redis = require('ioredis');const redis = new Redis();async function getUser(id) { // Try cache first const cached = await redis.get(`user:${id}`); if (cached) return JSON.parse(cached); // Query database const user = await pool.query('SELECT * FROM users WHERE id = $1', [id]); // Cache for 1 hour await redis.setex(`user:${id}`, 3600, JSON.stringify(user.rows[0])); return user.rows[0];}2. Materialized Views
CREATE MATERIALIZED VIEW user_order_stats ASSELECT u.id, u.name, COUNT(o.id) as total_orders, SUM(o.total) as total_spentFROM users uLEFT JOIN orders o ON u.id = o.user_idGROUP BY u.id, u.name;-- Refresh periodicallyREFRESH MATERIALIZED VIEW user_order_stats;Monitoring and Maintenance
Regular VACUUM
-- Automatic vacuum (enabled by default)-- Manual vacuum for specific tablesVACUUM ANALYZE users;VACUUM ANALYZE orders;Monitor Slow Queries
-- Enable slow query logging in postgresql.conflog_min_duration_statement = 1000 # Log queries taking > 1 second-- View slow queriesSELECT query, calls, total_time, mean_timeFROM pg_stat_statementsORDER BY mean_time DESCLIMIT 10;Check Index Usage
SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read, idx_tup_fetchFROM pg_stat_user_indexesWHERE idx_scan = 0ORDER BY schemaname, tablename;Partitioning Large Tables
For tables with millions of rows:
-- Create partitioned tableCREATE TABLE orders ( id SERIAL, user_id INTEGER, created_at TIMESTAMP, total DECIMAL) PARTITION BY RANGE (created_at);-- Create partitionsCREATE TABLE orders_2024_q1 PARTITION OF orders FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');CREATE TABLE orders_2024_q2 PARTITION OF orders FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');Final Takeaway
PostgreSQL optimization is an ongoing process. Monitor your queries, analyze slow operations, and continuously refine your indexes and configuration. With these techniques, you can handle millions of requests efficiently.