2024-09-05
Master SQL optimization techniques, indexing strategies, and query performance tuning for scalable applications.
Database performance is critical for application scalability. This guide covers SQL optimization techniques, indexing strategies, query analysis, and best practices for maintaining high-performance databases.
Common query optimization techniques:
-- Inefficient: SELECT * with unnecessary columns
SELECT * FROM users WHERE status = 'active';
-- Optimized: Select only needed columns
SELECT id, name, email FROM users WHERE status = 'active';
-- Inefficient: Using functions in WHERE clause
SELECT * FROM orders WHERE YEAR(created_at) = 2024;
-- Optimized: Use date range instead
SELECT * FROM orders
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';
-- Inefficient: NOT IN with subquery
SELECT * FROM products WHERE id NOT IN (SELECT product_id FROM orders);
-- Optimized: Use LEFT JOIN
SELECT p.* FROM products p
LEFT JOIN orders o ON p.id = o.product_id
WHERE o.product_id IS NULL;
-- B-Tree Index (default)
CREATE INDEX idx_users_email ON users(email);
-- Composite Index
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Unique Index
CREATE UNIQUE INDEX idx_users_username ON users(username);
-- Partial Index (PostgreSQL)
CREATE INDEX idx_orders_pending ON orders(created_at)
WHERE status = 'pending';
-- Full-text Index (MySQL)
CREATE FULLTEXT INDEX idx_products_search
ON products(name, description);
✓ Index columns used in WHERE, JOIN, and ORDER BY clauses
✓ Consider composite indexes for queries with multiple conditions
✓ Place most selective columns first in composite indexes
✗ Avoid over-indexing (slows down writes)
✗ Don't index low-cardinality columns (like boolean fields)
✗ Avoid indexing frequently updated columns
Analyzing query performance with EXPLAIN:
-- MySQL EXPLAIN
EXPLAIN SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id;
-- PostgreSQL EXPLAIN ANALYZE
EXPLAIN ANALYZE SELECT * FROM large_table WHERE status = 'active';
-- Look for:
-- • Table scans (might need index)
-- • High row counts (filter earlier)
-- • Sorting operations (consider index)
-- • Nested loops with large datasets
INNER JOIN
Fastest, returns matching rows only
LEFT/RIGHT JOIN
Slower, includes unmatched rows
CROSS JOIN
Slowest, Cartesian product
Semi-join (EXISTS)
Often faster than IN for large datasets
• Join on indexed columns
• Filter data before joining
• Join smaller tables first
• Use EXISTS instead of IN for subqueries
• Consider denormalization for read-heavy queries
-- Update statistics (PostgreSQL)
ANALYZE table_name;
-- Rebuild indexes (MySQL)
ALTER TABLE table_name ENGINE=InnoDB;
-- Vacuum (PostgreSQL)
VACUUM ANALYZE table_name;
-- Check for missing indexes
SELECT schemaname, tablename, attname, n_distinct, correlation
FROM pg_stats
WHERE n_distinct > 100 AND correlation < 0.1
ORDER BY n_distinct DESC;
Slow Query Log
Enable slow query logging to identify performance bottlenecks over time.
Connection Pooling
Implement connection pooling to reduce connection overhead and improve response times.
Caching Strategy
Use query result caching for frequently accessed, rarely changing data.
Published on 2024-09-05 • Category: Database
← Back to BlogFree online developer tools and utilities for encoding, formatting, generating, and analyzing data. No registration required - all tools work directly in your browser.
Built for developers, by developers. Privacy-focused and open source.
Free online tools for Base64 encoding, JSON formatting, URL encoding, hash generation, UUID creation, QR codes, JWT decoding, timestamp conversion, regex testing, and more.
© 2024 NarvikHub. All rights reserved.