Optimization Best Practices

Performance optimization is often about doing less work. By writing more precise SQL, you reduce the burden on the database’s CPU, Memory, and Disk I/O.

Definition: The “Less is More” Philosophy

Every optimization strategy boils down to one goal: Reduce the amount of data the database has to touch.

Context: Top 5 Performance Boosters

1. Stop using SELECT *

Always name your columns.

Why: SELECT * forces the database to fetch data for every column, even those you don’t need (like large BLOB or TEXT fields). It also prevents “Covering Index” optimizations.

2. Use UNION ALL instead of UNION

Why: UNION performs a hidden DISTINCT operation, which requires an expensive sort and duplicate check. If you know your data sets are unique or you don’t care about duplicates, UNION ALL is nearly instant.

3. Filter Early (The “Narrowing” Principle)

Put your most restrictive filters in the WHERE clause rather than the HAVING clause or the final SELECT.

Why: The earlier you remove rows, the less data the subsequent JOIN, GROUP BY, and ORDER BY operations have to process.

4. Avoid Wildcards at the Start of a String

Slow: LIKE '%abc'
Fast: LIKE 'abc%'
Why: Leading wildcards make a query non-SARGable (see Chapter 8.3).

5. Be Careful with OR

Why: The optimizer often struggles with OR. Sometimes, it’s faster to write two separate queries combined with a UNION ALL.

Detailed Explanation: The Covering Index

A Covering Index is the ultimate performance optimization. It occurs when an index contains every column requested by the query.

-- Query
SELECT Email FROM Users WHERE Email = 'test@test.com';

-- Index
CREATE INDEX idx_email ON Users(Email);

In this case, the database engine never touches the actual table. It finds the email in the index, sees that the query only asked for the email, and returns it directly from the index. This avoids a “Bookmark Lookup” (jumping from index to table), which is the most expensive part of a query.

Example: Before and After Optimization

The Slow Way

SELECT *
FROM Orders
WHERE UPPER(Status) = 'SHIPPED'
UNION
SELECT *
FROM Archives
WHERE UPPER(Status) = 'SHIPPED';

The Optimized Way

SELECT OrderID, CustomerID, OrderDate
FROM Orders
WHERE Status = 'Shipped'
UNION ALL
SELECT OrderID, CustomerID, OrderDate
FROM Archives
WHERE Status = 'Shipped';

What changed?

Column Selection: Removed SELECT *.
SARGability: Removed UPPER() from the WHERE clause.
Set Operator: Changed UNION to UNION ALL.

Under the Hood: The Cost of Sorting

Many SQL operations trigger a Sort:

ORDER BY
GROUP BY
DISTINCT
UNION (not UNION ALL)
JOIN (specifically Merge Joins)

Sorting is a CPU-intensive operation. If the data to be sorted is larger than the available work_mem (in Postgres) or Sort Area (in SQL Server), the database must use External Merge Sort, which writes data to the disk. This is a common cause of “Disk I/O spikes.”