Skip to content

Execution Plans

When you send a SQL query to the database, you are telling it what you want. The Query Optimizer then decides how to get it. The resulting “map” of how the database will execute your query is called the Execution Plan.

An Execution Plan is a series of steps (or operators) that the database engine follows to retrieve the requested data. It shows which indexes will be used, how tables will be joined, and the estimated “cost” of each step.

To see the plan, you prefix your query with the EXPLAIN keyword (or use a graphical tool like SQL Server Management Studio’s “Execution Plan” tab).

Using EXPLAIN
EXPLAIN SELECT * FROM Users WHERE ID = 50;

Detailed Explanation: Key Operators to Watch

Section titled “Detailed Explanation: Key Operators to Watch”

When reading a plan, you are looking for specific “Red Flags” that indicate slow performance.

Red Flag! The database is reading every single row of the table from the disk. This is usually caused by a missing index or a non-SARGable query.

Good Signal. The database is using an index. A “Seek” is generally better than a “Scan.”

  • Nested Loop: Good for small sets or when joining on an index.
  • Hash Join: Good for large, unindexed sets. If the plan shows a Hash Join for a small query, it might mean an index is missing.

Imagine we have a Products table with 1 million rows.

EXPLAIN SELECT * FROM Products WHERE Price = 19.99;

  • Result: Seq Scan on Products (cost=0.00..18500.00 rows=12 width=45)
  • The “cost” is high because it must read the whole table.

After running: CREATE INDEX idx_price ON Products(Price); EXPLAIN SELECT * FROM Products WHERE Price = 19.99;

  • Result: Index Seek using idx_price on Products (cost=0.42..8.44 rows=12 width=45)
  • The cost dropped from 18,500 to 8.44!

Under the Hood: Statistics and the Cost-Based Optimizer

Section titled “Under the Hood: Statistics and the Cost-Based Optimizer”

How does the optimizer know which index is best? It uses Statistics.

The database maintains a “Metadata” layer that stores:

  • The total number of rows in each table.
  • A “Histogram” of values (e.g., “30% of users are from the USA”).
  • The “Density” of columns (how many unique values exist).

If your statistics are “Stale” (outdated), the optimizer might choose a terrible plan. Most databases have an ANALYZE or UPDATE STATISTICS command to refresh this metadata.

If your query asks for 90% of the rows in a table, the optimizer might choose a Sequential Scan even if an index exists. Why? Because jumping back and forth between the index and the table data (Random I/O) is slower than just reading the whole table in order (Sequential I/O).