The Select Statement
The SELECT statement is the most fundamental tool in SQL. It is how we ask the database for information. However, there is a major difference between how we write a query and how the database executes it.
Definition: The “Big 6” Clauses
Section titled “Definition: The “Big 6” Clauses”Almost every retrieval query in SQL is built using these six clauses, and they must be written in this specific order:
SELECT- Which columns do you want?FROM- Which table(s) are they in?WHERE- Which rows meet your criteria?GROUP BY- How should rows be aggregated?HAVING- Which groups meet your criteria?ORDER BY- How should the results be sorted?
Context: Syntax vs. Logic
Section titled “Context: Syntax vs. Logic”If you were explaining a task to a human, you might say: “Go to the kitchen (FROM), find the cookies (WHERE), and bring me two (SELECT).”
SQL syntax puts SELECT first for readability, but logically, the database cannot “select” something until it knows where it is coming from.
Detailed Explanation: Logical Order of Execution
Section titled “Detailed Explanation: Logical Order of Execution”Understanding the order in which the database processes your query is the “Superpower” of SQL experts. It explains why certain aliases work in some places but fail in others.
The logical order is:
- FROM: The database first identifies the source table.
- WHERE: It filters the raw rows based on your conditions.
- GROUP BY: It groups the remaining rows.
- HAVING: It filters the groups.
- SELECT: It finally picks the columns (and calculates any expressions or aliases).
- ORDER BY: It sorts the final result set.
- LIMIT / OFFSET: It discards rows outside the requested range.
Example: A Standard Query
Section titled “Example: A Standard Query”Imagine a Products table:
| ProductID | Name | Category | Price |
|---|---|---|---|
| 1 | Laptop | Electronics | 1200 |
| 2 | Mouse | Electronics | 25 |
| 3 | Desk | Furniture | 350 |
| 4 | Phone | Electronics | 800 |
We want to find the names of Electronics that cost more than $100.
SELECT Name, PriceFROM ProductsWHERE Category = 'Electronics' AND Price > 100;Output
Section titled “Output”| Name | Price |
|---|---|
| Laptop | 1200 |
| Phone | 800 |
Under the Hood: Projection and Selection
Section titled “Under the Hood: Projection and Selection”In relational algebra:
- Selection ($\sigma$): This is the
WHEREclause. It selects specific rows. - Projection ($\pi$): This is the
SELECTclause. It selects specific columns (attributes).
When you run SELECT Name FROM Products WHERE ID = 1, the engine:
- Opens the
Productstable (FROM). - Finds the row where
ID = 1(Selection). - Discards all columns except
Name(Projection). - Returns the result.
The SELECT * Performance Hit
Section titled “The SELECT * Performance Hit”While SELECT * (select all columns) is convenient for exploration, it is generally discouraged in production code.
- Network Traffic: Sending unneeded data over the network wastes bandwidth.
- Memory/I/O: The database must read more data from the disk and store it in the buffer cache.
- Fragility: If a column is added or removed from the table, your application might break if it expects a specific number of columns.