Sorting and Limiting
Raw data is often unordered. To make data readable for humans or to perform “Top-N” analysis (e.g., “Top 10 Customers”), we use the ORDER BY and LIMIT clauses.
Definition: Ordering and Limiting
Section titled “Definition: Ordering and Limiting”ORDER BY: Sorts the result set by one or more columns in either Ascending (ASC) or Descending (DESC) order.LIMIT: Restricts the number of rows returned by the query.
Context: The Illusion of Order
Section titled “Context: The Illusion of Order”As discussed in the Relational Model, a table is a Set, and sets have no inherent order. While it might look like rows are returned in the order they were inserted, this is a side effect of how the database physically stores data on the disk.
If you delete a row and insert a new one, the new row might take the physical spot of the old one, changing the “default” order.
Detailed Explanation: Syntax and Nuances
Section titled “Detailed Explanation: Syntax and Nuances”1. Sorting by Multiple Columns
Section titled “1. Sorting by Multiple Columns”You can sort by multiple columns. The database sorts by the first column, and for any rows where that value is identical, it sorts by the second column.
SELECT FirstName, LastNameFROM EmployeesORDER BY LastName ASC, FirstName ASC;2. Positional Sorting
Section titled “2. Positional Sorting”Some dialects allow you to sort by the column number in the SELECT list.
SELECT Name, Salary FROM Employees ORDER BY 2 DESC; (This sorts by Salary).
Caution: This is generally considered bad practice in production code as it makes the query harder to read and fragile if columns are reordered.
3. Limiting Results (Dialect Differences)
Section titled “3. Limiting Results (Dialect Differences)”Unfortunately, “Limit” is one area where SQL dialects vary the most.
| Dialect | Syntax |
|---|---|
| PostgreSQL / MySQL / SQLite | LIMIT 10 |
| SQL Server | SELECT TOP 10 ... |
| Oracle | FETCH FIRST 10 ROWS ONLY |
Example: Top Earners
Section titled “Example: Top Earners”We have a Salaries table:
| Name | Dept | MonthlySalary |
|---|---|---|
| Alice | Engineering | 8000 |
| Bob | Sales | 5000 |
| Charlie | Engineering | 9000 |
| Diana | HR | 5500 |
| Eve | Engineering | 8000 |
Task: Find the top 3 highest-paid employees.
Section titled “Task: Find the top 3 highest-paid employees.”If salaries are equal, sort by Name alphabetically.
SELECT Name, MonthlySalaryFROM SalariesORDER BY MonthlySalary DESC, Name ASCLIMIT 3;Output
Section titled “Output”| Name | MonthlySalary |
|---|---|
| Charlie | 9000 |
| Alice | 8000 |
| Eve | 8000 |
Under the Hood: The External Merge Sort
Section titled “Under the Hood: The External Merge Sort”When you request an ORDER BY, the database engine does the following:
- Check for Index: If the column being sorted is already indexed (B-Tree), the engine can simply walk the index in order. This is extremely fast.
- In-Memory Sort: If there is no index, the engine tries to sort the data in a dedicated memory area (often called
work_memin Postgres orSort Areain others). - External Sort: If the data is too large for memory, the engine must write “sort runs” to temporary files on the disk and merge them. This is significantly slower and is a common performance bottleneck.
The Cost of LIMIT
Section titled “The Cost of LIMIT”When you use LIMIT 10 with ORDER BY, the database engine often uses a Heap Sort (specifically a “top-N” sort). Instead of sorting all 1 million rows, it maintains a small “heap” of the top 10 values as it scans the data. This is much more efficient than a full sort.