Ranking Functions

Ranking functions allow you to assign a numerical rank to each row within a partition. This is the most common use case for window functions, especially for finding “Top-N” records (e.g., “The top 3 sales per region”).

Definition: The 4 Ranking Specialists

ROW_NUMBER(): A unique, sequential integer for every row. Even if values are identical, the row numbers will be different.
RANK(): Assigns the same rank to identical values. However, it leaves gaps in the sequence. (e.g., 1, 2, 2, 4).
DENSE_RANK(): Like RANK(), it handles ties identically, but it never leaves gaps. (e.g., 1, 2, 2, 3).
NTILE(n): Divides the rows into n roughly equal buckets and returns the bucket number. (e.g., NTILE(4) for quartiles).

Context: Handling Ties

The difference between RANK and DENSE_RANK is best explained with an example. Imagine a race where two people tie for 2nd place.

RANK: The next person is in 4th place. (1, 2, 2, 4).
DENSE_RANK: The next person is in 3rd place. (1, 2, 2, 3).

Detailed Explanation: Syntax

SELECT
  Name,
  Score,
  ROW_NUMBER() OVER(ORDER BY Score DESC) AS RowNum,
  RANK()       OVER(ORDER BY Score DESC) AS Rnk,
  DENSE_RANK() OVER(ORDER BY Score DESC) AS DenseRnk
FROM Students;

Example: Top Salesperson per Region

We have a SalesPerformance table:

Name	Region	Sales
Alice	East	1000
Bob	West	1200
Charlie	East	1000
Diana	West	1500

Task: Identify the top seller in each region.

WITH RankedSales AS (
  SELECT
      Name, Region, Sales,
      DENSE_RANK() OVER(PARTITION BY Region ORDER BY Sales DESC) AS SalesRank
  FROM SalesPerformance
)
SELECT *
FROM RankedSales
WHERE SalesRank = 1;

Output

Name	Region	Sales	SalesRank
Alice	East	1000	1
Charlie	East	1000	1
Diana	West	1500	1

(Note: Both Alice and Charlie are returned for the East region because they tied for 1st place.)

Under the Hood: The Cost of Sorting

To calculate a rank, the database must sort the data (internally) by the ORDER BY clause of the window.

1. In-Memory Sort

If the partition fits in memory, the engine sorts it and then iterates through the rows, keeping track of the current rank and the previous value to determine if a tie has occurred.

2. Index Optimization

If an index exists on the PARTITION BY and ORDER BY columns, the database can skip the sort entirely. It simply walks the index and assigns ranks as it goes. This is the difference between a query taking 1 second vs. 1 minute on large datasets.