Value Extraction

In standard SQL, rows are independent. You cannot easily compare “today’s price” with “yesterday’s price” without performing a complex self-join. LEAD and LAG solve this by allowing you to “reach” into other rows.

Definition: LEAD and LAG

LAG(column, offset, default): Returns the value from a row behind (above) the current row.
LEAD(column, offset, default): Returns the value from a row ahead (below) the current row.

offset: How many rows to reach (Default is 1).
default: What to return if the reach goes out of bounds (Default is NULL).

Context: Period-over-Period Analysis

LEAD and LAG are the primary tools for time-series analysis. They are used to calculate:

Month-over-Month growth.
The time elapsed between two events.
Price changes between transactions.

Detailed Explanation: Syntax

SELECT
  Date,
  Price,
  LAG(Price) OVER(ORDER BY Date) AS PrevPrice,
  Price - LAG(Price) OVER(ORDER BY Date) AS PriceChange
FROM StockPrices;

Example: Customer Re-ordering

We have an Orders table:

CustomerID	OrderDate
1	2023-01-01
1	2023-01-15
2	2023-01-05
1	2023-02-01

Task: For each order, find how many days have passed since that customer’s previous order.

SELECT
  CustomerID,
  OrderDate,
  OrderDate - LAG(OrderDate) OVER(PARTITION BY CustomerID ORDER BY OrderDate) AS DaysSinceLastOrder
FROM Orders;

Output

CustomerID	OrderDate	DaysSinceLastOrder
1	2023-01-01	NULL
1	2023-01-15	14
1	2023-02-01	17
2	2023-01-05	NULL

(Note: The first order for each customer returns NULL because there is no “previous” row within their partition.)

Under the Hood: The Row Buffer

When executing LAG, the database engine keeps a small buffer of the previous rows in memory as it iterates through the partition. Because the window is already sorted (required by the ORDER BY in the OVER clause), it simply looks back at the $n$-th previous row in its memory buffer.

Performance Tip

LEAD and LAG are extremely efficient because they only require a single pass through the sorted data. A self-join achieving the same result would require a second scan of the table, making it twice as slow.