Skip to content

Common Table Expressions (CTEs)

Common Table Expressions (CTEs) are the modern standard for writing complex SQL. They allow you to define a “virtual table” (a named result set) that exists only for the duration of a single query.

A CTE is defined using the WITH keyword at the very beginning of your query.

CTE Syntax
WITH MyVirtualTable AS (
SELECT ... (Your Subquery Here)
)
SELECT * FROM MyVirtualTable;

Context: Why CTEs are Better than Subqueries

Section titled “Context: Why CTEs are Better than Subqueries”
  1. Readability: CTEs allow for “Top-Down” logic. You define your data components first and then use them in the final SELECT. Subqueries force you to read “Inside-Out”.
  2. Modularity: You can define multiple CTEs in a single query, separated by commas.
  3. Reusability: You can reference the same CTE multiple times in your main query (e.g., joining a CTE to itself).
  4. Recursion: Only CTEs support recursive logic (needed for tree structures).

Think of CTEs as “variables” for tables.

Multi-Step Logic
WITH
DepartmentPayroll AS (
SELECT Dept, SUM(Salary) AS TotalDeptSalary
FROM Employees
GROUP BY Dept
),
CompanyAverage AS (
SELECT AVG(TotalDeptSalary) AS AvgPayroll
FROM DepartmentPayroll
)
SELECT Dept, TotalDeptSalary
FROM DepartmentPayroll
WHERE TotalDeptSalary > (SELECT AvgPayroll FROM CompanyAverage);

Often, you want to clean or aggregate data from one table before joining it to another. Doing this in a CTE makes the join logic much clearer.

CTE for Pre-Aggregation
WITH CleanedSales AS (
SELECT
ProductID,
SUM(Quantity) AS TotalSold
FROM Sales
WHERE SaleDate >= '2023-01-01'
GROUP BY ProductID
)
SELECT
p.ProductName,
s.TotalSold
FROM Products p
JOIN CleanedSales s ON p.ID = s.ProductID;

How does a CTE affect performance compared to a subquery?

In most modern versions of PostgreSQL and SQL Server, the optimizer treats a CTE exactly like a subquery. It “inlines” the logic into the main query to find the best execution plan.

Historically, PostgreSQL always “materialized” CTEs—meaning it ran the CTE and saved it to a temp table before doing anything else. While this was great for preventing expensive subqueries from running multiple times, it prevented the optimizer from using indexes from the outer query.

  • Note: Since Postgres 12, the engine now inlines CTEs automatically unless you explicitly tell it not to using the MATERIALIZED keyword.

A CTE only exists for the specific query it is attached to. You cannot reference a CTE in a different tab or a separate database connection.