JOIN Operations

JOINs are the most powerful—and most misunderstood—feature of SQL. They allow you to reconstruct normalized data into meaningful information.

Definition: The JOIN Clause

A JOIN clause is used to combine rows from two or more tables based on a related column between them (usually a Primary Key and a Foreign Key).

Context: The Common Scenarios

We will use two simple tables for our examples:

Table: Customers (C)

ID	Name
1	Alice
2	Bob

Table: Orders (O)

OrderID	CustID	Amount
101	1	50.00
102	3	20.00

Detailed Explanation: The 5 Major JOINs

1. INNER JOIN

Returns rows when there is a match in both tables. If a customer hasn’t placed an order, they are excluded. If an order has a non-existent customer, it is excluded.

Result: Alice (Order 101).

2. LEFT (OUTER) JOIN

Returns all rows from the left table, and the matched rows from the right table. If there is no match, the right side results in NULL.

Result: Alice (Order 101), Bob (NULL).

3. RIGHT (OUTER) JOIN

The opposite of LEFT JOIN. Returns all rows from the right table.

Result: Alice (Order 101), NULL (Order 102).

4. FULL (OUTER) JOIN

Returns all rows when there is a match in either left or right table.

Result: Alice (Order 101), Bob (NULL), NULL (Order 102).

5. CROSS JOIN

Returns the Cartesian Product of both tables (every row in A joined with every row in B). No ON clause is used.

Result: 2 customers $ imes$ 2 orders = 4 rows.

Example: Reporting on Customer Orders

Task: List every customer and their total order amount. Include customers with zero orders.

SELECT
  C.Name,
  SUM(O.Amount) AS TotalSpent
FROM Customers C
LEFT JOIN Orders O ON C.ID = O.CustID
GROUP BY C.Name;

Output

Name	TotalSpent
Alice	50.00
Bob	NULL

Under the Hood: Join Algorithms

When you join two tables, the database engine chooses an algorithm to execute the search:

1. Nested Loop Join

The engine takes one row from Table A and scans Table B for a match. Then it takes the next row from Table A and scans Table B again.

Efficiency: $O(N imes M)$.
Best for: Small tables or when the join column is indexed.

2. Hash Join

The engine builds a hash table in memory for the smaller table. Then it scans the larger table and looks up each row in the hash table.

Efficiency: $O(N + M)$.
Best for: Large, unindexed tables.

3. Merge Join

Both tables are sorted by the join column, and then the engine “merges” them like a zipper.

Best for: Very large tables that are already sorted or indexed.

Summary Table

Join Type	Left Table (A)	Right Table (B)
`INNER`	Only matches	Only matches
`LEFT`	All rows	Only matches
`RIGHT`	Only matches	All rows
`FULL`	All rows	All rows
`CROSS`	All rows	All rows (Cartesian)