Handling Missing Values
As we learned in Module 1, NULL values can break logic and calculations. To write resilient SQL, you need specific functions to replace NULLs with defaults or to create NULLs when they are needed for safety.
Definition: COALESCE and NULLIF
Section titled “Definition: COALESCE and NULLIF”These two functions are the “Guardians” of your data integrity.
COALESCE(val1, val2, ... valN): Returns the first non-null value in the list.NULLIF(val1, val2): Returns NULL ifval1equalsval2. Otherwise, it returnsval1.
Context: The Zero-Problem
Section titled “Context: The Zero-Problem”In math, dividing by zero is undefined. In SQL, it crashes your query.
NULLIF is the standard way to prevent “Division by Zero” errors by turning the zero into a NULL. Since any number divided by NULL is NULL, the query survives.
Detailed Explanation: COALESCE
Section titled “Detailed Explanation: COALESCE”COALESCE is often used to provide fallback values for reports.
SELECT COALESCE(PhoneNumber, 'No Phone Provided') FROM Customers;If PhoneNumber is NULL, the user sees “No Phone Provided” instead of a blank space or an error.
Why not use CASE?
Section titled “Why not use CASE?”You could use a CASE statement:
CASE WHEN PhoneNumber IS NULL THEN 'No Phone Provided' ELSE PhoneNumber END
However, COALESCE is more concise and can handle multiple fallbacks (e.g., Check Phone, then Email, then “No Contact”).
Detailed Explanation: NULLIF
Section titled “Detailed Explanation: NULLIF”NULLIF is commonly used to clean data or prevent errors.
1. Cleaning Legacy Data
Section titled “1. Cleaning Legacy Data”Imagine a database where an empty string '' was used to mean “No Data”. You want to treat those as proper SQL NULLs.
NULLIF(ColumnName, '')
2. Preventing Division by Zero
Section titled “2. Preventing Division by Zero”SELECT Sales / NULLIF(ItemsSold, 0) FROM Reports;Example: Comprehensive Data Cleanup
Section titled “Example: Comprehensive Data Cleanup”We have a ProductStock table:
| Name | RetailPrice | DiscountPrice | StockCount |
|---|---|---|---|
| Widget | 100.00 | 80.00 | 50 |
| Gadget | 50.00 | NULL | 10 |
| Gizmo | 20.00 | 15.00 | 0 |
Task: Find the “Best Price” (Discount if available, else Retail) and calculate the Stock Ratio.
Section titled “Task: Find the “Best Price” (Discount if available, else Retail) and calculate the Stock Ratio.”SELECT Name, COALESCE(DiscountPrice, RetailPrice) AS BestPrice, 100.0 / NULLIF(StockCount, 0) AS ScarcityFactorFROM ProductStock;Output
Section titled “Output”| Name | BestPrice | ScarcityFactor |
|---|---|---|
| Widget | 80.00 | 2.0 |
| Gadget | 50.00 | 10.0 |
| Gizmo | 15.00 | NULL |
Note: Gizmo’s ScarcityFactor is NULL because we used NULLIF to prevent division by zero.
Under the Hood: Variable Arguments
Section titled “Under the Hood: Variable Arguments”COALESCE is a “Variadic” function—it can take an unlimited number of arguments. Internally, the database engine implements COALESCE as a shorthand for a searched CASE expression.
COALESCE(a, b, c) is translated by the query optimizer to:
CASE WHEN a IS NOT NULL THEN a WHEN b IS NOT NULL THEN b ELSE cENDBecause it’s a CASE expression, it also supports Short-Circuiting. If a is not null, the engine never even looks at b or c. This is useful if b is a complex, expensive subquery or calculation.