The Relational Model

The foundation of SQL lies in the Relational Model, a mathematical approach to managing data proposed by Edgar F. Codd in 1970. To master SQL, you must first understand how data is organized and, more importantly, how to think about that data.

Definition: What is the Relational Model?

In simple terms, the Relational Model represents data as a collection of relations. In the world of SQL, we call these relations Tables.

A table is a structured set of data consisting of:

Columns (Attributes): The categories of data being stored (e.g., Email, OrderDate, Price).
Rows (Tuples): Individual records or instances of those categories (e.g., a specific customer or a single transaction).

Context: Why It Matters

Before the Relational Model, data was often stored in hierarchical or network structures where you had to know the physical path to the data to retrieve it. If the structure changed, your code broke.

The Relational Model decouples the logical structure (how we see the data) from the physical storage (how the computer saves it). This allows you to ask for “All customers from New York” without caring whether they are stored on a hard drive in California or a cloud server in Ireland.

Thinking in Sets (The Mental Shift)

The biggest hurdle for most programmers is moving from Imperative Thinking to Declarative Set-Based Thinking.

Imperative (The “How”)

In languages like Python or Java, you often use loops to process data:

Start at the first row.
Check if the city is “New York”.
If yes, add it to a list.
Move to the next row.
Repeat until the end.

Declarative (The “What”)

In SQL, you describe the result set you want: “Give me the subset of the ‘Customers’ table where the ‘City’ attribute is ‘New York’.”

The database engine handles the “how”—it might use an index, scan the whole table, or use multiple processor cores. Your job is only to define the criteria.

Detailed Explanation: Table Anatomy

Let’s look at a sample table called Employees:

EmployeeID	FirstName	Department	Salary
101	Alice	Engineering	95000
102	Bob	Marketing	72000
103	Charlie	Engineering	88000

1. The Schema

The schema is the blueprint of the table. It defines the names of the columns and the type of data they can hold (e.g., Integers for EmployeeID, Strings for FirstName).

2. Cardinality and Degree

Cardinality: The number of rows in the table (In our example: 3).
Degree (Arity): The number of columns in the table (In our example: 4).

3. Primary Keys

Every table should have a Primary Key—a column (or set of columns) that uniquely identifies each row. No two rows can have the same Primary Key. In our example, EmployeeID is the Primary Key.

Example: Conceptualizing a Set

Imagine two sets of data:

Set A: All Employees.
Set B: All people in the Engineering Department.

A SQL query to find Engineering employees is essentially finding the Intersection of these sets or selecting a Subset based on a predicate (a condition).

-- This is how we express that set-based requirement in SQL
SELECT *
FROM Employees
WHERE Department = 'Engineering';

Output

The result is a new, virtual table containing only the rows that meet the criteria:

EmployeeID	FirstName	Department	Salary
101	Alice	Engineering	95000
103	Charlie	Engineering	88000

Under the Hood: Set Theory

SQL is based on Relational Algebra and Tuple Relational Calculus. When you write a query, the database’s “Query Optimizer” converts your declarative statement into a series of algebraic operations (like Selection, Projection, and Join).

Because tables are mathematically “sets,” the order of rows in a table technically doesn’t matter. Unless you explicitly ask the database to sort the results (using ORDER BY), you should never rely on the order in which rows are returned. They are a bag of data, not a sorted list.