Dictionaries & The Hash Map
The Dictionary (dict) is arguably Python’s most powerful data structure. In other languages, this is known as a Hash Map, Associative Array, or Hash. It allows you to store data as key-value pairs, providing near-instantaneous lookups regardless of how many millions of items you store.
1. Anatomy of a Dictionary
Section titled “1. Anatomy of a Dictionary”A dictionary maps a unique Key to a Value.
- Keys: Must be Hashable (immutable objects like strings, numbers, or tuples).
- Values: Can be any Python object (lists, functions, even other dictionaries).
# Literal syntaxuser = {"id": 1, "name": "Alice"}
# Dictionary Constructoruser = dict(id=1, name="Alice")2. Advanced Retrieval: Beyond []
Section titled “2. Advanced Retrieval: Beyond []”Using user["id"] is fine, but if the key doesn’t exist, your program crashes with a KeyError.
1. The .get() Method
Section titled “1. The .get() Method”Safely returns a default value if the key is missing.
email = user.get("email", "unknown@example.com")2. The .setdefault() Method
Section titled “2. The .setdefault() Method”Returns the value if the key exists. If not, it inserts the key with the provided default value and returns that value.
# If "tags" doesn't exist, create it as an empty listtags = user.setdefault("tags", [])tags.append("python")3. Professional Tooling: defaultdict
Section titled “3. Professional Tooling: defaultdict”If you are building a dictionary where every key should start with a specific type (like a list or an integer), use collections.defaultdict.
from collections import defaultdict
# Every new key automatically starts as an empty listgroups = defaultdict(list)
groups["admin"].append("Alice")groups["user"].append("Bob")
print(groups["admin"]) # ['Alice']print(groups["guest"]) # [] (No KeyError!)4. Under the Hood: The Hash Table
Section titled “4. Under the Hood: The Hash Table”How is a dictionary so fast?
1. The Hash Function
Section titled “1. The Hash Function”When you add a key, Python runs it through a hash() function to get a large integer. This integer is used as an index in an internal array. Python doesn’t “search” for your key; it calculates its exact location.
2. Compact Dictionaries (Python 3.7+)
Section titled “2. Compact Dictionaries (Python 3.7+)”In older versions of Python, dictionaries used a lot of memory and were unordered. Since Python 3.7, dictionaries are Compact (using much less RAM) and Ordered (they remember the order in which you added items).
3. Hash Collisions
Section titled “3. Hash Collisions”What if two different keys result in the same hash index? This is a Collision. Python handles this using “Open Addressing”—it simply looks for the next empty “slot” in the array. While collisions slow down the dictionary slightly, Python’s hash function is designed to make them extremely rare.
5. Summary Table: Dictionary Performance
Section titled “5. Summary Table: Dictionary Performance”| Operation | Complexity | Description |
|---|---|---|
| Lookup | $O(1)$ | Near-instant retrieval by key. |
| Insertion | $O(1)$ | Adding a new pair. |
| Deletion | $O(1)$ | Removing by key. |
| Iteration | $O(n)$ | Walking through all keys/values. |
6. Detailed Example: Counting Frequencies
Section titled “6. Detailed Example: Counting Frequencies”This is a classic use case for dictionaries: counting how many times items appear in a collection.
text = "apple banana apple cherry banana apple"counts = {}
for word in text.split(): counts[word] = counts.get(word, 0) + 1
print(counts)# Output: {'apple': 3, 'banana': 2, 'cherry': 1}Note: For this specific task, Python also provides collections.Counter, which is even faster.