The Generational Garbage Collector
While Reference Counting is Python’s primary memory manager, it has a blind spot: Circular References. If two objects point to each other but are no longer used by the program, reference counting will never delete them.
To solve this, Python includes a secondary system: the Generational Garbage Collector (GC).
1. The Cyclic Problem
Section titled “1. The Cyclic Problem”Imagine a parent and a child object that both hold references to each other.
class Node: def __init__(self, name): self.name = name self.link = None
# Create cyclea = Node("Parent")b = Node("Child")a.link = bb.link = a
# Delete the external labelsdel adel bEven though you can no longer access these objects, their reference count remains at 1. They are “Islands of Isolation” that would leak memory forever if not for the GC.
2. The Generational Hypothesis
Section titled “2. The Generational Hypothesis”The Python GC is based on a simple observation in computer science: Most objects die young.
Python organizes all objects into three Generations:
- Generation 0: Newly created objects.
- Generation 1: Objects that survived one GC sweep of Gen 0.
- Generation 2: Objects that survived multiple GC sweeps.
The Collection Strategy
Section titled “The Collection Strategy”- Frequent: Python scans Gen 0 very often.
- Less Frequent: If Gen 0 is scanned many times, Python scans Gen 1.
- Rare: Only when Gen 1 has been scanned many times does Python perform a full sweep of Gen 2.
This strategy is efficient because it focuses the CPU’s effort on the objects most likely to be garbage (the new ones).
3. Under the Hood: Tri-color Marking
Section titled “3. Under the Hood: Tri-color Marking”How does the GC know which objects are part of a cycle? It uses an algorithm that “walks” the graph of objects.
- Root Set: The GC starts with objects it knows are alive (globals, stack variables).
- Marking: It follows every reference from the root set and marks those objects as “Reachable.”
- Sweeping: Any object that was not reached during the walk is considered “Unreachable” and is deleted, even if its reference count is not zero.
4. Manual Control with the gc Module
Section titled “4. Manual Control with the gc Module”You can inspect and tune the garbage collector using the built-in gc library.
import gc
# 1. Check current thresholds (when GC triggers)print(gc.get_threshold()) # Default: (700, 10, 10)
# 2. Force a full collection manuallycollected = gc.collect()print(f"Garbage collector: cleaned {collected} objects.")
# 3. Disable GC (Use with extreme caution!)# gc.disable()5. Summary Table
Section titled “5. Summary Table”| Feature | Reference Counting | Generational GC |
|---|---|---|
| Trigger | Immediate (on refcnt == 0). | Periodic (based on thresholds). |
| Complexity | Simple and Fast. | Complex and Expensive. |
| Handles Cycles? | No. | Yes. |
| Predictability | Deterministic. | Non-deterministic. |