Skip to content

Profiling & Performance Optimization

“Premature optimization is the root of all evil.” — Donald Knuth.

Before you spend hours trying to make your code “fast,” you must first prove that it is “slow” and identify exactly where the bottleneck is. This is the domain of Profiling.


If you want to compare two small snippets of code (e.g., is a list comprehension faster than a loop?), use the timeit module. It runs the code thousands of times to give you a statistically significant average.

benchmark.py
import timeit
# Compare list comprehension vs. loop
setup = "nums = range(100)"
stmt1 = "[x**2 for x in nums]"
stmt2 = """
res = []
for x in nums:
res.append(x**2)
"""
print(f"Comp: {timeit.timeit(stmt1, setup, number=100000):.4f}s")
print(f"Loop: {timeit.timeit(stmt2, setup, number=100000):.4f}s")

To find which function in your entire program is taking the most time, use cProfile. This is a “Deterministic Profiler” that records every function call.

Run from the terminal:

Terminal window
python -m cProfile -s cumtime my_script.py
  • ncalls: How many times the function was called.
  • tottime: Total time spent in this function (excluding calls to sub-functions).
  • cumtime: Total time spent in this function and all sub-functions. This is usually the most useful metric.

Once you’ve found the bottleneck, use these “Pythonic” techniques to speed it up:

Looking up a global variable is slower than looking up a local one. If you use a global constant or function inside a tight loop, assign it to a local variable first.

Python’s built-ins (like sum(), max(), map()) are written in highly optimized C. They are almost always faster than a manual Python loop.

If your memory profile shows you have millions of small objects, use __slots__ to reduce memory consumption by 40-50% and speed up attribute access.

Checking if an item exists in a List is $O(n)$. Checking in a Set is $O(1)$. Switching to a set for large lookups can take your code from minutes to milliseconds.


If your program’s RAM usage keeps growing, you may have a memory leak. tracemalloc allows you to take snapshots of memory and compare them.

leak_hunt.py
import tracemalloc
tracemalloc.start()
# ... run your code ...
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
for stat in top_stats[:5]:
print(stat)

ToolUsagePurpose
timeittimeit.timeit()Benchmarking small snippets.
cProfilepython -m cProfileIdentifying the slowest function in a program.
tracemalloctracemalloc.start()Finding memory leaks and tracking RAM.
memory_profiler@profile decoratorLine-by-line memory usage analysis.