Skip to content

File I/O & Buffering

Interacting with the file system is a core task for almost any application. Whether you are logging errors, reading configuration, or processing massive datasets, you must understand how Python manages file streams.

In Python, we use the built-in open() function to create a “file object” that acts as a bridge between your code and the physical disk.


file_obj = open("filename", mode)

ModeNameBehavior
rReadDefault. Error if file doesn’t exist.
wWriteCreates new file or erases existing content.
aAppendAdds to the end of the file.
xExclusiveCreates new file; errors if it already exists.
bBinaryFor non-text files (images, PDFs).
tTextDefault. For text files (automatic encoding).

Depending on the size of your file, you should choose different reading methods.

Loads everything into memory as one giant string.

with open("data.txt", "r") as f:
content = f.read() # Don't do this for 10GB files!

Reads only the next line.

with open("data.txt", "r") as f:
line1 = f.readline()
line2 = f.readline()

Method C: The Iterator (The Professional Way)

Section titled “Method C: The Iterator (The Professional Way)”

You can loop directly over the file object. This is Memory Efficient because Python only loads one line into memory at a time.

with open("huge_data.txt", "r") as f:
for line in f:
process(line.strip())

When writing, Python handles the buffer automatically.

writing.py
with open("output.log", "a") as f:
f.write("New event logged\n")

When working with images, audio, or binary data, you must use the b flag. In this mode, Python returns Bytes objects instead of strings.

copy_image.py
with open("input.png", "rb") as source:
data = source.read()
with open("copy.png", "wb") as dest:
dest.write(data)

Disk operations are incredibly slow compared to CPU operations. To speed things up, Python uses a Buffer (a small slice of RAM).

  1. When you call .write(), Python doesn’t go to the disk immediately. It puts the data in the buffer.
  2. When the buffer is full (usually 4KB or 8KB), it “flushes” the whole batch to the disk at once.
  3. The with statement ensures that even if your program crashes, Python will attempt to flush the buffer before closing the file.

MethodReturnsBest Use Case
.read(n)StringSmall files or specific byte counts.
.readline()StringParsing specific headers or fixed-length lines.
.readlines()ListSmall files where you need random access to lines.
for line in fStringLarge files (memory efficient).