Skip to content

Data Serialization: JSON, CSV, & Pickle

Serialization is the process of converting a Python object (like a dictionary or a list) into a format that can be stored on a disk or sent over a network. Deserialization is the reverse process.

In Python, we primarily work with three formats: JSON for web APIs, CSV for spreadsheets, and Pickle for internal Python-only storage.


JSON is the universal language of data on the web. It is text-based and readable by almost every programming language.

json_demo.py
import json
user = {"name": "Alice", "id": 101, "admin": True}
# Serialize to a string
json_str = json.dumps(user, indent=4)
# Serialize directly to a file
with open("user.json", "w") as f:
json.dump(user, f)

CSV is the standard for tabular data (Excel, Google Sheets).

csv_usage.py
import csv
data = [
{"name": "Alice", "score": 95},
{"name": "Bob", "score": 88}
]
# Using DictWriter is the most professional way to handle CSVs
with open("scores.csv", "w", newline="") as f:
writer = csv.DictWriter(f, fieldnames=["name", "score"])
writer.writeheader()
writer.writerows(data)

pickle is Python’s built-in serialization for complex objects. It can save almost anything, including custom class instances.

pickle_demo.py
import pickle
data = {"complex": {1, 2, 3}, "math": 3.14}
with open("data.pkl", "wb") as f: # Must use Binary mode 'wb'
pickle.dump(data, f)
with open("data.pkl", "rb") as f:
loaded = pickle.load(f)

When you serialize data, Python is converting high-level objects into a Bytestream (a sequence of 0s and 1s).

  • JSON encodes this into UTF-8 text.
  • Pickle encodes this into a proprietary binary format that is much more compact but only readable by Python.

FormatReadabilitySpeedCompatibilityBest Use Case
JSONHigh (Text)MediumUniversalWeb APIs, Config files.
CSVHigh (Text)HighExcel/Data ScienceLarge tables of simple data.
PickleLow (Binary)Very HighPython OnlySaving complex app state.
XMLMediumLowUniversalLegacy systems, Enterprise.