Data Serialization: JSON, CSV, & Pickle
Serialization is the process of converting a Python object (like a dictionary or a list) into a format that can be stored on a disk or sent over a network. Deserialization is the reverse process.
In Python, we primarily work with three formats: JSON for web APIs, CSV for spreadsheets, and Pickle for internal Python-only storage.
1. JSON (JavaScript Object Notation)
Section titled “1. JSON (JavaScript Object Notation)”JSON is the universal language of data on the web. It is text-based and readable by almost every programming language.
import json
user = {"name": "Alice", "id": 101, "admin": True}
# Serialize to a stringjson_str = json.dumps(user, indent=4)
# Serialize directly to a filewith open("user.json", "w") as f: json.dump(user, f)2. CSV (Comma Separated Values)
Section titled “2. CSV (Comma Separated Values)”CSV is the standard for tabular data (Excel, Google Sheets).
import csv
data = [ {"name": "Alice", "score": 95}, {"name": "Bob", "score": 88}]
# Using DictWriter is the most professional way to handle CSVswith open("scores.csv", "w", newline="") as f: writer = csv.DictWriter(f, fieldnames=["name", "score"]) writer.writeheader() writer.writerows(data)3. Pickle (The Python Native Format)
Section titled “3. Pickle (The Python Native Format)”pickle is Python’s built-in serialization for complex objects. It can save almost anything, including custom class instances.
import pickle
data = {"complex": {1, 2, 3}, "math": 3.14}
with open("data.pkl", "wb") as f: # Must use Binary mode 'wb' pickle.dump(data, f)
with open("data.pkl", "rb") as f: loaded = pickle.load(f)4. Under the Hood: The Bytestream
Section titled “4. Under the Hood: The Bytestream”When you serialize data, Python is converting high-level objects into a Bytestream (a sequence of 0s and 1s).
- JSON encodes this into UTF-8 text.
- Pickle encodes this into a proprietary binary format that is much more compact but only readable by Python.
5. Summary Table: Choosing a Format
Section titled “5. Summary Table: Choosing a Format”| Format | Readability | Speed | Compatibility | Best Use Case |
|---|---|---|---|---|
| JSON | High (Text) | Medium | Universal | Web APIs, Config files. |
| CSV | High (Text) | High | Excel/Data Science | Large tables of simple data. |
| Pickle | Low (Binary) | Very High | Python Only | Saving complex app state. |
| XML | Medium | Low | Universal | Legacy systems, Enterprise. |