Data Formats
CSV and JSON serve different purposes. CSV is ideal for tabular data like databases and spreadsheets. JSON is perfect for nested, hierarchical data like API responses and configurations.
CSV Structure
Alice
,
30
,
New York
Data row 1
Bob
,
25
,
Los Angeles
Data row 2
Charlie
,
35
,
Chicago
Data row 3
JSON Structure
"name"
Key-value string pair
30
Number value (no quotes)
true
Boolean (true/false)
[...]
Array (Python list)
{...}
Nested object (dict)
CSV Best For
- Spreadsheet data
- Database exports
- Simple tabular records
- Excel compatibility
JSON Best For
- API responses
- Configuration files
- Nested/hierarchical data
- Web applications
Reading CSV
Python's csv module provides two main ways to read CSV files: csv.reader for lists and csv.DictReader for dictionaries with column names as keys.
Using csv.reader
import csv
with open('data.csv', 'r') as file:
reader = csv.reader(file)
header = next(reader) # Get header row
print(f"Columns: {header}")
for row in reader:
print(row) # Each row is a list
csv.reader returns each row as a list. Use next() to skip the header row if you want to process data separately.
Using csv.DictReader
import csv
with open('users.csv', 'r') as file:
reader = csv.DictReader(file)
for row in reader:
# row is a dict with column names as keys
print(f"Name: {row['name']}, Age: {row['age']}")
DictReader automatically uses the first row as field names. Each row becomes a dictionary, making code more readable.
csv.reader
- Returns lists
- Access by index: row[0]
- Faster for large files
- Less readable code
csv.DictReader
- Returns dicts
- Access by name: row['name']
- Self-documenting code
- Slight overhead
Writing CSV
Write CSV files using csv.writer for lists or csv.DictWriter when you have dictionaries.
Using csv.writer
import csv
data = [
['name', 'age', 'city'],
['Alice', 30, 'New York'],
['Bob', 25, 'Los Angeles']
]
with open('output.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerows(data)
Always use newline='' when opening CSV files for writing on Windows to prevent extra blank rows.
Using csv.DictWriter
import csv
users = [
{'name': 'Alice', 'age': 30, 'city': 'New York'},
{'name': 'Bob', 'age': 25, 'city': 'Los Angeles'}
]
with open('users.csv', 'w', newline='') as file:
fieldnames = ['name', 'age', 'city']
writer = csv.DictWriter(file, fieldnames=fieldnames)
writer.writeheader() # Write column names
writer.writerows(users)
DictWriter requires fieldnames to define column order. Use writeheader() to write the header row.
Practice: CSV Files
Task: Write read_csv(filename) that prints each row as a formatted string.
Show Solution
import csv
def read_csv(filename):
with open(filename, 'r') as f:
reader = csv.DictReader(f)
for row in reader:
print(row)
# Test
read_csv('users.csv')
Task: Write save_to_csv(data, filename) that saves a list of dicts to CSV.
Show Solution
import csv
def save_to_csv(data, filename):
if not data:
return
fieldnames = data[0].keys()
with open(filename, 'w', newline='') as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(data)
# Test
users = [{'name': 'Alice', 'age': 30}]
save_to_csv(users, 'test.csv')
Task: Write filter_csv(input, output, column, value) that copies only matching rows.
Show Solution
import csv
def filter_csv(input_file, output_file, column, value):
with open(input_file, 'r') as f_in:
reader = csv.DictReader(f_in)
rows = [r for r in reader if r[column] == value]
if rows:
with open(output_file, 'w', newline='') as f_out:
writer = csv.DictWriter(f_out, fieldnames=rows[0].keys())
writer.writeheader()
writer.writerows(rows)
# Test
filter_csv('users.csv', 'nyc.csv', 'city', 'New York')
Reading JSON
Python's json module converts JSON data to Python objects. Objects become dicts, arrays become lists, and JSON types map naturally to Python types.
Loading from File
import json
# Load from file
with open('config.json', 'r') as file:
data = json.load(file)
print(data['name'])
print(data['settings']['theme'])
json.load() reads from a file object. The result is a Python dict (or list if the JSON root is an array).
Loading from String
import json
json_string = '{"name": "Alice", "age": 30}'
data = json.loads(json_string)
print(data['name']) # Alice
print(type(data)) #
json.loads() (with 's' for string) parses a JSON string. Use this for API responses or data from other sources.
Type Mapping
| JSON Type | Python Type |
|---|---|
| object {} | dict |
| array [] | list |
| string "" | str |
| number (int) | int |
| number (float) | float |
| true/false | True/False |
| null | None |
Writing JSON
Convert Python objects to JSON using json.dump() for files or json.dumps() for strings.
Writing to File
import json
data = {
'name': 'Alice',
'age': 30,
'skills': ['Python', 'SQL', 'Git']
}
with open('user.json', 'w') as file:
json.dump(data, file, indent=2)
Use indent=2 for human-readable output. Without indent, JSON is written as a single line.
Converting to String
import json
data = {'name': 'Bob', 'active': True}
# Compact JSON
json_str = json.dumps(data)
print(json_str) # {"name": "Bob", "active": true}
# Pretty JSON
pretty = json.dumps(data, indent=2)
print(pretty)
json.dumps() returns a string. Useful for sending JSON in API responses or storing in databases.
Practice: JSON Files
Task: Write load_json(filename) that loads and returns JSON data from a file.
Show Solution
import json
def load_json(filename):
with open(filename, 'r') as f:
return json.load(f)
# Test
data = load_json('config.json')
print(data)
Task: Write save_json(data, filename) that saves a dict to a pretty-printed JSON file.
Show Solution
import json
def save_json(data, filename):
with open(filename, 'w') as f:
json.dump(data, f, indent=2)
# Test
config = {'theme': 'dark', 'lang': 'en'}
save_json(config, 'settings.json')
Task: Write update_json(filename, key, value) that updates a key in a JSON file.
Show Solution
import json
def update_json(filename, key, value):
with open(filename, 'r') as f:
data = json.load(f)
data[key] = value
with open(filename, 'w') as f:
json.dump(data, f, indent=2)
# Test
update_json('settings.json', 'theme', 'light')
Nested Data
Real-world JSON often contains nested objects and arrays. Navigate them using chained key access and handle missing keys gracefully.
Accessing Nested Data
import json
data = {
'user': {
'name': 'Alice',
'address': {
'city': 'New York',
'zip': '10001'
}
},
'orders': [
{'id': 1, 'total': 99.99},
{'id': 2, 'total': 149.99}
]
}
# Access nested values
city = data['user']['address']['city']
first_order = data['orders'][0]['total']
print(f"City: {city}, First order: ${first_order}")
Chain keys to drill down into nested structures. Use indices for arrays within the JSON.
Safe Access with .get()
# Risky: raises KeyError if missing
# city = data['user']['location']['city']
# Safe: returns None or default if missing
city = data.get('user', {}).get('location', {}).get('city', 'Unknown')
# Or use try/except
try:
city = data['user']['location']['city']
except KeyError:
city = 'Unknown'
Chained .get() calls prevent KeyError on missing keys. Provide empty dicts as defaults to continue the chain.
Iterating Nested Arrays
data = {
'users': [
{'name': 'Alice', 'skills': ['Python', 'SQL']},
{'name': 'Bob', 'skills': ['JavaScript', 'React']}
]
}
for user in data['users']:
print(f"{user['name']}:")
for skill in user['skills']:
print(f" - {skill}")
Iterate over nested arrays using standard for loops. Each element can be a dict, list, or primitive value.
Practice: Nested Data
Task: Write get_nested(data, keys) that safely gets a value by a list of keys.
Show Solution
def get_nested(data, keys, default=None):
result = data
for key in keys:
if isinstance(result, dict):
result = result.get(key)
elif isinstance(result, list) and isinstance(key, int):
result = result[key] if key < len(result) else None
else:
return default
if result is None:
return default
return result
# Test
data = {'user': {'address': {'city': 'NYC'}}}
print(get_nested(data, ['user', 'address', 'city']))
Task: Write flatten_json(data) that converts nested dict to flat with dot notation keys.
Show Solution
def flatten_json(data, prefix=''):
result = {}
for key, value in data.items():
new_key = f"{prefix}.{key}" if prefix else key
if isinstance(value, dict):
result.update(flatten_json(value, new_key))
else:
result[new_key] = value
return result
# Test
nested = {'user': {'name': 'Alice', 'address': {'city': 'NYC'}}}
print(flatten_json(nested))
# {'user.name': 'Alice', 'user.address.city': 'NYC'}
Task: Write csv_to_json(csv_file, json_file) that converts a CSV file to JSON array.
Show Solution
import csv
import json
def csv_to_json(csv_file, json_file):
with open(csv_file, 'r') as f:
reader = csv.DictReader(f)
data = list(reader)
with open(json_file, 'w') as f:
json.dump(data, f, indent=2)
# Test
csv_to_json('users.csv', 'users.json')
Task: Write json_to_csv(json_file, csv_file) that converts a JSON array to CSV.
Show Solution
import csv
import json
def json_to_csv(json_file, csv_file):
with open(json_file, 'r') as f:
data = json.load(f)
if not data:
return
with open(csv_file, 'w', newline='') as f:
writer = csv.DictWriter(f, fieldnames=data[0].keys())
writer.writeheader()
writer.writerows(data)
# Test
json_to_csv('users.json', 'users_export.csv')
Task: Write merge_json(files, output) that merges multiple JSON files into one.
Show Solution
import json
def merge_json(files, output):
merged = {}
for filename in files:
with open(filename, 'r') as f:
data = json.load(f)
merged.update(data)
with open(output, 'w') as f:
json.dump(merged, f, indent=2)
# Test
merge_json(['config1.json', 'config2.json'], 'merged.json')
Key Takeaways
CSV for Tabular Data
Use CSV for spreadsheet-like data with rows and columns. Simple and universal.
JSON for Structured Data
Use JSON for nested, hierarchical data. Perfect for APIs and configs.
DictReader is Cleaner
csv.DictReader makes code readable by using column names as keys.
newline for CSV
Always use newline='' when opening CSV files for writing on Windows.
indent for Readability
Use indent=2 in json.dump() for human-readable JSON output.
Safe Nested Access
Use .get() chains or try/except for safely accessing nested JSON data.
Knowledge Check
Quick Quiz
Test what you've learned about CSV and JSON handling