Why File I/O?
Programs that only work with data in memory lose everything when they stop. File I/O allows your code to persist data, read configuration, process logs, and interact with other programs through files.
Files Are Persistent Storage
Files store data beyond program execution. Text files hold human-readable content, while binary files store raw bytes for images, audio, and executables.
Why it matters: File I/O is essential for configuration files, data processing, logging, exports, and integration with other systems.
Configuration Files
Store settings, preferences, and API keys that persist across runs.
Data Processing
Read CSV, JSON, and text files to analyze and transform data.
Logging
Write application logs for debugging and monitoring.
Data Exchange
Share data between programs using standard file formats.
File Modes
When opening a file, you specify a mode that determines what operations are allowed. Choose the wrong mode and you might accidentally overwrite important data or fail to read existing content.
File Opening Modes
'r'
Read
Open for reading (default). File must exist or raises FileNotFoundError.
'w'
Write
Open for writing. Creates file if not exists. Truncates existing content!
'a'
Append
Open for appending. Creates file if not exists. Preserves existing content.
'rb' / 'wb'
Binary Modes
Read/write raw bytes. Use for images, audio, PDFs, and other non-text files.
'r+' / 'w+'
Read + Write
Open for both reading and writing. Use when you need to modify files in place.
# Different ways to open files
file = open('data.txt', 'r') # Read mode (default)
file = open('data.txt', 'w') # Write mode (overwrites!)
file = open('data.txt', 'a') # Append mode
file = open('image.png', 'rb') # Binary read
file = open('output.bin', 'wb') # Binary write
Warning: Write mode ('w') truncates the file immediately when opened, erasing all existing content!
Context Managers
The with statement ensures files are properly closed even if errors occur. This is the recommended way to work with files in Python.
Context Manager Flow
with open('file.txt') as f:
Manual vs Context Manager
Bad: Manual Close
# Manual close - risky!
file = open('data.txt')
content = file.read()
# If error happens here...
# file never gets closed!
file.close()
Good: With Statement
# Context manager - safe!
with open('data.txt') as file:
content = file.read()
# Even if error happens...
# File is ALWAYS closed
Always use the with statement for file operations. It handles cleanup automatically, even when exceptions occur.
Reading Files
Python offers multiple ways to read file content: all at once, line by line, or in chunks. Choose based on file size and your processing needs.
Read Entire File
# Read entire file as string
with open('data.txt', 'r') as file:
content = file.read()
print(content)
# Read all lines as list
with open('data.txt', 'r') as file:
lines = file.readlines()
for line in lines:
print(line.strip())
Use read() for small files. readlines() returns a list where each element is a line including the newline character.
Read Line by Line (Memory Efficient)
# Iterate directly - best for large files
with open('big_file.txt', 'r') as file:
for line in file:
print(line.strip())
# Read single line
with open('data.txt', 'r') as file:
first_line = file.readline()
second_line = file.readline()
Iterating over the file object is memory efficient because it reads one line at a time instead of loading everything into memory.
Reading with Encoding
# Specify encoding for non-ASCII characters
with open('data.txt', 'r', encoding='utf-8') as file:
content = file.read()
# Handle encoding errors
with open('data.txt', 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
Always specify encoding='utf-8' when working with international text. The errors parameter controls how encoding errors are handled.
Practice: Reading Files
Task: Create a file 'sample.txt' with three lines, then read and print its content.
Show Solution
# First create the file
with open('sample.txt', 'w') as f:
f.write("Line 1\nLine 2\nLine 3")
# Then read and print
with open('sample.txt', 'r') as f:
content = f.read()
print(content)
Task: Write a function count_lines(filename) that returns the number of lines in a file.
Show Solution
def count_lines(filename):
with open(filename, 'r') as f:
return len(f.readlines())
# Test
print(count_lines('sample.txt')) # 3
Task: Write search_file(filename, word) that returns all lines containing the word.
Show Solution
def search_file(filename, word):
matches = []
with open(filename, 'r') as f:
for line in f:
if word in line:
matches.append(line.strip())
return matches
# Test
print(search_file('sample.txt', 'Line'))
Writing Files
Writing files creates new content or overwrites existing files. Append mode adds to the end without destroying existing data.
Write Mode (Overwrites)
# Write creates/overwrites file
with open('output.txt', 'w') as file:
file.write("Hello, World!\n")
file.write("This is line 2.\n")
# Write multiple lines at once
lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
with open('output.txt', 'w') as file:
file.writelines(lines)
Write mode erases everything in the file when opened. Use append mode if you want to preserve existing content.
Append Mode (Preserves)
# Append adds to end of file
with open('log.txt', 'a') as file:
file.write("New log entry\n")
# Append with timestamp
from datetime import datetime
with open('log.txt', 'a') as file:
timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
file.write(f"[{timestamp}] Event occurred\n")
Append mode is perfect for log files where you want to add new entries without losing history.
Writing with Print
# Use print() with file parameter
with open('output.txt', 'w') as file:
print("Using print!", file=file)
print("Automatic newlines", file=file)
print("Value:", 42, file=file)
Using print() with file parameter adds automatic newlines and supports multiple arguments like regular print.
Practice: Writing Files
Task: Write a function save_list(filename, items) that writes each item on a new line.
Show Solution
def save_list(filename, items):
with open(filename, 'w') as f:
for item in items:
f.write(f"{item}\n")
# Test
fruits = ['apple', 'banana', 'cherry']
save_list('fruits.txt', fruits)
Task: Write log_message(message) that appends timestamped messages to 'app.log'.
Show Solution
from datetime import datetime
def log_message(message):
with open('app.log', 'a') as f:
ts = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
f.write(f"[{ts}] {message}\n")
# Test
log_message("Application started")
log_message("User logged in")
Task: Write copy_file(source, dest) that copies content from one file to another.
Show Solution
def copy_file(source, dest):
with open(source, 'r') as src:
content = src.read()
with open(dest, 'w') as dst:
dst.write(content)
# Test
copy_file('sample.txt', 'sample_copy.txt')
Task: Write filter_file(source, dest, keyword) that copies only lines containing keyword.
Show Solution
def filter_file(source, dest, keyword):
with open(source, 'r') as src:
lines = src.readlines()
with open(dest, 'w') as dst:
for line in lines:
if keyword in line:
dst.write(line)
# Test
filter_file('app.log', 'errors.log', 'ERROR')
Task: Write number_lines(source, dest) that adds line numbers to each line.
Show Solution
def number_lines(source, dest):
with open(source, 'r') as src:
lines = src.readlines()
with open(dest, 'w') as dst:
for i, line in enumerate(lines, 1):
dst.write(f"{i:4d}: {line}")
# Test
number_lines('sample.txt', 'numbered.txt')
Binary Mode
Binary mode reads and writes raw bytes without text encoding. Use it for images, audio, PDFs, executables, and any non-text file.
Reading Binary Files
# Read binary file
with open('image.png', 'rb') as file:
data = file.read()
print(f"File size: {len(data)} bytes")
print(f"First 10 bytes: {data[:10]}")
# Read binary in chunks
with open('large_file.bin', 'rb') as file:
while chunk := file.read(1024):
process(chunk)
Binary mode returns bytes objects instead of strings. Reading in chunks is memory efficient for large files.
Writing Binary Files
# Copy binary file
with open('source.png', 'rb') as src:
data = src.read()
with open('copy.png', 'wb') as dst:
dst.write(data)
# Write raw bytes
with open('data.bin', 'wb') as file:
file.write(b'\x00\x01\x02\x03')
Binary write expects bytes, not strings. Use the b prefix for byte literals or encode strings with .encode().
Text Mode
- Returns strings
- Handles encoding
- Converts line endings
- Cannot handle binary data
Binary Mode
- Returns bytes
- No encoding applied
- Preserves exact bytes
- Works with any file
Practice: Binary Files
Task: Write get_file_size(filename) that returns file size in bytes using binary read.
Show Solution
def get_file_size(filename):
with open(filename, 'rb') as f:
data = f.read()
return len(data)
# Test
print(f"Size: {get_file_size('sample.txt')} bytes")
Task: Write copy_binary(src, dst) that copies any file type using binary mode.
Show Solution
def copy_binary(src, dst):
with open(src, 'rb') as source:
data = source.read()
with open(dst, 'wb') as dest:
dest.write(data)
# Test - works for any file type
copy_binary('image.png', 'image_backup.png')
Task: Write is_png(filename) that checks if file starts with PNG signature bytes.
Show Solution
def is_png(filename):
# PNG signature: 89 50 4E 47 0D 0A 1A 0A
png_sig = b'\x89PNG\r\n\x1a\n'
with open(filename, 'rb') as f:
header = f.read(8)
return header == png_sig
# Test
print(is_png('image.png')) # True or False
Key Takeaways
Choose the Right Mode
Use 'r' for reading, 'w' for overwriting, 'a' for appending, and 'b' suffix for binary files.
Always Use With
Context managers ensure files are closed even if errors occur. Never manually close files.
Iterate for Large Files
Loop over file objects line by line instead of read() to handle large files efficiently.
Write Mode Truncates
Opening with 'w' erases existing content immediately. Use 'a' to preserve data.
Binary for Non-Text
Use 'rb' and 'wb' for images, audio, PDFs, and any file that is not plain text.
Specify Encoding
Use encoding='utf-8' for international text to avoid encoding errors.
Knowledge Check
Quick Quiz
Test what you've learned about file I/O operations