3.2. Generators, Decorators and Context Managers
Overview
These three features represent some of Python's most powerful and "Pythonic" constructs. They allow you to write cleaner, more efficient, and more maintainable code by managing state, resources, and behaviors with minimal boilerplate. Mastering these tools will elevate your Python code from functional to elegant and professional.
Generators & Iterators
Understanding Iterators
Before diving into generators, let's understand what an iterator is. An iterator is an object that implements the iterator protocol, which consists of the __iter__()
and __next__()
methods. When you use a for
loop, Python automatically calls these methods behind the scenes.
# Example: Manual iteration
my_list = [1, 2, 3, 4, 5]
iterator = iter(my_list)
print(next(iterator)) # Output: 1
print(next(iterator)) # Output: 2
print(next(iterator)) # Output: 3
Introduction to Generators
Generators are a special type of iterator that are defined using functions with the yield
keyword instead of return
. They're incredibly memory-efficient because they generate values on-demand rather than storing them all in memory at once.
def simple_generator():
yield 1
yield 2
yield 3
# Using the generator
gen = simple_generator()
for value in gen:
print(value)
# Output: 1, 2, 3
Why Generators Matter
Consider the difference between these two approaches:
# Memory-intensive approach
def get_squares_list(n):
return [x**2 for x in range(n)]
# Memory-efficient approach
def get_squares_generator(n):
for x in range(n):
yield x**2
# With a large number, the list approach uses massive memory
large_squares_list = get_squares_list(1000000) # Uses ~8MB of memory
# The generator approach uses minimal memory
large_squares_gen = get_squares_generator(1000000) # Uses ~96 bytes
Practical Generator Examples
Here's a more practical example - reading large files:
def read_large_file(file_path):
"""Generator that reads a file line by line without loading it entirely into memory"""
with open(file_path, 'r') as file:
for line in file:
yield line.strip()
# Usage
for line in read_large_file('huge_data.txt'):
process_line(line) # Process each line individually
Another useful pattern is the infinite sequence generator:
def fibonacci():
"""Infinite Fibonacci sequence generator"""
a, b = 0, 1
while True:
yield a
a, b = b, a + b
# Usage
fib = fibonacci()
first_10_fibs = [next(fib) for _ in range(10)]
print(first_10_fibs) # [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
Generator Expressions & Comprehensions
Generator expressions provide a concise way to create generators, similar to list comprehensions but with parentheses instead of square brackets:
# List comprehension (creates entire list in memory)
squares_list = [x**2 for x in range(10)]
# Generator expression (creates generator object)
squares_gen = (x**2 for x in range(10))
print(type(squares_list)) # <class 'list'>
print(type(squares_gen)) # <class 'generator'>
Practical Generator Expression Examples
# Processing large datasets efficiently
def process_large_dataset(data_source):
# Filter and transform data without loading everything into memory
filtered_data = (
transform(item)
for item in data_source
if is_valid(item)
)
return filtered_data
# Chaining generator expressions
numbers = range(1000000)
even_squares = (x**2 for x in numbers if x % 2 == 0)
large_even_squares = (x for x in even_squares if x > 1000)
# Only compute values as needed
for value in large_even_squares:
if value > 10000:
break
print(value)
Decorators
Decorators are a powerful feature that allows you to modify or enhance functions without changing their code. They're essentially functions that take another function as an argument and return a modified version of that function.
Basic Decorator Syntax
def my_decorator(func):
def wrapper():
print("Something is happening before the function is called.")
func()
print("Something is happening after the function is called.")
return wrapper
@my_decorator
def say_hello():
print("Hello!")
# This is equivalent to:
# say_hello = my_decorator(say_hello)
say_hello()
# Output:
# Something is happening before the function is called.
# Hello!
# Something is happening after the function is called.
Decorators with Arguments
To handle functions with arguments, use *args
and **kwargs
:
def my_decorator(func):
def wrapper(*args, **kwargs):
print(f"Calling {func.__name__} with args: {args}, kwargs: {kwargs}")
result = func(*args, **kwargs)
print(f"{func.__name__} returned: {result}")
return result
return wrapper
@my_decorator
def add(a, b):
return a + b
result = add(3, 5)
# Output:
# Calling add with args: (3, 5), kwargs: {}
# add returned: 8
Practical Decorator Examples
Timing Decorator
import time
from functools import wraps
def timing_decorator(func):
@wraps(func) # Preserves original function metadata
def wrapper(*args, **kwargs):
start_time = time.time()
result = func(*args, **kwargs)
end_time = time.time()
print(f"{func.__name__} took {end_time - start_time:.4f} seconds")
return result
return wrapper
@timing_decorator
def slow_function():
time.sleep(1)
return "Done!"
slow_function() # Output: slow_function took 1.0041 seconds
Caching Decorator
from functools import wraps
def cache_decorator(func):
cache = {}
@wraps(func)
def wrapper(*args, **kwargs):
# Create a cache key from arguments
key = str(args) + str(sorted(kwargs.items()))
if key in cache:
print(f"Cache hit for {func.__name__}")
return cache[key]
print(f"Cache miss for {func.__name__}")
result = func(*args, **kwargs)
cache[key] = result
return result
return wrapper
@cache_decorator
def expensive_calculation(n):
time.sleep(1) # Simulate expensive operation
return n ** 2
print(expensive_calculation(5)) # Cache miss, takes 1 second
print(expensive_calculation(5)) # Cache hit, instant
Authentication Decorator
def require_auth(func):
@wraps(func)
def wrapper(*args, **kwargs):
# In a real app, this would check session, JWT, etc.
user_authenticated = check_user_auth()
if not user_authenticated:
raise PermissionError("Authentication required")
return func(*args, **kwargs)
return wrapper
@require_auth
def sensitive_operation():
return "Sensitive data accessed"
Chaining Decorators
You can apply multiple decorators to a single function. They're applied from bottom to top:
def bold(func):
def wrapper(*args, **kwargs):
result = func(*args, **kwargs)
return f"<b>{result}</b>"
return wrapper
def italic(func):
def wrapper(*args, **kwargs):
result = func(*args, **kwargs)
return f"<i>{result}</i>"
return wrapper
@bold
@italic
def greet(name):
return f"Hello, {name}!"
print(greet("World")) # Output: <b><i>Hello, World!</i></b>
# This is equivalent to:
# greet = bold(italic(greet))
Practical Decorator Chaining
@timing_decorator
@cache_decorator
@require_auth
def complex_calculation(data):
# Some expensive authenticated operation
return sum(x**2 for x in data)
# This function is now:
# - Protected by authentication
# - Cached for performance
# - Timed for monitoring
Context Managers & the with
Statement
Context managers provide a clean way to manage resources like files, network connections, or locks. They ensure that setup and cleanup code is executed properly, even if an error occurs.
Basic Context Manager Usage
# Without context manager (prone to errors)
file = open('data.txt', 'r')
data = file.read()
file.close() # What if an error occurs before this line?
# With context manager (guaranteed cleanup)
with open('data.txt', 'r') as file:
data = file.read()
# File is automatically closed here, even if an error occurs
Multiple Context Managers
# Managing multiple resources
with open('input.txt', 'r') as infile, open('output.txt', 'w') as outfile:
data = infile.read()
processed_data = process(data)
outfile.write(processed_data)
# Both files are automatically closed
Context Managers for Non-File Resources
import threading
# Thread lock context manager
lock = threading.Lock()
with lock:
# Critical section - only one thread can execute this
shared_resource += 1
# Lock is automatically released
Writing Custom Context Managers
Class-Based Context Managers
To create a context manager using a class, implement __enter__
and __exit__
methods:
class DatabaseConnection:
def __init__(self, database_url):
self.database_url = database_url
self.connection = None
def __enter__(self):
print(f"Connecting to {self.database_url}")
self.connection = connect_to_database(self.database_url)
return self.connection
def __exit__(self, exc_type, exc_value, traceback):
if self.connection:
if exc_type is not None:
print(f"Error occurred: {exc_value}")
self.connection.rollback()
else:
self.connection.commit()
self.connection.close()
print("Database connection closed")
return False # Don't suppress exceptions
# Usage
with DatabaseConnection("postgresql://localhost/mydb") as db:
db.execute("INSERT INTO users (name) VALUES ('John')")
# Connection is automatically managed
Function-Based Context Managers with contextlib
The contextlib
module provides a simpler way to create context managers:
from contextlib import contextmanager
import os
@contextmanager
def change_directory(path):
old_path = os.getcwd()
try:
os.chdir(path)
yield path # This is what gets returned by 'as'
finally:
os.chdir(old_path)
# Usage
with change_directory('/tmp'):
print(os.getcwd()) # /tmp
# Do work in /tmp
print(os.getcwd()) # Back to original directory
Advanced Context Manager Example
Here's a more sophisticated example that manages a temporary environment:
from contextlib import contextmanager
import os
import tempfile
import shutil
@contextmanager
def temporary_environment(env_vars=None):
"""Context manager that creates a temporary directory and sets environment variables"""
# Setup
temp_dir = tempfile.mkdtemp()
old_env = os.environ.copy()
old_cwd = os.getcwd()
try:
# Set new environment variables
if env_vars:
os.environ.update(env_vars)
# Change to temp directory
os.chdir(temp_dir)
yield temp_dir
finally:
# Cleanup
os.chdir(old_cwd)
os.environ.clear()
os.environ.update(old_env)
shutil.rmtree(temp_dir)
# Usage
with temporary_environment({'DEBUG': 'True', 'ENV': 'test'}) as temp_path:
print(f"Working in: {temp_path}")
print(f"DEBUG env var: {os.environ.get('DEBUG')}")
# Create temporary files, run tests, etc.
# Everything is cleaned up automatically
Putting It All Together
Here's a comprehensive example that combines generators, decorators, and context managers:
from contextlib import contextmanager
from functools import wraps
import time
import logging
# Decorator for logging function calls
def log_calls(func):
@wraps(func)
def wrapper(*args, **kwargs):
logging.info(f"Calling {func.__name__}")
result = func(*args, **kwargs)
logging.info(f"{func.__name__} completed")
return result
return wrapper
# Context manager for timing operations
@contextmanager
def timing_context(operation_name):
start_time = time.time()
try:
yield
finally:
end_time = time.time()
logging.info(f"{operation_name} took {end_time - start_time:.4f} seconds")
# Generator for processing large datasets
@log_calls
def process_large_dataset(data_source):
"""Generator that processes data in chunks"""
for chunk in data_source:
with timing_context(f"Processing chunk of size {len(chunk)}"):
# Process each item in the chunk
for item in chunk:
if validate_item(item):
yield transform_item(item)
# Usage
def main():
large_dataset = get_data_chunks() # Returns chunks of data
with timing_context("Full dataset processing"):
processed_items = process_large_dataset(large_dataset)
# Only process items as needed (lazy evaluation)
for item in processed_items:
if item.priority > 5:
handle_high_priority_item(item)
Best Practices and Tips
- Use
functools.wraps
in decorators to preserve original function metadata - Generators are memory-efficient - use them for large datasets or infinite sequences
- Context managers ensure cleanup - always use them for resource management
- Chain decorators thoughtfully - consider the order of application
- Generator expressions are powerful - use them for data processing pipelines
- Custom context managers should handle exceptions gracefully
- Document your decorators - make it clear what they do and any side effects
These advanced Python features will help you write more professional, efficient, and maintainable code. They're the hallmarks of experienced Python developers and are essential for building robust applications.
Now, you're ready to learn about concurrency and parallelism!