π§° Lesson 8: Itertools & Functional Patterns
Unlock Python's powerful itertools module for advanced iteration, and master functools tools like lru_cache, reduce, partial, and wraps to write cleaner, faster, more composable code.
π― Learning Objectives
By the end of this lesson, you will be able to:
- Use
itertoolsinfinite iterators:count,cycle, andrepeat - Generate combinations and permutations with
itertoolscombinatoric tools - Chain, slice, group, and filter iterables with
itertoolsterminating iterators - Aggregate data with
functools.reduce - Cache expensive function calls with
functools.lru_cache - Create specialized functions with
functools.partial - Preserve function metadata in decorators with
functools.wraps - Recognize when a functional approach is clearer than an imperative one
Estimated Time: 60β75 minutes
Prerequisites: Generators, comprehensions, and basic decorator knowledge
In This Lesson
π€ Why itertools & functools?
In the previous lesson, you learned to build generators and chain them into pipelines. Python's standard library takes this further with two battle-tested modules:
itertoolsβ a toolkit of fast, memory-efficient building blocks for iteration. Think of it as a box of Lego pieces for working with sequences.functoolsβ higher-order functions and utilities that let you transform, cache, and compose functions themselves.
Together, they let you express complex data transformations in a few readable lines β without reinventing the wheel.
import itertools
import functools
# Example: Get the top 3 most expensive items per category
# Without itertools β lots of manual bookkeeping
# With itertools β clean and declarative
from itertools import groupby
from operator import itemgetter
products = [
("Electronics", "Laptop", 999),
("Electronics", "Phone", 699),
("Electronics", "Tablet", 449),
("Books", "Python Crash Course", 35),
("Books", "Clean Code", 40),
("Books", "DDIA", 45),
]
# Sort then group β groupby needs sorted input
products.sort(key=itemgetter(0))
for category, items in groupby(products, key=itemgetter(0)):
top = list(itertools.islice(items, 2))
print(f"{category}: {[name for _, name, _ in top]}")
# Books: ['Clean Code', 'DDIA']
# Electronics: ['Laptop', 'Phone']
π Key Terms
Higher-order function: A function that takes another function as an argument or returns one. Examples: map(), filter(), reduce(), decorators.
Pure function: A function that always returns the same output for the same input and has no side effects. Easy to test, cache, and reason about.
Lazy iterator: An object that produces values on demand rather than storing them all in memory. Every itertools function returns one.
βΎοΈ Infinite Iterators
These three itertools functions produce values forever β you must limit them with islice(), a break, or a takewhile().
count(start=0, step=1) β Endless Counter
from itertools import count, islice
# count() is like range() with no stop
for i in count(10, 3):
if i > 25:
break
print(i, end=" ")
# 10 13 16 19 22 25
# Auto-generate IDs
ids = count(1)
users = ["Alice", "Bob", "Carlos"]
user_records = [{"id": next(ids), "name": name} for name in users]
print(user_records)
# [{'id': 1, 'name': 'Alice'}, {'id': 2, 'name': 'Bob'}, {'id': 3, 'name': 'Carlos'}]
# Use with zip to add indices (like enumerate, but customizable)
data = ["a", "b", "c", "d"]
indexed = list(zip(count(100), data))
print(indexed)
# [(100, 'a'), (101, 'b'), (102, 'c'), (103, 'd')]
cycle(iterable) β Loop Forever
from itertools import cycle, islice
# Cycle repeats the iterable endlessly
colors = cycle(["red", "green", "blue"])
print([next(colors) for _ in range(7)])
# ['red', 'green', 'blue', 'red', 'green', 'blue', 'red']
# Round-robin assignment
teams = cycle(["Alpha", "Beta", "Gamma"])
tasks = ["Task A", "Task B", "Task C", "Task D", "Task E"]
assignments = {task: next(teams) for task in tasks}
print(assignments)
# {'Task A': 'Alpha', 'Task B': 'Beta', 'Task C': 'Gamma',
# 'Task D': 'Alpha', 'Task E': 'Beta'}
# Alternating row styles for a table
styles = cycle(["even", "odd"])
rows = ["Row 1", "Row 2", "Row 3", "Row 4"]
styled = [(row, next(styles)) for row in rows]
print(styled)
# [('Row 1', 'even'), ('Row 2', 'odd'), ('Row 3', 'even'), ('Row 4', 'odd')]
repeat(value, times=None) β Same Value Repeatedly
from itertools import repeat
# Repeat a value n times
zeros = list(repeat(0, 5))
print(zeros) # [0, 0, 0, 0, 0]
# Useful with map() for element-wise operations
import operator
bases = [2, 3, 4, 5]
# Raise each base to the power of 3
cubes = list(map(operator.pow, bases, repeat(3)))
print(cubes) # [8, 27, 64, 125]
# Create a grid of default values
grid = [list(repeat(".", 5)) for _ in range(3)]
for row in grid:
print(row)
# ['.', '.', '.', '.', '.']
# ['.', '.', '.', '.', '.']
# ['.', '.', '.', '.', '.']
β οΈ Infinite Means Infinite
Never pass count() or cycle() directly to list() β your program will run out of memory and hang. Always limit them with islice(), takewhile(), zip() with a finite iterable, or a manual break.
π² Combinatoric Iterators
These functions generate every possible arrangement or selection from an iterable β useful for brute-force search, testing, scheduling, and puzzle solving.
product(*iterables, repeat=1) β Cartesian Product
from itertools import product
# All combinations of two dice
dice_rolls = list(product(range(1, 7), repeat=2))
print(f"Total rolls: {len(dice_rolls)}") # 36
print(dice_rolls[:6])
# [(1,1), (1,2), (1,3), (1,4), (1,5), (1,6)]
# Cross-join two lists
sizes = ["S", "M", "L"]
colors = ["Red", "Blue"]
variants = list(product(sizes, colors))
print(variants)
# [('S','Red'), ('S','Blue'), ('M','Red'), ('M','Blue'), ('L','Red'), ('L','Blue')]
# Binary strings of length 4
bits = list(product("01", repeat=4))
print(["".join(b) for b in bits[:5]])
# ['0000', '0001', '0010', '0011', '0100']
permutations(iterable, r=None) β All Orderings
from itertools import permutations
# All orderings of 3 items
perms = list(permutations(["A", "B", "C"]))
print(f"Total: {len(perms)}") # 6 (= 3!)
for p in perms:
print(p)
# ('A', 'B', 'C')
# ('A', 'C', 'B')
# ('B', 'A', 'C')
# ('B', 'C', 'A')
# ('C', 'A', 'B')
# ('C', 'B', 'A')
# 2-letter permutations from 4 letters
pairs = list(permutations("ABCD", 2))
print(f"Total: {len(pairs)}") # 12 (= 4 Γ 3)
print(pairs[:6])
# [('A','B'), ('A','C'), ('A','D'), ('B','A'), ('B','C'), ('B','D')]
combinations(iterable, r) β Choose Without Order
from itertools import combinations, combinations_with_replacement
# Choose 2 from 4 players β order doesn't matter
teams = list(combinations(["Alice", "Bob", "Carlos", "Dana"], 2))
print(f"Total teams: {len(teams)}") # 6 (= 4C2)
for team in teams:
print(team)
# ('Alice', 'Bob')
# ('Alice', 'Carlos')
# ('Alice', 'Dana')
# ('Bob', 'Carlos')
# ('Bob', 'Dana')
# ('Carlos', 'Dana')
# combinations_with_replacement allows repeated elements
# Scoops of ice cream (can pick same flavor twice)
scoops = list(combinations_with_replacement(["vanilla", "chocolate", "strawberry"], 2))
print(scoops)
# [('vanilla', 'vanilla'), ('vanilla', 'chocolate'),
# ('vanilla', 'strawberry'), ('chocolate', 'chocolate'),
# ('chocolate', 'strawberry'), ('strawberry', 'strawberry')]
π§ Which Combinatoric Do I Need?
Ask two questions: Does order matter? (AB β BA?) and Can elements repeat?
Order + repeats β product Β· Order, no repeats β permutations Β· No order, no repeats β combinations Β· No order + repeats β combinations_with_replacement
π§ Terminating Iterators
These are the workhorses of itertools β they consume one or more iterables and produce a new iterable. All of them are lazy.
chain(*iterables) β Concatenate Iterables
from itertools import chain
# Flatten multiple lists into one stream
list1 = [1, 2, 3]
list2 = [4, 5]
list3 = [6, 7, 8, 9]
combined = list(chain(list1, list2, list3))
print(combined) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
# chain.from_iterable β when you have a list of lists
nested = [[1, 2], [3, 4], [5, 6]]
flat = list(chain.from_iterable(nested))
print(flat) # [1, 2, 3, 4, 5, 6]
# Combine generators without loading everything
def evens(n):
return (x for x in range(0, n, 2))
def odds(n):
return (x for x in range(1, n, 2))
all_nums = list(chain(evens(6), odds(6)))
print(all_nums) # [0, 2, 4, 1, 3, 5]
islice(iterable, stop) β Slice Any Iterable
from itertools import islice, count
# islice works on ANY iterable β unlike [start:stop] which needs a sequence
# Take the first 5 values from an infinite counter
first_five = list(islice(count(100), 5))
print(first_five) # [100, 101, 102, 103, 104]
# islice(iterable, start, stop, step)
# Skip first 2, take next 4
middle = list(islice(range(20), 2, 6))
print(middle) # [2, 3, 4, 5]
# Every 3rd item from a generator
thirds = list(islice(range(30), 0, 30, 3))
print(thirds) # [0, 3, 6, 9, 12, 15, 18, 21, 24, 27]
# Head of a file β without reading the entire file
def head(filepath, n=5):
"""Print first n lines of a file."""
with open(filepath) as f:
for line in islice(f, n):
print(line, end="")
groupby(iterable, key=None) β Group Consecutive Items
from itertools import groupby
from operator import itemgetter
# IMPORTANT: groupby only groups CONSECUTIVE equal elements
# You must sort by the key first!
sales = [
("Electronics", "Laptop", 999),
("Books", "Python Crash Course", 35),
("Electronics", "Phone", 699),
("Books", "Clean Code", 40),
("Electronics", "Tablet", 449),
]
# Sort by category first
sales.sort(key=itemgetter(0))
# Now group
for category, items in groupby(sales, key=itemgetter(0)):
item_list = list(items)
total = sum(price for _, _, price in item_list)
print(f"{category}: {len(item_list)} items, ${total}")
# Books: 2 items, $75
# Electronics: 3 items, $2147
β οΈ groupby Requires Sorted Input
groupby() only groups consecutive elements with the same key. If your data isn't sorted by the grouping key, you'll get multiple groups for the same key. Always sort() first, or use defaultdict(list) for unsorted data.
takewhile & dropwhile β Conditional Slicing
from itertools import takewhile, dropwhile
# takewhile β yield items WHILE condition is true, stop at first False
data = [2, 4, 6, 8, 1, 3, 5, 10, 12]
leading_evens = list(takewhile(lambda x: x % 2 == 0, data))
print(leading_evens) # [2, 4, 6, 8] β stops at 1
# dropwhile β skip items WHILE condition is true, then yield the rest
after_evens = list(dropwhile(lambda x: x % 2 == 0, data))
print(after_evens) # [1, 3, 5, 10, 12] β starts from 1
# Practical: Skip header lines in a file
lines = ["# Comment 1", "# Comment 2", "Name,Age", "Alice,30", "Bob,25"]
data_lines = list(dropwhile(lambda l: l.startswith("#"), lines))
print(data_lines)
# ['Name,Age', 'Alice,30', 'Bob,25']
zip_longest(*iterables, fillvalue=None)
from itertools import zip_longest
# Built-in zip stops at the shortest iterable
names = ["Alice", "Bob", "Carlos"]
scores = [95, 87]
print(list(zip(names, scores)))
# [('Alice', 95), ('Bob', 87)] β Carlos dropped!
# zip_longest pads the shorter iterables
print(list(zip_longest(names, scores, fillvalue="N/A")))
# [('Alice', 95), ('Bob', 87), ('Carlos', 'N/A')]
Other Useful Tools
from itertools import accumulate, starmap, filterfalse
import operator
# accumulate β running totals (or any running operation)
values = [1, 2, 3, 4, 5]
running_sum = list(accumulate(values))
print(running_sum) # [1, 3, 6, 10, 15]
running_product = list(accumulate(values, operator.mul))
print(running_product) # [1, 2, 6, 24, 120]
running_max = list(accumulate(values, max))
print(running_max) # [1, 2, 3, 4, 5]
# starmap β like map(), but unpacks argument tuples
points = [(2, 5), (3, 2), (10, 3)]
powers = list(starmap(pow, points))
print(powers) # [32, 9, 1000] (2β΅, 3Β², 10Β³)
# filterfalse β opposite of filter
nums = range(10)
odds = list(filterfalse(lambda x: x % 2 == 0, nums))
print(odds) # [1, 3, 5, 7, 9]
π» functools.reduce
reduce() takes a two-argument function and applies it cumulatively to the items of an iterable, reducing it to a single value β like accumulate() but returns only the final result.
from functools import reduce
# Sum of all values (same as sum(), but shows the concept)
total = reduce(lambda a, b: a + b, [1, 2, 3, 4, 5])
print(total) # 15
# How it works step by step:
# Step 1: a=1, b=2 β 3
# Step 2: a=3, b=3 β 6
# Step 3: a=6, b=4 β 10
# Step 4: a=10, b=5 β 15
Practical Uses
from functools import reduce
import operator
# Product of all numbers
numbers = [2, 3, 4, 5]
product = reduce(operator.mul, numbers)
print(product) # 120
# With an initial value (handles empty lists safely)
total = reduce(operator.add, [], 0)
print(total) # 0 (without initializer, empty list raises TypeError)
# Flatten a list of lists
nested = [[1, 2], [3, 4], [5, 6]]
flat = reduce(operator.add, nested)
print(flat) # [1, 2, 3, 4, 5, 6]
# Deep merge dictionaries (left to right)
dicts = [
{"a": 1, "b": 2},
{"b": 3, "c": 4},
{"c": 5, "d": 6},
]
merged = reduce(lambda a, b: {**a, **b}, dicts)
print(merged) # {'a': 1, 'b': 3, 'c': 5, 'd': 6}
# Find the longest string
words = ["hello", "magnificent", "world", "python"]
longest = reduce(lambda a, b: a if len(a) >= len(b) else b, words)
print(longest) # magnificent
π§ reduce vs. Built-in Alternatives
Python has built-in functions for common reductions: sum(), max(), min(), any(), all(). Prefer these when they fit β they're faster and more readable. Reach for reduce() when you have a custom two-argument operation (like merging dicts or finding the GCD of a list).
β‘ functools.lru_cache
lru_cache is a decorator that caches function results. If the function is called again with the same arguments, it returns the cached result instantly instead of recomputing. "LRU" stands for Least Recently Used β when the cache is full, the oldest unused entries are evicted.
The Fibonacci Problem
import time
from functools import lru_cache
# Without cache β exponential time, VERY slow for large n
def fib_slow(n):
if n < 2:
return n
return fib_slow(n - 1) + fib_slow(n - 2)
start = time.perf_counter()
print(fib_slow(35)) # 9227465
print(f"Time: {time.perf_counter() - start:.3f}s")
# Time: ~3-5 seconds!
# With lru_cache β instant!
@lru_cache(maxsize=None) # None = unlimited cache
def fib_fast(n):
if n < 2:
return n
return fib_fast(n - 1) + fib_fast(n - 2)
start = time.perf_counter()
print(fib_fast(35)) # 9227465
print(f"Time: {time.perf_counter() - start:.6f}s")
# Time: ~0.000050s β about 100,000x faster!
How It Works
@lru_cache(maxsize=128) # Cache up to 128 unique argument combos
def expensive_lookup(user_id, include_details=False):
"""Simulate a slow database query."""
print(f" [DB Query] Looking up user {user_id}...")
time.sleep(0.5)
return {"id": user_id, "name": f"User_{user_id}"}
# First call β actually runs the function
result1 = expensive_lookup(42) # Prints [DB Query] and takes 0.5s
# Second call with same args β returns cached result instantly
result2 = expensive_lookup(42) # No print, instant!
# Different args β cache miss, runs the function
result3 = expensive_lookup(99) # Prints [DB Query] and takes 0.5s
# Check cache stats
print(expensive_lookup.cache_info())
# CacheInfo(hits=1, misses=2, maxsize=128, currsize=2)
Cache Management
@lru_cache(maxsize=256)
def fetch_config(key):
print(f" Loading {key} from disk...")
return f"value_for_{key}"
fetch_config("database_url")
fetch_config("api_key")
fetch_config("database_url") # Cached β no disk read
# View cache stats
print(fetch_config.cache_info())
# CacheInfo(hits=1, misses=2, maxsize=256, currsize=2)
# Clear the cache (useful when underlying data changes)
fetch_config.cache_clear()
print(fetch_config.cache_info())
# CacheInfo(hits=0, misses=0, maxsize=256, currsize=0)
β οΈ lru_cache Requirements
All function arguments must be hashable (immutable). You can't cache a function that takes a list or dict as an argument β use tuples or frozensets instead. Also, don't cache functions with side effects or that depend on external state (like the current time).
π Python 3.9+: @cache
Python 3.9 added functools.cache β a simpler alias for lru_cache(maxsize=None). Use it when you want unlimited caching without specifying a max size:
from functools import cache
@cache
def factorial(n):
return n * factorial(n - 1) if n else 1
π functools.partial
partial() creates a new function with some arguments pre-filled β a technique called partial application. It's useful for adapting a general function to a more specific use case.
from functools import partial
# Start with a general function
def power(base, exponent):
return base ** exponent
# Create specialized versions
square = partial(power, exponent=2)
cube = partial(power, exponent=3)
print(square(5)) # 25
print(cube(5)) # 125
print(square(9)) # 81
Practical Examples
from functools import partial
# 1. Pre-configure a logging function
def log(level, message, timestamp=None):
ts = timestamp or "now"
print(f"[{level}] {ts}: {message}")
info = partial(log, "INFO")
error = partial(log, "ERROR")
debug = partial(log, "DEBUG")
info("Server started") # [INFO] now: Server started
error("Disk full!") # [ERROR] now: Disk full!
debug("x = 42") # [DEBUG] now: x = 42
# 2. Pre-configure int() for different bases
from_binary = partial(int, base=2)
from_hex = partial(int, base=16)
print(from_binary("1010")) # 10
print(from_hex("ff")) # 255
# 3. Create a specialized sort
students = [
{"name": "Alice", "gpa": 3.9},
{"name": "Bob", "gpa": 3.2},
{"name": "Carlos", "gpa": 3.7},
]
# Instead of writing a lambda each time
sort_by_gpa = partial(sorted, key=lambda s: s["gpa"], reverse=True)
top_students = sort_by_gpa(students)
print([s["name"] for s in top_students])
# ['Alice', 'Carlos', 'Bob']
partial with Callbacks
from functools import partial
def apply_discount(price, discount_pct):
"""Apply a percentage discount to a price."""
return round(price * (1 - discount_pct / 100), 2)
# Create discount tiers
black_friday = partial(apply_discount, discount_pct=30)
member_discount = partial(apply_discount, discount_pct=10)
clearance = partial(apply_discount, discount_pct=50)
print(black_friday(100)) # 70.0
print(member_discount(100)) # 90.0
print(clearance(100)) # 50.0
# Use with map for batch processing
prices = [29.99, 49.99, 99.99, 149.99]
sale_prices = list(map(black_friday, prices))
print(sale_prices) # [21.0, 35.0, 70.0, 105.0]
π§ partial vs. Lambda
Both partial(power, exponent=2) and lambda x: power(x, exponent=2) do the same thing. Prefer partial when you're simply freezing arguments β it's more explicit about intent, has a useful repr, and is slightly faster. Use lambdas when you need to transform arguments or add logic.
π Decorators & functools.wraps
A decorator is a function that takes a function and returns a modified version of it. It's one of the most powerful patterns in Python for adding behavior without changing the original function's code.
Building a Decorator
import time
def timer(func):
"""Decorator that measures execution time."""
def wrapper(*args, **kwargs):
start = time.perf_counter()
result = func(*args, **kwargs)
elapsed = time.perf_counter() - start
print(f" β±οΈ {func.__name__} took {elapsed:.4f}s")
return result
return wrapper
@timer
def slow_function(n):
"""Simulate a slow operation."""
total = sum(i ** 2 for i in range(n))
return total
result = slow_function(1_000_000)
# β±οΈ slow_function took 0.1234s
The Problem: Lost Metadata
# Without @wraps, the decorator hides the original function's identity
print(slow_function.__name__) # 'wrapper' β not 'slow_function'!
print(slow_function.__doc__) # None β the docstring is gone!
help(slow_function) # Shows wrapper's info, not the original
The Fix: @functools.wraps
import time
from functools import wraps
def timer(func):
"""Decorator that measures execution time."""
@wraps(func) # β Preserves the original function's metadata
def wrapper(*args, **kwargs):
start = time.perf_counter()
result = func(*args, **kwargs)
elapsed = time.perf_counter() - start
print(f" β±οΈ {func.__name__} took {elapsed:.4f}s")
return result
return wrapper
@timer
def slow_function(n):
"""Simulate a slow operation."""
total = sum(i ** 2 for i in range(n))
return total
# Now metadata is preserved!
print(slow_function.__name__) # 'slow_function' β
print(slow_function.__doc__) # 'Simulate a slow operation.' β
More Decorator Patterns
from functools import wraps
# Retry decorator β retries a function on failure
def retry(max_attempts=3, delay=1):
"""Decorator factory that retries on exceptions."""
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
for attempt in range(1, max_attempts + 1):
try:
return func(*args, **kwargs)
except Exception as e:
if attempt == max_attempts:
raise
print(f" Attempt {attempt} failed: {e}. Retrying...")
time.sleep(delay)
return wrapper
return decorator
@retry(max_attempts=3, delay=0.5)
def fetch_data(url):
"""Fetch data from an API."""
import random
if random.random() < 0.7:
raise ConnectionError("Network timeout")
return {"status": "ok"}
# Validate arguments decorator
def validate_positive(func):
"""Ensure all numeric arguments are positive."""
@wraps(func)
def wrapper(*args, **kwargs):
for arg in args:
if isinstance(arg, (int, float)) and arg < 0:
raise ValueError(f"Expected positive number, got {arg}")
return func(*args, **kwargs)
return wrapper
@validate_positive
def calculate_area(width, height):
"""Calculate rectangle area."""
return width * height
print(calculate_area(5, 3)) # 15
# calculate_area(-1, 3) # ValueError: Expected positive number, got -1
β
Always Use @wraps
Every time you write a decorator, include @wraps(func) on the inner wrapper function. It costs nothing and preserves __name__, __doc__, __module__, and __qualname__. Without it, debugging tools, documentation generators, and serialization libraries can break.
βοΈ Functional vs. Imperative Style
Python supports both imperative (step-by-step mutation) and functional (transformation pipeline) styles. Neither is universally better β the best choice depends on the task.
Side-by-Side Comparison
# TASK: From a list of orders, get the total revenue
# for completed orders over $50, with a 10% tax.
orders = [
{"id": 1, "status": "complete", "amount": 120},
{"id": 2, "status": "pending", "amount": 45},
{"id": 3, "status": "complete", "amount": 30},
{"id": 4, "status": "complete", "amount": 200},
{"id": 5, "status": "cancelled", "amount": 75},
{"id": 6, "status": "complete", "amount": 55},
]
# βββ IMPERATIVE STYLE βββ
total = 0
for order in orders:
if order["status"] == "complete":
if order["amount"] > 50:
total += order["amount"] * 1.10
print(f"Total: ${total:.2f}")
# Total: $412.50
# βββ FUNCTIONAL STYLE βββ
from functools import reduce
total = reduce(
lambda acc, amt: acc + amt,
(order["amount"] * 1.10
for order in orders
if order["status"] == "complete" and order["amount"] > 50),
0 # initial value
)
print(f"Total: ${total:.2f}")
# Total: $412.50
# βββ PYTHONIC MIDDLE GROUND βββ
# Use comprehensions/generators but stay readable
qualifying = (
order["amount"] * 1.10
for order in orders
if order["status"] == "complete" and order["amount"] > 50
)
total = sum(qualifying)
print(f"Total: ${total:.2f}")
# Total: $412.50
When to Use Each Style
| Use Functional When⦠| Use Imperative When⦠|
|---|---|
| Transforming data through a pipeline (filter β map β reduce) | Logic has complex branching or state that changes over time |
| Each step is a pure function (no side effects) | You need side effects (print, write to file, update a database) |
| You want to compose reusable pieces | The step-by-step process is clearer than a one-liner |
| Caching or memoization is beneficial | Performance-critical mutation of large data structures in place |
You're working with itertools pipelines |
Collaborators are more comfortable with loops |
π The Pythonic Way
Python isn't a purely functional language β it's multi-paradigm. The Pythonic approach is pragmatic: use map()/filter() when the function already exists (like str.upper), use comprehensions for simple transforms, use loops for complex logic, and use itertools/functools when they make the code more readable, not less.
ποΈ Hands-on Exercises
ποΈ Exercise 1: itertools Toolkit
Objective: Practice itertools combinatoric and terminating iterators.
Requirements:
- Use
itertools.productto generate all possible 3-character passwords from lowercase letters "a"β"d" and digits "0"β"2". Print the total count. - Use
itertools.combinationsto find all 3-person teams from a list of 6 people. Print each team. - Use
itertools.groupbyto group a list of words by their first letter (sort first!). Print each group with its count. - Use
itertools.chainandisliceto merge three generators and take only the first 10 items.
Starter Code:
from itertools import product, combinations, groupby, chain, islice
# 1. All 3-char passwords from "abcd" + "012"
chars = "abcd012"
passwords = # TODO β use product with repeat=3
print(f"Total passwords: {len(passwords)}")
# Should be 7Β³ = 343
# 2. All 3-person teams from 6 people
people = ["Alice", "Bob", "Carlos", "Dana", "Eve", "Frank"]
teams = # TODO β use combinations
for team in teams:
print(team)
# 3. Group words by first letter
words = ["apple", "avocado", "banana", "blueberry", "cherry", "cantaloupe", "apricot"]
# TODO: sort words, then groupby first letter, print each group
# 4. Merge three generators, take first 10
gen1 = (x ** 2 for x in range(5))
gen2 = (x ** 3 for x in range(5))
gen3 = (x * 10 for x in range(5))
first_10 = # TODO β chain then islice
print(list(first_10))
π‘ Hint
For passwords: product(chars, repeat=3) gives tuples, join each with "".join(). For groupby, remember to sort() by the same key first. The key function for first letter is lambda w: w[0].
β Solution
from itertools import product, combinations, groupby, chain, islice
# 1. All 3-char passwords from "abcd" + "012"
chars = "abcd012"
passwords = ["".join(p) for p in product(chars, repeat=3)]
print(f"Total passwords: {len(passwords)}")
# Total passwords: 343
print("First 5:", passwords[:5])
# First 5: ['aaa', 'aab', 'aac', 'aad', 'aa0']
# 2. All 3-person teams from 6 people
people = ["Alice", "Bob", "Carlos", "Dana", "Eve", "Frank"]
teams = list(combinations(people, 3))
print(f"\nTotal teams: {len(teams)}")
# Total teams: 20
for team in teams[:5]:
print(team)
# ('Alice', 'Bob', 'Carlos')
# ('Alice', 'Bob', 'Dana')
# ...
# 3. Group words by first letter
words = ["apple", "avocado", "banana", "blueberry", "cherry", "cantaloupe", "apricot"]
words.sort(key=lambda w: w[0])
print("\nWords grouped by first letter:")
for letter, group in groupby(words, key=lambda w: w[0]):
items = list(group)
print(f" {letter.upper()}: {items} ({len(items)} words)")
# A: ['apple', 'apricot', 'avocado'] (3 words)
# B: ['banana', 'blueberry'] (2 words)
# C: ['cantaloupe', 'cherry'] (2 words)
# 4. Merge three generators, take first 10
gen1 = (x ** 2 for x in range(5)) # 0, 1, 4, 9, 16
gen2 = (x ** 3 for x in range(5)) # 0, 1, 8, 27, 64
gen3 = (x * 10 for x in range(5)) # 0, 10, 20, 30, 40
first_10 = list(islice(chain(gen1, gen2, gen3), 10))
print(f"\nFirst 10 from chained generators: {first_10}")
# [0, 1, 4, 9, 16, 0, 1, 8, 27, 64]
ποΈ Exercise 2: functools Power-Ups
Objective: Practice lru_cache, partial, reduce, and wraps.
Requirements:
- Write a recursive
factorial(n)function and decorate it with@lru_cache. Print cache stats after several calls. - Use
functools.partialto create ato_celsiusconverter from a generalconvert_temp(value, from_unit, to_unit)function. - Use
functools.reduceto compute the greatest common divisor (GCD) of a list of numbers. - Write a
@log_callsdecorator that prints the function name and arguments on each call. Use@wrapsto preserve metadata.
π‘ Hint
For GCD of a list, reduce over pairs using math.gcd: reduce(math.gcd, numbers). For the temperature converter, the general function might look like convert_temp(value, from_unit, to_unit) and you'd partial-apply from_unit="F" and to_unit="C".
β Solution
from functools import lru_cache, partial, reduce, wraps
import math
# ββ 1. Cached factorial ββ
@lru_cache(maxsize=None)
def factorial(n):
"""Compute n! recursively with caching."""
if n <= 1:
return 1
return n * factorial(n - 1)
print("10! =", factorial(10)) # 3628800
print("20! =", factorial(20)) # 2432902008176640000
print("15! =", factorial(15)) # 1307674368000 (uses cached sub-results)
print("Cache:", factorial.cache_info())
# CacheInfo(hits=10, misses=20, maxsize=None, currsize=20)
# ββ 2. Partial temperature converter ββ
def convert_temp(value, from_unit, to_unit):
"""Convert temperature between C, F, and K."""
# First convert to Celsius
if from_unit == "F":
celsius = (value - 32) * 5 / 9
elif from_unit == "K":
celsius = value - 273.15
else:
celsius = value
# Then convert from Celsius to target
if to_unit == "F":
return round(celsius * 9 / 5 + 32, 2)
elif to_unit == "K":
return round(celsius + 273.15, 2)
else:
return round(celsius, 2)
# Create specialized converters
f_to_c = partial(convert_temp, from_unit="F", to_unit="C")
c_to_f = partial(convert_temp, from_unit="C", to_unit="F")
f_to_k = partial(convert_temp, from_unit="F", to_unit="K")
print(f"\n212Β°F = {f_to_c(212)}Β°C") # 100.0
print(f"0Β°C = {c_to_f(0)}Β°F") # 32.0
print(f"72Β°F = {f_to_k(72)}K") # 295.37
# ββ 3. GCD of a list using reduce ββ
numbers = [48, 36, 60, 12, 84]
gcd_result = reduce(math.gcd, numbers)
print(f"\nGCD of {numbers} = {gcd_result}")
# GCD of [48, 36, 60, 12, 84] = 12
# ββ 4. @log_calls decorator with @wraps ββ
def log_calls(func):
"""Decorator that logs function calls with arguments."""
@wraps(func)
def wrapper(*args, **kwargs):
args_str = ", ".join(
[repr(a) for a in args] +
[f"{k}={v!r}" for k, v in kwargs.items()]
)
print(f" π Calling {func.__name__}({args_str})")
result = func(*args, **kwargs)
print(f" β
{func.__name__} returned {result!r}")
return result
return wrapper
@log_calls
def add(a, b):
"""Add two numbers."""
return a + b
@log_calls
def greet(name, greeting="Hello"):
"""Greet someone."""
return f"{greeting}, {name}!"
add(3, 4)
# π Calling add(3, 4)
# β
add returned 7
greet("Alice", greeting="Hi")
# π Calling greet('Alice', greeting='Hi')
# β
greet returned "Hi, Alice!"
# Metadata preserved!
print(f"\n{add.__name__}: {add.__doc__}")
# add: Add two numbers.
π― Quick Quiz
Question 1: What happens if you call list(itertools.count(1))?
Question 2: Why must you sort data before using itertools.groupby()?
Question 3: What does @functools.wraps(func) do inside a decorator?
π Best Practices
β Do's
- Use
itertoolsfor standard iteration patterns β it's faster than hand-rolled loops (implemented in C) - Always use
@wrapsin decorators β it costs nothing and saves debugging headaches - Cache pure functions with
@lru_cacheβ especially recursive ones like Fibonacci, factorials, and dynamic programming solutions - Prefer
partialover lambdas when you're just freezing arguments β it's more readable and introspectable - Sort before
groupbyβ or usedefaultdict(list)if sorting isn't an option - Always limit infinite iterators β use
islice(),takewhile(), orzip()with a finite sequence
β Don'ts
- Don't use
reducewhen a built-in works βsum(),max(),min()are clearer and faster - Don't cache functions with side effects β the cached call skips the function body, so the side effect won't happen on cache hits
- Don't pass unhashable arguments to cached functions β lists and dicts aren't hashable; convert to tuples or frozensets
- Don't over-functionalize β
reduce(lambda a, b: a + b, items)is just a worsesum(items) - Don't
list()infinite iterators β your program will hang and crash
π‘ Pro Tips
itertools.tee(iterable, n)createsnindependent copies of an iterator β but use it sparingly, as it caches values internally- The
operatormodule provides function versions of operators:operator.add,operator.mul,operator.itemgetterβ these are faster than equivalent lambdas more-itertools(third-party) extendsitertoolswith dozens of additional recipes:chunked,flatten,unique_everseen, and more- Python 3.12 made
itertools.batched(iterable, n)a built-in β it splits an iterable into chunks of sizen
π Summary
π Key Takeaways
itertoolsinfinite βcount,cycle,repeatproduce values forever (always limit them!)itertoolscombinatoric βproduct,permutations,combinationsfor exhaustive searchitertoolsterminating βchain,islice,groupby,accumulate,takewhile/dropwhile,zip_longestfunctools.reduceβ fold a sequence into a single value with a custom functionfunctools.lru_cacheβ memoize expensive pure functions for dramatic speedupsfunctools.partialβ freeze some arguments to create specialized versions of general functionsfunctools.wrapsβ always use in decorators to preserve function metadata- Functional style is great for data pipelines; imperative style is better for complex logic with side effects
| Tool | Purpose | Example |
|---|---|---|
chain() |
Concatenate iterables | chain([1,2], [3,4]) β 1,2,3,4 |
islice() |
Slice any iterable | islice(count(), 5) β 0,1,2,3,4 |
groupby() |
Group consecutive items | Group sales by category |
product() |
Cartesian product | All dice roll combinations |
reduce() |
Fold to single value | reduce(gcd, [48,36,12]) β 12 |
lru_cache() |
Memoize results | Cache recursive Fibonacci |
partial() |
Freeze arguments | partial(int, base=2) |
wraps() |
Preserve metadata | Inside every decorator |
π Additional Resources
- Python Docs β itertools
- Python Docs β functools
- Python Docs β operator module
- Python HOWTO β Functional Programming
- more-itertools β Extended Recipes
π What's Next?
In the next lesson, we'll cover Virtual Environments & Package Management β creating isolated Python environments with venv, managing dependencies with pip and requirements.txt, and structuring your projects into proper packages and modules.
π Level Up!
You now have a powerful toolkit for advanced iteration and functional programming. Combined with generators and comprehensions from the previous lesson, you can write concise, composable, and memory-efficient Python code for any data processing task.