Free Python Standard Library Expansion Cheat Sheet Online
Python's "batteries included" philosophy is not marketing copy. It is the single biggest reason Python dominates fields from data science to web development to DevOps automation. The standard library ships with every Python installation and provides production-grade modules for file system operations, data structures, text processing, network communication, concurrency, and more. You do not need to install anything. You do not need to evaluate competing third-party packages. The modules are tested, documented, and battle-hardened across millions of deployments.
But the standard library is large. Really large. Nobody has memorized every function in os, itertools, functools, and the rest. Even experienced Python developers frequently pause to look up argument order, return types, or edge case behavior. That is precisely why we built the free interactive Python Standard Library Expansion Cheat Sheet: a searchable, filterable, copyable reference covering 120+ entries across the 10 most essential modules. No signup required. No ads. No data collection. It runs entirely in your browser. Just open it and go.
The cheat sheet covers the modules that most Python developers use every day, plus the lesser-known corners of each module that can save you from reinventing the wheel. All entries use Python 3.12 syntax and conventions. The design is inspired by "The Grand Archive's Reading Room" aesthetic: warm amber-gold lighting, parchment-textured cards, Playfair Display headings, and Lora body text. It is a reference tool that respects both the content and the reading experience. For deeper dives into specific Python topics, check our Python Data Structures Deep-Dive and our Python Built-in Functions Cheat Sheet, which pair naturally with the standard library reference.
Why the Python Standard Library Matters
The Python standard library is one of the most comprehensive among major programming languages. It includes over 200 modules spanning operating system interfaces, file formats, network protocols, text processing, data persistence, concurrency, testing, debugging, and much more. Guido van Rossum and the Python core team made a deliberate decision early on: ship with everything developers need for common tasks. That decision means you can write a web scraper, a command-line tool, a log parser, or a configuration file reader without touching pip. Zero external dependencies means zero supply chain risk, zero dependency resolution headaches, and zero breaking changes from third-party maintainers.
Using the standard library also means your code is more portable. The same os.path call works on Windows, macOS, and Linux. The same json serialization produces identical output regardless of platform. The same re pattern matches identically everywhere. When you minimize external dependencies, you maximize the chance that someone else can clone your repo, run your code, and get the same result. In an era of dependency-fatigue and left-pad incidents, that is a meaningful competitive advantage.
But the standard library is not just about avoiding dependencies. It is about developer speed. The functions in itertools, collections, and functools have been optimized over decades. They are written in C for performance-critical paths. They handle edge cases you have not thought of. They implement algorithms correctly and efficiently. Using them means writing less code, shipping faster, and having fewer bugs. That is the core argument for mastering the standard library: it makes you a more productive Python programmer.
What This Cheat Sheet Covers
Our Python Standard Library Expansion Cheat Sheet covers more than 120 entries across 10 essential modules. Every entry includes the function signature, a plain-English description, and a working code example. The interface allows you to search across all entries, filter by module, and copy any example with one click. The 10 modules covered are:
- os — File system operations, environment variables, process management.
- sys — Interpreter runtime: stdin/stdout/stderr, command-line arguments, recursion limits.
- collections — Specialized containers: namedtuple, deque, Counter, defaultdict, OrderedDict, ChainMap.
- itertools — Iterator building blocks: chain, cycle, product, combinations, groupby, islice.
- functools — Higher-order functions: lru_cache, partial, reduce, wraps, singledispatch.
- datetime — Date and time manipulation: datetime, date, time, timedelta, timezone, strptime, strftime.
- json — JSON serialization: dump, dumps, load, loads, custom encoders and decoders.
- re — Regular expressions: compile, search, match, findall, sub, split, flags.
- math — Mathematical functions: sqrt, ceil, floor, gcd, lcm, comb, perm, isclose, tau.
- pathlib — Object-oriented file paths: Path, glob, read_text, write_text, iterdir, stat.
Each module section on the cheat sheet is visually distinct, using color-coded headers and card-based layouts. You can click any module name in the sidebar to jump directly to that section. The search bar supports fuzzy matching across function names, module names, and descriptions. If you need to reference something offline, bookmark the page. It works without an internet connection once loaded.
os — Operating System Interface
The os module is your primary interface to the operating system. It handles file and directory operations, environment variables, process management, and OS-specific functionality. It is one of the most frequently imported modules in Python, and for good reason: almost every non-trivial program interacts with the file system.
Working with the current working directory:
import os
# Get current working directory
cwd = os.getcwd()
# Change directory
os.chdir('/home/user/projects')
# List directory contents
entries = os.listdir('.')
# ['file1.py', 'file2.py', 'subdir'] Path manipulation with os.path (the traditional approach; see pathlib below for the modern alternative):
import os
# Join path components (handles trailing slashes correctly)
path = os.path.join('home', 'user', 'docs', 'file.txt')
# 'home/user/docs/file.txt' on Unix, 'home\user\docs\file.txt' on Windows
# Split a path into directory and filename
dir_name, file_name = os.path.split('/home/user/file.txt')
# ('/home/user', 'file.txt')
# Get file extension
root, ext = os.path.splitext('script.py')
# ('script', '.py')
# Check if path exists and is a file/directory
os.path.exists('config.json') # True/False
os.path.isfile('script.py') # True/False
os.path.isdir('/usr/local') # True/False
# Get the absolute path
abs_path = os.path.abspath('../data')
# '/home/user/data' Environment variables are accessed through os.environ, which behaves like a dictionary but reflects the actual process environment:
import os
# Get an environment variable with a default
db_host = os.environ.get('DB_HOST', 'localhost')
api_key = os.environ.get('API_KEY') # None if not set
# Set an environment variable for the current process
os.environ['DEBUG'] = 'true'
# List all environment variables (careful — can be very long)
for key, value in os.environ.items():
print(f'{key}={value}') Directory creation and file operations:
import os
# Create a single directory (raises FileExistsError if exists)
os.mkdir('new_dir')
# Create nested directories (like mkdir -p)
os.makedirs('parent/child/grandchild', exist_ok=True)
# Rename a file or directory
os.rename('old_name.txt', 'new_name.txt')
# Remove a file
os.remove('temp.txt')
# Remove an empty directory
os.rmdir('empty_dir')
# Remove a directory tree (dangerous — no recycle bin)
import shutil
shutil.rmtree('dir_to_delete')
# Get file size in bytes
size = os.path.getsize('large_file.bin')
# Get file modification time
mtime = os.path.getmtime('file.txt') # Unix timestamp Process management and OS identity:
import os
# Current process ID
pid = os.getpid()
# Parent process ID
ppid = os.getppid()
# Run a shell command (deprecated — use subprocess.run instead)
# os.system('ls -la') # not recommended
# Get the current user's home directory
home = os.path.expanduser('~')
# '/home/username'
# Get OS-specific path separator
sep = os.sep # '/' on Unix, '\' on Windows
# Get OS name
os_name = os.name # 'posix' on Linux/macOS, 'nt' on Windows
# Get login name
user = os.getlogin() # may raise OSError in some environments sys — System-Specific Parameters and Functions
While os deals with the operating system, sys deals with the Python interpreter itself. It gives you access to command-line arguments, standard I/O streams, module search paths, recursion limits, and interpreter-level configuration. If os is your interface to the machine, sys is your interface to the Python runtime.
Command-line arguments are the most common use of sys:
import sys
# sys.argv[0] is always the script name
# sys.argv[1:] contains the user-supplied arguments
if len(sys.argv) < 2:
print(f'Usage: {sys.argv[0]} <filename>')
sys.exit(1)
filename = sys.argv[1]
print(f'Processing {filename}...') Standard I/O streams give you explicit control over input and output:
import sys
# Write to stdout (same as print with default arguments)
sys.stdout.write('Processing complete.\n')
# Write to stderr for error messages and diagnostics
sys.stderr.write(f'Error: file not found\n')
# Read from stdin line by line
for line in sys.stdin:
# Process each line (useful for pipe input: cat file | python script.py)
print(line.strip().upper())
# Read all stdin at once
data = sys.stdin.read() Interpreter introspection and control:
import sys
# Python version as a string
print(sys.version) # '3.12.0 (main, Oct 2 2023, 13:03:27) [GCC 12.2.0]'
print(sys.version_info) # sys.version_info(major=3, minor=12, micro=0)
# Platform identifier
print(sys.platform) # 'linux', 'darwin' (macOS), 'win32'
# Module search path (where import looks for modules)
print(sys.path)
# sys.path[0] is the script's directory
# You can append to it: sys.path.append('/custom/modules')
# List of loaded modules
print(sys.modules.keys())
# Get the recursion limit
print(sys.getrecursionlimit()) # typically 1000
sys.setrecursionlimit(2000) # increase for deep recursion
# Get the size of an object in bytes (approximate)
size = sys.getsizeof([1, 2, 3]) # memory consumed by the list
# Exit the interpreter
sys.exit(0) # 0 = success, non-zero = error The sys.path list is especially important for understanding Python's import system. When you run a script, Python adds the script's directory to sys.path[0]. When you import a module, Python searches through sys.path in order. This is why local modules shadow installed packages and why adding directories to sys.path at runtime can resolve import errors.
collections — Container Datatypes
The built-in list, dict, set, and tuple cover the majority of use cases, but collections fills the remaining 20% with specialized, high-performance containers. These are not niche tools. defaultdict, Counter, and namedtuple appear in production codebases across every Python domain.
namedtuple
namedtuple creates lightweight, immutable data classes with named fields. They support both attribute access and tuple unpacking, are more memory-efficient than full classes, and are perfect for records, coordinates, and configuration:
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
p = Point(10, 20)
print(p.x, p.y) # 10 20
print(p[0], p[1]) # 10 20 (tuple access still works)
x, y = p # unpacking works
# With default values (Python 3.7+)
Person = namedtuple('Person', ['name', 'age', 'city'], defaults=['Unknown', 0, 'NYC'])
p1 = Person('Alice') # age=0, city='NYC' by default
p2 = Person('Bob', 30) # city='NYC' by default
# Convert to dict
print(p._asdict()) # {'x': 10, 'y': 20}
# Create from existing sequence
p3 = Point._make([30, 40]) deque
deque (double-ended queue, pronounced "deck") provides O(1) append and pop from both ends. This is dramatically faster than list.pop(0), which is O(n) because it shifts all remaining elements:
from collections import deque
# Create a deque
d = deque(['a', 'b', 'c'])
d.append('d') # add to right
d.appendleft('z') # add to left
d.pop() # remove from right → 'd'
d.popleft() # remove from left → 'z'
# Bounded deque (drops items from the opposite end when full)
history = deque(maxlen=5)
for i in range(10):
history.append(i)
print(history) # deque([5, 6, 7, 8, 9], maxlen=5)
# Rotate
d = deque([1, 2, 3, 4, 5])
d.rotate(2) # deque([4, 5, 1, 2, 3])
d.rotate(-1) # deque([5, 1, 2, 3, 4]) Counter
Counter is a dict subclass for counting hashable objects. It is the fastest way to count frequencies, find most-common elements, and perform multiset operations:
from collections import Counter
# Count characters
c = Counter('abracadabra')
print(c) # Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})
print(c.most_common(2)) # [('a', 5), ('b', 2)]
# Count words from a list
words = ['apple', 'banana', 'apple', 'orange', 'banana', 'apple']
word_counts = Counter(words)
print(word_counts['apple']) # 3
# Counter arithmetic (multiset operations)
c1 = Counter(a=3, b=1)
c2 = Counter(a=1, b=2)
print(c1 + c2) # Counter({'a': 4, 'b': 3}) — sum
print(c1 - c2) # Counter({'a': 2}) — difference (drops zero/negative)
print(c1 & c2) # Counter({'a': 1, 'b': 1}) — intersection (min)
print(c1 | c2) # Counter({'a': 3, 'b': 2}) — union (max)
# Total count of all elements
print(c.total()) # 11 (Python 3.10+) defaultdict
defaultdict eliminates the "check if key exists" boilerplate. When you access a missing key, it calls the factory function you provide and inserts the result:
from collections import defaultdict
# Group items by category
items = [('fruit', 'apple'), ('fruit', 'banana'), ('veg', 'carrot'), ('fruit', 'orange')]
groups = defaultdict(list)
for category, item in items:
groups[category].append(item)
print(dict(groups))
# {'fruit': ['apple', 'banana', 'orange'], 'veg': ['carrot']}
# Count occurrences
counts = defaultdict(int)
for letter in 'mississippi':
counts[letter] += 1
print(dict(counts))
# {'m': 1, 'i': 4, 's': 4, 'p': 2}
# Nested defaultdict for arbitrary-depth trees
def tree():
return defaultdict(tree)
nested = tree()
nested['a']['b']['c'] = 42 OrderedDict and ChainMap
OrderedDict remembers insertion order. Since Python 3.7, regular dict also maintains insertion order, but OrderedDict has additional methods for reordering. ChainMap links multiple dicts for layered lookups — command-line args override config files, which override defaults:
from collections import OrderedDict, ChainMap
# OrderedDict: move_to_end is unique to OrderedDict
od = OrderedDict([('a', 1), ('b', 2), ('c', 3)])
od.move_to_end('a') # OrderedDict([('b', 2), ('c', 3), ('a', 1)])
od.move_to_end('c', last=False) # OrderedDict([('c', 3), ('b', 2), ('a', 1)])
# ChainMap: layered lookup
defaults = {'host': 'localhost', 'port': 5432, 'debug': False}
cli_args = {'host': 'db.example.com'}
env_vars = {'port': 15432}
config = ChainMap(cli_args, env_vars, defaults)
print(config['host']) # 'db.example.com' (first match wins)
print(config['port']) # 15432 (env_vars overrides defaults)
print(config['debug']) # False (only in defaults) itertools — Functions Creating Iterators for Efficient Looping
itertools is a treasure chest of iterator building blocks. These functions generate and transform iterators lazily, meaning they compute values on demand and never build large intermediate lists. For data processing pipelines, combinatorics, and any situation where you work with sequences, itertools replaces handwritten loops with tested, C-optimized primitives.
Infinite Iterators
from itertools import count, cycle, repeat
# count(start, step) — infinite arithmetic progression
for i in count(10, 2): # 10, 12, 14, 16, ...
if i > 20: break
print(i)
# cycle(iterable) — repeat the iterable forever
colors = cycle(['red', 'green', 'blue'])
for _ in range(5):
print(next(colors)) # red, green, blue, red, green
# repeat(object, times) — repeat an object
for x in repeat('hello', 3):
print(x) # hello, hello, hello Combining and Chaining Iterables
from itertools import chain, zip_longest
# chain — flatten nested iterables without copying
nested = [[1, 2], [3, 4], [5]]
flat = list(chain.from_iterable(nested)) # [1, 2, 3, 4, 5]
# Chain multiple iterables directly
result = list(chain('ABC', 'DEF')) # ['A', 'B', 'C', 'D', 'E', 'F']
# zip_longest — zip but fill missing values instead of truncating
from itertools import zip_longest
a = [1, 2, 3]
b = ['a', 'b']
pairs = list(zip_longest(a, b, fillvalue=None))
# [(1, 'a'), (2, 'b'), (3, None)] Filtering and Slicing
from itertools import islice, takewhile, dropwhile, filterfalse, compress
# islice — slice any iterator without copying
from itertools import islice
r = range(100)
first_ten = list(islice(r, 10)) # [0, 1, ..., 9]
middle = list(islice(r, 45, 55)) # [45, 46, ..., 54]
every_10th = list(islice(r, 0, 100, 10)) # [0, 10, 20, ..., 90]
# takewhile / dropwhile — split based on a predicate
data = [1, 3, 5, 2, 4, 6]
head = list(takewhile(lambda x: x % 2 == 1, data)) # [1, 3, 5]
tail = list(dropwhile(lambda x: x % 2 == 1, data)) # [2, 4, 6]
# filterfalse — keep elements where predicate is False
evens = list(filterfalse(lambda x: x % 2, range(10))) # [0, 2, 4, 6, 8]
# compress — filter with a boolean mask
data = ['a', 'b', 'c', 'd']
selectors = [True, False, True, False]
result = list(compress(data, selectors)) # ['a', 'c'] Combinatoric Generators
from itertools import product, permutations, combinations, combinations_with_replacement
# product — Cartesian product (nested for-loops flattened)
for x, y in product('AB', '12'):
print(f'{x}{y}') # A1, A2, B1, B2
# With repeat — product of iterable with itself
dice = list(product(range(1, 7), repeat=2))
# [(1,1), (1,2), ..., (6,6)] — all 36 dice outcomes
# permutations — all possible orderings (r-length)
perms = list(permutations('ABC', 2))
# [('A','B'), ('A','C'), ('B','A'), ('B','C'), ('C','A'), ('C','B')]
# combinations — r-length combinations, no repeats, order doesn't matter
combos = list(combinations('ABC', 2))
# [('A','B'), ('A','C'), ('B','C')]
# combinations_with_replacement — like combinations but allows repeats
combos_r = list(combinations_with_replacement('AB', 2))
# [('A','A'), ('A','B'), ('B','B')] groupby
groupby groups consecutive elements that share a key. It is O(n) memory and time, making it efficient for large datasets. The catch: data MUST be sorted by the key before grouping, otherwise non-consecutive matching elements form separate groups:
from itertools import groupby
# Group consecutive runs
data = [1, 1, 2, 2, 2, 3, 1, 1]
for key, group in groupby(data):
print(key, list(group))
# 1 [1, 1]
# 2 [2, 2, 2]
# 3 [3]
# 1 [1, 1] — note: separate group because not consecutive with first 1's
# Real-world: group log lines by severity
logs = [
('ERROR', 'disk full'),
('ERROR', 'connection refused'),
('WARN', 'retry attempt 1'),
('INFO', 'server started'),
('INFO', 'listening on port 8080'),
]
for severity, entries in groupby(sorted(logs, key=lambda x: x[0]), key=lambda x: x[0]):
messages = [e[1] for e in entries]
print(f'{severity}: {messages}') functools — Higher-Order Functions and Operations on Callable Objects
functools is the module for working with functions as first-class objects. It contains decorators for caching and dispatch, utilities for creating partial functions, and tools for function composition. If you have ever written a decorator by hand or cached computation results manually, functools has a production-tested alternative.
lru_cache and cache
The @lru_cache decorator is arguably the most valuable single decorator in the standard library. It memoizes function results — caching the return value for each unique combination of arguments — with a Least Recently Used eviction policy:
from functools import lru_cache, cache
import time
# Recursive Fibonacci without cache: O(2^n)
# With cache: O(n)
@lru_cache(maxsize=128)
def fib(n):
if n < 2:
return n
return fib(n - 1) + fib(n - 2)
print(fib(100)) # instant — would be impossible without caching
# Inspect cache statistics
print(fib.cache_info()) # CacheInfo(hits=97, misses=101, maxsize=128, currsize=101)
fib.cache_clear() # clear the cache
# Python 3.9+: @cache is @lru_cache(maxsize=None) — simpler, no eviction limit
@cache
def expensive_query(query_id: int) -> dict:
time.sleep(2) # simulated DB query
return {'id': query_id, 'result': '...'}
print(expensive_query(42)) # 2-second wait
print(expensive_query(42)) # instant from cache
print(expensive_query.cache_info()) # CacheInfo(hits=1, misses=1, maxsize=None, currsize=1) partial and partialmethod
partial creates a new function with some arguments pre-filled. This is a form of currying — you fix some arguments now and supply the rest later:
from functools import partial
def power(base, exponent):
return base ** exponent
# Create specialized functions
square = partial(power, exponent=2)
cube = partial(power, exponent=3)
print(square(5)) # 25
print(cube(5)) # 125
# Practical: pre-fill common arguments
import json
pretty_json = partial(json.dumps, indent=2, sort_keys=True)
print(pretty_json({'b': 2, 'a': 1}))
# {
# "a": 1,
# "b": 2
# }
# With placeholders — useful with map and threading
with open('urls.txt') as f:
from multiprocessing.pool import ThreadPool
import requests
fetch = partial(requests.get, timeout=5, headers={'User-Agent': 'my-app'})
with ThreadPool(4) as pool:
results = pool.map(fetch, [line.strip() for line in f]) reduce
reduce applies a binary function cumulatively to the items of an iterable, reducing the iterable to a single value. It was moved to functools from the built-in namespace in Python 3:
from functools import reduce
import operator
# Sum
total = reduce(operator.add, [1, 2, 3, 4, 5]) # 15
# Product
product = reduce(operator.mul, [1, 2, 3, 4, 5]) # 120
# Find maximum
maximum = reduce(lambda a, b: a if a > b else b, [3, 1, 9, 2, 7]) # 9
# Practical: deep merge of nested dicts
def deep_merge(a, b):
for key in b:
if key in a and isinstance(a[key], dict) and isinstance(b[key], dict):
deep_merge(a[key], b[key])
else:
a[key] = b[key]
return a
configs = [
{'database': {'host': 'localhost', 'port': 5432}},
{'database': {'host': 'prod-db', 'pool': 10}, 'cache': {'ttl': 300}},
{'logging': {'level': 'INFO'}},
]
merged = reduce(deep_merge, configs, {})
# All three configs merged into one wraps and singledispatch
wraps is essential for writing decorators that preserve the wrapped function's metadata. Without it, the decorated function loses its docstring, name, and signature. singledispatch provides function overloading based on the type of the first argument:
from functools import wraps, singledispatch
import time
# wraps: preserve function metadata through decoration
def timer(func):
@wraps(func)
def wrapper(*args, **kwargs):
start = time.perf_counter()
result = func(*args, **kwargs)
elapsed = time.perf_counter() - start
print(f'{func.__name__} took {elapsed:.4f}s')
return result
return wrapper
@timer
def compute_heavy(n):
"""Performs heavy computation on n."""
return sum(range(n))
help(compute_heavy)
# Help on function compute_heavy in module __main__:
# compute_heavy(n)
# Performs heavy computation on n.
# Without @wraps, it would show 'wrapper' and no docstring
# singledispatch: function overloading by type of first argument
@singledispatch
def format_value(arg):
raise NotImplementedError(f'No formatter for {type(arg)}')
@format_value.register
def _(arg: int) -> str:
return f'0x{arg:x}'
@format_value.register
def _(arg: list) -> str:
return ', '.join(str(v) for v in arg)
@format_value.register(dict)
def _(arg: dict) -> str:
return '; '.join(f'{k}={v}' for k, v in arg.items())
print(format_value(255)) # '0xff'
print(format_value([1, 2, 3])) # '1, 2, 3'
print(format_value({'a': 1})) # 'a=1' datetime — Basic Date and Time Types
The datetime module is the standard library's answer to dates, times, time intervals, and timezone handling. It provides four main classes: date (year, month, day), time (hour, minute, second, microsecond), datetime (combines date and time), and timedelta (duration). Nearly every application that deals with timestamps, scheduling, or data with temporal dimensions uses datetime.
Creating and Manipulating Dates and Times
from datetime import date, time, datetime, timedelta, timezone
# Current date and time
today = date.today() # 2026-05-17
now = datetime.now() # local time, naive (no timezone)
now_utc = datetime.now(timezone.utc) # aware datetime with UTC timezone
# Construct specific dates
d = date(2026, 5, 17) # May 17, 2026
dt = datetime(2026, 5, 17, 14, 30, 0) # May 17, 2026 2:30 PM
t = time(14, 30, 45) # 2:30:45 PM
# Extract components
print(dt.year, dt.month, dt.day) # 2026 5 17
print(dt.hour, dt.minute) # 14 30
print(dt.weekday()) # 6 (Monday=0, Sunday=6)
print(dt.isoweekday()) # 7 (Monday=1, Sunday=7) Parsing and Formatting: strptime and strftime
The most common datetime operations are converting strings to datetime objects (parsing) and converting datetime objects to display strings (formatting). The format codes are consistent between parsing and formatting:
from datetime import datetime
# strptime — parse string to datetime
dt = datetime.strptime('2026-05-17 14:30:00', '%Y-%m-%d %H:%M:%S')
dt = datetime.strptime('May 17, 2026', '%B %d, %Y')
dt = datetime.strptime('17/05/26', '%d/%m/%y')
# strftime — format datetime to string
print(dt.strftime('%Y-%m-%d')) # '2026-05-17'
print(dt.strftime('%B %d, %Y')) # 'May 17, 2026'
print(dt.strftime('%A, %B %d, %Y')) # 'Sunday, May 17, 2026'
print(dt.strftime('%I:%M %p')) # '02:30 PM'
# ISO 8601 format (recommended for APIs and data exchange)
print(dt.isoformat()) # '2026-05-17T14:30:00'
parsed = datetime.fromisoformat('2026-05-17T14:30:00+00:00')
# Key format codes (abbreviated):
# %Y = 4-digit year %y = 2-digit year %m = month (01-12)
# %d = day (01-31) %B = full month name %b = abbreviated month
# %H = hour (00-23) %I = hour (01-12) %M = minute (00-59)
# %S = second (00-59) %p = AM/PM %A = full weekday name
# %a = abbreviated weekday %w = weekday (0=Sunday, 6=Saturday) timedelta and Date Arithmetic
timedelta represents a duration, the difference between two dates or times. It supports addition, subtraction, multiplication, and division:
from datetime import datetime, timedelta, date
# Create timedeltas
one_day = timedelta(days=1)
one_week = timedelta(weeks=1)
two_hours = timedelta(hours=2)
thirty_minutes = timedelta(minutes=30)
combined = timedelta(days=3, hours=4, minutes=30)
# Date arithmetic
today = date.today()
tomorrow = today + one_day
yesterday = today - one_day
next_week = today + one_week
# Datetime arithmetic
now = datetime.now()
future = now + timedelta(days=7, hours=2)
past = now - timedelta(hours=12)
# Difference between two dates
delta = date(2026, 12, 31) - date(2026, 1, 1)
print(delta.days) # 364
# Difference between two datetimes
start = datetime(2026, 1, 1, 0, 0)
end = datetime(2026, 1, 3, 6, 30)
diff = end - start
print(diff) # 2 days, 6:30:00
print(diff.total_seconds()) # 196200.0 Timezones
Working with timezones correctly is essential for any application dealing with users across geographic regions. The standard library provides timezone for fixed UTC offsets and zoneinfo (Python 3.9+) for IANA timezone database support:
from datetime import datetime, timezone, timedelta
from zoneinfo import ZoneInfo # Python 3.9+
# Create timezone-aware datetimes
utc_now = datetime.now(timezone.utc)
est = timezone(timedelta(hours=-5))
et_now = datetime.now(est)
# Convert between timezones
tokyo_tz = ZoneInfo('Asia/Tokyo')
tokyo_time = utc_now.astimezone(tokyo_tz)
ny_tz = ZoneInfo('America/New_York')
ny_time = utc_now.astimezone(ny_tz)
# Make a naive datetime timezone-aware
naive = datetime(2026, 5, 17, 14, 30)
aware = naive.replace(tzinfo=timezone.utc)
# Never compare naive and aware datetimes — it raises TypeError
# Always use timezone-aware datetimes in production code json — JSON Encoder and Decoder
JSON (JavaScript Object Notation) is the universal data interchange format. Every REST API, configuration file, and data export pipeline uses it. The json module handles serialization (Python objects to JSON strings) and deserialization (JSON strings to Python objects) with a simple, battle-tested API.
Core Operations: dump, dumps, load, loads
The naming convention is straightforward: functions ending in s work with strings, functions without s work with file objects:
import json
# Python → JSON string
data = {'name': 'Alice', 'age': 30, 'city': 'New York'}
json_str = json.dumps(data) # compact
json_str = json.dumps(data, indent=2, sort_keys=True) # pretty-printed
# JSON string → Python
parsed = json.loads('{"name": "Bob", "age": 25}')
# Python → JSON file
with open('data.json', 'w') as f:
json.dump(data, f, indent=2)
# JSON file → Python
with open('data.json', 'r') as f:
loaded = json.load(f)
# Type mapping:
# Python dict → JSON object
# Python list → JSON array
# Python str → JSON string
# Python int/float → JSON number
# Python True/False → JSON true/false
# Python None → JSON null Custom Encoders and Decoders
Not all Python objects are JSON-serializable by default. datetime objects, Decimal values, and custom classes require custom encoding. The standard approach is to subclass JSONEncoder or provide a default function:
import json
from datetime import datetime, date
from decimal import Decimal
# Approach 1: default function
def custom_encoder(obj):
if isinstance(obj, (datetime, date)):
return obj.isoformat()
if isinstance(obj, Decimal):
return float(obj)
if isinstance(obj, set):
return list(obj)
raise TypeError(f'Object of type {type(obj)} is not JSON serializable')
data = {'timestamp': datetime.now(), 'amount': Decimal('99.99')}
json_str = json.dumps(data, default=custom_encoder)
# Approach 2: subclass JSONEncoder
class CustomEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, (datetime, date)):
return {'__type__': 'datetime', 'value': obj.isoformat()}
return super().default(obj)
json_str = json.dumps(data, cls=CustomEncoder)
# Approach 3: object_hook for deserialization
def datetime_decoder(dct):
if '__type__' in dct and dct['__type__'] == 'datetime':
return datetime.fromisoformat(dct['value'])
return dct
parsed = json.loads(json_str, object_hook=datetime_decoder) Additional useful parameters:
import json
data = {'users': [{'name': 'Alice'}, {'name': 'Bob'}]}
# ensure_ascii=False — output non-ASCII characters directly
json.dumps({'greeting': '你好'}, ensure_ascii=False) # '{"greeting": "你好"}'
# separators — compact output (remove whitespace)
json.dumps(data, separators=(',', ':')) # '{"users":[{"name":"Alice"},{"name":"Bob"}]}'
# allow_nan — control NaN and Infinity handling
json.dumps({'value': float('nan')}, allow_nan=True) # '{"value": NaN}'
# json.dumps({'value': float('nan')}, allow_nan=False) # ValueError
# skipkeys — skip non-string dict keys instead of raising TypeError
json.dumps({1: 'value', 'key': 'val'}, skipkeys=True) # '{"key": "val"}' re — Regular Expression Operations
Regular expressions are a domain-specific language for pattern matching in strings. The re module provides both a function-based API (convenient for one-off operations) and an object-oriented API built around compiled pattern objects (more efficient for repeated use). The cheat sheet covers both approaches, all major functions, and the full set of regex flags.
Compiling and Matching
import re
# Compile for repeated use (avoids re-parsing the pattern each time)
pattern = re.compile(r'\d{4}-\d{2}-\d{2}') # date pattern YYYY-MM-DD
# search — find first match anywhere in string
m = pattern.search('Date: 2026-05-17, Time: 14:30')
if m:
print(m.group()) # '2026-05-17'
print(m.start()) # 6
print(m.end()) # 16
# match — match at the START of the string only
m = re.match(r'\d{4}', '2026-05-17')
print(m.group()) # '2026'
m = re.match(r'\d{4}', 'Date: 2026') # None — doesn't start with digits
# fullmatch — entire string must match the pattern
m = re.fullmatch(r'\d{4}-\d{2}-\d{2}', '2026-05-17')
print(m.group()) # '2026-05-17'
m = re.fullmatch(r'\d{4}-\d{2}-\d{2}', 'Date: 2026-05-17') # None
# findall — find all non-overlapping matches as a list of strings
text = 'emails: alice@example.com, bob@test.org'
emails = re.findall(r'[\w.+-]+@[\w-]+\.[a-z]{2,}', text)
# ['alice@example.com', 'bob@test.org'] Groups, Substitution, and Splitting
import re
# Groups — extract parts of a match
pattern = re.compile(r'(\d{4})-(\d{2})-(\d{2})')
m = pattern.search('Date: 2026-05-17')
print(m.group(0)) # '2026-05-17' — entire match
print(m.group(1)) # '2026' — first group
print(m.group(2)) # '05' — second group
print(m.groups()) # ('2026', '05', '17') — all groups as tuple
# Named groups
pattern = re.compile(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})')
m = pattern.search('2026-05-17')
print(m.group('year')) # '2026'
print(m.groupdict()) # {'year': '2026', 'month': '05', 'day': '17'}
# sub — search and replace
text = 'My number is 555-1234'
result = re.sub(r'\d', '*', text) # 'My number is ***-****'
# sub with backreferences
result = re.sub(r'(\d{3})-(\d{4})', r'(\1) \2', '555-1234')
# '(555) 1234'
# sub with a function
def double(match):
return str(int(match.group()) * 2)
result = re.sub(r'\d+', double, 'Price: 10, Qty: 5')
# 'Price: 20, Qty: 10'
# subn — like sub but returns (result, count)
text = 'cat dog cat'
result, count = re.subn(r'cat', 'bird', text)
# ('bird dog bird', 2)
# split — split string by pattern
parts = re.split(r'[,;\s]+', 'apple, banana; cherry date')
# ['apple', 'banana', 'cherry', 'date'] Regex Flags
Flags modify how the regex engine interprets the pattern. They can be set inline in the pattern using (?flags) syntax or passed as the flags parameter:
import re
# re.IGNORECASE (re.I) — case-insensitive matching
pattern = re.compile(r'python', re.IGNORECASE)
print(pattern.findall('Python PYTHON python')) # ['Python', 'PYTHON', 'python']
# re.DOTALL (re.S) — dot matches newlines
pattern = re.compile(r'hello.world', re.DOTALL)
m = pattern.search('hello\nworld')
print(m.group()) # 'hello\nworld' — without DOTALL this would be None
# re.MULTILINE (re.M) — ^ and $ match line boundaries, not just string boundaries
text = 'apple\nbanana\ncherry'
print(re.findall(r'^a', text)) # ['a'] — only start-of-string
print(re.findall(r'^a', text, re.MULTILINE)) # ['a'] — start of each line
# With MULTILINE: ^ matches after each \n, $ matches before each \n
# re.VERBOSE (re.X) — allows comments and whitespace in patterns
pattern = re.compile(r'''
(\d{4}) # year
- # separator
(\d{2}) # month
- # separator
(\d{2}) # day
''', re.VERBOSE)
# Combine flags with bitwise OR
pattern = re.compile(r'pattern', re.IGNORECASE | re.DOTALL | re.MULTILINE) Lookahead, Lookbehind, and Non-Capturing Groups
import re
# Lookahead (?=...) — match if followed by, but don't consume
prices = re.findall(r'\d+(?=\.\d{2})', '$12.99 $5.00 $100')
# ['12', '5'] — matches digits only when followed by .XX, doesn't consume the decimal
# Negative lookahead (?!...) — match if NOT followed by
no_cents = re.findall(r'\d+(?!\.\d{2})', '$12.99 $5.00 $100')
# ['5', '100'] — matches digits when NOT followed by .XX
# Lookbehind (?<=...) — match if preceded by
dollar_amounts = re.findall(r'(?<=\$)\d+', '$12 $5 $100')
# ['12', '5', '100'] — matches digits preceded by $
# Negative lookbehind (?<!...) — match if NOT preceded by
no_dollar = re.findall(r'(?<!\$)\b\d+\b', '12 $5 100')
# ['12', '100'] — digits NOT preceded by $
# Non-capturing group (?:...) — group without creating a backreference
pattern = re.compile(r'(?:https?://)?(?:www\.)?([\w.-]+\.[a-z]{2,})')
m = pattern.search('Visit www.example.com today!')
print(m.group(1)) # 'example.com' — only the domain is captured math — Mathematical Functions
The math module provides access to the C standard math library. It includes number-theoretic and representation functions, power and logarithmic functions, trigonometric functions, angular conversion, hyperbolic functions, and constants like pi, e, tau, and inf. The functions operate on floats but return exact results where mathematically possible:
import math
# Constants
print(math.pi) # 3.141592653589793
print(math.e) # 2.718281828459045
print(math.tau) # 6.283185307179586 (Python 3.6+, = 2*pi)
print(math.inf) # float('inf')
print(math.nan) # float('nan')
# Number-theoretic functions
print(math.factorial(5)) # 120 (= 5*4*3*2*1)
print(math.gcd(48, 18)) # 6 (greatest common divisor)
print(math.lcm(12, 18)) # 36 (least common multiple, Python 3.9+)
print(math.comb(5, 2)) # 10 (5 choose 2, nCr, Python 3.8+)
print(math.perm(5, 2)) # 20 (5P2, permutations, Python 3.8+)
# Rounding and truncation
print(math.ceil(3.1)) # 4
print(math.floor(3.9)) # 3
print(math.trunc(-3.9)) # -3 (truncation toward zero)
print(round(2.5)) # 2 (banker's rounding in Python 3)
# Powers and logarithms
print(math.sqrt(16)) # 4.0
print(math.pow(2, 10)) # 1024.0
print(math.exp(2)) # 7.389... (e^2)
print(math.log(100, 10)) # 2.0 (log base 10 of 100)
print(math.log2(8)) # 3.0
print(math.log10(1000)) # 3.0
# Angle conversion
print(math.degrees(math.pi)) # 180.0
print(math.radians(180)) # 3.141592653589793
# Trigonometric functions (radians)
print(math.sin(math.pi / 2)) # 1.0
print(math.cos(math.pi)) # -1.0
print(math.tan(math.pi / 4)) # 0.999... (approximately 1.0)
# Floating-point comparisons
print(math.isclose(0.1 + 0.2, 0.3)) # True (safe float comparison)
print(math.isinf(math.inf)) # True
print(math.isnan(math.nan)) # True
# prod — product of all elements (Python 3.8+)
print(math.prod([1, 2, 3, 4, 5])) # 120
# dist — Euclidean distance between two points (Python 3.8+)
print(math.dist((0, 0), (3, 4))) # 5.0
# hypot — sqrt(x*x + y*y), useful for vector magnitude
print(math.hypot(3, 4)) # 5.0
# fsum — precise floating-point sum (avoids accumulation errors)
numbers = [0.1] * 10
print(sum(numbers)) # 0.9999999999999999 (floating-point error)
print(math.fsum(numbers)) # 1.0 (correct) pathlib — Object-Oriented Filesystem Paths
pathlib was introduced in Python 3.4 as a modern, object-oriented alternative to os.path and related functions. Instead of passing strings through functions, you create Path objects and call methods on them. The result is code that reads more naturally, is easier to chain, and handles cross-platform path separators automatically. pathlib is the recommended way to work with file paths in all new Python code.
Path Construction and Inspection
from pathlib import Path
# Create Path objects
p = Path('/home/user/documents/report.txt')
cwd = Path.cwd() # current working directory
home = Path.home() # user's home directory
# Path components
print(p.name) # 'report.txt'
print(p.stem) # 'report' (name without suffix)
print(p.suffix) # '.txt'
print(p.suffixes) # ['.tar', '.gz'] for 'archive.tar.gz'
print(p.parent) # Path('/home/user/documents')
print(p.parent.parent) # Path('/home/user')
print(p.anchor) # '/' (Windows: 'C:\')
print(p.parts) # ('/', 'home', 'user', 'documents', 'report.txt')
# Relative and absolute paths
print(p.is_absolute()) # True
p2 = Path('data/config.json')
print(p2.is_absolute()) # False
absolute = p2.resolve() # Resolve to absolute path
# Relative path between two paths
rel = Path('/a/b/c').relative_to('/a') # Path('b/c') Path Construction with / Operator
One of pathlib's most readable features is the / operator for path joining. It is a deliberate design choice that makes path construction look natural:
from pathlib import Path
base = Path('/home/user')
config_dir = base / '.config' # Path('/home/user/.config')
config_file = config_dir / 'settings.json' # Path('/home/user/.config/settings.json')
# Chain multiple joins
log_dir = Path.home() / 'logs' / 'app' / '2026' / '05'
# Path('/home/username/logs/app/2026/05')
# Works with string components too
data_file = Path.cwd() / 'data' / 'input.csv'
output_dir = Path('results') / 'run_01'
# create subdirectories
(Path.home() / 'projects' / 'myapp' / 'tests').mkdir(parents=True, exist_ok=True) File and Directory Operations
from pathlib import Path
p = Path('data.txt')
# Reading and writing
text = p.read_text(encoding='utf-8') # read entire file
p.write_text('Hello, World!', encoding='utf-8') # write (overwrites)
binary = p.read_bytes() # read as bytes
p.write_bytes(b'\x89PNG') # write bytes
# Iterate over lines (memory-efficient for large files)
with p.open('r') as f:
for line in f:
print(line.strip())
# Check properties
print(p.exists()) # True/False
print(p.is_file()) # True if regular file
print(p.is_dir()) # True if directory
print(p.is_symlink()) # True if symbolic link
# File metadata
stat = p.stat()
print(stat.st_size) # file size in bytes
print(stat.st_mtime) # modification time (Unix timestamp)
# Glob (match files by pattern)
for py_file in Path.cwd().glob('*.py'):
print(py_file)
# Recursive glob (** matches any depth)
for py_file in Path.cwd().rglob('**/*.py'):
print(py_file)
# iterdir — list directory contents
for entry in Path.cwd().iterdir():
if entry.is_dir():
print(f'[DIR] {entry.name}')
else:
print(f'[FILE] {entry.name}') Directory and File Manipulation
from pathlib import Path
# Create directories
Path('new_dir').mkdir() # creates single directory
Path('a/b/c').mkdir(parents=True) # creates a/b/c like mkdir -p
Path('nested/structure').mkdir(parents=True, exist_ok=True) # no error if exists
# Rename / Move
Path('old.txt').rename('new.txt')
Path('temp/data.csv').replace('archive/data.csv') # replace destination if exists
# Copy (use shutil for full copy support)
import shutil
shutil.copy2(Path('source.txt'), Path('dest.txt'))
# Delete
Path('temp.txt').unlink() # delete file
Path('temp.txt').unlink(missing_ok=True) # no error if missing (Python 3.8+)
Path('empty_dir').rmdir() # delete empty directory
# Symlinks
Path('link_name').symlink_to('/actual/path') # create symlink
target = Path('link_name').readlink() # read symlink target (Python 3.9+)
# Touch (create empty file or update timestamp)
Path('marker.txt').touch()
Path('marker.txt').touch(exist_ok=True) pathlib vs os.path Comparison
The following table shows equivalent operations in both APIs to help you transition from os.path to pathlib:
# Operation os.path pathlib
# ─────────────────────────────────────────────────────────────────────
# Join paths os.path.join('a', 'b') Path('a') / 'b'
# Parent directory os.path.dirname(p) p.parent
# Filename os.path.basename(p) p.name
# Stem (no extension) os.path.splitext(pn)[0] p.stem
# Extension os.path.splitext(pn)[1] p.suffix
# Absolute path os.path.abspath(p) p.resolve()
# Check exists os.path.exists(p) p.exists()
# Check is file os.path.isfile(p) p.is_file()
# Check is dir os.path.isdir(p) p.is_dir()
# File size os.path.getsize(p) p.stat().st_size
# Glob files glob.glob('*.py') Path.cwd().glob('*.py')
# Recursive glob glob.glob('**/*.py') Path.cwd().rglob('*.py')
# Read text open(p).read() p.read_text()
# Write text open(p, 'w').write(t) p.write_text(t) Practical Use Cases: Combining Modules
The real power of the standard library emerges when you combine modules. A single script might use pathlib to find files, json to parse their contents, datetime to filter by date, re to extract specific fields, and collections.Counter to aggregate results. Here are real patterns from production codebases.
Build a Log File Analyzer
This script uses pathlib, re, collections.Counter, and datetime to parse server logs and report the top error sources:
from pathlib import Path
import re
from collections import Counter
from datetime import datetime
# Pattern for Apache/Nginx combined log format
LOG_PATTERN = re.compile(
r'(?P<ip>\S+) \S+ \S+ \[(?P<time>[^\]]+)\] '
r'"(?P<method>\S+) (?P<url>\S+) \S+" '
r'(?P<status>\d{3}) (?P<size>\S+)'
)
log_dir = Path('/var/log/nginx')
error_counts = Counter()
status_counts = Counter()
for log_file in log_dir.glob('access.log*'):
for line in log_file.read_text().splitlines():
m = LOG_PATTERN.search(line)
if m:
status = int(m.group('status'))
status_counts[status] += 1
if status >= 500:
error_counts[m.group('url')] += 1
print('Status code distribution:')
for status, count in status_counts.most_common():
print(f' {status}: {count}')
print(f'\nTop error URLs:')
for url, count in error_counts.most_common(5):
print(f' {url}: {count} occurrences') Batch JSON Data Processor
Combining pathlib, json, datetime, and functools to cache expensive JSON processing:
from pathlib import Path
import json
from datetime import datetime, timedelta
from functools import lru_cache
from collections import defaultdict
@lru_cache(maxsize=256)
def load_json_file(filepath: str) -> dict:
"""Cache loaded JSON files to avoid re-reading in the same run."""
return json.loads(Path(filepath).read_text())
def aggregate_recent_data(data_dir: str, days: int = 7):
"""Load all JSON files from the last N days and aggregate by category."""
cutoff = datetime.now() - timedelta(days=days)
data_dir = Path(data_dir)
aggregated = defaultdict(list)
for json_file in sorted(data_dir.glob('*.json')):
mtime = datetime.fromtimestamp(json_file.stat().st_mtime)
if mtime < cutoff:
continue
try:
data = load_json_file(str(json_file))
category = data.get('category', 'uncategorized')
aggregated[category].append(data)
except (json.JSONDecodeError, KeyError):
continue
return dict(aggregated)
# Usage:
results = aggregate_recent_data('/data/exports', days=30)
for category, items in results.items():
print(f'{category}: {len(items)} records') Configuration File System
A common pattern is layering configuration from multiple sources using ChainMap, json, and os.environ:
import json
import os
from pathlib import Path
from collections import ChainMap
def load_config(app_name: str):
"""Load config with layered overrides: defaults < config file < env vars."""
# 1. Hard-coded defaults
defaults = {
'host': 'localhost',
'port': 8080,
'debug': False,
'log_level': 'INFO',
}
# 2. JSON config file
config_file = Path.home() / '.config' / app_name / 'settings.json'
file_config = {}
if config_file.exists():
file_config = json.loads(config_file.read_text())
# 3. Environment variables (uppercase, prefixed with APP_NAME_)
env_config = {}
prefix = f'{app_name.upper()}_'
for key, value in os.environ.items():
if key.startswith(prefix):
config_key = key[len(prefix):].lower()
# Cast numeric values
if value.isdigit():
value = int(value)
elif value.lower() in ('true', 'false'):
value = value.lower() == 'true'
env_config[config_key] = value
return ChainMap(env_config, file_config, defaults)
config = load_config('myapp')
print(config['host']) # env var overrides file overrides default
print(config['port']) # all sources visible through one interface When to Use Each Module (Decision Guide)
With ten modules, it helps to have a quick decision matrix. Here is when to reach for each one:
- Need to read a file path or check if it exists? Use
pathlib. It is more readable thanos.pathand handles cross-platform quirks automatically. - Need to manipulate environment variables or run a subprocess? Use
os.os.environis the canonical way to access env vars in Python. - Need to parse command-line arguments? Use
sys.argvfor simple scripts,argparsefor anything with flags and help text. - Need to count frequencies or find most-common elements? Use
collections.Counter. It is purpose-built and faster than a hand-rolled dict. - Need a fast queue or stack? Use
collections.deque. O(1) on both ends versus O(n) forlist.pop(0). - Need to group items by category without checking keys? Use
collections.defaultdict(list). It eliminates the if-key-exists conditional. - Need a lightweight immutable data record? Use
collections.namedtuple. For mutable records with type hints, usedataclasses. - Need to flatten nested lists or chain sequences? Use
itertools.chain.from_iterable(). It avoids creating intermediate copies. - Need to generate combinations or permutations? Use
itertools.combinations()anditertools.permutations(). They are tested, correct, and memory-efficient. - Need to cache expensive function results? Use
functools.lru_cache. Three lines of decorator can reduce runtime from hours to milliseconds for recursive algorithms. - Need to pre-fill some function arguments? Use
functools.partial. It makes callbacks and pipeline stages cleaner. - Need to parse or format dates and times? Use
datetime.strptime()anddatetime.strftime(). The format codes are worth memorizing — they come up constantly. - Need to calculate date differences or schedule future events? Use
datetime.timedelta. Date arithmetic withtimedeltais far more readable than manual epoch-seconds math. - Need to serialize Python objects to JSON? Use
json.dumps()with a customdefaultfunction for non-standard types likedatetimeandDecimal. - Need to find patterns in text (emails, URLs, dates)? Use
re.compile()to precompile the pattern, thenfindall()orsearch(). - Need precise mathematical calculations? Use
math. For arbitrary-precision decimals, use thedecimalmodule. For arrays and linear algebra, use NumPy.
How to Use the Interactive Python Standard Library Cheat Sheet
Our Python Standard Library Expansion Cheat Sheet is designed for developers who need fast, accurate reference material during coding sessions. When you open the tool, you see more than 120 entries organized into ten color-coded module sections. Each entry displays the function signature, a concise description, and a copyable code example. Click any module name in the sidebar to jump directly to that section. Type in the search box to find functions by name, module, or keyword. Click the copy icon on any code block to copy the example to your clipboard.
The tool is 100% client-side. No data is sent to any server. No signup is required. No cookies track your browsing. The page works offline once loaded, making it ideal for working on airplanes, in secure environments, or anywhere with unreliable internet. The interface is styled with "The Grand Archive's Reading Room" aesthetic: warm amber-gold lighting, parchment-textured cards, Playfair Display headings, and Lora body text. Subtle dust mote particle effects drift across the background, evoking the atmosphere of a classical European library. It is a reference experience that feels different from typical developer tools — calmer, more focused, and designed for long reading sessions.
Whether you are a junior developer learning the standard library for the first time, an experienced engineer looking up function signatures during a code review, or a data scientist who needs quick access to itertools and collections, this cheat sheet is the fastest path from "how do I..." to copy-pasteable working code. Bookmark it. Share it with your team. Keep it in a pinned tab where it earns its place.
Related Tools
DevToolkit offers a growing collection of interactive Python references and developer tools. Explore these related resources:
- Python Built-in Functions Cheat Sheet — All 71 Python built-in functions with categorized examples, from
abs()tozip(). - Python Data Structures Deep-Dive — Lists, dicts, sets, and tuples with time complexity analysis and optimization patterns.
- Python Type Hints Deep-Dive — Complete reference for Python type annotations, generics, protocols, and TypedDict.
- Python Advanced Patterns Cheat Sheet — Decorators, context managers, descriptors, metaclasses, and concurrency patterns.
- Python Comprehensions Cheat Sheet — List, dict, set, and generator comprehensions with nested and conditional patterns.
- Python Dictionary Methods Cheat Sheet — Every dict method from
get()tosetdefault()with practical examples. - Python List Methods Cheat Sheet — Complete list method reference:
append,extend,sort,pop, and more. - Python Set Methods Cheat Sheet — Set operations, method reference, and performance characteristics.
- Regex Tester Online — Interactive regex playground. Test patterns, see matches highlighted in real time, and debug your regular expressions.
- JSON Formatter Online — Format, validate, minify, and tree-view JSON data. Paste messy JSON and get pretty output instantly.
FAQ
What are the most important Python standard library modules every developer should know?
The essential modules every Python developer should know are os (file system and OS operations), sys (interpreter runtime), collections (specialized containers like namedtuple and deque), itertools (iterator building blocks), functools (higher-order functions like lru_cache), datetime (dates and times), json (serialization), re (regular expressions), math (mathematical functions), and pathlib (object-oriented file paths). These ten modules cover the operations that appear in nearly every non-trivial Python program. Mastering them means you can write most scripts and applications without reaching for third-party packages. This cheat sheet covers all 10 with over 120 entries, each including practical, copyable code examples.
What is the difference between os.path and pathlib for file path handling in Python?
os.path uses string-based functions (os.path.join, os.path.exists, os.path.basename) while pathlib provides an object-oriented Path class (Path.home(), path.glob(), path.read_text()). pathlib is more readable and chainable: Path('.').glob('*.py') compared to glob.glob('*.py') using the glob module. It was introduced in Python 3.4 and has been improved in every release since. The Python documentation now recommends pathlib as the preferred approach for new code. However, os.path is still widely used in existing codebases, and knowing both APIs is important for reading and maintaining legacy code. The cheat sheet includes a side-by-side comparison table to help with the transition.
How do itertools help write more efficient Python code?
itertools provides memory-efficient iterator building blocks that process data lazily. chain() flattens nested iterables without creating intermediate lists. islice() lets you slice any iterator without copying its contents into memory. groupby() groups consecutive data with O(n) memory usage. product(), combinations(), and permutations() generate combinatoric results lazily, computing each result only when requested. These tools let you process large or infinite data streams without loading everything into memory at once. For data pipelines, ETL operations, and any code that loops over sequences, itertools replaces nested for-loops with tested, C-optimized primitives that are both faster and more readable.
What is functools.lru_cache and when should I use it?
@functools.lru_cache(maxsize=128) is a decorator that memoizes function results — it caches the return value for each unique set of arguments, with a Least Recently Used eviction policy that keeps the most recently accessed results and discards the least recently used when the cache reaches maxsize. Use it for expensive pure functions (same input always produces same output) that are called repeatedly with the same arguments. Classic examples include recursive Fibonacci, dynamic programming solutions, API response parsing with repeated queries, and database query result caching within a single request. For unbounded caching where memory is not a concern, use @functools.cache (Python 3.9+), which is equivalent to @lru_cache(maxsize=None). The cache_info() method lets you inspect hit rates to verify the cache is effective.
How do Python collections like namedtuple and defaultdict improve code quality?
namedtuple creates lightweight, immutable data classes with named fields that support both attribute access and tuple unpacking. They are perfect for records, coordinates, configuration objects, and any situation where you need a simple data carrier. They are more memory-efficient than full classes and their immutability prevents accidental mutation. defaultdict eliminates the boilerplate of checking whether a key exists before accessing it — use defaultdict(list) for grouping items by category, defaultdict(int) for counters, and defaultdict(set) for deduplication. Counter provides dict-like frequency counting with a convenient .most_common() method. deque gives O(1) append and pop from both ends, making it ideal for queues, stacks, and sliding windows. All of these are implemented in C and are faster and more memory-efficient than hand-rolled alternatives.
How do I parse and format dates with Python's datetime module?
Use strptime() to parse strings into datetime objects: datetime.strptime("2026-05-17", "%Y-%m-%d"). The second argument is the format string, where %Y is a four-digit year, %m is a zero-padded month, and %d is a zero-padded day. Use strftime() to format datetime objects back into display strings: dt.strftime("%B %d, %Y") yields "May 17, 2026" where %B is the full month name. timedelta represents durations for date arithmetic: dt + timedelta(days=7) adds one week. For timezone-aware dates, use timezone.utc for UTC or install the zoneinfo module (Python 3.9+) for IANA timezone database support. The cheat sheet includes a complete reference of format codes and common date manipulation patterns.
How do re.compile and regex flags improve regular expression performance?
re.compile(pattern, flags) precompiles a regex pattern into a pattern object, avoiding the overhead of re-parsing the pattern on every call. When a pattern is used repeatedly — especially inside loops or functions called many times — compiling once and reusing the pattern object is measurably faster. Flags modify how the regex engine interprets the pattern: re.IGNORECASE makes matching case-insensitive, re.DOTALL makes the dot (.) match newline characters in addition to all other characters, and re.MULTILINE changes ^ and $ to match the start and end of each line rather than just the start and end of the entire string. Multiple flags are combined with the bitwise OR operator: re.compile(r'pattern', re.IGNORECASE | re.DOTALL). The cheat sheet covers all flags and includes performance comparisons to illustrate when compilation matters.
Conclusion
Python's standard library is a competitive advantage. It is the reason Python scripts can be shared as a single file with no setup instructions. It is the reason Python is the default language for system administration, data processing, and DevOps automation. The modules covered in this cheat sheet — os, sys, collections, itertools, functools, datetime, json, re, math, and pathlib — form the core of daily Python development. Knowing them deeply means writing less code, shipping faster, and producing fewer bugs.
But knowledge fades without a good reference. Even the most experienced Python developer pauses to check whether it is strptime or strftime, whether permutations or combinations accounts for order, or what the exact syntax is for opening a file with pathlib. Our free Python Standard Library Expansion Cheat Sheet is built for those moments. It is fast, comprehensive, and works offline. It covers 120-plus entries across ten essential modules. It is searchable, filterable, and designed for a pleasant reading experience with "The Grand Archive's Reading Room" aesthetic.
Bookmark the Python Standard Library Expansion Cheat Sheet today. Keep it pinned. The next time you need to chain iterators, cache a recursive function, parse a date string, or flatten a nested list, the answer will be one click away.