Lab 11: Advanced Concurrency

Objective

Build multi-process pipelines with ProcessPoolExecutor, inter-process communication with multiprocessing.Queue and Pipe, thread-safe data structures, the Actor pattern, and coordinating CPU-bound and I/O-bound work across process and thread pools.

Background

Python's GIL (Global Interpreter Lock) limits threads to one Python bytecode at a time — threads only win for I/O-bound work. For CPU-bound work (parsing, math, compression), use ProcessPoolExecutor which creates real OS processes with separate GILs. The challenge: processes can't share memory, so data must be serialised (pickled) when crossing the process boundary.

Time

35 minutes

Prerequisites

  • Practitioner Lab 04 (Concurrency basics), Lab 05 (Async)

Tools

  • Docker: zchencow/innozverse-python:latest


Lab Instructions

Step 1: ProcessPoolExecutor — CPU-Bound Work

💡 Functions passed to ProcessPoolExecutor.map() must be picklable — defined at the module level (not lambdas or nested functions). This is because each worker process imports the module and needs to find the function by name. Lambdas have no stable name, so they can't be pickled. This is a fundamental constraint of cross-process serialization.

📸 Verified Output:


Step 2: submit() + as_completed() — Heterogeneous Tasks

📸 Verified Output:


Steps 3–8: Thread vs Process, Actor pattern, Shared state, Pipeline, Semaphores, Capstone

📸 Verified Output:


Summary

Pattern
Tool
Best for

Thread pool

ThreadPoolExecutor

I/O-bound: HTTP, DB, files

Process pool

ProcessPoolExecutor

CPU-bound: math, parsing

as_completed

concurrent.futures.as_completed

Process results as they arrive

Thread queue

queue.Queue

Thread-safe work distribution

Actor

Thread + Queue + reply_q

Encapsulated mutable state

Condition var

threading.Condition

Bounded producer-consumer

Semaphore

threading.BoundedSemaphore

Resource limiting

Hybrid

Thread + Process pools

Mixed I/O + CPU work

Further Reading

Last updated