Lab 01: CPython Internals

Time: 60 minutes | Level: Architect | Docker: docker run -it --rm python:3.11-slim bash

Overview

Dive deep into CPython's object model, bytecode compilation, and runtime machinery. Understanding CPython internals lets you write code that works with the interpreter rather than against it.

Step 1: The PyObject Model

Every Python object is a C struct (PyObject) with two key fields: ob_refcnt (reference count) and ob_type (pointer to type object). Python's sys.getrefcount exposes this:

import sys

x = "hello"
print(sys.getrefcount(x))   # Always +1 because getrefcount itself holds a ref

a = x
b = x
print(sys.getrefcount(x))   # +2 more

del a
print(sys.getrefcount(x))   # back down by 1

# Object identity
print(id(x))                # memory address of PyObject
print(type(x))              # ob_type -> str
print(x.__class__)          # same

💡 sys.getrefcount() always shows count + 1 because the function call itself creates a temporary reference.

Step 2: The dis Module — Bytecode Disassembly

📸 Verified Output:

💡 Each bytecode instruction is 2 bytes (opcode + arg). LOAD_FAST is the fastest variable access — it reads from the local variable array by index.

Step 3: Code Objects

📸 Verified Output:

Step 4: Frame Objects

Frame objects represent the execution state of a function call — the runtime equivalent of a stack frame.

💡 sys._getframe(n) walks n frames up the call stack. This is how debuggers, profilers, and frameworks like Flask inspect the call stack at runtime.

Step 5: sys.getsizeof — Object Memory Layout

💡 CPython caches integers from -5 to 256. This is why a = 256; b = 256; a is b returns True, but the same trick fails for 257+.

Step 6: Bytecode Opcodes Deep Dive

Step 7: The Peephole Optimizer

CPython performs compile-time constant folding and other peephole optimizations:

💡 The AST optimizer runs before bytecode generation. Expressions like 2*3 are folded to 6 at compile time — no BINARY_OP in the bytecode.

Step 8: Capstone — Bytecode Analyzer Tool

Build a tool that analyzes a function's bytecode and generates a report:

📸 Verified Output:

Summary

Concept
Tool/API
Use Case

Object model

sys.getrefcount, id()

Memory debugging

Bytecode

dis.dis, dis.get_instructions

Performance analysis

Code objects

func.__code__

Introspection

Frame objects

sys._getframe()

Debuggers/profilers

Memory size

sys.getsizeof()

Memory optimization

Constant folding

dis + compile()

Compiler optimization

Peephole optimizer

Bytecode inspection

Understanding compiler

Last updated