Lab 03: Neural Network from Scratch

Objective

Build a fully-connected neural network using only NumPy: forward propagation through multiple layers, ReLU and sigmoid activations, backpropagation with the chain rule, mini-batch gradient descent, and weight initialisation strategies β€” trained to classify Surface products by tier.

Background

A neural network stacks layers of weighted connections. Each layer computes Z = XW + b, then applies a non-linear activation (ReLU: max(0,z), Sigmoid: 1/(1+e⁻ᢻ)). Backpropagation computes gradients layer-by-layer using the chain rule: βˆ‚L/βˆ‚W = βˆ‚L/βˆ‚Z Β· βˆ‚Z/βˆ‚W. This lab implements a full 3-layer network (input β†’ hidden β†’ hidden β†’ output) without any framework.

Time

35 minutes

Prerequisites

  • Lab 01 (Linear Regression), Lab 02 (Logistic Regression)

Tools

  • Docker: zchencow/innozverse-python:latest


Lab Instructions

πŸ’‘ He initialisation solves the vanishing/exploding gradient problem. If weights start at zero, all neurons compute identical gradients and the network never learns (symmetry problem). If weights are too large, gradients explode during backprop. He initialisation scales weights by √(2/fan_in) β€” just right for ReLU networks. Xavier initialisation (√(1/fan_in)) is better for sigmoid/tanh. This is why torch.nn.Linear uses Kaiming uniform by default.

πŸ“Έ Verified Output:


Summary

Component
Purpose
Key detail

ReLU max(0,z)

Non-linearity

Derivative = 0 or 1 (no vanishing)

Softmax

Multi-class output

Outputs sum to 1 (probabilities)

Cross-entropy

Classification loss

Penalises confident wrong predictions

Backprop

Gradient computation

Chain rule layer by layer

He init

Weight initialisation

√(2/fan_in) for ReLU

Last updated