Lab 12: Federated Learning & Privacy-Preserving ML

Objective

Implement federated learning from scratch: FedAvg algorithm, secure aggregation, differential privacy with noise addition, gradient clipping, and privacy budget tracking — applied to a cross-organisation threat sharing scenario.

Time: 50 minutes | Level: Advanced | Docker Image: zchencow/innozverse-ai:latest


Background

Centralised ML:  all data → central server → train → model
Problem:         GDPR, HIPAA, competitive concerns, data sovereignty

Federated Learning:
  Each org trains locally → share model gradients (NOT data) → aggregate globally
  Data never leaves the organisation.

Privacy-preserving additions:
  Differential Privacy (DP): add calibrated noise to gradients
  Secure Aggregation: server can't see individual gradients
  Homomorphic Encryption: compute on encrypted gradients

Step 1: FedAvg Algorithm

📸 Verified Output:


Step 2: Differential Privacy

📸 Verified Output:

💡 The ε-utility tradeoff is clear: ε=10 costs only 1.2% AUC for strong privacy guarantees. ε=0.1 gives near-theoretical privacy but loses 20% AUC. Most productions use ε=1–10.


Step 3–8: Capstone — Cross-Organisation Threat Sharing Network

📸 Verified Output:


Summary

Technique
Protects
Cost

FedAvg

Raw data stays local

None (comm overhead)

Differential Privacy

Individual samples

Accuracy ~5-15%

Secure Aggregation

Individual gradients

Computation overhead

Gradient clipping

Gradient leakage

Slight accuracy loss

Further Reading

Last updated