Lab 08: Transfer Learning

Objective

Understand and apply transfer learning — one of the most powerful techniques in practical deep learning. Use pretrained model features to achieve high accuracy on a small custom dataset that would take millions of images to train from scratch.

Time: 45 minutes | Level: Practitioner | Docker Image: zchencow/innozverse-ai:latest


Background

Training a CNN like ResNet-50 from scratch requires:

  • ~1.2 million images (ImageNet)

  • ~1 week on 8 GPUs

  • ~$50,000 in cloud compute

Transfer learning: take a pretrained model, freeze its layers, replace the classification head, and train only the head on your data. You get 90%+ of the performance with 1% of the data and compute.

Pretrained ResNet-50:
  [Conv Block 1] → [Conv Block 2] → ... → [Conv Block 49] → [FC: 1000 classes]
                    frozen (don't train)                      ↑ replace with your head

Your custom model:
  [frozen Conv Block 1..49] → [New FC: your N classes]
                               ↑ only this gets trained

Step 1: Environment Setup

📸 Verified Output:


Step 2: Simulating Pretrained Feature Extraction

We simulate CNN feature vectors as if extracted by ResNet-50 from real images:

📸 Verified Output:


Step 3: Linear Probe (Fastest Transfer Learning)

📸 Verified Output:

💡 93% accuracy on only 30 samples per class! Without transfer learning, this dataset is far too small to train any neural network.


Step 4: Comparing Classifiers on Top of Pretrained Features

📸 Verified Output:

💡 SVM with RBF kernel performs best — SVMs are excellent for high-dimensional feature spaces like CNN embeddings. LogReg is a close second and much faster at inference.


Step 5: The Effect of Dataset Size

📸 Verified Output:

💡 With only 5 samples per class, transfer learning achieves 84% — random features achieve 21% (barely above chance for 5 classes). The gap is massive, and persists even with 500 samples per class.


Step 6: Fine-Tuning vs Feature Extraction

Two transfer learning strategies:

📸 Verified Output:


Step 7: Domain Adaptation

What if your images are very different from ImageNet (e.g., medical X-rays, satellite imagery, security screenshots)?

📸 Verified Output:

💡 When domain shift is large (security screenshots are very different from everyday photos), you need to unfreeze deeper layers and fine-tune them — not just train the classification head.


Step 8: Real-World Capstone — Security Log Screenshot Triage

📸 Verified Output:

💡 98.75% accuracy identifying security tools from screenshots — using only 40 training examples per class. Transfer learning made this possible. 92% of predictions can be auto-triaged, saving significant analyst time.


Summary

Strategy
Data Needed
Training Time
When to Use

Linear probe

≥10/class

Seconds

Same/similar domain

Feature extraction + SVM

≥20/class

Seconds–minutes

Best for small datasets

Fine-tune last layers

≥100/class

Minutes–hours

Different domain

Full fine-tune

≥1000/class

Hours–days

Very different domain

Train from scratch

≥100K/class

Days–weeks

Unique modality (e.g., network packets)

Key Takeaways:

  • Pretrained CNN features are incredibly powerful, even for non-natural-image domains

  • SVM + RBF kernel is often the best classifier for high-dimensional CNN features

  • Low-confidence predictions should trigger human review, not be blindly trusted

  • Domain shift determines how many layers to unfreeze

Further Reading

Last updated