Lab 05: Model Evaluation & Metrics
Objective
Background
Dataset: 950 benign, 50 attacks (imbalanced)
Naive model (always predict "benign"):
Accuracy = 950/1000 = 95% ← looks great!
Recall = 0/50 = 0% ← detects ZERO attacks
Useless.Step 1: Environment Setup
docker run -it --rm zchencow/innozverse-ai:latest bashfrom sklearn.metrics import (accuracy_score, precision_score, recall_score,
f1_score, roc_auc_score, average_precision_score,
confusion_matrix, classification_report)
from sklearn.model_selection import (cross_val_score, StratifiedKFold,
learning_curve, validation_curve)
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
import numpy as np
import warnings; warnings.filterwarnings('ignore')
print("Ready")Step 2: The Confusion Matrix — Foundation of All Metrics
Step 3: ROC Curve and AUC
Step 4: Stratified K-Fold Cross-Validation
Step 5: Bias-Variance Tradeoff — Learning Curves
Step 6: Classification Threshold Tuning
Step 7: Multiclass Evaluation
Step 8: Real-World Capstone — Security Alert Triage System
Summary
Metric
Best For
Avoid When
Further Reading
Last updated
