Lab 05: Adversarial ML & Model Robustness
Objective
Background
Normal ML pipeline: train on clean data → deploy → assume inputs are benign
Adversarial reality:
- Attacker adds imperceptible noise to input → model misclassifies
- Spam filter evasion: slightly alter email to bypass detector
- Malware evasion: add benign-looking bytes → bypass ML AV
- Intrusion detection bypass: craft network traffic to evade classifierStep 1: Setup and Victim Model
docker run -it --rm zchencow/innozverse-ai:latest bashimport numpy as np
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.datasets import make_classification
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import warnings; warnings.filterwarnings('ignore')
np.random.seed(42)
# Malware classification dataset (features from PE file analysis)
X, y = make_classification(n_samples=5000, n_features=20, n_informative=12,
weights=[0.7, 0.3], random_state=42)
feature_names = [
'pe_size', 'section_count', 'import_count', 'export_count', 'entropy',
'has_tls', 'has_resources', 'debug_size', 'reloc_size', 'timestamp_delta',
'string_count', 'url_count', 'ip_count', 'suspicious_api', 'packed',
'crypto_api', 'network_api', 'process_api', 'registry_api', 'file_api',
]
X_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_tr_s = scaler.fit_transform(X_tr)
X_te_s = scaler.transform(X_te)
model = GradientBoostingClassifier(n_estimators=200, max_depth=4, random_state=42)
model.fit(X_tr_s, y_tr)
clean_acc = accuracy_score(y_te, model.predict(X_te_s))
print(f"Victim model (malware classifier): accuracy={clean_acc:.4f}")
print(f"Features: {len(feature_names)} PE-file features")Step 2: FGSM — Fast Gradient Sign Method
Step 3: PGD — Projected Gradient Descent (Stronger Attack)
Step 4: Black-Box Query Attack
Step 5: Data Poisoning Attack
Step 6: Adversarial Training (Defence)
Step 7: Input Preprocessing Defence
Step 8: Capstone — ML Security Audit Report
Summary
Attack
Type
Threat Level
Defence
Further Reading
PreviousLab 04: LangChain & Vector Databases — RAG at ScaleNextLab 06: MLflow — Experiment Tracking & Model Registry
Last updated
