Lab 09: AI Security Red Team
Overview
Architecture
┌─────────────────────────────────────────────────────────────┐
│ AI Security Threat Landscape │
├────────────────────────────────┬────────────────────────────┤
│ ATTACK SURFACE │ DEFENSES │
│ ───────────────────── │ ───────────────── │
│ Prompt Injection │ Input validation │
│ Jailbreaking │ Output filtering │
│ Model Extraction │ Rate limiting │
│ Membership Inference │ Differential privacy │
│ Data Poisoning │ Watermarking │
│ Adversarial Examples │ Adversarial training │
│ Model Inversion │ Homomorphic encryption │
└────────────────────────────────┴────────────────────────────┘Step 1: Prompt Injection Attacks
Step 2: Jailbreaking Techniques
Technique
Method
Example
Step 3: Model Extraction
Step 4: Membership Inference Attacks
Step 5: Data Poisoning
Step 6: MITRE ATLAS Framework
Tactic
ID
Examples
Step 7: Defense Architecture
Step 8: Capstone — AI Red Team Simulation
Summary
Concept
Key Points
Last updated
