Lab 18: AI in Cybersecurity — Threat Detection and AI Attacks

Objective

Understand the dual-use nature of AI in security — as both a defensive tool and an attack vector. By the end you will be able to:

  • Describe how AI improves threat detection and incident response

  • Explain common AI attack techniques: adversarial examples, data poisoning, model extraction

  • Understand prompt injection and jailbreaking

  • Apply AI securely in security tooling


AI for Defence

Anomaly Detection at Scale

Security events generate vast quantities of data that humans cannot manually analyse. AI enables patterns to emerge from noise:

from sklearn.ensemble import IsolationForest
import pandas as pd
import numpy as np

# Load network flow data
flows = pd.read_csv("network_flows.csv")
features = flows[["bytes_in", "bytes_out", "packets", "duration_s", "unique_dests", "port_entropy"]]

# Isolation Forest: anomalies are statistically "isolated" (short paths in trees)
detector = IsolationForest(
    contamination=0.001,   # expect 0.1% of traffic to be anomalous
    n_estimators=200,
    random_state=42
)
detector.fit(features)

# Score each flow (-1 = anomaly, 1 = normal)
flows["anomaly_score"] = detector.decision_function(features)
flows["is_anomaly"] = detector.predict(features) == -1

# Investigate flagged flows
anomalies = flows[flows["is_anomaly"]].sort_values("anomaly_score")
print(f"Flagged {len(anomalies)} anomalous flows out of {len(flows)}")
print(anomalies[["src_ip", "dst_ip", "bytes_out", "anomaly_score"]].head(10))

Real deployments:

  • Darktrace — unsupervised AI learns "normal" behaviour for every device; flags deviations (C2 beaconing, lateral movement, data exfiltration)

  • Vectra AI — detects attacker behaviours (not just malware signatures) using ML on network metadata

  • Google Chronicle — AI-powered SIEM correlating petabyte-scale logs

AI-Powered Vulnerability Discovery

Phishing Detection


AI Attack Techniques

1. Adversarial Examples

Small, imperceptible perturbations to input data that cause AI models to make wrong predictions with high confidence.

Real-world implications:

  • Stop signs with stickers → autonomous vehicles see "speed limit 45"

  • Adversarial patches on clothing → CCTV face recognition fails

  • Adversarial audio → voice assistant executes hidden commands

2. Data Poisoning

Attackers inject malicious training examples to compromise model behaviour:

3. Model Extraction / Stealing


Prompt Injection and Jailbreaking

Prompt Injection

Malicious input in data sources (websites, PDFs, emails) that hijacks AI agent behaviour:

Jailbreaking

Techniques to bypass LLM safety filters:

Measuring Jailbreak Resistance


AI Security Best Practices for Developers


Further Reading

Last updated