Lab 11: Sentiment Analysis Pipeline

Objective

Build a production-grade sentiment analysis pipeline end-to-end: data cleaning, feature engineering, model training, evaluation, threshold tuning, and a REST API for serving predictions. Applied to security community text — CVE discussions, threat actor chatter, and vulnerability disclosures.

Time: 50 minutes | Level: Practitioner | Docker Image: zchencow/innozverse-ai:latest


Background

Sentiment analysis in security context is broader than positive/negative:

Standard NLP:   "I love this product" → POSITIVE
Security NLP:   "This CVE is critical and actively exploited" → URGENT/HIGH_RISK
                "Patch available, low impact" → RESOLVED/LOW_RISK
                "Researchers found a pre-auth RCE" → CRITICAL

Security teams use sentiment/urgency analysis to:

  • Triage Twitter/Reddit/dark web chatter about vulnerabilities

  • Prioritise patch deployment from vendor advisories

  • Monitor threat actor forums for early warning signals


Step 1: Environment Setup

📸 Verified Output:


Step 2: Text Preprocessing

📸 Verified Output:


Step 3: Build the Training Dataset

📸 Verified Output:


Step 4: Train the Pipeline

📸 Verified Output:


Step 5: Probability Calibration and Threshold Tuning

📸 Verified Output:

💡 Texts with confidence < 70% should be flagged for human review. A 65% MEDIUM prediction means the model is quite uncertain — the text might warrant escalation.


Step 6: Serving Predictions via FastAPI

📸 Verified Output:


Step 7: Model Monitoring — Detecting Concept Drift

📸 Verified Output:


Step 8: Real-World Capstone — Security Advisory Triage System

📸 Verified Output:

💡 80% auto-triage rate means analysts focus their attention on genuinely ambiguous cases. Advisory 3 (timing side-channel) is correctly flagged for review — it has characteristics of both MEDIUM and LOW severity.


Summary

Pipeline components built:

  1. Text preprocessing (cleaning, normalisation, feature extraction)

  2. TF-IDF + Logistic Regression classifier

  3. Probability calibration and threshold tuning

  4. FastAPI serving layer

  5. Confidence-based human review routing

  6. Concept drift monitoring

Key Takeaways:

  • Security text has domain-specific vocabulary — build domain-specific lexicons

  • Confidence scores + thresholds control auto-triage vs human review

  • Monitor confidence distribution in production to detect concept drift

  • Always route low-confidence predictions to human review

Further Reading

Last updated