Lab 11: Web Application Reconnaissance
Objective
Perform systematic web application reconnaissance against a live vulnerable server from Kali Linux. You will:
Fingerprint the technology stack from HTTP headers using
nmapandwhatwebEnumerate hidden directories and files with
gobusterβ finding admin panels, backups, and config filesRead
robots.txtandsitemap.xmlto discover intentionally hidden pathsAccess sensitive files left exposed:
.env,.git/config,backup.sql,phpinfo.phpReach unauthenticated endpoints for user data, internal config, and admin panels
Build a complete recon report mapping the full attack surface
All phases run from the Kali attacker container against the victim Flask server β no simulation, every result is a real HTTP response.
Background
Reconnaissance is the first phase of every penetration test (PTES β Penetration Testing Execution Standard, OWASP Testing Guide). Professional attackers spend 60β80% of their engagement time in reconnaissance before touching anything.
Why recon matters:
Headers reveal framework and version β maps directly to known CVEs
robots.txtis a roadmap of what the developer tried to hide.git/configexposes the private repository URL β oftengit clone-ablebackup.sqlfrom a web root has ended careers (and companies)An unauthenticated
/api/usersendpoint is data breach #1 in any assessment
Real-world examples:
2021 Twitch breach (135GB) β git repo history accessible via misconfigured S3 bucket discovered through recon
2019 GraphQL introspection β leaving introspection enabled on production APIs lets attackers map every query, mutation, and data type β found trivially with gobuster
WordPress version fingerprinting β
X-Generator: WordPress/5.8maps to 200+ known CVEs; attackers automate this with WPScan in under 60 seconds
Architecture
Time
45 minutes
Prerequisites
Docker installed and running
Tools
nmap
Kali
Port scan + HTTP header scripts
whatweb
Kali
Technology stack fingerprinting
gobuster
Kali
Directory and file enumeration
curl
Kali
Manual HTTP requests, header inspection
python3
Kali
Parse JSON responses, build recon report
Lab Instructions
Step 1: Environment Setup β Launch the Victim Server
πΈ Verified Output:
Step 2: Launch the Kali Attacker Container
Set your target and confirm connectivity:
Step 3: Service Fingerprinting β nmap + HTTP Scripts
πΈ Verified Output:
π‘ nmap's
http-headersscript pulls every response header in one scan. Notice the server is sending contradictory headers β it claims to be Apache, PHP, WordPress, AND ASP.NET simultaneously. In a real engagement, mismatched headers like this indicate a reverse proxy in front of a different backend. TheX-Debug-Infoheader is especially dangerous: it confirms the environment isproduction, leaks the version (2.3.1), and gives a build date β all useful for CVE matching.
Step 4: Technology Stack Fingerprinting β whatweb
πΈ Verified Output:
πΈ Verified Output:
Step 5: Directory Enumeration β gobuster
πΈ Verified Output:
π‘ gobuster's
-xflag is critical for web recon. Developers leave backup files (backup.sql,config.php.bak), environment files (.env), and debug pages (phpinfo.php) in the web root during development and forget to remove them.Status: 200means the file is readable by anyone.Status: 301/302means it redirects somewhere interesting.Status: 403means it exists but is blocked β worth noting for later bypass attempts.
Step 6: Read robots.txt and sitemap.xml
πΈ Verified Output:
π‘
robots.txtis the attacker's cheat sheet. TheDisallowentries are literally a list of "please don't look here" β which means they are always the first things an attacker looks at./.git/being disallowed means a developer likely committed the.gitdirectory to the web root;git clone http://victim/would download the entire source history. Never userobots.txtas a security control β it only works on compliant crawlers (which attackers are not).
Step 7: Access Sensitive Exposed Files
πΈ Verified Output:
Step 8: Unauthenticated API Endpoints
πΈ Verified Output:
Step 9: Build the Recon Report
πΈ Verified Output:
Step 10: Cleanup
Attack Surface Summary
.env exposed
CRITICAL
All secrets: DB password, AWS key, JWT secret
backup.sql in web root
CRITICAL
Plaintext passwords for all users
/admin unauthenticated
CRITICAL
Full admin panel β user management, data export
/api/internal/config public
CRITICAL
DB host, Redis host, all internal addresses
.git/config exposed
HIGH
Private repo URL β source code accessible
phpinfo.php live
HIGH
DB connection string, document root, PHP config
/api/users unauthenticated
HIGH
Full user list with emails and roles
Version headers
MEDIUM
Direct CVE matching to known vulnerabilities
Remediation
Version disclosure headers
Remove Server, X-Powered-By, X-Generator at nginx/Apache level
.env in web root
Move to parent directory above web root; block \.env at web server: location ~ /\.env { deny all; }
.git in web root
Add to .gitignore at deploy time; block via nginx: location ~ /\.git { deny all; }
Backup files in web root
Never keep *.sql, *.bak in web root; use secure off-site storage
Unauthenticated endpoints
All /api/ endpoints require JWT validation; /admin behind IP allowlist + MFA
Verbose error pages
Return {"error": "Internal Server Error", "id": "ERR-XXXX"} β log details server-side only
phpinfo.php
Delete from production entirely; never deploy debug pages to production
Further Reading
Last updated
