Lab 05: Hashing and Integrity
🎯 Objective
Master cryptographic hashing — MD5, SHA-1, SHA-256 — to verify file integrity, detect tampering, and understand why rainbow tables make weak hashing dangerous. Apply hashing to real security scenarios.
📚 Background
A cryptographic hash function takes any input and produces a fixed-length "fingerprint" (digest). Good hash functions have three key properties: preimage resistance (can't reverse the hash to find the input), second preimage resistance (can't find a different input with the same hash), and collision resistance (can't find any two inputs with the same hash).
Hashing is used everywhere in security: file integrity verification (download a file, hash it, compare with the published hash to confirm no tampering), password storage (store the hash, never the password), digital signatures (sign the hash of a document, not the document itself), and blockchain (each block contains the hash of the previous block, making tampering detectable).
Rainbow tables are precomputed tables of hash values for common passwords. If an attacker steals a database of MD5 password hashes, they can look each hash up in a rainbow table and instantly find the original password. Salting defeats rainbow tables by making each hash unique even for identical passwords.
⏱️ Estimated Time
35 minutes
📋 Prerequisites
Lab 4 (Cryptography Basics) completed
Docker with
innozverse-cybersecimage
🛠️ Tools Used
md5sum,sha1sum,sha256sum— Hash computationopenssl dgst— Multi-algorithm hashingpython3— Demonstrating rainbow tables and salting
🔬 Lab Instructions
Step 1: Compute Hashes of Common Passwords
See why common passwords are instantly crackable:
📸 Verified Output:
💡 What this means: The MD5 hash
5f4dcc3b5aa765d61d8327deb882cf99for "password" is one of the most searched strings in rainbow table databases — any attacker can instantly identify it. These hashes are published in online databases (like https://crackstation.net/). If you see this in a database dump, the password is immediately known.
Step 2: File Integrity Verification
Hash a file, tamper with it, detect the tampering:
📸 Verified Output:
💡 What this means: The two SHA-256 hashes are completely different because we added content to the file. This is how software distributors verify downloads — they publish the SHA-256 hash alongside the download. Before running software, compute the hash and compare. If it differs, the file was corrupted or tampered with (potentially injected with malware).
Step 3: Hash Algorithm Comparison
📸 Verified Output:
💡 What this means: MD5 = 32 hex chars (128 bits), SHA-1 = 40 hex chars (160 bits), SHA-256 = 64 hex chars (256 bits), SHA-512 = 128 hex chars (512 bits). Longer hashes are more collision-resistant. The empty string hash is a well-known value — many attackers check if password fields contain the empty string hash to find users with blank passwords.
Step 4: The Avalanche Effect
📸 Verified Output:
💡 What this means: This property (avalanche effect) is crucial for security. It means you cannot "guess" the hash of a slightly modified input by looking at another hash. Every single bit change in the input completely randomizes the output. This is why hashes are perfect for integrity checking — you can't predict what a modified file's hash will be.
Step 5: Rainbow Tables — Why They're Dangerous
📸 Verified Output:
💡 What this means: Real rainbow tables contain billions of entries. The lookup is instantaneous — O(1) lookup time. Three of four hashes were cracked in milliseconds. Real-world rainbow tables (like from CrackStation) contain over 1.5 billion unique salted and unsalted password hashes. This is why you should NEVER use MD5 for passwords.
Step 6: Salting Defeats Rainbow Tables
📸 Verified Output:
💡 What this means: With salting, even if 1000 users all use "password" as their password, they all get different hashes. The attacker can't look any of them up in a rainbow table — they'd need to brute-force each one individually with that specific salt prepended. Combined with slow hashing algorithms (bcrypt/argon2), this makes password cracking computationally infeasible.
Step 7: Password-Based Key Derivation (PBKDF2)
📸 Verified Output:
💡 What this means: The
$6$prefix indicates SHA-512 crypt (a UNIX password hashing scheme). The format is$algorithm$salt$hash. If you see$1$in/etc/shadow, that's MD5 — dangerously weak.$6$is acceptable.$y$is yescrypt (modern). Modern systems should use argon2 which is even more configurable in terms of memory and computation cost.
Step 8: HMAC — Hashing with Authentication
📸 Verified Output:
💡 What this means: HMAC combines hashing with a secret key — it proves both integrity (content wasn't changed) and authenticity (sender knows the secret key). HMAC is used in API authentication (AWS, Azure use HMAC-SHA256 for API request signing), JWT tokens, and cookie signing. An attacker without the secret key cannot produce a valid HMAC for modified data.
Step 9: Check File Downloads with Hashes
📸 Verified Output:
💡 What this means: Always verify hash checksums when downloading sensitive software. Linux distributions (Ubuntu, Debian, Fedora) publish SHA-256 checksums for their ISO images. If your download was intercepted by a malicious proxy and modified to contain malware, the hash would not match. This is a critical security practice.
Step 10: Build a Simple Integrity Monitor
📸 Verified Output:
💡 What this means: This is a simplified version of what tools like Tripwire, AIDE, and Wazuh do — they maintain a baseline of file hashes and alert when files change unexpectedly. This detects rootkits (which modify system binaries), ransomware (which modifies/encrypts files), and insider threats (unauthorized changes to configurations).
✅ Verification
📸 Verified Output:
🚨 Common Mistakes
Using MD5 for security: MD5 is broken and should never be used for security purposes. Use SHA-256 minimum.
Storing hashes without salts: Unsalted hashes are vulnerable to rainbow table attacks. Always use a unique random salt per password.
Confusing HMAC with plain hashing: A plain hash doesn't authenticate the source — anyone can compute it. HMAC requires a secret key, providing authentication.
📝 Summary
Cryptographic hashes provide fixed-length fingerprints of data; any change to the input produces a completely different hash (avalanche effect)
MD5 and SHA-1 are cryptographically broken — use SHA-256 or SHA-3 for integrity verification
Password hashing must include unique random salts to prevent rainbow table attacks; use purpose-built functions like bcrypt or argon2
HMAC combines hashing with a secret key for both integrity and authentication
🔗 Further Reading
Last updated
