Lab 14: JMH Performance Tuning

Time: 60 minutes | Level: Architect | Docker: docker run -it --rm zchencow/innozverse-java:latest bash


Overview

Measure what you optimize: use JMH (Java Microbenchmark Harness) for reliable micro-benchmarks, configure GC for different workloads, analyze GC logs, tune JIT compilation, and understand the Epsilon GC for pure throughput measurement.


Step 1: Why Microbenchmarking is Hard

JVM Optimizations that fool naive benchmarks:
  Dead code elimination  — JIT removes code with no observable side effects
  Constant folding       — JIT computes constant expressions at compile time
  Loop unrolling         — JIT replicates loop body to reduce overhead
  Inlining               — JIT copies callee body into caller
  Warmup                 — first N iterations use interpreter, not JIT
  GC pauses              — garbage collection adds latency noise
  OSR                    — on-stack replacement changes benchmark behavior mid-run

JMH solutions:
  Blackhole.consume()   — prevents dead code elimination
  @Fork(1+)             — fresh JVM per benchmark
  @Warmup               — discard initial results
  @Measurement          — only measure after warmup
  @State                — isolate benchmark state
  @BenchmarkMode        — throughput, average time, sample time

Step 2: JMH Setup


Step 3: Benchmark Modes and Scopes


Step 4: GC Algorithm Selection

💡 Use Epsilon GC in JMH benchmarks with @Fork(0) to eliminate GC noise entirely — but set a large enough heap.


Step 5: GC Log Analysis


Step 6: JIT Compilation Flags


Step 7: String Interning and Constant Pool


Step 8: Capstone — JMH Benchmark

📸 Verified Output:

💡 concat is faster because the JIT folds "Hello" + " World!" into a compile-time constant — demonstrating why benchmarking is tricky and why JMH's Blackhole matters for variable inputs.


Summary

Tool/Concept
CLI Flag / API
Purpose

JMH benchmark

@Benchmark + Runner

Reliable micro-benchmarks

Warmup

@Warmup(iterations=N)

JIT stabilization

Blackhole

Blackhole.consume()

Prevent dead code elimination

G1GC

-XX:+UseG1GC

Balanced throughput/latency

ZGC

-XX:+UseZGC

Sub-10ms pause times

EpsilonGC

-XX:+UseEpsilonGC

Benchmark without GC noise

GC logging

-Xlog:gc*:file=gc.log

Pause analysis

JIT logging

-XX:+PrintCompilation

Compilation analysis

String intern

String.intern()

Pool deduplication

Compact strings

Default Java 9+

byte[] for Latin-1

Last updated