Lab 13: Streams & Lambdas
Objective
Use Java Streams API for data processing — filter, map, reduce, collect, flatMap, and parallel streams — and write concise lambda expressions and method references.
Background
The Streams API (Java 8+) brings functional-style data processing to Java. A Stream is a lazy sequence of elements supporting aggregate operations. Streams enable declarative, pipeline-based data transformation that replaces verbose for-loop boilerplate. Combined with lambdas and method references, they are the most impactful Java 8 feature for day-to-day code.
Time
45 minutes
Prerequisites
Lab 08 (Interfaces — Functional Interfaces)
Lab 09 (Collections)
Lab 12 (Generics)
Tools
Java 21 (Eclipse Temurin)
Docker image:
innozverse-java:latest
Lab Instructions
Step 1: Stream Pipeline Basics
💡 Streams are lazy — intermediate operations (
filter,map) don't execute until a terminal operation (collect,forEach,count) is called. This enables short-circuit optimization:stream().filter(...).findFirst()stops at the first match without processing the rest.
📸 Verified Output:
Step 2: Lambdas & Method References
💡 Method references are shorthand for lambdas that just call a method.
String::toUpperCase=s -> s.toUpperCase().System.out::println=x -> System.out.println(x). They're more readable for simple cases and automatically adapt to any functional interface with the right signature.
📸 Verified Output:
Step 3: Collectors — Grouping & Partitioning
💡
Collectors.groupingBy(classifier, downstream)is incredibly powerful. The downstream collector can becounting(),summingDouble(),joining(), anothergroupingBy()(nested grouping), ormapping()+toList(). This replaces hundreds of lines of imperative grouping code with a single expression.
📸 Verified Output:
Step 4: flatMap & Optional
💡
flatMapon streams flattensStream<Stream<T>>intoStream<T>. It's how you process nested collections in one pipeline.Optionalis not a collection — it holds zero or one values. Never call.get()without.isPresent()first; use.orElse(),.orElseGet(), or.orElseThrow()instead.
📸 Verified Output:
Step 5: reduce & Custom Collectors
💡
Collector.of(supplier, accumulator, combiner, finisher)lets you build any collection strategy. Thecombineris only used in parallel streams to merge partial results. Use custom collectors when built-in ones don't cover your aggregation needs (running stats, multi-pass aggregations, etc.).
📸 Verified Output:
Step 6: Parallel Streams
💡 Parallel streams split work across ForkJoinPool threads — they help when: (1) the dataset is large (100K+), (2) operations are CPU-intensive and independent, (3) order doesn't matter. They hurt for: small datasets (thread overhead dominates), I/O-bound operations, or when elements depend on each other. Never use parallel for
forEachwith shared mutable state.
📸 Verified Output:
(times vary by CPU; primes count is always 78498)
Step 7: Primitive Streams
💡 Primitive streams (
IntStream,LongStream,DoubleStream) are faster thanStream<Integer>because they avoid boxing/unboxing. For numeric processing, always usemapToInt(),mapToDouble(), and their specialized collectors. The speedup is significant for large datasets.
📸 Verified Output:
Step 8: Complete Example — Sales Data Pipeline
💡 Nested
groupingByis idiomatic Java for pivot tables — group by region, then product, getting aMap<Region, Map<Product, Total>>. The entire pipeline is lazy, composable, and parallelizable with.parallel(). This replaces SQL GROUP BY in application-layer processing.
📸 Verified Output:
(values vary by random seed; structure is always consistent)
Verification
Summary
You've mastered Java Streams: pipeline construction, lambdas, method references, all major collectors, flatMap, Optional, reduce, parallel streams, primitive streams, and a complete sales data pipeline. Streams are the heart of modern Java — they make code shorter, more readable, and often faster.
Further Reading
Last updated
