Java Streams are one of the most powerful features introduced in Java 8, providing a modern, efficient way to process collections of data in a functional style. The Stream API enables you to perform operations on sequences of elements in a declarative manner, making your code more concise and readable.
In Java, a Stream is a sequence of elements that can be processed in parallel or sequentially. It’s not a data structure, but rather a concept that represents a flow of data. Streams allow you to perform complex operations like filtering, mapping, and reducing on data without modifying the underlying data structure.
A Stream does not store data. Instead, it operates on data provided by a source, such as a collection, an array, or an I/O channel.
Creating a Stream
Streams can be created from various data sources. The most common ones are collections (like List
, Set
), arrays, or even I/O channels (such as reading from files).
Stream from a Collection
We can create a stream from a collection by calling the stream()
method.
1 2 3 4 5 6 7 8 9 10 |
import java.util.List; import java.util.Arrays; public class StreamExample { public static void main(String[] args) { List<String> names = Arrays.asList("John", "Jane", "Jack", "Jill"); names.stream() .forEach(System.out::println); // Printing each name } } |
Stream from an Array
We can convert an array to a stream using Arrays.stream()
.
1 2 3 |
String[] fruits = {"Apple", "Banana", "Cherry"}; Arrays.stream(fruits) .forEach(System.out::println); |
Stream Operations
The Stream API provides many methods to manipulate data in a pipeline style. These operations can be categorized into two types:
Intermediate Operations
These operations are lazy and return a new Stream. They allow you to transform or filter data but do not perform the actual computation until a terminal operation is invoked.
filter(Predicate<T> condition)
: Filters elements based on a given condition.map(Function<T, R> mapper)
: Transforms elements into another form.distinct()
: Removes duplicate elements.sorted()
: Sorts elements in natural order or according to a comparator.
Example
1 2 3 4 5 6 |
List<String> names = Arrays.asList("John", "Jane", "Jack", "Jill", "John"); names.stream() .filter(name -> name.startsWith("J")) // Filtering names starting with "J" .distinct() // Removing duplicates .sorted() // Sorting alphabetically .forEach(System.out::println); |
Terminal Operations
These operations trigger the actual computation and produce a result or a side-effect. Once a terminal operation is invoked, the Stream is considered consumed and cannot be used further.
forEach(Consumer<T> action)
: Performs an action for each element.collect(Collectors.toList())
: Collects the elements into a collection likeList
,Set
, etc.reduce(BinaryOperator<T> accumulator)
: Performs a reduction on the elements of the stream using an associative accumulation function.count()
: Returns the number of elements in the stream.anyMatch()
,allMatch()
,noneMatch()
: Check if any, all, or none of the elements satisfy a given condition.
Example
1 2 3 4 5 |
List<String> names = Arrays.asList("John", "Jane", "Jack", "Jill"); long count = names.stream() .filter(name -> name.startsWith("J")) .count(); // Counting names starting with "J" System.out.println("Count of names starting with 'J': " + count); |
Working with Optional
Many Stream operations return an Optional<T>
to represent a potential absence of a value, especially in cases where the operation might not return a result. For example, findFirst()
or findAny()
might return an empty result if no match is found.
1 2 3 4 |
Optional<String> firstName = names.stream() .filter(name -> name.startsWith("Z")) .findFirst(); System.out.println(firstName.orElse("No name found")); |
Parallel Streams
One of the standout features of Streams is their ability to operate in parallel, enabling more efficient data processing on multi-core processors.
By invoking the parallelStream()
method instead of stream()
, you can process elements in parallel:
1 2 |
names.parallelStream() .forEach(System.out::println); // This runs in parallel |
However, parallelism should be used cautiously. It’s most effective when the data is large and operations are computationally intensive. Small datasets or simple operations might not benefit from parallelism and could even perform worse.
Use Cases of Streams
- Filtering and transforming data: Streams make it easy to filter out unwanted elements, transform elements, and create new collections.
- Aggregating results: Operations like
reduce()
can help aggregate data, such as summing or finding the maximum/minimum value in a collection. - Processing large data sets: Streams can be used to process large data efficiently, especially in conjunction with parallelism.