Streams

Java Streams are one of the most powerful features introduced in Java 8, providing a modern, efficient way to process collections of data in a functional style. The Stream API enables you to perform operations on sequences of elements in a declarative manner, making your code more concise and readable.

In Java, a Stream is a sequence of elements that can be processed in parallel or sequentially. It’s not a data structure, but rather a concept that represents a flow of data. Streams allow you to perform complex operations like filtering, mapping, and reducing on data without modifying the underlying data structure.

A Stream does not store data. Instead, it operates on data provided by a source, such as a collection, an array, or an I/O channel.

Creating a Stream

Streams can be created from various data sources. The most common ones are collections (like List, Set), arrays, or even I/O channels (such as reading from files).

Stream from a Collection

We can create a stream from a collection by calling the stream() method.

Stream from an Array

We can convert an array to a stream using Arrays.stream().

Stream Operations

The Stream API provides many methods to manipulate data in a pipeline style. These operations can be categorized into two types:

Intermediate Operations

These operations are lazy and return a new Stream. They allow you to transform or filter data but do not perform the actual computation until a terminal operation is invoked.

  • filter(Predicate<T> condition): Filters elements based on a given condition.
  • map(Function<T, R> mapper): Transforms elements into another form.
  • distinct(): Removes duplicate elements.
  • sorted(): Sorts elements in natural order or according to a comparator.

Example

Terminal Operations

These operations trigger the actual computation and produce a result or a side-effect. Once a terminal operation is invoked, the Stream is considered consumed and cannot be used further.

  • forEach(Consumer<T> action): Performs an action for each element.
  • collect(Collectors.toList()): Collects the elements into a collection like List, Set, etc.
  • reduce(BinaryOperator<T> accumulator): Performs a reduction on the elements of the stream using an associative accumulation function.
  • count(): Returns the number of elements in the stream.
  • anyMatch(), allMatch(), noneMatch(): Check if any, all, or none of the elements satisfy a given condition.

Example

Working with Optional

Many Stream operations return an Optional<T> to represent a potential absence of a value, especially in cases where the operation might not return a result. For example, findFirst() or findAny() might return an empty result if no match is found.

Parallel Streams

One of the standout features of Streams is their ability to operate in parallel, enabling more efficient data processing on multi-core processors.

By invoking the parallelStream() method instead of stream(), you can process elements in parallel:

However, parallelism should be used cautiously. It’s most effective when the data is large and operations are computationally intensive. Small datasets or simple operations might not benefit from parallelism and could even perform worse.

Use Cases of Streams

  • Filtering and transforming data: Streams make it easy to filter out unwanted elements, transform elements, and create new collections.
  • Aggregating results: Operations like reduce() can help aggregate data, such as summing or finding the maximum/minimum value in a collection.
  • Processing large data sets: Streams can be used to process large data efficiently, especially in conjunction with parallelism.