Order of Intermediate Operations – Streams
Order of Intermediate Operations
The order of intermediate operations in a stream pipeline can impact the performance of a stream pipeline. If intermediate operations that reduce the size of the stream can be performed earlier in the pipeline, fewer elements need to be processed by the subsequent operations.
Moving intermediate operations such as filter(), distinct(), dropWhile(), limit(), skip(), and takeWhile() earlier in the pipeline can be beneficial, as they all decrease the size of the input stream. Example 16.4 implements two stream pipelines at (1) and (2) to create a list of CD titles, but skipping the first three CDs. The map() operation transforms each CD to its title, resulting in an output stream with element type String. The example shows how the number of elements processed by the map() operation can be reduced if the skip() operation is performed before the map() operation (p. 921).
Example 16.4 Order of Intermediate Operations
import java.util.List;
public final class OrderOfOperations {
public static void main(String[] args) {
List<CD> cdList = CD.cdList;
// Map before skip.
List<String> cdTitles1 = cdList
.stream() // (1)
.map(cd -> { // Map applied to all elements.
System.out.println(“Mapping: ” + cd.title());
return cd.title();
})
.skip(3) // Skip afterwards.
.toList();
System.out.println(cdTitles1);
System.out.println();
// Skip before map preferable.
List<String> cdTitles2 = cdList
.stream() // (2)
.skip(3) // Skip first.
.map(cd -> { // Map not applied to the first 3 elements.
System.out.println(“Mapping: ” + cd.title());
return cd.title();
})
.toList();
System.out.println(cdTitles2);
}
}
Output from the program:
Mapping: Java Jive
Mapping: Java Jam
Mapping: Lambda Dancing
Mapping: Keep on Erasing
Mapping: Hot Generics
[Keep on Erasing, Hot Generics]
Mapping: Keep on Erasing
Mapping: Hot Generics
[Keep on Erasing, Hot Generics]
Non-interfering and Stateless Behavioral Parameters
One of the main goals of the Stream API is that the code for a stream pipeline should execute and produce the same results whether the stream elements are processed sequentially or in parallel. In order to achieve this goal, certain constraints are placed on the behavioral parameters—that is, on the lambda expressions and method references that are implementations of the functional interface parameters in stream operations. These behavioral parameters, as the name implies, allow the behavior of a stream operation to be customized. For example, the predicate supplied to the filter() operation defines the criteria for filtering the elements.
Most stream operations require that their behavioral parameters are non-interfering and stateless. A non-interfering behavioral parameter does not change the stream data source during the execution of the pipeline, as this might not produce deterministic results. The exception to this is when the data source is concurrent, which guarantees that the source is thread-safe. A stateless behavioral parameter does not access any state that can change during the execution of the pipeline, as this might not be thread-safe.
If the constraints are violated, all bets are off, resulting in incorrect results being computed, which causes the stream pipeline to fail. In addition to these constraints, care should be taken to introduce side effects via behavioral parameters, as these might introduce other concurrency-related problems during parallel execution of the pipeline.
The aspects of intermediate operations mentioned in this subsection will become clearer as we fill in the details in subsequent sections.
Leave a Reply