05/23/2023 – Oracle Certified Professional Java SE 17 Developer

Order of Intermediate Operations – Streams

Order of Intermediate Operations

The order of intermediate operations in a stream pipeline can impact the performance of a stream pipeline. If intermediate operations that reduce the size of the stream can be performed earlier in the pipeline, fewer elements need to be processed by the subsequent operations.

Moving intermediate operations such as filter(), distinct(), dropWhile(), limit(), skip(), and takeWhile() earlier in the pipeline can be beneficial, as they all decrease the size of the input stream. Example 16.4 implements two stream pipelines at (1) and (2) to create a list of CD titles, but skipping the first three CDs. The map() operation transforms each CD to its title, resulting in an output stream with element type String. The example shows how the number of elements processed by the map() operation can be reduced if the skip() operation is performed before the map() operation (p. 921).

Example 16.4 Order of Intermediate Operations

Click here to view code image

import java.util.List;
public final class OrderOfOperations {
public static void main(String[] args) {
    List<CD> cdList = CD.cdList;
    // Map before skip.
    List<String> cdTitles1 = cdList
        .stream()                    // (1)
        .map(cd -> {                 // Map applied to all elements.
           System.out.println(“Mapping: ” + cd.title());
           return cd.title();
         })
        .skip(3)                     // Skip afterwards.
        .toList();
    System.out.println(cdTitles1);
    System.out.println();
    // Skip before map preferable.
    List<String> cdTitles2 = cdList
        .stream()                    // (2)
        .skip(3)                     // Skip first.
        .map(cd -> {                 // Map not applied to the first 3 elements.
           System.out.println(“Mapping: ” + cd.title());
           return cd.title();
         })
        .toList();
    System.out.println(cdTitles2);
}
}

Output from the program:

Click here to view code image

Mapping: Java Jive
Mapping: Java Jam
Mapping: Lambda Dancing
Mapping: Keep on Erasing
Mapping: Hot Generics
[Keep on Erasing, Hot Generics]
Mapping: Keep on Erasing
Mapping: Hot Generics
[Keep on Erasing, Hot Generics]

Non-interfering and Stateless Behavioral Parameters

One of the main goals of the Stream API is that the code for a stream pipeline should execute and produce the same results whether the stream elements are processed sequentially or in parallel. In order to achieve this goal, certain constraints are placed on the behavioral parameters—that is, on the lambda expressions and method references that are implementations of the functional interface parameters in stream operations. These behavioral parameters, as the name implies, allow the behavior of a stream operation to be customized. For example, the predicate supplied to the filter() operation defines the criteria for filtering the elements.

Most stream operations require that their behavioral parameters are non-interfering and stateless. A non-interfering behavioral parameter does not change the stream data source during the execution of the pipeline, as this might not produce deterministic results. The exception to this is when the data source is concurrent, which guarantees that the source is thread-safe. A stateless behavioral parameter does not access any state that can change during the execution of the pipeline, as this might not be thread-safe.

If the constraints are violated, all bets are off, resulting in incorrect results being computed, which causes the stream pipeline to fail. In addition to these constraints, care should be taken to introduce side effects via behavioral parameters, as these might introduce other concurrency-related problems during parallel execution of the pipeline.

The aspects of intermediate operations mentioned in this subsection will become clearer as we fill in the details in subsequent sections.

Streams from Collections – Streams

Streams from Collections

The default methods stream() and parallelStream() of the Collection interface create streams with collections as the data source. Collections are the only data source that provide the parallelStream() method to create a parallel stream directly. Otherwise, the parallel() intermediate operation must be used in the stream pipeline.

The following default methods for building streams from collections are defined in the java.util.Collection interface:

Click here to view code image

default Stream<E> stream()
default Stream<E> parallelStream()

Return a finite sequential stream or a possibly parallel stream with this collection as its source, respectively. Whether it is ordered or not depends on the collection used as the data source.

We have already seen examples of creating streams from lists and sets, and several more examples can be found in the subsequent sections.

The code below illustrates two points about streams and their data sources. If the data source is modified before the terminal operation is initiated, the changes will be reflected in the stream. A stream is created at (2) with a list of CDs as the data source. Before a terminal operation is initiated on this stream at (4), an element is added to the underlying data source list at (3). Note that the list created at (1) is modifiable. The count() operation correctly reports the number of elements processed in the stream pipeline.

Click here to view code image

List<CD> listOfCDS = new ArrayList<>(List.of(CD.cd0, CD.cd1));       // (1)
Stream<CD> cdStream = listOfCDS.stream();                            // (2)
listOfCDS.add(CD.cd2);                                               // (3)
System.out.println(cdStream.count());                                // (4) 3
// System.out.println(cdStream.count());             // (5) IllegalStateException

Trying to initiate an operation on a stream whose elements have already been consumed results in a java.lang.IllegalStateException. This case is illustrated at (5). The elements in the cdStream were consumed after the terminal operation at (4). A new stream must be created on the data source before any stream operations can be run.

To create a stream on the entries in a Map, a collection view can be used. In the code below, a Map is created at (1) and populated with some entries. An entry view on the map is obtained at (2) and used as a data source at (3) to create an unordered sequential stream. The terminal operation at (4) returns the number of entries in the map.

Click here to view code image

Map<Integer, String> dataMap = new HashMap<>();                     // (1)
dataMap.put(1, “en”); dataMap.put(2, “to”);
dataMap.put(3, “tre”); dataMap.put(4, “fire”);
long numOfEntries = dataMap
    .entrySet()                                                     // (2)
    .stream()                                                       // (3)
    .count();                                                       // (4) 4

In the examples in this subsection, the call to the stream() method can be replaced by a call to the parallelStream() method. The stream will then execute in parallel, without the need for any additional synchronization code (p. 1009).

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Order of Intermediate Operations – Streams

Order of Intermediate Operations

Non-interfering and Stateless Behavioral Parameters

Streams from Collections – Streams

Streams from Collections

Archives

Calendar

Categories