Filtering – Streams
Filtering
Filters are stream operations that select elements based on some criteria, usually specified as a predicate. This section discusses different ways of filtering elements, selecting unique elements, skipping elements at the head of a stream, and truncating a stream.
The following methods are defined in the Stream<T> interface, and analogous methods are also defined in the IntStream, LongStream, and DoubleStream interfaces:
// Filtering using a predicate.
Stream<T> filter(Predicate<? super T> predicate)
Returns a stream consisting of the elements of this stream that match the given non-interfering, stateless predicate.
This is a stateless intermediate operation that changes the stream size, but not the stream element type or the encounter order of the stream.
// Taking and dropping elements using a predicate.
default Stream<T> takeWhile(Predicate<? super T> predicate)
default Stream<T> dropWhile(Predicate<? super T> predicate)
The takeWhile() method puts an element from the input stream into its output stream, if it matches the predicate—that is, if the predicate returns the value true for this element. In this case, we say that the takeWhile() method takes the element.
The dropWhile() method discards an element from its input stream, if it matches the predicate—that is, if the predicate returns the value true for this element. In this case, we say that the dropWhile() method drops the element.
For an ordered stream:
The takeWhile() method takes elements from the input stream as long as an element matches the predicate, after which it short-circuits the stream processing.
The dropWhile() method drops elements from the input stream as long as an element matches the predicate, after which it passes through the remaining elements to the output stream.
In short, both methods find the longest prefix of elements to take or drop from the input stream, respectively.
For an unordered stream, where the predicate matches some but not all elements in the input stream:
The elements taken by the takeWhile() method or dropped by the dropWhile() method are nondeterministic; that is, any subset of matching elements can be taken or dropped, respectively, including the empty set.
If the predicate matches all elements in the input stream, regardless of whether the stream is ordered or unordered:
The takeWhile() method takes all elements; that is, the result is the same as the input stream.
The dropWhile() method drops all elements; that is, the result is the empty stream.
If the predicate matches no elements in the input stream, regardless of whether the stream is ordered or unordered:
The takeWhile() method takes no elements; that is, the result is the empty stream.
The dropWhile() method drops no elements; that is, the result is the same as the input stream.
Note that the takeWhile() method is a short-circuiting stateful intermediate operation, whereas the dropWhile() method is a stateful intermediate operation.
// Selecting distinct elements.
Stream<T> distinct()
Returns a stream consisting of the distinct elements of this stream, where no two elements are equal according to the Object.equals() method; that is, the method assumes that the elements override the Object.equals() method. It also uses the hashCode() method to keep track of the elements, and this method should also be overridden from the Object class.
For ordered streams, the first occurrence of a duplicated element is selected in the encounter order—called the stability guarantee. This stateful operation is particularly expensive for a parallel ordered stream which entails buffering overhead to ensure the stability guarantee. There is no such guarantee for an unordered stream: Which occurrence of a duplicated element will be selected is not guaranteed.
This stateful intermediate operation changes the stream size, but not the stream element type.
// Skipping elements.
Stream<T> skip(long n)
Returns a stream consisting of the remaining elements of this stream after discarding the first n elements of the stream in encounter order. If this stream has fewer than n elements, an empty stream is returned.
This stateful operation is expensive for a parallel ordered stream which entails keeping track of skipping the first n elements.
This is a stateful intermediate operation that changes the stream size, but not the stream element type.
// Truncating a stream.
Stream<T> limit(long maxSize)
Returns a stream consisting of elements from this stream, truncating the length of the returned stream to be no longer than the value of the parameter maxSize.
This stateful operation is expensive for a parallel ordered stream which entails keeping track of passing the first n elements from the input stream to the output stream.
This is a short-circuiting, stateful intermediate operation.
Using Generator Functions to Build Infinite Streams – Streams
Using Generator Functions to Build Infinite Streams
The generate() and iterate() methods of the core stream interfaces can be used to create infinite sequential streams that are unordered or ordered, respectively.
Infinite streams need to be truncated explicitly in order for the terminal operation to complete execution, or the operation will not terminate. Some stateful intermediate operations must process all elements of the streams in order to produce their results—for example, the sort() intermediate operation (p. 929) and the reduce() terminal operation (p. 955). The limit(maxSize) intermediate operation can be used to limit the number of elements that are available for processing from a stream (p. 917).
Generate
The generate() method accepts a supplier that generates the elements of the infinite stream.
IntSupplier supplier = () -> (int) (6.0 * Math.random()) + 1; // (1)
IntStream diceStream = IntStream.generate(supplier); // (2)
diceStream.limit(5) // (3)
.forEach(i -> System.out.print(i + ” “)); // (4) 2 4 5 2 6
The IntSupplier at (1) generates a number between 1 and 6 to simulate a dice throw every time it is executed. The supplier is passed to the generate() method at (2) to create an infinite unordered IntStream whose values simulate throwing a dice. In the pipeline comprising (3) and (4), the number of values in the IntStream is limited to 5 at (3) by the limit() intermediate operation, and the value of each dice throw is printed by the forEach() terminal operation at (4). We can expect five values between 1 and 6 to be printed when the pipeline is executed.
Iterate
The iterate() method accepts a seed value and a unary operator. The method generates the elements of the infinite ordered stream iteratively: It applies the operator to the previous element to generate the next element, where the first element is the seed value.
In the code below, the seed value of 1 is passed to the iterate() method at (2), together with the unary operator uop defined at (1) that increments its argument by 2. The first element is 1 and the second element is the result of the unary operator applied to 1, and so on. The limit() operation limits the stream to five values. We can expect the forEach() operation to print the first five odd numbers.
IntUnaryOperator uop = n -> n + 2; // (1)
IntStream oddNums = IntStream.iterate(1, uop); // (2)
oddNums.limit(5)
.forEach(i -> System.out.print(i + ” “)); // 1 3 5 7 9
The following stream pipeline will really go bananas if the stream is not truncated by the limit() operation:
Stream.iterate(“ba”, b -> b + “na”)
.limit(5)
.forEach(System.out::println);
Archives
- July 2024
- June 2024
- May 2024
- March 2024
- February 2024
- January 2024
- December 2023
- October 2023
- September 2023
- May 2023
- March 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- April 2022
- March 2022
- November 2021
- October 2021
- September 2021
- July 2021
- June 2021
- March 2021
- February 2021
Calendar
M | T | W | T | F | S | S |
---|---|---|---|---|---|---|
1 | ||||||
2 | 3 | 4 | 5 | 6 | 7 | 8 |
9 | 10 | 11 | 12 | 13 | 14 | 15 |
16 | 17 | 18 | 19 | 20 | 21 | 22 |
23 | 24 | 25 | 26 | 27 | 28 | 29 |
30 | 31 |