Filtering – Streams

Filtering

Filters are stream operations that select elements based on some criteria, usually specified as a predicate. This section discusses different ways of filtering elements, selecting unique elements, skipping elements at the head of a stream, and truncating a stream.

The following methods are defined in the Stream<T> interface, and analogous methods are also defined in the IntStream, LongStream, and DoubleStream interfaces:

Click here to view code image

// Filtering using a predicate.
Stream<T> filter(Predicate<? super T> predicate)

Returns a stream consisting of the elements of this stream that match the given non-interfering, stateless predicate.

This is a stateless intermediate operation that changes the stream size, but not the stream element type or the encounter order of the stream.

Click here to view code image

// Taking and dropping elements using a predicate.
default Stream<T> takeWhile(Predicate<? super T> predicate)
default Stream<T> dropWhile(Predicate<? super T> predicate)

The takeWhile() method puts an element from the input stream into its output stream, if it matches the predicate—that is, if the predicate returns the value true for this element. In this case, we say that the takeWhile() method takes the element.

The dropWhile() method discards an element from its input stream, if it matches the predicate—that is, if the predicate returns the value true for this element. In this case, we say that the dropWhile() method drops the element.

For an ordered stream:

The takeWhile() method takes elements from the input stream as long as an element matches the predicate, after which it short-circuits the stream processing.

The dropWhile() method drops elements from the input stream as long as an element matches the predicate, after which it passes through the remaining elements to the output stream.

In short, both methods find the longest prefix of elements to take or drop from the input stream, respectively.

For an unordered stream, where the predicate matches some but not all elements in the input stream:

The elements taken by the takeWhile() method or dropped by the dropWhile() method are nondeterministic; that is, any subset of matching elements can be taken or dropped, respectively, including the empty set.

If the predicate matches all elements in the input stream, regardless of whether the stream is ordered or unordered:

The takeWhile() method takes all elements; that is, the result is the same as the input stream.

The dropWhile() method drops all elements; that is, the result is the empty stream.

If the predicate matches no elements in the input stream, regardless of whether the stream is ordered or unordered:

The takeWhile() method takes no elements; that is, the result is the empty stream.

The dropWhile() method drops no elements; that is, the result is the same as the input stream.

Note that the takeWhile() method is a short-circuiting stateful intermediate operation, whereas the dropWhile() method is a stateful intermediate operation.

Click here to view code image

// Selecting distinct elements.
Stream<T> distinct()

Returns a stream consisting of the distinct elements of this stream, where no two elements are equal according to the Object.equals() method; that is, the method assumes that the elements override the Object.equals() method. It also uses the hashCode() method to keep track of the elements, and this method should also be overridden from the Object class.

For ordered streams, the first occurrence of a duplicated element is selected in the encounter order—called the stability guarantee. This stateful operation is particularly expensive for a parallel ordered stream which entails buffering overhead to ensure the stability guarantee. There is no such guarantee for an unordered stream: Which occurrence of a duplicated element will be selected is not guaranteed.

This stateful intermediate operation changes the stream size, but not the stream element type.

// Skipping elements.
Stream<T> skip(long n)

Returns a stream consisting of the remaining elements of this stream after discarding the first n elements of the stream in encounter order. If this stream has fewer than n elements, an empty stream is returned.

This stateful operation is expensive for a parallel ordered stream which entails keeping track of skipping the first n elements.

This is a stateful intermediate operation that changes the stream size, but not the stream element type.

// Truncating a stream.
Stream<T> limit(long maxSize)

Returns a stream consisting of elements from this stream, truncating the length of the returned stream to be no longer than the value of the parameter maxSize.

This stateful operation is expensive for a parallel ordered stream which entails keeping track of passing the first n elements from the input stream to the output stream.

This is a short-circuiting, stateful intermediate operation.

Leave a Reply

Your email address will not be published. Required fields are marked *