Java 8 Collect

26 October 2014
By Gonçalo Marques
 
java java8
In this article we will see how to use the collect operation over Java 8 streams in order to implement functional style programming reductions.

Introduction

As we have seen previously in Java 8 Reduction, the reduction provided by the reduce operation will use an accumulator function that is used to - as the name states - accumulate the current reduction result while processing the stream elements.

However, the aforementioned accumulator function should always return a new result. What if we need to perform a reduction using an accumulator that should be reused across multiple stream elements processing? On top of that we may also need to run our stream processing in parallel while keeping the reduction thread-safe. The collect operation provides the mechanism we are just looking for as we will see in the next sections.

The Collect operation

Let us start by defining an illustrative reduction using the collect operation: We have a collections of integers. From that list, we want to count which integers are lesser or equal than 10, and also which integers are greater than 10. Additionally we will consider only even integers.

The reduction data source will be the following collection:

Reduction data source

List<Integer> integerList = Arrays.asList(new Integer[] { 1, 2, 3, 4,
  5, 6, 8, 9, 11, 13, 14, 15, 17, 18, 19, 20 });

Now we define the result holder that will be used in the reduction (we will come back to this definition later):

Reduction result holder

class Counter {

  private int countLessOrEqualThanTen = 0;
  private int countGreaterThanTen = 0;

  public void accept(int value) {
    if (value > 10) {
      countGreaterThanTen++;
    } else {
      countLessOrEqualThanTen++;
    }
  }

  public void combine(Counter other) {
    countLessOrEqualThanTen += other.countLessOrEqualThanTen;
    countGreaterThanTen += other.countGreaterThanTen;
  }

  public int getCountLessOrEqualThanTen() {
    return countLessOrEqualThanTen;
  }

  public int getCountGreaterThanTen() {
    return countGreaterThanTen;
  }

}

Finally we use the result holder in the collect operation:

Collect operation

Counter counter = integerList.parallelStream()
  .filter((i) -> i % 2 == 0)
  .collect(Counter::new, Counter::accept, Counter::combine);

// Will print 4
System.out.println(counter.getCountLessOrEqualThanTen());

// Will print 3
System.out.println(counter.getCountGreaterThanTen());

The collect operation we have just used expects 3 arguments:

  1. A supplier: The supplier is responsible for creating new instances of result holders. Note that we are using a parallel stream (recognized by the parallelStream operation ), so multiple threads will process the stream in parallel. Each thread will call the supplier in order to fetch its own instance of the result holder, and then proceed to the stream processing.
  2. An accumulator: The accumulator is responsible for accumulating the results processed by each thread. A thread processes an element by passing the element into the accumulator. In our case it's the accept method of our result holder that will increment the respective counter (countLessOrEqualThanTen or countGreaterThanTen).
  3. A combiner: The combiner is responsible for merging result containers. Supposing that multiple threads are processing the stream in parallel, and each one previously acquired its result container via the supplier, the result containers must be merged when the processing threads conclude their work. In our example the combiner is the combine method. If we inspect the method body we may see that it is merging the counters of a given result holder instance with the counters of another instance.

Note: We used the collect operation with parallel streams in order to also cover the multi-threading aspects of this kind of reduction. One may also naturally use the collect operation against sequential streams. More information about parallel streams in the following article: Java 8 Parallel Streams.


We are passing method references into the collect operation. More information about method references is available in the following article: Java 8 Method References.

Collectors

Collectors implement the java.util.stream.Collector interface and provide the three required arguments that are needed for performing the collect operation: supplier, accumulator and combiner.

On the other side, the java.util.stream.Collectors class provides some useful methods that create Collectors that are suitable for common needs.

Supposing the initial integer collection, the following Collector may be used in order to reduce the initial collection to a new collection containing all the even integers:

Collectors.toList

List<Integer> evenIntegers = integerList
  .stream()
  .filter((i) -> i % 2 == 0).collect(Collectors.toList());

// Will print 2, 4, 6, 8, 14, 18, 20
System.out.println(evenIntegers);

The Collectors.toList() operation will produce a collector that will accumulate the processed stream elements into a new List instance.

Another useful Collectors class method is the groupingBy method:

Collectors.groupingBy

Map<String, List<Integer>> evenOddMap = integerList
  .stream()
  .collect(Collectors.groupingBy((i) -> (i % 2 == 0) ? "even" : "odd"));

// Will print
// even = [2, 4, 6, 8, 14, 18, 20]
// odd  = [1, 3, 5, 9, 11, 13, 15, 17, 19]
System.out.println(evenOddMap);

The groupingBy operation will produce a map which key is the result of the provided lambda expression. The corresponding value of each key is the list of elements which the lambda expression execution evaluated to that given key.

We may also execute nested Collectors:

Nested Collectors

Map<String, List<Integer>> evenOddSquareMap = integerList
  .stream()
  .collect(Collectors.groupingBy(
             (i) -> (i % 2 == 0) ? "even" : "odd",
             Collectors.mapping(
               (i) -> i * i,
               Collectors.toList()
             )
           )
  );

// Will print
// even = [4, 16, 36, 64, 196, 324, 400]
// odd  = [1, 9, 25, 81, 121, 169, 225, 289, 361]
System.out.println(evenOddSquareMap);

By applying the square mapping operation to the result of the first collector, each element of the stream was mapped to its square value.

Finally we show how to apply a reduce operation against the result of a collector. The reduce operation itself is detailed in the following article: Java 8 Reduction.

Reduce after collector

Map<String, Integer> evenOddSumMap = integerList
  .stream()
  .collect(
    Collectors.groupingBy(
      (i) -> (i % 2 == 0) ? "even" : "odd",
      Collectors.reducing(0, (i1, i2) -> i1 + i2)
    )
  );

// Will print even=72, odd=93
System.out.println(evenOddSumMap);

Reference

Reduction (The Java(TM) Tutorials - Collections - Aggregate Operations)

Related Articles

Comments

About the author
Gonçalo Marques is a Software Engineer with several years of experience in software development and architecture definition. During this period his main focus was delivering software solutions in banking, telecommunications and governmental areas. He created the Bytes Lounge website with one ultimate goal: share his knowledge with the software development community. His main area of expertise is Java and open source.

GitHub profile: https://github.com/gonmarques

He is also the author of the WiFi File Browser Android application: