Overview of Java Stream API Extensions


Stream API, which has been introduced in Java 8, is probably still the most important new feature that has been included to Java during last several years. I think that every Java developer has an opportunity to use Java Stream API in his career. Or I should rather told that you probably use it on a day-to-day basis. However, if you compare the built-in features offered for functional programming with some other languages – for example Kotlin – you will quickly realize that the number of methods provided by Stream API is very limited. Therefore, the community has created several libraries used just for extending API offered by pure Java. Today I’m going to show the most interesting Stream API extensions offered by the three popular Java libraries: StreamEx, jOOλ and Guava.

This article treats only about sequential Streams. If you would use parallel Streams you won’t be able to leverage jOOλ since it is dedicated only for sequential Streams.

Dependencies

Here’s the list on current releases of all three libraries compared in this article.

<dependencies>
	<dependency>
		<groupId>one.util</groupId>
		<artifactId>streamex</artifactId>
		<version>0.7.0</version>
	</dependency>
	<dependency>
		<groupId>org.jooq</groupId>
		<artifactId>jool</artifactId>
		<version>0.9.13</version>
	</dependency>
	<dependency>
		<groupId>com.google.guava</groupId>
		<artifactId>guava</artifactId>
		<version>28.1-jre</version>
	</dependency>
</dependencies>

1. Zipping

When working with Java Streams in more advanced applications you will often process multiple streams. Also they can often contain different objects. One of useful operation in that case is zipping. Zipping operation returns a stream that contains a pair of corresponding elements in given two streams, which means that they are in the same position in those streams. Let’s consider two objects Person and PersonAddress. Assuming we have two streams, first which contains only Person objects and second with PersonAddress objects, and the order of elements clearly indicates their association we may zip them to create a new stream of objects containing all the fields from Person and PersonAddress. Here’s the screen that illustrates the described scenario.
streams-zipping.png
Zipping is supported by all the three currently described libraries. Let’s begin from Guava example. It provides the only one method dedicated for zipping – static zip method that takes three parameters: first stream, second stream and mapping function.

Stream<Person> s1 = Stream.of(
	new Person(1, "John", "Smith"),
	new Person(2, "Tom", "Hamilton"),
	new Person(3, "Paul", "Walker")
);
Stream<PersonAddress> s2 = Stream.of(
	new PersonAddress(1, "London", "Street1", "100"),
	new PersonAddress(2, "Manchester", "Street1", "101"),
	new PersonAddress(3, "London", "Street2", "200")
);
Stream<PersonDTO> s3 = Streams.zip(s1, s2, (p, pa) -> PersonDTO.builder()
	.id(p.getId())
	.firstName(p.getFirstName())
	.lastName(p.getLastName())
	.city(pa.getCity())
	.street(pa.getStreet())
	.houseNo(pa.getHouseNo()).build());
s3.forEach(dto -> {
	Assertions.assertNotNull(dto.getId());
	Assertions.assertNotNull(dto.getFirstName());
	Assertions.assertNotNull(dto.getCity());
});

Both StreamEx and jOOλ offers more possibilities for zipping than Guava. We can between some static methods or non-static methods invoked on a given stream. Let’s take a look how we may perform it using StreamEx zipWith method.

StreamEx<Person> s1 = StreamEx.of(
	new Person(1, "John", "Smith"),
	new Person(2, "Tom", "Hamilton"),
	new Person(3, "Paul", "Walker")
);
StreamEx<PersonAddress> s2 = StreamEx.of(
	new PersonAddress(1, "London", "Street1", "100"),
	new PersonAddress(2, "Manchester", "Street1", "101"),
	new PersonAddress(3, "London", "Street2", "200")
);
StreamEx<PersonDTO> s3 = s1.zipWith(s2, (p, pa) -> PersonDTO.builder()
	.id(p.getId())
	.firstName(p.getFirstName())
	.lastName(p.getLastName())
	.city(pa.getCity())
	.street(pa.getStreet())
	.houseNo(pa.getHouseNo()).build());
s3.forEach(dto -> {
	Assertions.assertNotNull(dto.getId());
	Assertions.assertNotNull(dto.getFirstName());
	Assertions.assertNotNull(dto.getCity());
});

The example for is almost identical. We have zip method called on a given stream.

Seq<Person> s1 = Seq.of(
	new Person(1, "John", "Smith"),
	new Person(2, "Tom", "Hamilton"),
	new Person(3, "Paul", "Walker"));
Seq<PersonAddress> s2 = Seq.of(
	new PersonAddress(1, "London", "Street1", "100"),
	new PersonAddress(2, "Manchester", "Street1", "101"),
	new PersonAddress(3, "London", "Street2", "200"));
Seq<PersonDTO> s3 = s1.zip(s2, (p, pa) -> PersonDTO.builder()
	.id(p.getId())
	.firstName(p.getFirstName())
	.lastName(p.getLastName())
	.city(pa.getCity())
	.street(pa.getStreet())
	.houseNo(pa.getHouseNo()).build());
s3.forEach(dto -> {
	Assertions.assertNotNull(dto.getId());
	Assertions.assertNotNull(dto.getFirstName());
	Assertions.assertNotNull(dto.getCity());
});

2. Joining

Zipping operation merge elements from two different streams in accordance to their order in those streams. What if we would like to associate elements basing on their fields like id, but not the order in a stream. Something ala LEFT JOIN or RIGHT JOIN between two entities. The result of an operation should be the same as for previous section – a new stream of objects containing all the fields from Person and PersonAddress. The described operation is illustrated on the picture below.
streams-join.png
When it comes to join operation only jOOλ provides some methods for that. Since it is dedicated for object oriented queries, we may choose between many join options. For example there are innerJoin, leftOuterJoin, rightOuterJoin and crossJoin methods. In the source code visible below you can see an example of innerJoin usage. This methods takes two parameters: stream to join and predicate for matching elements from first stream and joining stream. If we would like to create new object basing on innerJoin result we should additionally invoke map operation.

Seq<Person> s1 = Seq.of(
		new Person(1, "John", "Smith"),
		new Person(2, "Tom", "Hamilton"),
		new Person(3, "Paul", "Walker"));
Seq<PersonAddress> s2 = Seq.of(
		new PersonAddress(2, "London", "Street1", "100"),
		new PersonAddress(3, "Manchester", "Street1", "101"),
		new PersonAddress(1, "London", "Street2", "200"));
Seq<PersonDTO> s3 = s1.innerJoin(s2, (p, pa) -> p.getId().equals(pa.getId())).map(t -> PersonDTO.builder()
		.id(t.v1.getId())
		.firstName(t.v1.getFirstName())
		.lastName(t.v1.getLastName())
		.city(t.v2.getCity())
		.street(t.v2.getStreet())
		.houseNo(t.v2.getHouseNo()).build());
s3.forEach(dto -> {
	Assertions.assertNotNull(dto.getId());
	Assertions.assertNotNull(dto.getFirstName());
	Assertions.assertNotNull(dto.getCity());
});

3. Grouping

The next useful operation that is supported by Java Stream API only through static method groupingBy in java.util.stream.Collectors is grouping (s1.collect(Collectors.groupingBy(PersonDTO::getCity))). As a result of executing such an operation on stream you get a map with keys are the values resulting from applying the grouping function to the input elements, and whose corresponding values are lists containing the input elements. This operation is some kind of aggregation, so you get java.util.List as a result, no java.util.stream.Stream.
Both StreamEx and jOOλ provides some methods for grouping streams. Let’s start from StreamEx groupingBy operation example. Assuming we have an input stream of PersonDTO objects we will group them by person’s home city.

StreamEx<PersonDTO> s1 = StreamEx.of(
	PersonDTO.builder().id(1).firstName("John").lastName("Smith").city("London").street("Street1").houseNo("100").build(),
	PersonDTO.builder().id(2).firstName("Tom").lastName("Hamilton").city("Manchester").street("Street1").houseNo("101").build(),
	PersonDTO.builder().id(3).firstName("Paul").lastName("Walker").city("London").street("Street2").houseNo("200").build(),
	PersonDTO.builder().id(4).firstName("Joan").lastName("Collins").city("Manchester").street("Street2").houseNo("201").build()
);
Map<String, List<PersonDTO>> m = s1.groupingBy(PersonDTO::getCity);
Assertions.assertNotNull(m.get("London"));
Assertions.assertTrue(m.get("London").size() == 2);
Assertions.assertNotNull(m.get("Manchester"));
Assertions.assertTrue(m.get("Manchester").size() == 2);

The result of similar jOOλ groupBy method is the same. It also returns multiple java.util.List objects inside map.

Seq<PersonDTO> s1 = Seq.of(
		PersonDTO.builder().id(1).firstName("John").lastName("Smith").city("London").street("Street1").houseNo("100").build(),
		PersonDTO.builder().id(2).firstName("Tom").lastName("Hamilton").city("Manchester").street("Street1").houseNo("101").build(),
		PersonDTO.builder().id(3).firstName("Paul").lastName("Walker").city("London").street("Street2").houseNo("200").build(),
		PersonDTO.builder().id(4).firstName("Joan").lastName("Collins").city("Manchester").street("Street2").houseNo("201").build()
);
Map<String, List<PersonDTO>> m = s1.groupBy(PersonDTO::getCity);
Assertions.assertNotNull(m.get("London"));
Assertions.assertTrue(m.get("London").size() == 2);
Assertions.assertNotNull(m.get("Manchester"));
Assertions.assertTrue(m.get("Manchester").size() == 2);

4. Multiple Concatenation

That’s a pretty simple scenario. Java Stream API provides static method for concatenation, but only for two streams. Sometimes it is convenient to concat multiple streams in a single step. Guava and jOOλ provides dedicated method for that.
Here’s the example of calling concat method with jOOλ:

Seq<Integer> s1 = Seq.of(1, 2, 3);
Seq<Integer> s2 = Seq.of(4, 5, 6);
Seq<Integer> s3 = Seq.of(7, 8, 9);
Seq<Integer> s4 = Seq.concat(s1, s2, s3);
Assertions.assertEquals(9, s4.count());

And here’s similar example for Guava:

Stream<Integer> s1 = Stream.of(1, 2, 3);
Stream<Integer> s2 = Stream.of(4, 5, 6);
Stream<Integer> s3 = Stream.of(7, 8, 9);
Stream<Integer> s4 = Streams.concat(s1, s2, s3);
Assertions.assertEquals(9, s4.count());

5. Partitioning

Partitioning operation is very similar to grouping, but divides input stream into two lists or streams, where elements in the first list fulfill a given predicate, while elements in the second list does not.
The StreamEx partitioningBy method will returned two List objects inside Map.

StreamEx<PersonDTO> s1 = StreamEx.of(
		PersonDTO.builder().id(1).firstName("John").lastName("Smith").city("London").street("Street1").houseNo("100").build(),
		PersonDTO.builder().id(2).firstName("Tom").lastName("Hamilton").city("Manchester").street("Street1").houseNo("101").build(),
		PersonDTO.builder().id(3).firstName("Paul").lastName("Walker").city("London").street("Street2").houseNo("200").build(),
		PersonDTO.builder().id(4).firstName("Joan").lastName("Collins").city("Manchester").street("Street2").houseNo("201").build()
);
Map<Boolean, List<PersonDTO>> m = s1.partitioningBy(dto -> dto.getStreet().equals("Street1"));
Assertions.assertTrue(m.get(true).size() == 2);
Assertions.assertTrue(m.get(false).size() == 2);

In opposition to StreamEx jOOλ is returning two streams (Seq) inside Tuple2 object. This approach has one big advantage over StreamEx – you can still invoke stream operations on a result without any conversions.

Seq<PersonDTO> s1 = Seq.of(
		PersonDTO.builder().id(1).firstName("John").lastName("Smith").city("London").street("Street1").houseNo("100").build(),
		PersonDTO.builder().id(2).firstName("Tom").lastName("Hamilton").city("Manchester").street("Street1").houseNo("101").build(),
		PersonDTO.builder().id(3).firstName("Paul").lastName("Walker").city("London").street("Street2").houseNo("200").build(),
		PersonDTO.builder().id(4).firstName("Joan").lastName("Collins").city("Manchester").street("Street2").houseNo("201").build()
);
Tuple2<Seq<PersonDTO>, Seq<PersonDTO>> t = s1.partition(dto -> dto.getStreet().equals("Street1"));
Assertions.assertTrue(t.v1.count() == 2);
Assertions.assertTrue(t.v2.count() == 2);

6. Aggregation

Only jOOλ provides some methods for streams aggregation. For example we can count sum, avg or median. Since jOOλ is a part of jOOQ it is targeted to be used for object oriented queries, and in fact provides many operations which correspond to the SQL SELECT clauses.
The fragment of source code visible below illustrates how easily we can count a sum of selected field in the stream of objects, on the example of all the persons age.

Seq<Person> s1 = Seq.of(
	new Person(1, "John", "Smith", 35),
	new Person(2, "Tom", "Hamilton", 45),
	new Person(3, "Paul", "Walker", 20)
);
Optional<Integer> sum = s1.sum(Person::getAge);
Assertions.assertEquals(100, sum.get());

7. Pairing

StreamEx allows you process pairs of adjacent objects in the stream and apply a given function on them. It may be achieved by using pairMap function. In the fragment of code visible below I’m counting the sum for each pair of adjacent numbers in the stream.

StreamEx<Integer> s1 = StreamEx.of(1, 2, 1, 2, 1);
StreamEx<Integer> s2 = s1.pairMap(Integer::sum);
s2.forEach(i -> Assertions.assertEquals(3, i));

Summary

While Guava Streams is just a part of bigger Google’s library, StreamEx and jOOλ are strictly dedicated for lambda streams. In comparison to other libraries described in this article jOOλ provides the largest number of features and operations. If you are looking for library that helps you in performing OO operations on streams jOOλ is definitely for you. Unlike the other it provides operations, for example for joining or aggregation. StreamEx also provides many useful operations for manipulating the streams. It is not related with object oriented queries and SQL, so you won’t find there methods for out of order joins or aggregation what does not change the fact that it is very useful and worth to recommend library. Guava provides relatively small number of features for streams. However, if you have already used it in your application, it could a nice addition for manipulating the streams. The source code snippets with examples of usage may be found on GitHub in the repository https://github.com/piomin/sample-java-playground.git.

One thought on “Overview of Java Stream API Extensions

  1. OMG, just use Ix java and stop suffering. IMO the underlying parallel coupling was a tremendous design error in Java streams making it unnecessary complex, I have never seen an actual use case, specially if you compare it with the number of times used in secuential situations. Ix is much complete, simpler and easier to extended.
    https://github.com/akarnokd/ixjava

    Liked by 1 person

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.