Creating a Pivot Table with Java Steams

Creating a Pivot Table with Java Steams

[51CTO.com Quick Translation] This time, let's see how to use Java 8 Streams to implement a pivot table. Usually, the raw data itself is not easy for people to understand, so we need to perform some data aggregation operations to identify various regular patterns in the raw data. The pivot table is such a tool, which uses aggregation methods to display various visual graphs and charts.

[[189685]]

In previous articles, we have shown how to use Java 8 Streams to perform SQL-like effects on raw data. Today's article builds on those examples. If you find these a bit difficult for you, I suggest you browse those two articles first through the following links.

https://dzone.com/articles/java-streams-groupingby-examples

https://dzone.com/articles/using-java-collectors

If you don't like this "raw data" approach to creating a pivot table and prefer to use Excel, I have provided you with an alternative solution, please refer to the following link:

http://www.novixys.com/blog/excel-pivot-table-using-apache-poi/

Representing CSV data as POJO

We use the following POJO (Plain Ordinary Java Object) to represent baseball players and their salaries.

public class Player {

private int year;

private String teamID;

private String lgID;

private String playerID;

private int salary;

// defined getters and setters here

}

The data comes from a simple CSV file that has no quoted fields, no multi-line fields, and no commas within a single field. We can use a simple regular expression pattern to parse the CSV file and load the data into a list. The data looks like this, and it has a total of about 26428 rows:

yearID, teamID, lgID, playerID, salary

198***TL,NL,barkele01,870000

198***TL,NL,bedrost01,550000

198***TL,NL,benedbr01,545000

198***TL,NL,campri01,633333

198***TL,NL,ceronri01,625000

...

We load the CSV data using Streams similar to the following code:

Pattern pattern = Pattern.compile(",");

try (BufferedReader in = new BufferedReader(new FileReader(filename));){

List<Player> players = in

.lines()

.skip(1)

.map(line -> {

String[] arr = pattern.split(line);

return new Player(Integer.parseInt(arr[0]),

arr[1],

arr[2],

arr[3],

Integer.parseInt(arr[4]));

})

.collect(Collectors.toList());

}

Class that defines pivot table columns

We use the following class to define the columns of the pivot table. These columns are used to group data. If you are using SQL, these columns will appear in the "GROUP BY" statement.

public class YearTeam

{

public int year;

public String teamID;

public YearTeam(int year, String teamID) {

this.year = year;

this.teamID = teamID;

}

@Override

public boolean equals(Object other)

{

if (other == null) return false;

if ( this == other ) return true;

if (other instanceof YearTeam) {

YearTeam yt = (YearTeam)other;

if ( year == yt.year && teamID.equals(yt.teamID) )

return true;

}

return false;

}

@Override

public int hashCode()

{

int hash = 1;

hash = hash * 17 + year;

hash = hash * 31 + teamID.hashCode();

return hash;

}

@Override

public String toString()

{

StringBuilder sbuf = new StringBuilder();

sbuf.append('[').append(year).append(", ").append(teamID)

.append(']');

return sbuf.toString();

}

}

These fields are defined as "public" access properties for convenience only. For your own applications, you are free to make them "private" properties and add getters and/or setters as needed.

This class overrides equals() and hashCode() because it will be used as a key when storing a Map.

Grouping Data with Streams

Here we read the data from the CSV, create a POJO for each row, and group the data using the pivot column class defined above.

Map<YearTeam, List<Player>> grouped = in

.lines()

.skip(1)

.map(line -> {

String[] arr = pattern.split(line);

return new Player(Integer.parseInt(arr[0]),

arr[1],

arr[2],

arr[3],

Integer.parseInt(arr[4]));

})

.collect(Collectors.groupingBy(x-> new YearTeam(x.getYear(), x.getTeamID())));

At this point, the data has been correctly collected in a Map and grouped by the specified column items.

Print pivot table to CSV

Let’s print the data from the pivot table to a CSV file so that we can load it into Excel for comparison. When printing the data, we use the summingLong() function. Java 8 Streams also provides an averagingLong() function that lets you average the data. If you need to get it all at once quickly, summarizingLong() will give you all the information. Enjoy!

CSV column headers

We use the values ​​of teamID as the column headers. Collecting and printing them is as follows. We use TreeSet to sort them in alphabetical order.

Set<String> teams = grouped

.keySet()

.stream()

.map(x -> x.teamID)

.collect(Collectors.toCollection(TreeSet::new));

System.out.print(',');

teams.stream().forEach(t -> System.out.print(t + ","));

System.out.println();

Print data

This creates and prints a complete pivot table. For the team's annual totals, we extract them from the list of athletes and perform a sum and print operation.

Set<Integer> years = grouped

.keySet()

.stream()

.map(x -> x.year)

.collect(Collectors.toSet());

years

.stream()

.forEach(y -> {

System.out.print(y + ",");

teams.stream().forEach(t -> {

YearTeam yt = new YearTeam(y, t);

List<Player> players = grouped.get(yt);

if ( players != null ) {

long total = players

.stream()

.collect(Collectors.summingLong(Player::getSalary));

System.out.print(total);

}

System.out.print(',');

});

System.out.println();

});

Compare output from Excel

We load the CSV file into Excel and output the following data:

Comparing them using Excel's built-in pivot table function shows that the data is identical. (As shown below, for some reason the column "MON" appears at the front of Excel. Perhaps it's one of the many magical "features". Anyway, the values ​​are identical.)

That, my friends, is one way to create a pivot table using simple collections in Java. You can find more cool uses for it!

Summarize

Pivot table is indeed a very useful data summary tool. Most data analysis software, including Excel, use it. Here, we learned how to use Java 8 Streams to create the same data structure. At the same time, we also used grouping and summing to achieve this function.

【Original Title】Making Pivot Table Using Java Streams (Author: Jay Sridhar )

Original link: https://dzone.com/articles/java-pivot-table-using-streams

[Translated by 51CTO. Please indicate the original translator and source as 51CTO.com when reprinting on partner sites]

<<:  Aiti Tribe Story Collection (16): Best Practices for Technical People During Their Fatigue Period

>>:  In the tenth year, what choice did Gudong make?

Recommend

50 proposal skills that planners must have in 2020

Introduction丨Proposal is also a technical job A g...

Danger! The same old notch screen...

Looking into the future from the current point of...

Note to developers: Android M's permission mechanism

Introduction Before the release of Android M, the...

Image social networking is counterattacking general social networking

Preface: If we look back at the original intentio...

Inphic i5 quad-core official in-depth review, the box is also cute

Hello guys! I am back again! Previously, one of ou...

Can Cordyceps sinensis and American ginseng be eaten together?

Whether it is Cordyceps or American ginseng, they...

How do brands follow hot topics? 3 methods

In the short video operation tutorials or practic...

What is Wenchang Tower? How to choose Wenchang Tower?

Wenchang Tower is one of the common Feng Shui orn...

Tips for placing Qianchuan ads on Douyin stores

The short video sales model has become a trend, e...

ExoPlayer supports multiple media formats and streaming protocols

ExoPlayer Introduction ExoPlayer is an open sourc...