Creating a Pivot Table with Java Steams

Creating a Pivot Table with Java Steams

[51CTO.com Quick Translation] This time, let's see how to use Java 8 Streams to implement a pivot table. Usually, the raw data itself is not easy for people to understand, so we need to perform some data aggregation operations to identify various regular patterns in the raw data. The pivot table is such a tool, which uses aggregation methods to display various visual graphs and charts.

[[189685]]

In previous articles, we have shown how to use Java 8 Streams to perform SQL-like effects on raw data. Today's article builds on those examples. If you find these a bit difficult for you, I suggest you browse those two articles first through the following links.

https://dzone.com/articles/java-streams-groupingby-examples

https://dzone.com/articles/using-java-collectors

If you don't like this "raw data" approach to creating a pivot table and prefer to use Excel, I have provided you with an alternative solution, please refer to the following link:

http://www.novixys.com/blog/excel-pivot-table-using-apache-poi/

Representing CSV data as POJO

We use the following POJO (Plain Ordinary Java Object) to represent baseball players and their salaries.

public class Player {

private int year;

private String teamID;

private String lgID;

private String playerID;

private int salary;

// defined getters and setters here

}

The data comes from a simple CSV file that has no quoted fields, no multi-line fields, and no commas within a single field. We can use a simple regular expression pattern to parse the CSV file and load the data into a list. The data looks like this, and it has a total of about 26428 rows:

yearID, teamID, lgID, playerID, salary

198***TL,NL,barkele01,870000

198***TL,NL,bedrost01,550000

198***TL,NL,benedbr01,545000

198***TL,NL,campri01,633333

198***TL,NL,ceronri01,625000

...

We load the CSV data using Streams similar to the following code:

Pattern pattern = Pattern.compile(",");

try (BufferedReader in = new BufferedReader(new FileReader(filename));){

List<Player> players = in

.lines()

.skip(1)

.map(line -> {

String[] arr = pattern.split(line);

return new Player(Integer.parseInt(arr[0]),

arr[1],

arr[2],

arr[3],

Integer.parseInt(arr[4]));

})

.collect(Collectors.toList());

}

Class that defines pivot table columns

We use the following class to define the columns of the pivot table. These columns are used to group data. If you are using SQL, these columns will appear in the "GROUP BY" statement.

public class YearTeam

{

public int year;

public String teamID;

public YearTeam(int year, String teamID) {

this.year = year;

this.teamID = teamID;

}

@Override

public boolean equals(Object other)

{

if (other == null) return false;

if ( this == other ) return true;

if (other instanceof YearTeam) {

YearTeam yt = (YearTeam)other;

if ( year == yt.year && teamID.equals(yt.teamID) )

return true;

}

return false;

}

@Override

public int hashCode()

{

int hash = 1;

hash = hash * 17 + year;

hash = hash * 31 + teamID.hashCode();

return hash;

}

@Override

public String toString()

{

StringBuilder sbuf = new StringBuilder();

sbuf.append('[').append(year).append(", ").append(teamID)

.append(']');

return sbuf.toString();

}

}

These fields are defined as "public" access properties for convenience only. For your own applications, you are free to make them "private" properties and add getters and/or setters as needed.

This class overrides equals() and hashCode() because it will be used as a key when storing a Map.

Grouping Data with Streams

Here we read the data from the CSV, create a POJO for each row, and group the data using the pivot column class defined above.

Map<YearTeam, List<Player>> grouped = in

.lines()

.skip(1)

.map(line -> {

String[] arr = pattern.split(line);

return new Player(Integer.parseInt(arr[0]),

arr[1],

arr[2],

arr[3],

Integer.parseInt(arr[4]));

})

.collect(Collectors.groupingBy(x-> new YearTeam(x.getYear(), x.getTeamID())));

At this point, the data has been correctly collected in a Map and grouped by the specified column items.

Print pivot table to CSV

Let’s print the data from the pivot table to a CSV file so that we can load it into Excel for comparison. When printing the data, we use the summingLong() function. Java 8 Streams also provides an averagingLong() function that lets you average the data. If you need to get it all at once quickly, summarizingLong() will give you all the information. Enjoy!

CSV column headers

We use the values ​​of teamID as the column headers. Collecting and printing them is as follows. We use TreeSet to sort them in alphabetical order.

Set<String> teams = grouped

.keySet()

.stream()

.map(x -> x.teamID)

.collect(Collectors.toCollection(TreeSet::new));

System.out.print(',');

teams.stream().forEach(t -> System.out.print(t + ","));

System.out.println();

Print data

This creates and prints a complete pivot table. For the team's annual totals, we extract them from the list of athletes and perform a sum and print operation.

Set<Integer> years = grouped

.keySet()

.stream()

.map(x -> x.year)

.collect(Collectors.toSet());

years

.stream()

.forEach(y -> {

System.out.print(y + ",");

teams.stream().forEach(t -> {

YearTeam yt = new YearTeam(y, t);

List<Player> players = grouped.get(yt);

if ( players != null ) {

long total = players

.stream()

.collect(Collectors.summingLong(Player::getSalary));

System.out.print(total);

}

System.out.print(',');

});

System.out.println();

});

Compare output from Excel

We load the CSV file into Excel and output the following data:

Comparing them using Excel's built-in pivot table function shows that the data is identical. (As shown below, for some reason the column "MON" appears at the front of Excel. Perhaps it's one of the many magical "features". Anyway, the values ​​are identical.)

That, my friends, is one way to create a pivot table using simple collections in Java. You can find more cool uses for it!

Summarize

Pivot table is indeed a very useful data summary tool. Most data analysis software, including Excel, use it. Here, we learned how to use Java 8 Streams to create the same data structure. At the same time, we also used grouping and summing to achieve this function.

【Original Title】Making Pivot Table Using Java Streams (Author: Jay Sridhar )

Original link: https://dzone.com/articles/java-pivot-table-using-streams

[Translated by 51CTO. Please indicate the original translator and source as 51CTO.com when reprinting on partner sites]

<<:  Aiti Tribe Story Collection (16): Best Practices for Technical People During Their Fatigue Period

>>:  In the tenth year, what choice did Gudong make?

Recommend

Wow! It turns out that our food contains so much "technology and hard work"!

Reviewer: Wang Guoyi Postdoctoral fellow in Nutri...

iPhone 6s and 6s Plus may look like this

[[131414]] In recent years, the release of each g...

The hotter the foot bath water, the better? The truth is——

Review expert: Wang Xuejiang, professor of pathop...

Watch is a powerful open source project abroad

Source code introduction: Watch is a powerful ope...

Understand Toutiao’s information flow ads in one article!

As the largest information flow platform at prese...