Creating a Pivot Table with Java Steams

Creating a Pivot Table with Java Steams

[51CTO.com Quick Translation] This time, let's see how to use Java 8 Streams to implement a pivot table. Usually, the raw data itself is not easy for people to understand, so we need to perform some data aggregation operations to identify various regular patterns in the raw data. The pivot table is such a tool, which uses aggregation methods to display various visual graphs and charts.

[[189685]]

In previous articles, we have shown how to use Java 8 Streams to perform SQL-like effects on raw data. Today's article builds on those examples. If you find these a bit difficult for you, I suggest you browse those two articles first through the following links.

https://dzone.com/articles/java-streams-groupingby-examples

https://dzone.com/articles/using-java-collectors

If you don't like this "raw data" approach to creating a pivot table and prefer to use Excel, I have provided you with an alternative solution, please refer to the following link:

http://www.novixys.com/blog/excel-pivot-table-using-apache-poi/

Representing CSV data as POJO

We use the following POJO (Plain Ordinary Java Object) to represent baseball players and their salaries.

public class Player {

private int year;

private String teamID;

private String lgID;

private String playerID;

private int salary;

// defined getters and setters here

}

The data comes from a simple CSV file that has no quoted fields, no multi-line fields, and no commas within a single field. We can use a simple regular expression pattern to parse the CSV file and load the data into a list. The data looks like this, and it has a total of about 26428 rows:

yearID, teamID, lgID, playerID, salary

198***TL,NL,barkele01,870000

198***TL,NL,bedrost01,550000

198***TL,NL,benedbr01,545000

198***TL,NL,campri01,633333

198***TL,NL,ceronri01,625000

...

We load the CSV data using Streams similar to the following code:

Pattern pattern = Pattern.compile(",");

try (BufferedReader in = new BufferedReader(new FileReader(filename));){

List<Player> players = in

.lines()

.skip(1)

.map(line -> {

String[] arr = pattern.split(line);

return new Player(Integer.parseInt(arr[0]),

arr[1],

arr[2],

arr[3],

Integer.parseInt(arr[4]));

})

.collect(Collectors.toList());

}

Class that defines pivot table columns

We use the following class to define the columns of the pivot table. These columns are used to group data. If you are using SQL, these columns will appear in the "GROUP BY" statement.

public class YearTeam

{

public int year;

public String teamID;

public YearTeam(int year, String teamID) {

this.year = year;

this.teamID = teamID;

}

@Override

public boolean equals(Object other)

{

if (other == null) return false;

if ( this == other ) return true;

if (other instanceof YearTeam) {

YearTeam yt = (YearTeam)other;

if ( year == yt.year && teamID.equals(yt.teamID) )

return true;

}

return false;

}

@Override

public int hashCode()

{

int hash = 1;

hash = hash * 17 + year;

hash = hash * 31 + teamID.hashCode();

return hash;

}

@Override

public String toString()

{

StringBuilder sbuf = new StringBuilder();

sbuf.append('[').append(year).append(", ").append(teamID)

.append(']');

return sbuf.toString();

}

}

These fields are defined as "public" access properties for convenience only. For your own applications, you are free to make them "private" properties and add getters and/or setters as needed.

This class overrides equals() and hashCode() because it will be used as a key when storing a Map.

Grouping Data with Streams

Here we read the data from the CSV, create a POJO for each row, and group the data using the pivot column class defined above.

Map<YearTeam, List<Player>> grouped = in

.lines()

.skip(1)

.map(line -> {

String[] arr = pattern.split(line);

return new Player(Integer.parseInt(arr[0]),

arr[1],

arr[2],

arr[3],

Integer.parseInt(arr[4]));

})

.collect(Collectors.groupingBy(x-> new YearTeam(x.getYear(), x.getTeamID())));

At this point, the data has been correctly collected in a Map and grouped by the specified column items.

Print pivot table to CSV

Let’s print the data from the pivot table to a CSV file so that we can load it into Excel for comparison. When printing the data, we use the summingLong() function. Java 8 Streams also provides an averagingLong() function that lets you average the data. If you need to get it all at once quickly, summarizingLong() will give you all the information. Enjoy!

CSV column headers

We use the values ​​of teamID as the column headers. Collecting and printing them is as follows. We use TreeSet to sort them in alphabetical order.

Set<String> teams = grouped

.keySet()

.stream()

.map(x -> x.teamID)

.collect(Collectors.toCollection(TreeSet::new));

System.out.print(',');

teams.stream().forEach(t -> System.out.print(t + ","));

System.out.println();

Print data

This creates and prints a complete pivot table. For the team's annual totals, we extract them from the list of athletes and perform a sum and print operation.

Set<Integer> years = grouped

.keySet()

.stream()

.map(x -> x.year)

.collect(Collectors.toSet());

years

.stream()

.forEach(y -> {

System.out.print(y + ",");

teams.stream().forEach(t -> {

YearTeam yt = new YearTeam(y, t);

List<Player> players = grouped.get(yt);

if ( players != null ) {

long total = players

.stream()

.collect(Collectors.summingLong(Player::getSalary));

System.out.print(total);

}

System.out.print(',');

});

System.out.println();

});

Compare output from Excel

We load the CSV file into Excel and output the following data:

Comparing them using Excel's built-in pivot table function shows that the data is identical. (As shown below, for some reason the column "MON" appears at the front of Excel. Perhaps it's one of the many magical "features". Anyway, the values ​​are identical.)

That, my friends, is one way to create a pivot table using simple collections in Java. You can find more cool uses for it!

Summarize

Pivot table is indeed a very useful data summary tool. Most data analysis software, including Excel, use it. Here, we learned how to use Java 8 Streams to create the same data structure. At the same time, we also used grouping and summing to achieve this function.

【Original Title】Making Pivot Table Using Java Streams (Author: Jay Sridhar )

Original link: https://dzone.com/articles/java-pivot-table-using-streams

[Translated by 51CTO. Please indicate the original translator and source as 51CTO.com when reprinting on partner sites]

<<:  Aiti Tribe Story Collection (16): Best Practices for Technical People During Their Fatigue Period

>>:  In the tenth year, what choice did Gudong make?

Recommend

How to become a great advertising marketer?

Those who have watched the American TV series &qu...

Product Operation: What should I do if DAU decreases?

This article would like to use a case of data dec...

Latest ranking of global app downloads: TikTok continues to top the list

[[337384]] Sensor Tower, a mobile application dat...

From PPT-onlyism to self-licking philosophy, how did Hisense crush Samsung?

Spring has arrived again. At the beginning of Mar...

Samsung S5 octa-core version exposed

In the past, Samsung's official documents acci...

Musk becomes the world's richest man, is the secret the "first principles"?

Musk is the richest man in the world, and he owns...

World Obesity Day丨"Puffy" or Strong, which type of obesity are you?

May 11 is the annual World Obesity Day. The homon...

Chocolate + soda =?

One day, I was eating chocolate and drinking soda...

Using an ancient seed from 1,000 years ago to cultivate a "mysterious tree"?

Author: Huang Yanhong Duan Yuechu According to a ...