With a 99.41% win rate that crushes human chess players, will AI really beat humans this time?

With a 99.41% win rate that crushes human chess players, will AI really beat humans this time?

This time, AI beat humans again.

A research team led by Huawei Cloud AI CTO Dai Zonghong and Peking University AI Institute Assistant Professor Yang Yaodong has developed an algorithm that can crush human opponents with a 99.41% win rate in chess games - JiangJun (pronounced as "general").

The related research paper, titled “JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games”, has been published on the preprint website arXiv.

Using human players as opponents, and constantly trial and error, and iteration, is the common way of evolution of AI agents based on reinforcement learning. In recent years, considering that there are usually multiple agents at the same time in real scenes, researchers have extended their focus from the single-agent field to multi-agent.

In fact, multi-agent reinforcement learning has indeed achieved remarkable success in various game fields, and has been proven in games such as Hide and Seek (a game on Steam), Go, StarCraft II, Dota 2, and Military Chess.

However, algorithms like AlphaZero and AlphaGo, which focus on the recent performance of their opponents for training, may not be able to consistently win or achieve the desired state in games with non-transitive structures. Although this problem has been intensively studied in games with incomplete information, it has been relatively less studied in games with complete information.

Perfect Information Game: A game in which every participant has accurate information about the characteristics, strategies, and payoff functions of all other participants, such as chess.

Incomplete information game: at least one participant has incomplete knowledge of the above information, such as Western Army Chess.

Currently, overcoming the non-transitivity problem in perfect information games remains an unsolved research problem. Recent research focuses on using strategy space response predictor (PSRO) algorithms to find Nash equilibria, but these methods have not been explored in perfect information games.

The accessibility of chess makes it an excellent object for exploring board games and non-transitive geometry. This study deeply explores the complex geometric properties of chess, using a large-scale dataset of more than 10,000 human games to reveal the remarkable non-transitivity of chess in the transitive middle region.

To solve the non-transitivity problem, the researchers proposed the JiangJun algorithm, which, unlike AlphaZero's self-playing strategy, uses Nash responses to select opponents.

The JiangJun algorithm consists of two basic modules: MCTS Actor and Populationer. These components jointly use Monte Carlo Tree Search (MCTS) technology to approach Nash equilibrium within the player group.

The effectiveness of JiangJun's algorithm was comprehensively evaluated across a range of metrics. The researchers proposed a training framework that effectively leveraged the computing power of up to 90 V100 GPUs on the Huawei Cloud ModelArt platform to train the JiangJun algorithm to master-level performance.

Multiple metrics, including relative population performance, Nash distribution visualization, and low-dimensional game landscape visualization in two main embedding dimensions, together confirm the proficiency of JiangJun's algorithm in solving the chess non-transitivity problem.

In addition, the JiangJun algorithm significantly outperformed its contemporary algorithms in win rate, with win rates exceeding 85% and 96.40% respectively compared to standard AlphaZero chess and Behavior Clone chess. In the exploitability evaluation, the JiangJun algorithm (8.41% win rate of near-optimal response) was significantly closer to the optimal strategy than the standard AlphaZero chess algorithm (25.53%).

In addition, the researchers designed a chess applet on the WeChat platform, which collected more than 7,000 game records between the JiangJun algorithm and human opponents over a six-month period. According to the game data, the JiangJun algorithm defeated human opponents with an astonishing 99.41% winning rate.

In addition to its amazing win rate of nearly 100%, case studies of various endgames show that JiangJun's algorithm also has a strong ability to flexibly respond to the complexity of chess endgames.

The advent of the JiangJun algorithm marks an amazing achievement of AI in the field of chess. By solving the non-transitivity problem in complete information games, the research team successfully introduced Nash response and Monte Carlo tree search technology, bringing a new way of thinking to the field of chess. This algorithm not only achieves an amazing winning rate, but also demonstrates the powerful ability of AI in dealing with complex and uncertain problems.

Reference Links:

https://arxiv.org/abs/2308.04719

https://openreview.net/forum?id=MMsyqXIJuk

https://sites.google.com/view/jiangjun-site/

Author: Hazel Yan

<<:  [Smart Farmers] Taking multiple measures to tap potential and turn saline-alkali land into a fertile field

>>:  One picture to understand | In order to let the "melon-eating crowd" eat watermelon without spitting out the seeds, breeding experts have come up with this method

Recommend

How to do website SEO optimization? How to do Baidu SEO?

Especially in recent times, Baidu has been acting...

How to create a personal account on Douyin

If a personal Douyin account is successfully crea...

Yexiseo video tutorial: Technology-driven SEO

Compared with SEO open classes, video tutorials r...

How educational institutions increase followers and traffic through TikTok!

This article introduces the methods used by educa...

Mimi Meng, Youshu, Tongdao Dashu, Xin Shixiang... How do they attract fans?

We often say that the biggest difference between ...

Is it always fun to be on vacation? Not necessarily!

The May Day holiday is coming to an end in the bl...

Advertising: Can your ads be seen by users?

What are we talking about when we talk about ad v...

Flurry: Mobile gamers spend less time playing but more money

[[147285]] Flurry, a mobile application data anal...

Don’t throw away these kitchen wastes, you can eat them after hydroponics!

In the past two months, how to stock up on vegeta...

How much does it cost to invest in Anshan Sports Mini Program?

How much does it cost to attract investment in An...