AlphaGo's first bug: What's the Achilles' heel of the Go algorithm?

[[163852]]

No matter how terrifying a monster is, it has weaknesses. Why did AlphaGo make such a bad move that surprised everyone? What is the Achilles' heel of the second and third generations of Go artificial intelligence?

What is the Achilles' heel of the third generation of Go artificial intelligence?

Lee Sedol, a professional 9th dan who has won multiple world championships in the Korean Go world, faced off against Google's computer AlphaGo. After three consecutive defeats, he actually won a game with white pieces.

The public has taken this Go game too seriously, seeing it as a thorough contest between the human brain and the computer. In fact, it is not the case, because the competition is only about Go. If humans lose the Go game, it does not mean that the human brain will no longer be able to compete with computers.

In fact, although the game of Go is complex, it is not infinite in theory. After all, it can be seen as a math problem, which can be solved by programming computers. In the future, humans will sooner or later be unable to compete with computers in playing Go. Even if Lee Sedol can win two games after three consecutive defeats, what will happen? Sooner or later, humans will be unable to compete with artificial intelligence in playing Go.

The first generation of AI algorithms for Go used the exhaustive method, trying to calculate all possible moves and then choose the winning move. However, because there are too many variations in Go, this method is not currently possible.

The second-generation algorithm uses sampling evaluation to select the algorithm with the highest winning rate, which greatly reduces the amount of computing power and enables artificial intelligence to finally compete with amateur Go players.

Google AlphaGo is a third-generation algorithm that uses the self-learning ability of artificial intelligence to improve its chess skills by leaps and bounds. AlphaGo, which seemed to be able to crush the world's top professional players, unexpectedly lost in the fourth game due to bad moves. What went wrong?

There are problems with Alpha Go's algorithm. The problems with the third-generation algorithm are actually the problems with the second-generation algorithm, because the third-generation algorithm is based on the sampling evaluation of the second-generation algorithm and then self-learning. Even if Alpha Go can play chess with itself to improve its chess skills, the reason why it can play chess in this way is still based on sampling evaluation.

Sampling evaluation has its weaknesses. A move with a high winning rate is not necessarily the correct move, after all, it is just statistics. Some unpopular moves may actually lead to unexpected wins.

The fact that Alpha Go was thinking very quickly when it made a bad move just shows that the sample size was small at that time. Lee Sedol's 78th move was a surprise move, and there were very few chess players who could make this move. AlphaGo thought that Lee Sedol's chances of winning were not high according to the results of the program calculation, and responded very quickly. This was a huge mistake.

Go players who have played against Go programs all have this feeling: the computer is particularly confused when it is at a disadvantage. This is not only true for AlphaGo, but also for Zen. Because when the computer is in a low winning rate situation, it is difficult to think. According to sample statistics, when the winning rate is low, of course, there are more losses, and it is difficult to find samples of turning defeat into victory. The lower the winning rate, the more confused AlphaGo's thinking is. In extreme cases, perhaps the side with a low winning rate in the sample found has already resigned, and there is no subsequent chess record, so AlphaGo does not know what to do.

Alpha Go's failure is also a human failure, after all, the program is written by humans. On the other hand, the failure of AlphaGo is also caused by the fact that the algorithm is based on human samples. If there were samples to deal with Lee Sedol's move, AlphaGo would not have made such a wrong judgment.

Hassabis, the creator of Alpha Go, said: "AlphaGo's training is not specifically for Mr. Lee Sedol, but is just daily preparation like for ordinary chess players before a game. AlphaGo's preparation is to download a lot of amateur chess players' chess records from the Internet for study, there is nothing special about it." He then said, "It is also difficult to prepare for a specific chess player. We need at least millions or even hundreds of millions of chess records to provide to AlphaGo as a whole so that it can receive and conduct deep learning."

Hassabis's words also confirmed the Achilles' heel of the second and third generations of Go artificial intelligence, which is the problem of sample evaluation. There are too few samples of tricks like Lee Sedol's comeback. AlphaGo obviously needs a large number of high-quality and very comprehensive games of professional players as reference, which is not easy to do.

The problem of handicap in Go still needs to be solved

In the fourth game, Lee Sedol won with white pieces, which was incredible because black pieces had the advantage of first move and a higher chance of winning.

The man-machine Go game brought up a problem that has long troubled the Go community, which is the rule about giving points in the Go rules. People who don't play Go may not know this, but Go fans know it well. The black player has the first move and enjoys an advantage, so for the sake of fairness, the black player needs to give points to the white player.

In 1949 (Showa 24), the Nippon Go Association Go rules stipulated that the black player was given a 4.5-point handicap. Starting from the 3rd King's Tournament in 1955, the black player's handicap was changed from 4.5 points to 5.5 points. However, according to statistics, the black player still had the upper hand. As of the end of 2001, in the 15,000 official open Go games held by the Nippon Go Association in the past five years, the black player's winning rate reached 51.86% (with a 5.5-point handicap). Although the difference in the winning rate between black and white is not large, in the fiercely contested Go tournament, such a difference is enough to be fatal.

South Korea, which has a clear advantage in international chess competitions, was the first to adopt the 6.5-point system in most chess competitions. China also changed to 3 3/4 points (equivalent to 7.5 points) in all chess competitions starting in the spring of 2002. The Japanese Go Association also reformed the 50-year-old system of 5.5 points for black chess, and brought some competitions closer to China and South Korea. Starting in 2003, all black chess competitions adopted the 6.5-point rule.

By the end of 2014, there were 380 games in the world championships held in mainland China with a handicap of 3 3/4 points (equivalent to 7.5 points), of which black won 200 games, with a winning rate of 52.6% (the first three Chunlan Cups were equivalent to a handicap of 5.5 points, which were not included). In the Yingshi Cup held in Taiwan (with a handicap of 8 points, also equivalent to 7.5 points), black won 100 games and white won 97 games. This shows that even with a handicap of 7.5 points, black still seems to have a slight advantage.

So, how much handicap should Black give White to be absolutely fair? Currently, this value is only obtained based on statistics of a large number of human games, and it is not a perfect mathematical answer.

Perhaps, let AlphaGo learn from Zhou Botong and Guo Jing in Jin Yong's martial arts novels, play chess with its left hand and right hand, and then calculate the reasonable value of the goal through a large number of civil war results? No, this is not a perfect solution either.

As mentioned before, the reason why AlphaGo can play against itself is based on the sample evaluation of human games, which is no different from directly collecting statistics on human games.

The perfect solution is to go back to the beginning and use the most primitive exhaustive method to find the optimal solution to playing Go. Only in this way can the numerical value of the sticking point be thoroughly formulated. I just don't know when such a large amount of calculations can be realized?

We can imagine a scenario after Go is completely cracked: an international Go tournament begins under the watchful eyes of the crowd, the black player makes his first move, and the referee immediately announces that the white player does not need to make any more moves, the result of the game is already decided, and the winner or loser or the draw is clear. The audience cheers.

How does AlphaGo play Mahjong? A discussion on the randomness of board game design

Although humans lost the Go game, some netizens said that Chinese mahjong still has the wisdom to protect them (see the Titanium Media article "Winning the Go championship is nothing, does AlphaGo dare to challenge mahjong? | Titanium did it"). However, Chen Pei, founder of Zhongsou Network and champion of the Beijing Go Amateur Tournament, told the reporter who interviewed him: "If it was mahjong, humans would lose even more miserably! There are only so many mahjong tiles, and it is easy to calculate."

However, Chen Pei's words are actually wrong. Suppose there is a fool who doesn't know how to play mahjong, but he is extremely lucky and can draw the Thirteen Yao at the beginning of the game. Does AlphaGo have a trick to crack it?

Board games like mahjong involve an element of luck because the cards are drawn randomly.

Some netizens joked: If three people teamed up to play mahjong against AlphaGo, AlphaGo would lose miserably. Although this is cheating, it also reveals a big problem: in multiplayer games, the situation becomes extremely complicated, how can artificial intelligence cope with it?

For example, three people play mahjong with Alpha Go. Although no one cheats, Player A's skills are not good enough and he plays the wrong cards, causing Player B to gain an advantage. This is something Alpha Go cannot control.

In fact, AlphaGo’s learning to play Mahjong is based on a large number of human game samples. Alpha Go can only figure out what kind of playing style has a higher winning rate, but there is no so-called winning style in Mahjong.

The Go pieces are clearly visible on the board. Have you ever heard of "dark chess", which is to play with the chess pieces turned upside down? Land War Chess also has a dark chess gameplay. Mahjong is actually similar to dark chess, where the cards are covered up. What the covered cards are is still a guess. Alpha Go can only evaluate through calculations. The possibility of what the covered cards are is relatively large, and it is impossible to accurately infer the only result. In other words, Alpha Go will never evolve to the point of 100% victory.

There is no luck element in Go, but there is luck element in Mahjong, and both have their own charm.

In a broad sense, chess and card games such as Go and Mahjong actually belong to the category of board games. The English word for Go is "Go", and the "Go" in AlphaGo comes from this. Go ranks among the top 100 board game rankings of BGG, and is the highest ranking Chinese chess and card game.

In fact, among the people, chess and go, which are board games without luck, have gradually declined. Among the netizens who are paying attention to the man-machine game of Go, how many of them are Go fans?

Because what the public wants is more entertainment, and games like chess and Go, which have no luck element, have a relatively stable level of performance. A master is a master, and it is difficult for ordinary people to win, unless you can find someone who happens to be at the same level as you. Otherwise, it will be boring to play, because the strong are too strong and the weak are too weak.

Therefore, the design principle of emerging board games is to ensure randomness and have a luck component. Games like Magic: The Gathering, Yu-Gi-Oh and Hearthstone use random card drawing, and games like Ludo and Monopoly use dice rolling, which all generate random numbers and bring in luck. With the luck component, the levels of all players will not differ too much, and they can play together happily.

On the other hand, chess and card games (board games) that do not involve luck must have an optimal solution in theory. Once the optimal solution is revealed, people will definitely lose interest. Go has survived to this day because there are too many variations and there is still no optimal solution. Now, AlphaGo has not completely defeated Lee Sedol, and it seems that Go is still very tenacious.

<<: Why are programmers a little weird?

>>: How to use deep learning AI to detect and prevent malware and APT

Use celebrity quotes video accounts to earn over 10,000 yuan a month without any threshold

Insomniacs can't fall asleep, so they keep taking sleeping pills? The love-hate relationship between insomniacs and sleeping pills

Author: Jiang Molin Beijing Hospital of Tradition...

“Sewage” fish farming, low-carbon design, harmonious coexistence————the art of pumped storage engineering in the mountains

"The 40 or so young fish we released three y...

Will the novel coronavirus break out again after the May Day holiday? Does the new virus strain cause conjunctivitis? One article explains it all!

From the surge in the number of positive COVID-19...

AlphaGo's first bug: What's the Achilles' heel of the Go algorithm?

Use celebrity quotes video accounts to earn over 10,000 yuan a month without any threshold

Mobile phone repair toolbox-Jingyi Magic Box v12.1_Non-toxic software

“New wind” blows towards Chinese automobiles, and new forces emerge!

How to grow the "Zebra AI Class" app!

Eating like this will make you sick! 6 bad eating habits that will "eat away" your immunity...

How to create a WeChat mini program? Which mini program development company in Linxia is better?

The nucleic acid test sampling cotton swab contains carcinogens? Here comes the rumor debunking!

China Mobile confirms it will reduce subsidies for purchasing phones and will mainly sell bare phones in the future

How to continuously obtain seed users?

What happened with Meituan’s response to the cancellation of Alipay payment? Why did Meituan cancel Alipay payment?

Recommend

Douyin short video APP competitive product analysis report!

How to do it in the community? Selling community, get 2000+ accurate users with 0 budget!

Event planning tips!

Insomniacs can't fall asleep, so they keep taking sleeping pills? The love-hate relationship between insomniacs and sleeping pills

Why does Musk want to take Tesla private? Three reasons why Musk wants to "sell" the company

What is it that prevents me from getting “young people’s first cup of Moutai”?

A quick guide from C++ to Objective-C

How to do holiday marketing most effectively?

How do K12 online education companies build their own distribution systems?

Catching fireflies to make lanterns is beautiful, but it is really difficult to use...｜BoLan Daily

What is the difference between a good operator and an ordinary junior operator?

How can physical stores tap into private domain traffic?

How to explain sleeping in using physics?

“Sewage” fish farming, low-carbon design, harmonious coexistence————the art of pumped storage engineering in the mountains

Will the novel coronavirus break out again after the May Day holiday? Does the new virus strain cause conjunctivitis? One article explains it all!