Intelligence

Computer Program Beats European Go Champion

There are implications both for the psychology of expertise and the world of Go.

Posted February 9, 2016

In 1997, chess computer Deep Blue beat chess World Champion Gary Kasparov in a six-game match. This result was felt as a big blow for human pride – chess had been seen as a symbol of human unique intellect. Licking its wounds, human kind searched for another game to replace chess as its symbol of intelligence. It chose the Asian game of Go.

Go is played on a 19 x 19 board, between two players (Black and White). Once placed, a piece (called a “stone”) cannot be moved again. The aim of the game is to gain more territory than the opponent, by surrounding its stones. The rules are simple, but the game is devilishly complex, much more so than chess (Gobet, de Voogt, & Retschitzki, 2004): there are 10¹⁷² number possible positions (one followed by 172 zeroes), many more than the number of atoms in the known universe. By comparison, the number of positions in chess is "only" 10⁴³.

Compared to other board games such as chess and checkers, Go is more strategic and less tactical. That is, long-term plans dominate short-term combinations. This is due to the large size of the Go board, and to the fact that the stones do not move once placed on the board. One consequence is that the game taps into aspects of cognition where humans are strong (pattern recognition, intuition, planning) and where computers have traditionally struggled. By contrast, the game does not suit computers’ traditional strengths, most notably the ability to systematically search a large number of states by brute force.

Thus, while computers have long been stronger than humans in games such as chess, Othello and checkers, they had been rather poor at Go, being unable to progress beyond the level of a good amateur. A major breakthrough came in 2006, when computer programs drastically increased their strength with a simple but surprising technique called Monte-Carlo tree search (Lee et al., 2009). Rather than searching the tree of possible moves in a systematic way, this method generates games by randomly picking moves for the two players. The intuition is that, if a move in the current position is better than the alternatives, this move should lead to better results on average, when many such games are played, even though each individual move is selected randomly. With more sophisticated variations of this technique, the choice of moves is biased by previous experience.

Breakthrough with AlphaGo

At the end of last January, the journal Nature reported another breakthrough (Silver et al., 2016). The program AphaGo, developed by Google DeepMind, not only trashed all the best other Go programs (99.8 % of wins), but it also defeated Fan Hui, a professional Go player who had won the European Championship three times. The result was brutally clear: five to nil.

AlphaGo uses a combination of three artificial-intelligence techniques: Monte Carlo tree search, which we have just discussed, Deep Learning, and reinforcement learning. Deep Learning consists of adjusting the weights of an artificial neural network, using techniques recently developed (LeCun, Bengio, & Hinton, 2015). AlphaGo uses two networks: the first suggests a move in a given position, and the second evaluates the position as a whole. The program first learns by scanning a large number of master games (30 million positions). Then, it plays a large number of games against itself, tuning the weights of its networks using a technique called reinforcement learning. This technique uses the feedback obtained by the outcome of games to further learn. Reinforcement learning had already been used successfully to produce top-level programs in several board games, including backgammon (Tesauro, 1995). The entire learning is computationally very expensive and requires powerful computers.

When playing an opponent, AlphaGo uses its two networks to evaluate positions and bias the selection of moves such that it selects moves that turned out useful in the past. The program does some planning, with Monte Carlo tree search. The beauty of this approach is that AlphaGo uses only knowledge that it has learned itself. This contrasts, for example, with Deep Blue, which uses a lot of knowledge hand-coded by its programmers (Campbell, Hoane, & Hsu, 2002).

Lessons for human expertise

What does AlphaGo tell us about human expertise? What are the implications for the world of Go? A first important result is that AlphaGo confirms the importance of pattern recognition and intuition in board games and presumably in other domains of expertise. Using only its pattern recognition ability, and without using any search, AlphaGo still beats most of the computer programs. This is not surprising, given that Go is a strategic game, but the way AlphaGo is able to capture this aspect of human expertise so well is impressive. The importance of pattern recognition in human experts has been long emphasized by several researchers (e.g. Adriaan De Groot, Herbert A. Simon, and Hubert Dreyfus), even when there were important differences in the specifics of their theories (for details, see Gobet & Chassy, 2009).

By contrast, this project does not tell much about human planning and search. Monte Carlo tree search is not very human-like: even experts simply do not generate thousands of (pseudo-)random games, collecting statistics on the way. They carry out more subtle and selective search, where pattern-recognition is intertwined with look-ahead search (Gobet, 1997). While Alpha-Go does use its knowledge to search selectively, it does so much less than humans.

Intelligence Essential Reads

How Do You Know What You Know?

How Gifted Children Change in the Teenage Years

Computers have changed the way chess is played at the top level. They have opened up new conceptual avenues and exposed shocking limits in expert play. As a consequence of playing against computers, using computers for practicing, and using computerized databases, the quality of play has improved markedly in the last two decades. Opening variations that were thought unplayable are now employed, and others that were thought satisfactory have been refuted by computer analyses. Another consequence, this time an unwelcome one, is the emergence of cheating using computers. It will be interesting to see whether similar developments will occur with Go.

It is highly unlikely there will be universal acceptance of artificial intelligence as superior to human intellect. People will develop new games and activities in a bid to preserve human ascendancy over computers. This will lead to even better computer techniques. This arms race between human intelligence and computer intelligence will lead to an increased understanding of human and artificial intelligence, for the benefit of both.

The next challenge

While AlphaGo’s performance is remarkable, one must remember that it has not beaten the world champion (yet). Although European champion, Fan Hui is “only” a 2 dan professional, and thus clearly weaker than top-level Go professional, who are ranked 9 dan. This is roughly equivalent to the difference, in chess, between a Master and a world-class Grandmaster. In other words, a 9-dan professional is likely to win more than 95% of the time against a 2-dan professional.

So, what is the real strength of AlphaGo? We shall know soon, as a match has been organized between AlphaGo and Lee Se-dol, a 9-dan South Korean professional considered to be one of the best players in the world. While the team behind AlphaGo is optimistic that it will win, Go masters believe the human mind will prevail. So does Jonathan Schaeffer, a computer scientist who has contributed to several breakthroughs in computer games: "Think of AlphaGo as a child prodigy. All of a sudden it has learned to play really good Go, very quickly. But it doesn’t have a lot of experience. What we saw in chess and checkers is that experience counts for a lot."

Fernand Gobet and Morgan Ereku

References

Campbell, M., Hoane, A. J., & Hsu, F. H. (2002). Deep Blue. Artificial Intelligence, 134, 57-83.

Gobet, F. (1997). A pattern-recognition theory of search in expert problem solving. Thinking and Reasoning, 3, 291-313.

Gobet, F., & Chassy, P. (2009). Expertise and intuition: A tale of three theories. Minds & Machines, 19, 151-180.

Gobet, F., de Voogt, A. J., & Retschitzki, J. (2004). Moves in mind. Hove, UK: Psychology Press.

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521, 436-444.

Lee, C.-S., Wang, M.-H., Chaslot, G., Hoock, J.-B., Rimmel, A., Teytaud, O., et al. (2009). The computational intelligence of MoGo revealed in Taiwan's computer Go tournaments. IEEE Transactions on Computational Intelligence and AI in Games, 1, 73-89.

Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529, 484-489.

Tesauro, G. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38, 58-68.