Back in January, the Libratus poker AI defeated four top heads-up no limit hold’em specialists, marking the first time that a computer showed it could defeat high-level professional players.
Now, the team of researchers from Carnegie Mellon University that created the silicon shark have revealed just how they pulled off that feat.
In a paper being published by Science, Professor Tuomas Sandhold and Ph.D. student Noam Brown detailed how their artificial intelligence was not only able to compete with top human poker players, but ultimately defeat them, achieving a superhuman level of performance.
The paper details how the team managed to teach a computer to play a game of incomplete information, a decidedly different challenge than learned a complete information game like chess or go.
Three Algorithms Power Victory
One of the key steps taken was breaking down Libratus’ strategy into three distinct algorithms to cover various aspects of Texas Hold’em. The first used limit hold’em as a guide to learning how to play no limit as well.
Similar hands and bet sizes were grouped together in order to cut down on the size of the decision tree, which was one of the serious hurdles to overcome given the number of possibilities in a no limit game.
“Rather than consider every possible bet between $100 and $20,000, we could instead just consider increments of $100,” the researchers wrote. “This drastically reduces the complexity of solving the game.”
Secondly, the computer was programmed to use a more advanced strategic framework on the final two betting rounds (the turn and river). Known as “nested subgame solving,” this portion of the AI doesn’t abstract anything, instead considering each hand individually.
This portion of the program built on ideas from earlier poker AI efforts, but featured a number of complex advances that helped ensure that opponents couldn’t exploit the computer by changing their strategy.
Finally, Libratus included a self-improvement algorithm that was constantly improving the “blueprint strategy” that ran in the background and formed the basis of the machine’s play.
This self-improver used the decisions made by opponents to help decide where the computer should look to learn more about the game of poker and calculate game-theoretic strategies.
Libratus Rises Above Previous AI, Human Play
Once all of these features were in place, Libratus proved far stronger than the previous best poker AI, Baby Tartanian8, decisively defeating it in heads-up play by more than 6 big blinds per 100 hands.
In January, the program then defeated a team of human experts made up of Jason Les, Dong Kim, Daniel McCauley, and Jimmy Chou, winning the 120,000-hand Brains vs. AI match by about 15 big blinds per 100 hands.
While the research may initially seem like it is poker-specific, the researchers say their work will have impact well beyond the felt.
“The techniques that we developed are largely domain independent and can thus be applied to other strategic imperfect-information interactions, including non-recreational applications,” the team wrote.
The performance of Libratus mirrors that of another super-powered AI: Google DeepMind’s AlphaGo Zero. The program not only became the first to defeat top professional players in go, but more recently mastered the game of chess just four hours after learning the rules of the game, defeating the previous best computer chess program without losing once in a 100-game match.