Cepheus, a computer program created by the Computer Science Department at the University of Alberta in Canada, has “solved” hold’em poker, its makers claim.
Or at least, the heads-up fixed limit variation thereof. And like all the best poker players, Cepheus is completely self-taught. The program, or bot, uses a completely new algorithm, known as CFT, which is able to analyze extensive-form games of incomplete information, such as poker, on a magnitude never before realized.
In layman’s terms, that means that Cepheus is able to play literally trillions of hands of poker against itself, learning and memorizing optimal moves as it goes, rendering it virtually unbeatable. In the words of its creators, it has created a game theory strategy so close to optimal that it “can’t be beaten with statistical significance within a lifetime of human poker playing.”
Of course, because poker is a game of incomplete information, it can never be said there is a “perfect” game theory strategy, merely an approximation of a perfect strategy, which is essentially a strategy that can’t be beaten by any other counter-strategy.
And the Alberta team think it’s cracked it.
24 Trillion Hands
“We’re not quite perfect, but we’re so close that even after a lifetime of playing against it, you wouldn’t know it wasn’t perfect,” said Professor Michael Bowling, whose team created Cepheus, and published its study in the journal Science this week.
“Even if you played 60 million hands of poker for 70 years, 12 hours a day, and never made any mistakes, you still wouldn’t be able to say with statistical confidence you were better than this program,” the study claims.
“Our model has spent two months playing poker again and again. It’s playing 24 trillion hands of Poker, every second for two months. That’s more poker hands than all of humanity, so in some sense it’s not surprising that it has developed the perfect strategy.”
Cepheus has been a long time in development, as the team have been engaged in this type of research since the nineties when it built its first bot, “Loki.” The team’s refined second attempt, known as “Poki” later provided the artificial intelligence for the video game, Stacked.
A later incarnation, “PsOpti,” powered Poker Academy, an AI training software program popular in the mid-to-late noughties. But Cepheus has blown its predecessors out of the water and the publication of the University of Alberta’s study is being hailed as a momentous day in the development of game theory.
Dealer Advantage Quantified
An interesting offshoot of the research is that the team have been able to quantify, for the first time, the the dealer advantage in heads-up limit hold’em, having had the opportunity to analyze it over trillions of hands.
“We actually can now prove that the dealer has an advantage of what we call ’88 millablinds’ per game,” explained team member Michael Johanson. “That’s .088 of a big blind per game.”
And the best bit is that you can now play against the best heads up-fixed limit poker player in the cosmos, and query its strategy if you dare, at poker.srv.ualberta.ca.