Heads-up limit hold’em poker is solved

Here is a bit more on the poker story Tyler mentioned yesterday.

An important variant of poker, heads-up limit hold’em (HULHE), has been essentially solved–meaning that a computer can now play the game so well that it wouldn’t lose much and might even win against a theoretically perfect player over a lifetime of play. Solving the game required new algorithms and significant computational power.

HULHE has 3.16 × 1017 possible states the game can reach, making it larger than Connect Four and smaller than checkers. However, because HULHE is an imperfect-information game, many of these states cannot be distinguished by the acting player, as they involve information about unseen past events (i.e., private cards dealt to the opponent). As a result, the game has 3.19 × 1014 decision points where a player is required to make a decision….There are two challenges for established CFR variants to handle games at this scale: memory and computation.

The solution supports some conventional strategies but also contains new insights:

Human players have disagreed about whether it may be desirable to “limp” (i.e., call as the very first action rather than raise) with certain hands. Conventional wisdom is that limping forgoes the opportunity to provoke an immediate fold by the opponent, and so raising is preferred. Our solution emphatically agrees (see the absence of blue in Fig. 4A). The strategy limps just 0.06% of the time and with no hand more than 0.5%. In other situations, the strategy gives insights beyond conventional wisdom, indicating areas where humans might improve. The strategy almost never “caps” (i.e., makes the final allowed raise) in the first round as the dealer, whereas some strong human players cap the betting with a wide range of hands. Even when holding the strongest hand—a pair of aces—the strategy caps the betting less than 0.01% of the time, and the hand most likely to cap is a pair of twos, with probability 0.06%. Perhaps more important, the strategy chooses to play (i.e., not fold) a broader range of hands as the nondealer than most human players (see the relatively small amount of red in Fig. 4B). It is also much more likely to re-raise when holding a low-rank pair (such as threes or fours) (44).

Why is this important? The only previous games of any difficulty that have been solved are perfect information games–games where each player knows everything that has previously happened. Tic-tac-toe and chess are perfect information games because everything that has happened is summarized in the state of the board. In imperfect games there is hidden information, such as in card games where the opposing players cards are hidden.

It’s clear that most games in the real world (and that includes “games” of nuclear strategy, bargaining, and detection and monitoring) are imperfect information games. Even though the sample space for HULHE is very large it’s smaller than these real world strategy games (and smaller than other forms of poker). Nevertheless, it’s clear that people are “solving” the real world games not by working through the sample space but by pruning it. A combination of search and heuristic pruning in the perfect information game of chess has already produced computers that are better than any human player. What the solution to this relatively small and somewhat unimportant imperfect information game indicates is that the computers are soon going to be better than you and I at the “human” capabilities of threat, bluff and deception.


Comments for this post are closed