Vous êtes sur la page 1sur 50

Contents

1 Introduction 1.1 Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 History of Poker Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 History of Poker Research . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview Modeling Games 3.1 Players . . . . . . . . . . . 3.2 Moves, Actions and Game 3.3 Payos . . . . . . . . . . . 3.4 Information Sets . . . . . 3.5 Strategies . . . . . . . . . 3.6 Solutions . . . . . . . . . 3.7 Formalization . . . . . . . Modeling Poker 4.1 Players . . . . . . . . . . . 4.2 Moves, Actions and Game 4.3 Payos and Rake . . . . . 4.4 Information Sets . . . . . 4.5 Strategies . . . . . . . . . 4.6 Solutions . . . . . . . . . 4.7 Cards and the Deck . . . 4.8 Other Conventions . . . . 4.9 Toy Games . . . . . . . . Discrete Toy Games 5.1 3 Card Poker . 5.2 13 Card Poker 5.3 A Case Study 5.4 A Case Study 3 3 3 6 6 7 . 7 . 8 . 8 . 9 . 9 . 9 . 10 13 13 13 15 17 17 18 18 18 19 20 20 26 34 35 36 36 41 44 46

2 3

. . . Tree . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . Turn Turn

. . . . . . Play Play

. . . . . . . . . . . . . . . . . . . . . . . . . . . . with Check-Raising . . without Check-Raising

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

Continuous Toy Games: [0,1] Poker 6.1 [0,1] Fixed Bet Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 [0,1] Chosen Bet Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 [0,1] Free Bet Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Botting

7 8

Summary 47 8.1 Further development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Appendix

47 49

Bibliography

Strategies For Complex Games: The Toy Game Method


Paul Otto April 10, 2014
Abstract Intro to Poker. Intro to poker model. Intro to toy game method. Toy games solved and interpreted. Bot programming, and recent discoveries are touched upon.

1 Introduction
1.1 Games
Games play a dominant role in mammal and especially human life. Children spend most of their early years exclusively playing. Leading theories1 suggest that games help the brain to focus on and master small skill sets in a friendly environment, which ultimately help the brain conquer complex tasks. Humans dont stop playing games as grown ups, games are enjoyed at all ages with family and friends, as well as increasingly online. The railing of games ll up many a pastime, the largest of them being the Olympic games, drawing in massive world attention. Games and gaming being what they are, there is little surprise that the academic world takes interest2 . However, the study of games from a mathematical standpoint has a surprisingly short history. Perhaps this is because analyzing a game takes some of the magic out of it. Taking a great shot at a goal is much more enjoyable than running numbers on how hard to hit the ball, using what angle, and how to turn the foot to achieve the desired spin. Through the advance of living standards around the globe more careers at professional gaming than ever are available now. This increases the competitiveness and the mathematical framework for understanding games is just slowly catching up.

1.2 History of Poker Games


Card games involving betting have likely been present ever since the invention of playing cards. Games which can be broadly classied as poker have rst appeared in the southern
A broad introduction about childrens play and its eects from the 1600s to online games [2] and a concise modern view [7, p 3: The Benets of Play] 2 Especially with the advent of computer gaming in the 1980s game studies have taken o. The Games and Economic Behavior journal starting publishing in 1989, the Game Studies Journal in 2001, the University van Amsterdam is oering an MSc in Game Studies since 2012. At Alberta University a poker research group has been studying poker since the mid 1990s.
1

Fig. 1 Peak daily online real money poker players plotted against the time period 2003 to 2013. Courtesy of pokerhistory.eu. US in the 19th century. The modern community card rules, specically the variant Texas Holdem which is discussed in this writing, appeared in the early 20th century. By 1967 Taxes Holdem was played in Las Vegas casinos, from where its popularity has been growing consistently. Soon after the no-limit variant was introduced for tournaments, the variant took o. No limit holdem features a set of rules with interesting properties. The game is easy to understand, thrilling and has tremendous strategic depth. As poker is a form of strategic gambling where players play against each other and not against the house, it is one of a few forms of possible positive expected value gambling. Unlike other forms of gambling where one can not win in the long term, such as roulette, lotteries or bingo, professional poker players stand to potentially win money consistently. In this regard, poker is very unique. There are certain parallels to professional sports betting, there, however, players bet against the bookie, and thus making a constant prot is at odds with the operator. Before we move on, it should be noted that poker should be thought of as a civilized past time game. There are known clear rules, a veriable pseudo random shue and safe money transactions in place. A third party trusted arbitrator3 handles disputes: the casino in a live setting and the poker room in an online setting. Scenes from Hollywood movies depicting betting cars by placing car keys on the table are far from reality. Unregulated poker play in private clubs, while existing and illegal, is a very small part of the poker universe. That is not to say poker is completely safe. Cheating and insolvency of the game host resulting is loss of player funds have their part in poker history as much as in any other industry. Starting around 1990 poker for play money was rst played online on IRC servers.
3 Implementing poker in a completely trustless fashion on a distributed ledger may soon be a possibility; for instance the ethereum project is working towards supplying a turing complete blockchain scripting language.

Fig. 2 Selection of software and hardware specialization of professional poker players. TL: Information set visualization for a strategy. TR: Statistical data on a player as collected by a program describing the players frequencies. BL: A professional poker player at work. BR: Replay analysis of a game of poker with statistical data. Compilation of the author. The rst successful poker bots were developed during this period. The rst online poker real money site Planet Poker opened for business in early 1997. About a year later the second site, Paradise Poker, launched, which still operates today. Only the name has survived though, Paradise is just a vertical of Sportingbet plc using third party poker software. With the the Moneymaker Eect4 in 2003 the advent of online poker started. Newer and better operators opened their online card rooms. Poker magazines and news outlets popped up, many books were written. Poker training and coaching websites launched their services, and software companies for statistical data analysis were founded. By 2010 online poker had become so competitive that professional players were using customized hardware and highly specialized software in addition to the software oered by the operator.
Chris Moneymaker, an accountant at the time, won the World Series of Poker main event in 2003, taking home a life changing amount of money. He qualied to play in the event through a $39 online satellite tournament. This story of rags to riches gained enormous public popularity.
4

1.3 History of Poker Research


The game of Poker was rst formally used as an example by von Neumann and Morgenstern [9, e.g. p. 186-219] in 1944 who alongside Nash in 1951 birthed the branch of game theory. They considered a model of one street5 simultaneous action poker with two players and a betting tree of depth one. The solved the game for a continuous hand strength distribution and extrapolated to the discrete and deeper betting tree of actual poker. Little has been done on the subject as it relates to Poker until very recently. Informal works by David Sklansky, most famously [8], introduced the idea of using mixed strategies, showing the need to make certain plays occasionally but not always, to the mathematical laymen. Discussions of these and related ideas among players with a mathematical background yielded the rst useful applications of game theory to real world poker on the early online poker message boards. This resulted in the ground-breaking 2006 publication by Chen and Ankeman: Mathematic of Poker [1]. Poker viewed as a two person zero sum game does not seem too interesting from a theoretic perspective. A Nash equilibrium exists and can be computed in polynomial time6 . A recent calculation [3] on the size of heads up no limit holdem at 200 blind stacks concluded there are about 1050 possible game states. In Chess there are about 1047 distinct game states. While the size of the game state space does not directly relate to complexity, complexity is only possible if the number of game states is suciently high. Consider the game say the larger number, where two players consecutively state an integer and the larger number wins. The number of possible game states is innite, as the rst player can name any integer, yet the second player wins always with a very obvious strategy. Nonetheless, complexity necessitates a large number of possible game states to make exhaustive computation futile. Since 2006 the Annual Computer Poker Competition hosts a platform for academics and hobbyists at the AAAI7 conference to test their poker bots in a number of disciplines.

2 Overview
This paper gives a quick introduction to game theory, discusses how poker is best understood in its framework, and then applies the ideas in aid of strong poker play. The section Modeling Games outlines how the modern branch of mathematics game theory tries to model conicts of interest. Participants in a game, their possible actions and outcomes of those actions are formalized. Information sets, game states and strategies are derived and the extensive and normal form of a game are presented.
5 Street is the shorthand for a draw of cards, a chance event. Modern poker is typically played with four streets. 6 Finding the solution to a two person zero sum game with nite pure strategy sets can be done by computing the solution to a corresponding LP, which can be solved in polynomial time (section 9) 7 Association for the Advancement or Articial Intelligence

The distinctions between two-player and n-player, constant sum and non-constant sum, complete information and hidden information, cooperative and competitive as well as one time and sequential are made. With the game theory framework available we move onto poker, with its intriguing hidden information property. The necessary notation and conventions are introduced. After the necessity for toy games has been shown, we move to the main part. TODO We solve a number of toy games which are designed to model certain aspects of the full game of poker. The rst games will be solved algebraically. The later games will be translated into LP form and solved by a commercial LP engine. The results will be analyzed and extrapolated

3 Modeling Games
The branch of game theory sometimes justies its existence by claiming to be an adequate model for conicts of interest. Game theory can be used to examine economics, tactical warfare, political debates or ght for resources of any kind. These ideas have pushed game theoretic courses into the human studies. A problem that arises is that the mathematical framework makes many subtle assumptions, some fundamental ones of which do not translate well into the real world. As such the authors view is that the game theoretic framework is best applied to games in the narrow sense dened below, and any extrapolations to real world situations need to be done with great care. There exist many ways to model a game. In particular, a given game can be modeled dierently, and a given model can be extrapolated to many dierent games. In this part the extensive form and normal form are introduced. Those descriptions lacks a time component, not in the sense of no existing consecutivity, but in the sense of no model for limited time resources for players as reaction times or time to contemplate. As such these models are useful for games where players have sucient time to learn and plan for as well as to play in. These models succeed in describing turn based games, such as chess or poker, by oering strategic advice. They tend to fail to describe real time games, such as tennis or Starcraft, in a useful way to aid play. Although with appropriate abstractions even parts of real time games can be analyzed with the standard game theoretic models. A game features players, who take actions when it is their move. The players moves form the vertices of the game tree, its root is the start of the game. The players actions traverse the game tree until they read a leaf in a nite number of steps. Each leaf is assigned a payo to each player. A player may not be able to dierentiate between her moves. Indistinguishable moves form an information set. A line is a path from root to leaf, representing one full play of the game.

3.1 Players
The games players consist of the participants 1, 2, 3... and player 0, a chance player, sometimes referred to as nature.

The participants resemble parties or individuals that take part in the game. Each participant is assumed to be fully aware of the games structure. A participant knows about all players, all existing moves, all possible actions, all payos, and all information sets. Additionally, each participant is assumed to act rational. A participant will choose, under their information, the action with the highest expected payo (see 3.3).

3.2 Moves, Actions and Game Tree


At any point in time, the game is in a state, which is formalized by the notion of a move. Each move belongs exactly to one of the players. As such, the game states can be partitioned into #players sets of moves. The moves are represented by vertices in the game tree. The initial state of the game, the rst move, is the root of the game tree. Terminal moves, at which play is halted and payos are determined, are leafs of the game tree. Transitioning from move to move requires the active moves player to take an action. The possible actions at a move are the moves action set (or the moves information sets action set). The actions are represented by directed edges in the game tree. A move with no outgoing edges is a terminal vertex, or a leaf, of the game tree. A directed path from root to leaf resembles a line : one play of the game from start to nish. A cyclical game tree, where the same move can be reached more than once in a line, shall not be allowed. More restrictive, the game shall be nite in size. Note that this does not prohibit certain forms of repetition. In chess, for instance, repetitive positions are possible, yet they are dierent moves, due to a tying rule for three time repetition. Player 0 always chooses randomly (but predictably) between its actions. The probability distributions of 0s actions are known. Player 0 can be thought of as the shue of the deck, or the ip of a coin.

3.3 Payos
Payos are the outcome of a game. They describe who wins, and when relevant, how much. The diculty is: How much of what? Payos are denominated in utility units, which is formalization of comparable value, interest and measurable or unmeasurable desires of the participants. Finding a suitable utility unit and assigning them in a consistent manner to the outcomes of the game can be a daunting task. A player may be willing to endure a poor outcome if the outcome for his foe is even worse. Such preferences must not violate rationality requirements. Fortunately, in poker there is an inherent unit of utility - the wager. We make the obvious but explicit assumption that each participant is solely trying to maximize their expected wager units in each game. The payo is a function that assigns each leaf a set real values, one per participant. The payo function is said to be constant sum, if the payos always sum to a constant. If the constant happens to be zero, the payo, and the game, are called zero sum.

3.4 Information Sets


For games of complete information, such as chess or roulette, the notion of information sets is not necessary. Every two moves, every two game states, are distinguishable to every participant. For games of incomplete information, such as poker or rock-paper-scissors, an information set is a set of nodes, which, at the corresponding players turn, are indistinguishable to her. When taking an action with an information set of size 2 or more, the current game state is not fully known, it may be any of the moves in the information set. To satisfy the indistinguishableness, each move in a given information set must have the same set of possible actions.

3.5 Strategies
With the framework of the game laid out, we consider how the participants behavior can be modeled. During game play, each participant chooses an action among his action set at each of his reached information sets. However, the participant could note down his chosen action for each information set before the game starts, and the noted actions could be executed automatically. This predetermined choice on all actions is called a pure strategy. A participant, especially in games of incomplete information, may not always choose the same action at a given information set, in order not to give o information that allow the other participants to narrow down the possible moves within their information sets. A predetermined choice on all information sets with a probability distribution over all possible actions is called a strategy. A strategy is a sucient notion for describing all possible game play from a participant. The set of all participants strategies is sucient to simulate the game. For a given set of participants strategies, the average payos can be determined. A strategy is said to dominate another strategy, if its (expected) payo is always at least that of the dominated.

3.6 Solutions
A solution to a game is an attempt to predict how the participants will act in the game. The more complex a game, the more dicult it is to dene a notion of solution that correctly predicts behavior. At the core of all solution lies the following idea. A participant can, given the set of strategies of all other participants, nd a strategy that maximizes her (expected) payo. If a number of participants are co-operating, whether or not payos are transferable, that group of participants can, given the set of strategies of all other participants, nd a set of strategies that maximize their (expected) payo. If this process, for all participants or cooperating parties of participants, yields no change in strategy for an initial given set of strategies, the strategies are said to be in

equilibrium. Such an equilibrium strategy set has the desirable property of being stable and can thus be seen as a solution. A game may have none or many such equilibria. Their occurrence, structure, nding, approximation and pruning down to a smaller set or single solution are part of the goal of game theory. What qualies as a solution and the structure of the solution space depend on the number of players, repetition of game play, properties of the payo and the exchange of value and information among participants outside before, during or after the game. Constant Sum or Non-Constant Sum A constant sum game has a constant total payo, there exists a c R such that iP \{0} Zi (v ) = c for all v T . A game is non-constant sum if there exists no such value. Two Participants or Three+ Participants A game has two participants if P has two elements in addition to {0}. A game has three or more participants if P as more than three elements in addition to {0}. Preplay Negotiating or Preplay Announcements or No Preplay Commnunication Preplay negotiations allow participants to discuss their actions before the game starts. Two participants may form a coalition, agreeing to adhere to specic strategies. This agreement may be enforceable (a contract between two companies in a legal system) or unenforceable (a vocal commitment which can be violated). These negotiations are potentially complex, and may constitute a game themselves. Preplay announcements allow participants to make promises and threats without feedback (If this happens, I will do that). No preplay communication is closest modeled by our notion of a game, where no participant can take actions outside the game tree. Single Game or Repetitive Game A game can be played a single time or be repeated many times. For repeated games promise and threat execution, coalition abiding and defecting, and other forms of contract negotiations play an important role. Outside Payments or No Outside Payments A game which permits outside payments allows participants to transfer part of their payo to one another. Which allows the formation of cooperations; groups of participants that try to maximize their joint payo.

3.7 Formalization
A game is described by G = (P, G, q, I , A, a, x, Z ) 1. P = {0, 1, ..., n} a set of players. 2. G = (V, E ), v0 V, T V a nite tree with root v0 as starting move, a subset T of terminal moves and E : V \ T V actions from move to move. 10

3. q : V \ T P a map that assigns a player to each move. 4. I a partition of V \ T into the players information sets, I I : I V \ T while I I I = V \ T which satises I I v, w I : p(v ) = p(w ), naturally mapping a player to her information sets. 5. A = (AI )I I a partition of all actions into possible actions per information set. 6. a : E AI A a surjective map from edge to an action. 7. Av = {a((v, w)) | (v, w) E } the action set for a move v. 8. AI = Av v I a property, the action set is a same for each move in an information set. 9. x = {Pr(A)|A Av , q (v ) = 0} a family of probability distributions over all action sets of 0. 10. Z : T Rn a payo to each participant at the terminal moves. This complete form, and similar forms, are referred to as the extensive form of a game. A pure strategy Si for a player i chooses one possible actions in each information set: Si = {A AI | v I, q (v ) = i} for i P \ {0} A given strategy match-up S = (S1 , ..., Sn ) yields a payo to player i P of Yi (S ) =
v T

Pr(v ) Zi (v )

where Pr(v ) denotes the probability of reaching leaf v . Let H denote the set of all directed paths from the root to v as an edge list, that is all lines that reach leaf v . Then Pr(v ) is Pr(v ) =
hH eh

Pr(a(e)).

where 1. Pr(a((v, w))) x if q (v ) = 0 is given and 2. Pr(a((v, w))) = 1 if q (v ) = 0 with a((v, w)) Sq(v) or 3. Pr(a((v, w))) = 0 if q (v ) = 0 with a((v, w)) Sq(v) otherwise. We can now describe the game from a behavioral perspective, viewing the game in the light of what the players can do. The game is described by G = (P, S , Y ) 1. P = {1, ..., n} a set of active players. 11

2. S = {Si |i P } a set of nite sets of pure strategies available to each player. 3. Y = {Yi |i P } a set of linear payos for each player, where Yi : j P Sj R is the payo Player i receives under a strategy match-up. This is also referred to as the normal form of a game. The solution theory for games is broad. A small overview is given below. A constant sum two participant (2 players and nature) game has a solution with unique payo (Minimax theorem section 9). A non-constant sum game with two participants, enforceable preplay negotiations that allows outside payments will push the participants to form a coalition. This also is called a cooperative game as participants can work together. They can achieve total payo v T such that Z (v ) is maximal. Finding all strategy pairs that reach this maximal total payo is straight forward. However, it is unclear how the maximal payo Z (v ) is to be distributed among the participants; typically the use of an arbiter is useful to determine this distribution in a fair manner. Fair arbitration schemes are widely studied, e.g. as bargaining problems where under specic axiomatic desiderata a single solution can emerge [5, p 155-162]. A non-constant sum game with two participants, enforceable preplay negotiations that does not allow outside payments still push the participants to form a coalition, however their objective is more complicated than maxing Z (v ) over v T . Determining a fair solution is can also be seen as a bargaining problem. A non-constant sum game with two participants without preplay communication allows no strategic agreement to be made. This is also called a non-cooperative game as participants can not work together despite potentially possible outside payments. There still exists at least one equilibrium strategy pair [6, p 286-295]. However, an equilibrium pair is a poor notion of solution in this case. A solution with a more applicable denition may not exist [4, p 106-109]. Interestingly, however, if such a game is repeated many times, implicit collusion may occur. Patterns in the choice of strategy can serve as means of in-game communication strong enough to settle on a solution of the same cooperative game [4, p 110-111]. A game with three+ participants with enforceable preplay negotiations and outside payments allows for the denition of a characteristic function v : 2P \{0} R which captures the maximal value a coalition of the participants can achieve. The value of coalition S , v (S ), is computed by assuming P \ S form an opposing coalition, in turn reducing the problem to a two participant game. Coalitions which are stable in the sense that no participant in S gains by defecting are called imputations. Under dierent sets of assumptions dierent imputations qualify as solutions, e.g. introducing a domination relation on the set of imputations leads to useful results [9]

12

For other three+ participant games there exist various approaches, a broad overview can be found in [4, p170].

4 Modeling Poker
4.1 Players
Poker is played between two to ten participants, which map directly to the player set {1, 2, ...}. Player 0 takes the important role of the shue. A shue can be modeled to take place either once before any other player takes a move (pick a random permutation of the deck (see section 4.7) once) or every time a street is dealt (pick the appropriate number of random cards from those cards which are left). The second methodology, although appearing less clean, has two big advantages. Firstly, the game tree is much smaller, as the order of undrawn cards is irrelevant. Even if we were to restrict permutations to those on n out of 52 cards, where n is the maximum number of known cards for a round, we still specify cards for branches of the game tree where they are not reached. Secondly, the notion of new cards being drawn at the time they inuence the information sets interacts more naturally with the description of behavioral strategies. Each participant knowing about the game structure; that is the other players, moves, actions, payos and info sets, is a given in poker, as the rules are known and easy to understand. As hinted to earlier, the structure of the game with units of wager allows for satisfying the rationality requirement: each participant simply aims to maximize their (expected) units of wager.

4.2 Moves, Actions and Game Trees


The participants in a poker game act strictly sequential. The arbitrary ordering of moves which is necessary for games with simultaneous moves will not add any confusion to the model for poker. Each participant has a stack size at the start of the game. The stack size is the total amount the participant can wager throughout the game. Each participant has a position which determines the order in which they move in the game. The root of the game tree is a move by 0 dealing the initial starting hands of the participants. A participants hole cards stay constant and known to them for the entire line. There are up to three more moves by 0, depending on the specic variant of poker and the line chosen by the participants. The moves of 0 separate the streets on which wagering takes place. The four streets, in order, are called preop, op, turn and river. At the beginning of the rst street, all participants are active. On each street, the (still active) participants engage in a round of wagering, a round of betting. The order in which participants bet is xed for each round. The betting actions traverse this order in a cyclical manner until the end of the round of betting is reached. A round of betting is terminated when all (still active) participants have acted at least once and one of the two following conditions is met:

13

All (still active) participant have bet the same amount. Only one participant remains active. At the end of a betting round, if only one participant remains active, payos are determined. If more than two participants remain active, 0 acts to initiate the next betting round. On the the last betting round, the river, if it ends with two or more active participants, there is a showdown : the starting hands are revealed, and payos are determined based on them. At each move there is a current bet size. For the rst move preop, the starting bet size is set by the game rules to a xed amount a number of participants are forced to place. Without this initial wager, there would be no reason to engage in betting. In holdem there are two starting bets called small blind and big blind forced onto two participants. The current bet size can be altered by the participants moves. Whenever a new street starts by a move of 0, the current bet size is set to zero. The actions the participants can take during a round of betting fall into three categories. Fold A fold is the action of passing. When folding, a person discards their hole cards and surrenders. A fold is a terminal action for a participant, it is always her last action. The participant changes from being active to not being active. The payo for the participant is xed at this point to the negative total amount wagered up to that move. Call A call is the action of matching the current bet size. When the current bet size is zero, a call is called a check. When the current bet size is larger than the remaining stack size of the acting participant, prohibiting her from matching the bet size in full, the call action matches the bet size up to the maximum possible, and she is said to be all-in. An all-in participant is usually treated as automatically matching all future bets for free without the need to take the action explicitly. Raise A raise is the action of increasing the current bet size. When the current bet size is zero, a raise is called a bet. An action of raising requires a numerical value, the amount of the increase or the total current bet size; the authors prefers the latter notation. The size of the bet increase is subject to a number of restrictions. The minimum increase from zero on each street is given by the game rules. In no-limit holdem it is equal to the initial big blind. The minimum increase from a non-zero amount is the previous increase. For example, the bet size progression may take the form 2, 4, 6, 12, 18 with increases of 2, 2, 6, 6 respectively. The progressions 2, 4, 5 and 2, 4, 6, 12, 14 are prohibited. The granularity of the increase is specied by the game rules. Typically the minimum currency denomination (1 cent) in an online setting and the smallest chip in a brick an mortar casino are used, but other conventions are possible. 14

Exception to the above. If a participants stack size is not sucient for a raise by the minimum increase amount, she has the option to raise all-in. Such a non-complete raise must be matched by other participants to continue to the next street. However, depending on the house rules, it may not constitute a full raise that increased the current bet size, and the minimum increase afterwards diers.

4.3 Payos and Rake


The payo for participant is determined in three situations: Folding When the terminal action of folding is taken, that participants payo is equal to her negative total wager during the line. Last Active When a participant is left as the only active one at any point, her payo is equal to the total amount wagered by all other participants during the line, minus rake. Showdown When two or more participant remain active until the last round of betting is completed, there is a showdown. Each participant reveals her up to that point secret hole cards. The participant whose hole cards which form the best hand (TODO??) achieves a payo equal to the total amount wagered by all other participants, minus rake. If two or more participants hold matching hand strengths they each achieve a payo equal to the total amount wagered by all other participants minus rake split among them evenly. Those participants whose hole cards do not form the best hand achieve a payo equal to their negative total wager during the line.8 Amounts wagered by the participants during a hand, starting with the small blind and big blind, are collected in the pot. In a real life environment this constitutes putting physical chips into the middle of a table. In an online environment the pot is virtually displayed for accounting purposes. When everyone but one participant folds, or participant wins (has the best hand) at showdown, she is said to win the pot, minus rake. A game of Poker is run by a host. The host ensures the game integrity and charges a fee for the services. This fee is called rake; rake is typically charged in one of two forms. For tournaments, whose special properties are not part of this discussion, there is an entry fee, part of which is rake. Past the entry fee there is no rake, the game is true zero-sum. For cash games, the form of poker discussed in this writing, each played
8 Situations in which some participants have wagered their entire stack size and others have not is paid out in steps until all active participants have bet less than their starting stack size: For participants that are all-in: i) Determine smallest starting stack size s of active participants. ii) Wagers made up to s by each active participant are paid o as described above. iii) Participants with starting stack size s are removed from the list of active players. iv) All total accounted for wagers are reduced by s. Rake is deducted in each step until the maximum is reached.

15

Hand # 113628743330: Holdem No Limit ($5/$10 USD) [2014/03/21 17:51:39 ET] Seat # 1 is the button Seat 1: Player1 ($2866.33 in chips) Seat 3: Player2 ($1898 in chips) Player1: posts small blind $5 Player2: posts big blind $10 *** HOLE CARDS *** Dealt to Player1 [6h 2d] Dealt to Player2 [Js 7s] Player1: raises $20 to $30 Player2: calls $20 *** FLOP *** [5s 8h 7c] Player2: checks Player1: bets $60 Player2: calls $60 *** TURN *** [5s 8h 7c] [Jc] Player2: checks Player1: checks *** RIVER *** [5s 8h 7c Jc] [Qs] Player2: bets $120 Player1: folds Uncalled bet ($120) returned to Player2 Player2 collected $178.50 from pot Player2: doesnt show hand *** SUMMARY *** Total pot $180 Rake $1.50 Board [5s 8h 7c Jc Qs] Seat 1: Player1 (button) (small blind) folded on the River Seat 3: Player2 (big blind) collected ($178.50) Fig. 3 A record of a poker game. The game setup is given with the description of stack sizes in chips and the seating arrangement. Hole cards are dealt by 0 to initiate game play. Subsequently the players take action up to the river, where only Player2 remains active after Player1 folded. This initiates payos without comparison of hole cards.

16

hand is raked. Whenever a op is seen, a percentage of the pot is taken as rake. At low stakes (pot sizes rarely exceed $60), this can be as high as 5% of the pot size. At high stakes (pot sizes regularly exceed $60), the rake is typically capped at a fraction of a big blind. With this structure, the game is not zero- or even constant-sum. The rake can be modeled as a third player with a deterministic strategy. In this discussion we ignore these eects and model a game without fees, as they add more than justiable complexity for their negligibility.

4.4 Information Sets


The rst move by 0 is the only move9 where hidden Information is created. Each participant is dealt a number of cards which are only known to her. At every succeeding move every participant has full knowledge of all prior actions taken by of all players10 except the rst action by 0. An information set for a participant thus spans all moves reachable by the given action sequence with all admissible11 hole cards by all other participants. 542n The size of an information set in Texas Holdem for n participants is n i=1 2 preop, as every other participant is dealt two random cards. On the op, turn and river the set of cards to choose from decreases by 3 then 1 then 1, shrinking the size of the info sets.

4.5 Strategies
A strategy describes which action a player chooses in each information set. As the other participants hole cards are varied (take all possible values) on a players information set, a strategy must be formulated without taking into account the cards of the opponents. A player must choose his actions solely based on his own hole cards and the preceding actions taken by the other players. Strategies for the rst one or two preop actions are simple enough to be written down, as there are only 169 distinct two card starting hands in Holdem12 and only a few actions could have been taking by the other players before. Once the op is dealt, however, the number of distinct hole cards can increase to up to 117613 and the number of potential actions taken grows almost exponentially14 .
In forms other than Texas Holdem and Omaha there can be more than one move by 0 which creates hidden information. 10 including her own; this may not be a given in games such as bridge. 11 Cards which are held by the player or on the board cant be hole cards of someone else 12 Due to suit isomorphism Ac2c and As2s are strategically identical preop (see 4.7 for notation). 13 if no suit isomorphism exists anymore, as on ops with three dierent suits 14 The number of actions per move is almost constant, it only decreases slowly as the total amount bet increases. For example, with two active players, the game tree below the moves reached by the actions bet b and check, bet b is identical is size.
9

17

4.6 Solutions
Poker is a zero-sum game ignoring the rake. For two players there exists a single value which can be reached by a pair of equilibrium strategies. For more than two players, preplay communication is explicitly prohibited by the rules. Poker operators are in fact using software to detect potential strategy synchronizing. No outside payments are allowed of course, but that is virtually unenforceable. Implicit collusion poses a problem, though. The game is repeated around 100 times per hour per tables online, and the set of participants over multiple tables can overlap, thus playing multiple hundred games per hour against or with each other.

4.7 Cards and the Deck


Poker is played with a specic set of cards, called the deck. Cards are dierentiated by two properties: rank and suit. Each card has exactly one of thirteen ranks and exactly one of four suits. Two cards are dierent if they dier in rank and/or suit. A deck is the maximal set of cards: 52 cards (13 ranks times 4 suits). Ranks are in increasing order: Two, Three, Four, Five, Six, Seven, Eight, Nine, Ten, Jack, Queen, King, Ace. Suits are in increasing order: Clubs, Diamonds, Hearts, Spades. This induces a total order on a set of cards. To determine the higher ordered of two cards, one simply compares them by rank; if the rank matches, one compares them by suit. As such, the Eight of Hearts is said to be higher than the Four of Spades (any Eight is higher than any Four), while the Four of Spades is higher than the Four of Hearts. Cards are noted in short form by rst stating their rank as 2,3,...,9,T,J,Q,K,A and then stating their suit as c,d,h,s. The short form As denotes the Ace of Spades, the highest ranking card, whereas 2c denotes the Two of Clubs, the lowest ranking card. While an isomorphism between a deck and its ordering and the pair ({0, 1, ..., 51}, ) is mathematically viable (and used in computer models), the properties of the studied games revolve around rank and suit, making the suggested notation most practical. When the discussion is focused solely on rank, or when all suits are meant, cards can be denoted by rank only. T denotes the rank Ten, any Ten, or all Tens, based on context. A set of cards is typically denoted by a list of the cards without spaces. AsAc and AcAs both denote the two cards Ace of Spades and Ace of Clubs. The former version, listing the cards in decreasing order, is the preferred default. In part of the discussion in this writing, sets of two cards are a being studied. Certain short hand notations are being used when referring to sets of two-card sets. A Pair is two cards of matching rank. Two cards are called suited if their suits match and osuit otherwise.

4.8 Other Conventions


When the current bet has been increased multiple times on a street, the shorthand notation refers to the bet in order as bet, raise, 3bet, 4bet and so on.

18

**, ATC A*, Ax T* Tx Q8 Q8s Q8o As*s 5cxc 4c*o *h*h KK-99 AK-AT, AT+ 22+ T6s+

two random cards, Any Two Cards an Ace of any suit and any other random card a Ten of any suit and any other random card a Ten of any suit and a random card of rank 9 or lower a Queen and an Eight of any suits a Queen and an Eight of matching suit, equal to Qs8s, Qh8h, Qc8c, Qd8d a Queen and an Eight of diering suit, equal to Q8 minus Q8s the Ace of Spades and any other Spade the 5 of Clubs and any other Club of rank 4 or lower, equal to 5c4c, 5c3c, 5c2c the 4 of Clubs and any other non-Club any two Hearts two cards of equal rank between K and 9 an Ace and any card of rank K to T any Pair T9s, T8s, T7s, T6s

Fig. 4 Various short hands for denoting specic sets of two cards. The last street, if not the fourth in a given variant of poker, is still referred to as the river. In no-limit holdem, there is a distinct positioning of participants. With three or more participants the players {1, 2, 3, ..., n} are ordered as 1 being the small blind, 2 being the big blind and n being the button. Before the game starts, the small blind and big blind place their forced initial wagers. Preop action is ordered (3, ..., n, 1, 2), postop15 action is ordered (1, 2, 3, ..., n). With two participants the players {1, 2} are 2 the small blind and button and 1 the big blind. Preop action is ordered (2, 1), postop (1, 2). The research parts of this paper will deal with two participants exclusively. For this purpose we let, unless otherwise indicated, P = {0, 1, 2}, whenever P is mentioned. 1 denotes the participant who acts last on the last street.

4.9 Toy Games


a toy games tries to capture some of the real game elements

15

Postop refers to the set of streets op, turn and river.

19

5 Discrete Toy Games


Our rst type of toy game features discrete sets of cards, such as 3-card poker, which leads to small enumerable game trees and strategy spaces. The games are therefore easy to understand and solve while still allowing some insight into the mechanics at work when playing poker. One of the important aspects of discrete toy games are card-removal. In 52-card poker, a players hole cards (two in Holdem) aect the distribution of hole cards available to the other players. A player holding the Ace of spades knows that no one else can hold the same card. This blocking eect happens without the other players knowledge: a strategy that plays the Ace of spades in a specic way is never executed when an opponent holds that card, without the player noticing. In other words, the information sets for a player are dierent based for each set of his own cards and are always dierent for two dierent players. This leads to curios results, for instance in a given Holdem situation a player may want to play a sixth of his 99 hole cards in a specic way. Since there is six ways to be dealt 99, she may choose one set of suits, such as blacks nines: 9s9c. This, however, gives away information to an opponent that happens to hold any 9. If the opponent holds a black nine, she cant be holding 9s9c, and if the opponent holds a red nine (9h or 9d), her other strategic option has been cut by 60% of all nines. For this reason, mixing must happen in a suit-isomorphic way: play a sixth of all 99 in a specic way.

5.1 3 Card Poker


In this rst discrete poker game we aim to understand how poker works in one of the simplest toy game models: one streets played with three cards: Ace, King and Queen. Each of the two participants is dealt one of the three cards at random, their information sets are therefore of size two, the smallest that supplies sucient ambiguity to be called poker. At showdown, the highest card wins16 . This game has been discussed from a practical perspective in [1, p140-147, 151-153]. The blinds are 1 unit per person. Player one has to choose a bet size b with each hand. Player two has then the option to call or fold to the bet. If 2 calls, there is a showdown. Formally17
16 17

A>K>Q G = (P, S = (S, T ), Y = (y, y )) with y : S T R

20

P = {1, 2} S = [0, N ]{A,K,Q} T = {call, fold}{A,K,Q}[0,N ] y (s, t) =


(h1 ,h2 ){A,K,Q}2

+ (s(h1 ) + 1)1{t(h2 ,s(h1 ))=call, (s(h1 ) + 1)1{t(h2 ,s(h1 ))=call, + 1{t(h2 ,s(h1 ))=fold} (h1 , h2 )

h1 >h2 } (h1 , h2 ) h1 <h2 } (h1 , h2 )

In gure 5 are payo Z = (z, z ) for the leaves of the game tree. z (, b, , call) A K 2 Q 1 K (b + 1) b+1

A b+1 b+1

Q (b + 1) (b + 1)

Fig. 5 Payo z for 1 when h1 (row) is betting b and h2 (column) is calling. Payo for h2 folding is always +1, irrespective of h1 , h2 , as no showdown is reached. We can make the following observations on dominated strategies18 For 2, folding the Queen (z = 1) dominates calling (z = b + 1). For b = 0 both options are equal, for b > 0 folding is better by b. Thus, 2 will always fold a Queen. For 2, calling the Ace (z = (b + 1)) strictly dominates folding (z = 1). Thus, 2 will always call the Ace. For 2, calling against b = 0 dominates folding. Thus, 2 will always call b = 0 without the Queen. For 1, assuming 2 folds all Queens and calls all Aces, betting b = 0 with the King (z = 1 half the time against the Queen, z = 1 the other 50% against the Ace) strictly dominates betting b > 0 (z = 1 vs. Queen, (z = (b + 1)) vs. Ace). Thus, 1 will bet b = 0 with the King. For 1, betting b > 0 with the Ace (z = +1 at some frequency of folds k [0, 1], z = b + 1 for calls (1 k ) of the time) dominates betting b = 0 (z = +1). Thus 1 will bet some amount b > 0 with the Ace.
18

A strategy s1 [strictly] dominates s2 if y (s2 , t) [<] y (s1 , t) for all t.

21

In other words, 2 should not put in extra money when always losing at showdown with the Queen, but should always put in extra money when always winning with the Ace. 1 should not bet the King against that strategy, there is no potential gain. 1 should always bet the Ace for more than 0. Eliminating dominated options, the potential solution space simplies to:

S = (0, N ] {0} [0, N ] T = {call} {call, fold}[0,N ] {fold}

Next we consider the bet sizes 1 can choose. For 1, when holding the Queen, betting b = 0 (z = 1) dominates betting b > 2 (z = (b + 1)) at least half the time when called by the Ace, z = +1 at most half the time when the King folds, totals < 1). The Queen will bet b [0, 2] For 1, when holding the Ace, there exists a bA > 0, such that the average payo is maximal. If there were two such values bA1 , bA2 , 1 could choose any of them for the same payo. 19 If 1 plays bQ > 0 with the Queen while bA with the Ace, and bQ = bA , 2 will call the King against bQ and fold against bA . But then bA is not the maximal payo for the Ace of 1, bQ would be better. Therefore, if the Queen bets, it must bet bA . Does 1 bet bA with the Queen? Let s1 = (bA , 0, bA ) (1 bets the Queen) and s2 = (bA , 0, 0) (1 checks the Queen). Let t1 = (call, bA call, fold) (2 calls the King against bA ) and t2 = (call, bA fold, fold) (2 folds the King against bA ). We nd y (s1 , t1 ) = ((bA + 1) (bA + 1) (bA + 1) + 1)/6 y (s2 , t1 ) = ((bA + 1) 1 1 + 1)/6 y (s1 , t2 ) = (1 (bA + 1) + 1 + 1)/6 y (s2 , t2 ) = (1 1 1 + 1)/6

Where the rst three summands each are the payo of the card match-ups (A, K ), (Q, A) and (Q, K ). The fourth and last summand 1 is the payo achieved when 2 has the Queen against 1s Ace (payo +1 for (A, Q)). The match-ups where 1 has the King ((K, A) and (K, Q)) have average payo 0).
19

Both 1 and 2 only have one decision point, so 2 cant force 1 to mix bA1 , bA2 to achieve the maximum.

22

For example, y (s1 , t2 ) has 2 folding the K, and 1 betting the Queen. (A, K ) pays 1, (Q, A) pays ((bA + 1)) and (Q, K ) pays 1. This simplies to

y (s1 , t1 ) = (bA )/6 y (s2 , t1 ) = (bA )/6 y (s1 , t2 ) = (2 bA )/6 y (s2 , t2 ) = 0

We can see, neither pair can is in equilibrium. If 1 plays s1 2 plays t1 . If 2 plays t1 1 plays s2 . If 1 plays s2 2 plays t2 . And if 2 plays t2 1 plays s1 20 , completing a cycle. Translating into poker language, if 1 blus, 2 should call in the hopes of catching the blu. If 2 tries to catch all blus, 1 should never blu. If 1 never blus, 2 should never try to catch a blu. Which in turn triggers 1 to always blu. The solution is a mixed strategy of some blus and some calls where no player can unilaterally improve. This means at equilibrium 1 has the same payo when blung as when not blung and 2 has the same payo catching a blu as not trying to catch a blu; otherwise either could change their strategy to achieve a higher payo. Let 1 play s1 with frequency u [0, 1] and 2 t1 with frequency k [0, 1]. To satisfy the above constraint it must:

u y (s1 , t1 ) + (1 u) y (s2 , t1 ) = u y (s1 , t2 ) + (1 u) y (s2 , t2 ) k y (s1 , t1 ) + (1 k ) y (s1 , t2 ) = k y (s2 , t1 ) + (1 k ) y (s2 , t2 )

Which solves to bA 2 bA 2 bA . 2 + bA

u= k=

The payos match21


as argued above 2 > bA y (s1 , ((k, t1 ), (1 k, t2 ))) := k y (s1 , t1 ) + (1 k) y (s1 , t2 ), the shorthand notation expresses how 1 is truly indierent between choosing s1 and s2 against the linear combination of t1 , t2 .
21 20

23

Fig. 6 Game value for Player one plotted against his chosen bet size bA . If 1 bets too small, she misses value with her Aces, as she could put in more money on average with a larger bet. If she bets too big, she gets too few blucatch calls from the King, and thus also misses value with her Aces. At bA = 0 there is no value gained for Aces because the bet size is zero; at bA = 2 there is no value gained for Aces as no King must call to prevent protable blus from 1s Queens.

y (s1 , ((k, t1 ), (1 k, t2 ))) = = = = y (s2 , ((k, t1 ), (1 k, t2 ))) y (((u, s1 ), (1 u, s2 )), t1 ) y (((u, s1 ), (1 u, s2 )), t2 ) bA (2 bA ) 6(2 + bA )

This game value is positive, 1 has the advantage. To maximize his advantage he chooses bA to maximize his game value. The game value function has one extrema per arm, the maximum falls into bA [0, 2].

24

d bA (2 bA ) =0 dbA 6(2 + bA ) b2 A + 4bA 4 = 0 bA = 2 +

8 = 2( (2) 1) 0.83

The solution to the game is (s , t ) with game value 1 s1 (A, K, Q) = (2( s2 (A, K, Q) = (2( s =

2 3

(2) 0.0572.

(2) 1), 0, 2( (2) 1)) (2) 1), 0, 0)

(2) 2 (2) s1 + s2 2 2

t1 ((A, b), (K, b), (Q, b)) = (call, call, fold) t2 ((A, b), (K, b), (Q, b)) = (call, fold, fold) t = 2b 2b t1 + t2 2+b 2+b for b [0, 2) for b 2

t = t2 We learn a few poker essentials from this game and its solution.

Strong hands perform best by putting in more money, they want to value bet. Medium strong perform best just seeing a showdown. Weak hands can perform best by putting in more money, they want to blu. They counteract the value bets, to give the opponent a tough decision when she holds a medium strength hand. Bet sizing is important, betting too small or too big achieves poor results.

25

5.2 13 Card Poker


In this second discrete poker game we aim to understand how poker works with more than one decision for player one and increase the number of cards while doing so. We have to x the bet sizing in order to still be able to solve the game. One street is played with three to 13 cards: Ace, King, Queen to Ace, ... , Two. Each of the two participants is dealt one of the cards at random; at showdown, the highest card wins22 . The blinds are 1 unit per person. Player one has the option to bet b = 1 or check b = 0 with every hand. Player two, if she faces no bet, has the option to bet b = 1 or check back b = 0, the latter leading to a showdown. If player two faces a a bet, she can call, results in a showdown, or fold, resulting in a win for player one. If player one checked and player two bet, player one has also the option to call or fold. For deck D = {A, K, Q, ...} we have the game as:

P = {1, 2} S = {0, 1}D {0, 1}{D0} T = {00, 01, 10, 11}D y (s, t) =
(h1 ,h2 )D2

+ 21{s(h1 )=01, 21{s(h1 )=01, + 21{s(h1 )=1, 21{s(h1 )=1,

t{11,01}, h1 >h2 } (h1 , h2 ) t{11,01}, h1 <h2 } (h1 , h2 )

t{11,10}, h1 >h2 } (h1 , h2 ) t{11,10}, h1 <h2 } (h1 , h2 ) t=00, h1 >h2 } (h1 , h2 ) t=00, h1 <h2 } (h1 , h2 )

+ 1{s(h1 ){00,01}, 1{s(h1 ){00,01}, + 1{s(h1 )=1, 1{s(h1 )=(0,0),

t(h2 ){00,01}} (h1 , h2 ) t(h2 ){01,11}} (h1 , h2 )

S is written in compact form where {D 0} is shorthand for the preimage of the anonymous function f : D {0, 1} that maps to 0 of the rst part of the strategy space, that is {D 0} = {d | d D, f (d) = 0}. Only cards which elect not to bet initially face a second decision. Treating the game as S = T and letting (1, 0) and (1, 1) denote the same strategy in S is also possible. Q 1 as part of s S means the card Queen bets. J 00 means the card Jack checks and then folds to a bet, if it is made. K 10 as part of t T means card King would call a bet when bet into but not bet itself when checked to.
22

A > K > ... > 2

26

Fig. 7 The game tree for two decisions for player one. Under each possible deal of 0 this tree hangs with the payos depending on the hole cards (w expresses the two cases of who has the stronger hand). After the deal of 0 , 1 decides to bet (b = 1) or check (b = 0). Against bets, 2 can fold or call; folding for instance loses the blind of 1. Against checks, 2 can bet or check herself; if bet, putting 2 to the call or fold decision.

27

Figure 7 depicts the game tree, for instance strategy 01 for a hand of 1 follows the branches marked b = 0 then b = 1 of the nodes marked with 1. 01 could also be called check-call, meaning checking with the intention of calling if a bet is made. We can again make some observations on dominated strategies No player folds an Ace when facing a bet. Both players fold the lowest card when facing a bet. 2 bets the Ace when checked to. Some other observations using our understanding from the 3-card game. Both players will have strong hands value betting and weak hands blung when they make a bet. The number of value bets to blu bets will be related. When players face a bet with a medium strength hand, they will sometimes call and sometimes fold. If 1 bets his strong hands up to L (e.g. up to a King), 2 bets the his strong hands at least up to L 1 (a Queen) when checked to. Thus, if 1 bets all strong hands it cant be in equilibrium, as checking becomes best against 2s best response. We therefore expect some mixing on 1s strong hands. Considering the implications of the bet size of 1 unit into 2 units. A medium strong hand, a blucatcher, needs to win 25% when facing a bet to call, as it stands to win 3 if best and lose 1 if worst, compared to folding. As such, we expect the value bet frequency to be 75%. A weak hand, a blu, needs to achieve at least 33% folds to make a blu worthwhile, as it risks 1 if unsuccessful and wins 2 is successful. Therefor we expect the calling frequency of hands that can beat a blu to be at most 67%. If at most 67% of hands are calling, a hand that has the option to check and see a showdown can only be a value bet if it is at least in the top 33% of hands of the opponent at that point. If, e.g. in the 12-card case D = {A, ..., 3}, 1 were check all hands, we expect 2 to bet A-Q for value, 3 as a blu, and 1 to call A-8, or something very similar.

28

Solving this game by hand is possible, but at this point computer help is potentially faster. Georey J. Gordon published a tool on his website23 which nds approximate equilibria for exactly this game. He represents the game in sequence form, which makes the size very manageable, and then runs the corresponding LP through an interior point algorithm. In this representation the constraints (Ax = b) of the LP are that each participant chooses an admissible action with every card. 1 can choose between bet, check-fold and check-call; 2 between call and fold or bet and check. Through clever modeling, only exactly one of the admissible choices for each card satises the constraints, and therefore only a linear combination of admissible choices is in the feasible set. Since there are only up to n cards and four dierent actions per player per card, the constraint matrix is rather small.24 The objective function (max cT x), naturally, is the payo of the actions represented by x being played by 1 and 2. We present the (numerically approximate) solutions for the 3-card, 13-card and also the 52-card version. The found solutions are in equilibrium, they may however contain dominated strategies. 3-card 1 action % bet check call25 fold A 48 52 100 0 K 0 100 50 50 Q 17 93 0 100 2 action % call fold bet check A 100 0 100 0 K 33 67 0 100 Q 0 100 33 67

We see a few familiar patterns. When betting, 1 is blung 25% of the time. When checking, 1 is calling 67% of hands that beat a blu from the Queen. For 2, the patterns are even more clear. Against a bet, 2 calls 67% of hands that beat a blu. When betting, 2 is blung 25% of the time. The open question is, why does 1 split his Aces in the fashion he does? Interestingly enough, according to [1, p164] this 3-card game actually ips at exactly blinds of 1 unit. For blinds of less than 1 unit, while keeping the bet size at 1 unit, 1 checks all Aces. For blinds of more than 1 unit, 1 bets all Aces. In this game then, betting no Aces, some Aces, or all Aces, should not make a dierence. 1 The game value for the above solution is 18 , an advantage for 2 who acts last. This is exactly the game value for our initial 3-card game at b = 1 of for the person acting last. The cited result above appears to be correct then, 1 gains nothing by being to able to bis his Aces and some Queens himself at the given betsizing of 1 into 2. The fact that
http://www.cs.cmu.edu/~ggordon/poker For Matlab source les append /source. In the given implementation Gordon uses four entries per card in x which leads to sparse A of size (8n 6n). Comparing this to the strategy space which runs exponential in n (O(4n ) as given above) shows how useful this modeling method is. 25 Call and fold are the actions check-call and check-fold given a check was the rst action. Not normalizing by the checking frequency makes the results more dicult to interpret. The frequency of the check-call with an Ace is the product of check and call in this notation.
24 23

29

the solver nds a bet for 48% of Aces is solely due to the deterministic implementation and starting point. The (exact) solution (s , t ) in our strategy notation is (no bets with the Ace for 1, any number thereof accompanied with a third as many Queens would be co-optimal): s1 (A, K, Q) = (0, 0, 0, 1, 1, 0) s2 (A, K, Q) = (0, 0, 0, 1, 0, 0) 1 2 s = s1 + s2 3 3 t1 (A, K, Q) = (11, 10, 00) t2 (A, K, Q) = (11, 00, 00) 2 1 t = t1 + t2 3 3

13-card 1 action % bet check call fold A 64 36 100 0 K 63 37 100 0 Q 62 38 100 0 J 60 40 100 0 T 55 45 100 0 9 42 58 100 0 8 0 100 76 34 7 0 100 61 39 6 0 100 43 57 5 0 100 27 73 4 25 75 17 83 3 44 56 0 100 2 45 55 0 100

2 action % call fold bet check

A 100 0 100 0

K 100 0 100 0

Q 100 0 100 0

J 100 0 100 0

T 100 0 100 0

9 100 0 100 0

8 76 24 0 100

7 58 42 0 100

6 40 60 0 100

5 25 75 0 100

4 0 100 0 100

3 0 100 100 0

2 0 100 100 0

When betting, 1 is blung 25% of the time again. The value bets are a mix of Ace-Nine with weight gradually decreasing with hand strength. The blus are a mix of Four-Two. If any of the blus check, they lose always, as Three-Two blus for 2 against a check. As such, the weight on the blus of 1 can be distributed arbitrarily; however blung weaker hands dominates blung stronger hands. Against a weak opponent that sometimes misses a blu with a Three, checking a Four and blung a Two performs relatively better than checking a Two and blung a Four. 1 can improve his blung game against non-optimal opponents: 1 action % bet check 4 0 100 3 14 86 2 100 0 30

When checking, 1 is calling 67% of hands that beat a blu from the Three a Two, 67% of hands Four+ that have checked. These calls consist of all hands that beat some value bets of 2 starting at the Ten. They also include all Nines as those block the value betting Nines of 2. This means if 1 checks the Nine or Eight-Four and faces a bet, the Nine is losing against 5 cards (Ten to Ace), but the Eight-Four is losing to six cards (Nine to Ace), all while the blus they face are equal. For the calls from Eight to Four there is no dierence, these are non-blocking blucatchers. Since calling the Eight and folding the Seven dominates playing the other way around, 1 can improve his calling game against non-optimal opponents by shifting calling weight in this fashion: 1 action % call fold 8 100 0 7 100 0 6 24 76 5 0 100 4 0 100

When facing a bet, 2 is calling 67% of the time from the perspective of 1s blu hands Four-Two, and a 61% on average. Only at this frequency the blus of 1 have the same payo for blung as for checking of 0. The selection of hands which call is similar to 1s hands that call after checking: all strong hands and a random mix of blucatcher hands. This again can be improved co-optimally: 2 action % call fold 8 100 0 7 100 0 6 0 100 5 0 100

When 2 is betting, its 25% blus again. The value hands are Ace-Nine, the blus Three-Two. Against 1s checking range, 2s Nine is the best hand over 77% of the time, against 1s checking range of blucatchers and better, the Nine is the best hand 71% of the time, just good enough to be a value bet. Since 67% of 1s blucatch and better hands are calling, the Nine is ahead, and the Eight is not. With eight mixed strategic options being played translating the solution into our strategy notation is omitted here. 52-card For a larger deck such as 52 cards we plot the strategies. In gure 8 we can see how 1 plays. The structure of the solution is the same as for the 13-card game. We also observe the same dominated strategies for calling and blung as discussed in the 13-card game which can be easily xed. Figure 9 depicts how 2 plays. The structure of the solution is also equivalent to that of the 13-card game, and the same dominated strategy for calling as before pops up, which we can swap for a more robust equilibrium solution easily. Expanding from 3-card to 13-card to 52-card lets us make the following observations. When a bet is made, value and blu frequency are correlated. The blinds and the 31

Fig. 8 Player one bets around 70% of the time with hands 1-25 and 44-52, for a blu frequency of 25%. After checking, she calls all strong hands 1-26 always, all weak hands 44-52 never, and hands in between sometimes. bet size determine the risk to reward ratio for the calling player and thereby the frequencies for the person betting. Value frequency depends on the number of value hands, which depends on the calling hands of the opponent. When there is more than one decision point, strong hands are potentially split before the last the rst. This is called slow-playing, acting weak with a strong hand. This can be interpreted as protecting other hands that act weak (check as 1), or as denying the other player to have too many value bets.

32

Fig. 9 Player two when facing a bet calls all strong hands 1-25 always and all weak hands 44-52 never, this works in tandem with Player ones bets. With the blucatcher hands in the middle she calls sometimes. When facing a check the strong hands 1-25 and half of 26 value bet and the weak hands 45-52 and half of 44 blu bet, the rest checks for showdown.

33

5.3 A Case Study Turn Play with Check-Raising


The board reads 2h3h4hQd with a pot size of $100 and an innite (arbitrarily large) stack size . Hero has a xed checking range of {AhKh (1), AK no h (6)} Villainhas a range of {QsJs}
26 If the hand is checked down, Hero has an EV of $33.77 ($ 77 ). This be easily veried with equilab or a similar program 33.77% AK 66.23% QsJs Consider what happens if Villainchecks the turn behind 100%. Hero will win the pot always on an A,K and 5 river. He will bet on all other rivers with his single ush combo and close to one blu combo for a very large size. To nd his total EV we consider all possible river cards and calculate the EV Hero achieves on each. river card frequency in /44 # value combos # blu combos $ EV A,K,5 10 all none 100 2 2,3,4,6-Q 34 1 1 7 100

EV(Hero) =
card C

(frequency of C )(EV on C ) =

1 3450 (10 $100 + 34 $200/7) = $ 44 77

Heros EV if turn is checked through is about $44.81. If Villainbets the turn for an innitesimal amount, Hero would raise to a very large amount with AhKh and just shy of two blu combos ( 11 6 ), say AsKs and 83.3% of AcKc. When Hero raises, his EV is $100. When Hero just calls there is no betting on the river and Heros EV is just $22.73 ($ 250 11 ). His total EV against an innitesimal bet is then 17 25 250 12475 ( 6 $100 + 6 $ 11 )/7 = $ 231 or about $54.00. (If Hero only called the innitesimal bet, he would achieve the EV of a checked turn, which is lower.) If Villainbets the turn for an amount b Hero would raise to a very large amount with the same range as above (ush and 11 6 blu combos), again making QJ an indierent call. If b < $ 125 then the remaining AK will call, else they will fold. If they can call 3 $2506b 25 $2506b their EV is . Heros total EV is then ( 17 )/7. 11 6 ($100 + b) + 1{b<$ 125 } 6 11 At b = $0 we get the same value as above of about $44.62. At b = $ 125 3 this is about $57.34. To summarize, Hero achieves the following EVs
3

$ 33.77 Checked down $ 44.81 Checked turn $ 54.00 Checkraised turn vs innitesimal bet (= Hero bet turn himself) $ 57.34 Checkraised turn vs $ 125 3 (AKo indierent to calling + checking river)
34

$ 40.48 No Checkraise allowed (Villainbets turn for $ 125 3 )


Since the option is on Villainhe will choose the option that maximizes his EV, minimizing Heros EV. Checking behind is his best option at $44.81 EV for Hero.

5.4 A Case Study Turn Play without Check-Raising


But what happens if Hero cannot check raise the turn? Would it make sense for Villainto make a small bet, designed to generate value from non-ush AK or to force a fold, while avoiding paying o the ush for too much? Hero would be forced to call the ush but can also call some non-ush oats, looking to donk bet the river. How many non-ushes can oat the turn against a bet size b? How often does Villaincall the river donk bet? Against bets below $125/3 non-ushes can oat based on immediate odds received. We consider this case rst and then go over to b > $125/3 for the interesting oat case. Against a small bet all hands call. On the river, as considered in the turn check through scenario in the previous section, Hero wins the pot always on 10 cards and wins the pot half the time on the other 34 cards. His total EV check-calling a turn bet 1 5 b < $125/3 is therefor 44 (10 ($100 + b) + 34 2 7 ($100 + b) + 34 7 (b)). At b = $0 this is equal to the turn being checked through, $44.81. At b = $125/3 this is $ 850 21 or about $40.48. For bets below $125/3 Villainprefers $125/3 , as that minimizes Heros EV. Now for b > $125/3. We need to nd out how often Hero calls the turn with his non-ush hands. If calling turn and blung river was neutral EV, as it is in single street analysis, oating turn would be losing the turn money, meaning AKo was not a good oat. Therefore the river blus must be showing positive EV here! In fact, their river EV must be equal to the turn investment. To accomplish positive EV blu, Villainmust fold to river bets more often than in a single street analysis. We start by assessing how many oats Hero can play to make Villainindierent between calling and folding to a river bet. We then assess how often Villainneeds to call the river to prevent oats. (Villainwill still be made indierent between calling and folding river, but he folds a lot more than in single street analysis.) Hero will bet a non A,K,5 river with AhKh and one blu combo to make Villainindierent between calling and folding river, assuming a very large bet. He can therefor oat turn with one combo, say AsKs. To understand how often Villaincalls the river bet we need to gure our the EV of oating AsKs against a calling strategy of c [0, 1] calls. The river bet size is r which is very large. 1 (10 ($100 + b) + 34 (1 c) ($100 + b) + 34 c (b r)) = 0 44
44($100+b) 68b+$3400+34r

EV(Hero) =

This is satised when c =

35

6 Continuous Toy Games: [0,1] Poker


In the limited card, discrete toy games, we examined how poker is played when the knowledge of the own hole cards signicantly aects the distribution of the opponents hole cards. In this section we will model the hand strength distribution with a simple [0, 1] interval. This notion of continuous hand strength, has several advantages. It simplies the consideration blocker eects by removing them completely. It opens the possibility to think in terms of regions and thresholds between those regions taking away the need of non-pure strategies. If a strategy would play a hand h in two ways s1 , s2 with probabilities p1 , 1 p1 , the strategy can be changed in this way: For some play s1 on [h, h + p1 ] and s2 on [h + p1 , h + ]. The dierence in pay o between the original and modied strategy goes to zero as goes to zero. In the this manner, any chaotic strategy can smoothed out to exclusively pure strategies of width . For the real Texas Holdem with 1326 distinct starting hands we are between the discrete and continuous hole card space description where both blocking eects and threshold ideas are important. In a [0, 1] game there is only one move by 0 at the start drawing a number from the interval at random for each participant. The number represents the hand strength. If a showdown is reached the lowest number wins. Betting and the number of streets will be restricted and varied in the following parts so obtain useful results. Since [0, 1] is not a nite set the game does not canonically work with our (P, S , Y ) description for games. For simple enough innite strategy spaces, the Minimax Theorem [section 9] still holds. If instead of solving on [0, 1] we solve for some on the set of {[0, ], [ , 2 ], ..., [ 1 , 1]} and can show the this solution converges for 0 to the solution on the innite space, we have eectively shown that a single equilibrium exists.

6.1 [0,1] Fixed Bet Size


In this rst [0, 1] game we aim to understand how players wager in relation to their hand strengths for a xed bet size. We restrict the game to a single street and only one player, 1, having the right to increase the bet to some xed amount. The xed blinds are 1 unit per person. Formally26
26

G = (P, S = (S, T ), Y = (y, y )) with y : S T R

36

P = {1, 2} S = {bet b, check}[0,1] T = {call, fold}[0,1] y (s, t) =


(h1 ,h2 )[0,1]2

(b + 1) sgn(h1 h2 )1{(s(h1 ),t(h2 ))=(bet b,call)} (h1 , h2 ) (+1) sgn(h1 h2 )1{s(h1 )=check} (h1 , h2 ) (+1)1{(s(h1 ),t(h2 ))=(bet b,fold)} (h1 , h2 ) dh1 dh2

The payo for the leaves of the game tree is given by z, Z = (z, z ) 1 b = bet b, a = fold Bet - Fold b = check, h1 < h2 Check, 1 wins. 1 z ((h1 , b), (h2 , a)) = 1 b = check, h1 > h2 Check, 2 wins. b+1 b = bet b, a = call, h1 < h2 Bet - Call, 1 wins. b 1 b = bet b, a = call, h1 > h2 Bet - Call, 2 wins. Where leaf ((h1 , b), (h2 , a)) is reached when 1 chose action b and 2 chose action a with hands h1 , h2 dealt by 0, respectively. We can make the following observations Both player one and player two have only a single decision point. 1 when she has the option to bet, 2 when she faces a bet. At both those points, both have their full distribution [0, 1] of hands available, and the other player can have no information where one is in that set. If 2 folds, there is no dierence in the hand she was dealt, the payo is 1 to her (z ((h1 , bet b), (h2 , fold)) = 1 h2 ). If 2 calls, a stronger hand can not perform worse than a weaker hand. For a given play of 1 z (, (j, call)) z (, (k, call)) whenever j < k . Thus, if 2 correctly calls some hand k , she also calls all j < k . As such, there is a unique value k for each b splitting 2s distribution into a calling (< k ) and folding (> k ) region. If 1 bets, she will encounter a response of the type characterized by k (b), meaning calls from [0, k ) and folds from [k , 1]. The payo for h1 betting can be better than checking if h1 prots b frequently when called, that is h1 << k , or if h1 prots by folds enough to make it better than checking, that is h1 >> k . h1 close to k will perform worse by betting over checking since checking and betting have the same payo against hands worse than k , but betting has a worse payo then checking against hands better than k . 37

Fig. 10 An example path in the game tree of section 6.1. 0 deals the starting hands. Player one receives a strong hand, 0.2, and decides to bet b = 2. Player two receives a medium hand, 0.4, and decides to call against b = 2. They reach the payo z ((0.2, 2), (0.4, call)) = 3. 1 made a successful value bet, 2 made an unsuccessful blucatch call. With this reasoning we propose three values: k , v and u , which mark the thresholds at which calling performs better for 2 than folding and where betting performs better for 1 than checking, with v < k < u . 27 The corresponding strategies are bet b if h1 [0, v ) s (h1 ) = check if h1 [v , u ) bet b if h1 [u , 1] t (h2 ) = call if h2 [0, k ) fold if h2 [k , 1]

(1)

(2)

The payo of the proposed v < k < u structure for strategies s and t is y (s, t) = ((0)v + (b + 1)(k v ) + (+1)(1 k ))v + ((1)(v ) + (0)(u v ) + (+1)(1 u))(u v ) + ((b 1)(k ) + (+1)(1 k ))(1 u) = bkv + bku bv 2 + 2ku bk 2k u2 + 1.

27

v for value bet threshold, u for blu threshold.

38

Fig. 11 Visualization of the payo for the a strategy match-up characterized by v < k < u for a xed b. The horizontal axis represents hands [0, 1] for 1, the vertical for 2. The areas of intersection are marked in areas of uniform payos for their leaves Z . E.g. the area marked y = b + 1 is the intersection of strong hands [0, v ] of 1 with weaker hands calling [v, k ] of 2. In the rst equation the rst summand is the payo when 1 holds v and better: on average 0 for hands of 2 of v and better, (b + 1) for hands of 2 between v and k , and +1 for hands of 2 of k and worse. The second and third summand is the payo of 1s hands between v and u and worse than u, respectively. See Fig. 11 for a visualization. We nd 1s problem of nding v for given k as v = arg max y (k , v, u)
v [0,k ]

= arg max bk v + bk u bv 2 + 2k u bk 2k u2 + 1
v [0,k ]

= arg max v 2 (b) + v (bk )


v [0,k ]

With the second order coecient being negative there exists exactly one maximum 39

where 2 (v (b) + v (bk )) = 0. v k v = [0, k ] 2 The same problem of nding u for a given k u = arg max y (k , v, u)
u[k ,1]

= arg max bk v + bk u bv 2 + 2k u bk 2k u2 + 1
u[k ,1]

= arg max u2 (1) + u(bk + 2k )


u[k ,1]

Again, with the second order coecient being negative there exists exactly one maximum where 2 (u (1) + u(bk + 2k )) = 0. u b+2 u = k [k , 1] 2 We nd 2s problem of nding k for given v , b as k = arg min y (k, v , u )
k[v ,u ]

= arg min bkv + bku b(v )2 + 2ku bk 2k (u )2 + 1


k[v ,u ]

= arg min k (bv + bu + 2u b 2)


k[v ,u ]

= v 1{bv +bu +2u b2>0} (v , u ) + u 1{bv +bu +2u b2<0} (v , u ) 1{bv +bu +2u b2=0} (v , u ) +k

can be any arbitrary value in the interval. We see that the interesting case Where k is bv + bu + 2u b 2 = 0. If b and u do not qualify this condition, 2 will either call as many hands as possible (1 has too many blus) or as few hands as possible (1 has too few blus). If the condition is violated, k will actually fall outside the [v , u ] 28 and range; for instance for v > 0, u = 1 (1 has no blus) we nd k = (1 2bb +2 )v < v b+2 b for v = 0, u < 1 (1 has only blus) we get k = 2b+2 + 2b+2 u > u
Poker players call a term of the 2bb kind pot odds. If a hand matches a bet b for a total win +2 b 2b +pot size, the hand must be winning at showdown at least at the frequency of their fraction 2b+pot . size
28

40

bv

We conclude that if our v < k < u structure does indeed characterize a solution, + bu + 2u b 2 = 0 must be satised. In other terms v = (1 u )( b+2 b ). The size of the blung region (1 u ) is linearly related in terms of b to the size of the value region v . This multiplier eect has rst been coined with its own character in [1, p 113]. The system v = k 2 b+2 2

u = k

0 = bv + bu + 2u b 2 solves to v = u = k = b2 b+2 (0, 0.5] + 5b + 4

8 b2 + 4b + 4 ,1 b2 + 5b + 4 9 b2 2b + 4 + 5b + 4 [v , u ] [0, 1]

for b 0. The last step is to show that s , t described by v , u , k as in equation 1 are in fact a solution. We have to show that neither 1 nor 2 can unilaterally improve their strategy to achieve a higher payo. If that is true, the strategies are in equilibrium, and are therefore a solution. 2 tries to improve over t when facing s : 2 should maximize her payo over a {call, fold} for all h2 [0, 1]. As can be readily veried, calling performs best for h2 [0, v ], folding performs best for h2 [u , 1], and both options perform exactly the same for h2 [v , u ]. Therefor any k [v , u ] performs best possible, and k is one such k . 2 can not unilaterally improve t , although all strategies with arbitrary play on the middle interval oer the same payo against s . 1 tries to improve over s when facing t : 1 should maximize her payo over b {bet b, check} for all h1 [0, 1]. As shown in the derivation, betting performs best for h1 [0, v ] [u , 1]. For h1 [v , u ] it can be easily veried that checking is indeed best. s is best possible against t and all deviations (with weight > 0 of hands) perform worse. Thus 1 can not unilaterally improve s .

6.2 [0,1] Chosen Bet Size


In this second [0, 1] game we aim to understand how to choose a bet size b. We keep the game the same as in section 6.1 with one small change: One player can choose b. We 41

Fig. 12 Optimal strategy bounds v , k , u plotted against b. u reaches its minimum 8 of 9 at b = 2. v , k 0, u 1 for b . already know how the solution to the game for any xed b looks like, as such we must only nd b [0, ) with maximal payo for 1 and 2. The payo for s , t is maximized and minimized over b. We recall the payo for strategies s, t of the v, k, u structure as y (s, t) = bkv + bku bv 2 + 2ku bk 2k u2 + 1.

Let c(b) = b2 + 5b + 4. We nd 1s problem of nding b 1 as


b 1 = arg max y (s (b1 ), t (b1 )) b1 [0,)

= arg max
b1 [0,)

2 b1 (2b1 + 4)(b1 + 2) + b1 (2b1 + 4)(b2 1 + 4b1 + 4) b1 (b1 + 2) 2 2 2 + 2(2b1 + 4)(b2 1 + 4b1 + 4) (b1 + 4b1 + 4) /c(b1 )

b1 (2b1 + 4) 2(2b1 + 4) /c(b1 ) + 1

This fourth order fraction solves to one maximum for b > 0 at exactly

42

Fig. 13 Payo y for s against t for bet sizes b. y is maximal at b = 2 and minimal at b = 0 and when b .

b 1 = 2.
Incidentally b 1 = 2 is the point at which u is maximal. This is not a coincidence, we can solve many toy poker games by using the short hand of nding maximal blung ranges. The same problem of nding b 2 for 2 b 2 = arg max y (s (b2 ), t (b2 )) b2 [0,)

=0 2 prefers if there was no betting, which is true at b = 0 or as b , which intuitively seems logical, as only 1 gains by putting in extra wagers with strong hands.

43

If Player one can choose a single possible bet size, the game plays as b 1 =2 2 v = 9 k = u = y= 4 9 8 9 1 . 9

If Player two can choose a single possible bet size, it is b 2 = 0 and the game plays arbitrarily, the players have no options, eectively, with y = 0.

6.3 [0,1] Free Bet Size


For this third [0, 1] game we aim to understand how big hands want to bet in relation to their likelihood of being the best hands. We keep the game the same as in section 6.1 with one important change: Player one can choose a bet size b for every holding. Since b is not xed we need to restrict it by the stack sizes, which we set to N , where N is very large.

P = {1, 2} S = [0, N ][0,1] T = {call, fold}[0,N ][0,1] y (s, t) =


(h1 ,h2 )[0,1]2

(s(h1 ) + 1) sgn(h1 h2 )1{t(h2 ,s(h1 ))=call} (h1 , h2 ) (+1)1{t(h2 ,s(h1 ))=fold} (h1 , h2 ) dh1 dh2

The payo for the leaves of the game tree is essentially unchanged from section 6.1. Checks are now treated as bets of b = 0. Action a is based on the betsize b it faces, but as arguments of z simply have to mark a path through the game tree, this must not be

44

represented. a = fold 1 z ((h1 , b), (h2 , a)) = b + 1 a = call, h1 < h2 b 1 a = call, h1 > h2 We again start with some observations If 2 calls against b, a stronger hand can not perform worse than a weaker hand. ) and folding [k , 1] against a size b can weakly Thus 2 a strategy of calling [0, kb b dominate all other strategies, and therefore there exists an optimal strategy that is part of the set of strategies of this type.
. For each If 1 bets b, she will encounter a response of the type characterized by kb hand h1 the average payo of z ((h1 , b), ) will be maximal over all b.

Bet - Fold Bet - Call, 1 wins. Bet - Call, 2 wins.

45

We express the value that h has in betting b as follows EV (h, b) = +b(max0, 1/(b + 1) h) b (max1/(b + 1), h) Now it would not make sense to value bet a hand worse than c(b) = 1/(b + 1) so using h < 1/(b + 1) we get EV (h, b) = +b (1/(b + 1) h) b (h) EV (h, b) = b (1/(b + 1) 2h) Let us take a quick look how this function looks like for example h = 0.2, so a top 20% hand. Apparently the best bet size is somewhere around 0.5, a half pot bet. Lets solve for it. To nd the maximum of the function EV we take the derivative in respect to b. d/dbEV (h, b) =: EV (h, b) = 1/(b + 1)2 2h Then nd its root, that means the value of b where EV(h,b) = 0 0 = 1/(b + 1)2 2h sqrt(2h) = 1/(b + 1) b = 1/sqrt(2h) 1 c) Result, Application, Chart So our optimal bet size in respect to our hand h looks like this. A few key points to notice. At h=0.5 we should bet zero, that means checking behind. For hands worse than 0.5 we should not bet (you can try to interpret negative bet sizes) At h close to zero we should bet extremely large. To get a connection to a real poker situation. h=0.5 means we have the best hand half the time versus blucatchers and better. h=0.1 means we have the best hand 90% of the time. One way to apply this bet sizing theory is to measure how often our hand is good, and translate that into h. A simple way to do it: If our hand has equity x versus a range of blucatchers and better, we have a (1-x) hand h in [0,1] notation. We call equity vs blucatchers and better lead. A chart that maps lead to the best bet size. 1/sqrt(2 (1 x)) 1 And also useful, the other way around, mapping a bet size to the required lead. Example: http://weaktight.com/4716760 vs his blucatch+ range: 86.5% JsTs 13.5% KK+, 55, AQs, Ac8c, KQs, Kc8c, QJs, Tc8c, 98s, AQo, KQo, QJo opt size(86.5) = sqrt(1/2(1-0.865)) - 1 = 1/sqrt(2*13.5%) - 1 = 1.924 -1 = 0.92 = 92% Almost pot sized bet! Lower a little bit since he can also raise his nuts and a few blus.

7 Botting
When one has the option to check, checking dominates folding. In this text the option of open folding or folding behind will therefor never be considered, even though they are usually part of the strategy space. Game 1: Description: board: KQJT9 stacks: innite betting: one bet no blocking OOP: {A(a), split(b)} IP: {A(c), split(d)}, a, b, c, d N Solution space (OOP,IP,OOP):

46

If either player is faced with a bet at any point holding an Ace folding is dominated. If IP is faced with a check by OOP, betting an A dominates checking it behind. All other decision points need to be solved for. OOP can bet himself or check. If he is betting he may select a number of dierent sizes. If two dierent sizes s1 and s2 are played, they must yield the same EV, making using a single size, say s1 , co-optimal. As such considering only a single bet size s1 is sucient. All hands that dont bet s1 will check and may then face a bet by IP where they can call or fold to. If IP is checked to he may also select a number of dierent bet sizes for his bets. However, by the same argument, IP can play optimally with only one bet size when checked to: s2 . This yields the following parametrization for the game: OOP bets s1 with ({A(a1 ), split(b1 )} where a1 [0, a], b1 [0, b], s1 ]0, ] OOP checks with ({A(a a1 ), split(b b1 )} IP calls s1 with {A(c), split(d1 )}) where d1 [0, d] IP folds to s1 with {A(0), split(d d1 )}) IP bets s2 with {A(c), split(d2 )} where d2 [0, d] IP checks with ({A(0), split(d d2 )} OOP calls s2 with ({A(a a1 ), split(b2)} where b2 [0, b b1 ] OOP folds to s2 with ({A(0), split(b b1 b2 )} Bet,Call ({A(a1 ), split(b1 )}, {A(c), split(d1 )}) Bet,Fold ({A(a1 ), split(b1 )}, {A(0), split(d d1 )}) Check,Check ({A(a a1 ), split(b b1 )}, {A(0), split(d d2 )}) Check,Bet,Fold ({A(a a1 ), split(b b1 )}, {A(c), split(d2 )}) Check,Bet,Call ({A(a a1 ), split(b b1 )}, {A(c), split(d2 )}) a1 a a2

8 Summary
8.1 Further development

9 Appendix
Theorem (Minimax). In every nite two participant zero sum game there exists a value v R a strategy pair (s, t) such that (s, t) has payos v and v respectively and no improvement in payo is possible for both players. We call v the value of the game. The strategies (s, t) are said to form an equilibrium pair. The proof is based on[4, p 408-419]. Proof. Two players, P = {1, 2}, each having a nite number of pure strategies available S = {s, t} = {{s1 , ..., sn }, {t1 , ..., tm }}, lead to a nite number of possible payouts Y1 = {yij |i {1, ..., n}, j {1, ..., m}} Rnm to player one. Player twos payos are the negative payo of player one due to the zero sum nature of the game. 47

Since the game is symmetric in description, this discussion simultaneously applies to 1 and 2 with perspectives interchanged. Let p1 be a probability distribution over 1s pure strategies.
n

p1 : s [0, 1] such that


i=1

p1 (si ) = 1

p1 describes a (mixed) strategy suciently and we shall refer to p1 as a strategy. Let w1 be the lowest possible payo for strategy p1 , that is the payo against a perfect response of 2.
n

w1 (p1 ) =

j {1,...,m}

min

p1 (si ) yij
i=1

Payo w1 is always achievable for 2 by playing the best pure strategy tj with
n

j = arg min
j {1,...,m} i=1

p1 (si ) yij

If w1 was achievable only by mixing two or more pure strategies tk , k K, K {1, ..., m}, then 2 could improve by increasing weight on the best option, j = arg minkK If all options are equal, any j K is a best pure strategy response to p1 . Player one is looking to nd a strategy p 1 such that w1 (p1 ) is maximal. No matter how well 2 plays against p1 , 1 gains w1 (p1 ) or more. p 1 = arg max
p1 :s[0,1],
n i=1

n i=1 p1 (si )yik .

w1 (p1 )

p1 (si )=1

, then there is a p with If 1 can achieve at least some w1 1 n p1 (si ) yij w1 j {1, ..., m} i=1

Let qi = Let the payos be restricted to non-negative values yij > 0 i, j . new = If they are not they can shifted by a constant, the absolute of the lowest payo, yij old + |min y |, without changing the solution. yij ij ij
n

. p1 (si )/w1

i=1

qi yij 1 j {1, ..., m}

We can now view 1s goal expressed in canonical LP form (1)T q Aq 1 q0 48

maximize subject to and

n n A feasible point q = (q1 , ..., qn ) satises i=1 p1 (si )/w1 = 1/w1 and i=1 qi = thus corresponds to a strategy p1 with w1 (p1 ) = w1 , where p1 (si ) = w1 qi . Maximizing n (1)T q = n i=1 qi is maximizing 1/( i=1 qi ) and therefore w1 . A is simply Aij = yij . Since LP P, this problem can be solved eciently, e.g. by simplex algorithm or proprietary software. Turning to player two, it may not be surprising that the maximum guaranteed payo problem is the dual to the above LP. Following the same reasoning as above for player two, the the denition of w1 (p1 ) changes as twos payo is y : m

w2 (p2 ) =

i{1,...,n}

max

p2 (tj ) yij
j =1

The remainder of the argument is the same. This leads to an LP formulation of minimize subject to and (1)T r AT r 1 r0

If player one can get at least payo 1/ i qi and player two can get at most payo 1/ j rj , then i qi j rj due to the zero sum nature of the game. We must now = 1/ show that they are in fact equal, and v = 1/ i qi j rj is the value of the game. TODO show there exists feasible point which is at v

Bibliography
[1] Bill Chen and Jerrod Ankenman. The Mathematics of Poker. ConJelCo, Pittsburgh, PA, 2006. [2] Chudaco H. Children at Play: An American History. NYU Press, New York, NY, 2007. [3] Michael Johanson. Measuring the size of large no-limit poker games. http://poker.cs.ualberta.ca/publications/2013-techreport-nl-size.pdf. 2013.

[4] R. Duncan Luce and Howard Raia. Games and Decisions. Dover, New York, 1957. [5] J. F. Nash. The bargaining problem. Econometrica (18), 1950. [6] J. F. Nash. Non-cooperative games. Annals of Mathematics (54), 1951. [7] Deborah Ann Mulligan Regina M. Milteer, Kenneth R. Ginsburg. The importance of play in promoting healthy child developement and maintaining strong parentchild bond: Focus on children in poverty. American Academy of Pediatrics, 2011. http://pediatrics.aappublications.org/content/129/1/e204.full.html. 49

[8] David Sklansky. The Theory of Poker. Two Plus Two, Las Vegas, 1987. [9] von Neumann and Morgenstern. Theory of Games and Economic Behavior. Princeton University Press, Oxfordshire, 4 edition, 1944.

50

Vous aimerez peut-être aussi