The first improvement involved a refinement and modification of the architecture BPP used to represent the dependence of variables within the game. By modifying the simple network used by the original BPP, it is shown that significant improvements can be made by refining the way hands are classified within the system. This improvement eradicated one of the more serious problems evident within the original version where BPP would bet inappropriately strongly or weakly given the strength of its hand. Also presented is the effects of modeling the dependence between opposing players hands (due to hands being dealt from the same deck) which was omitted from the previous version of BPP.
Opponent modeling has been found to be an essential element in successful poker play. The weak opponent modeling implemented in the original BPP has been improved in a number of areas to allow more detailed inferences about the strength of ones hand to be made, and hence provide better performance. In particular, specific opponent modeling is introduced as a way to model every player, whether they be weak or strong. BPP observes and maintains information about the actions of each opponent and uses this information to build a simple model of their play. This model is used to predict the strength of each opponent's hidden cards. The program adapts to the style of each opponent and attempts to exploit any predictable actions. This technique replaces the generic opponent modeling method employed previously, which maintained a description of ``typical'' play.
A more effective and intuitive bluffing technique will also be introduced in this paper. Not only does BPP indiscriminately over-represent the strength of its hand in a small percentage of games, when making a decision of whether to bluff or not, it also takes into consideration such things as the potential of its visible cards and the likelihood that it's opponent will fall for the bluff.
The final and most significant improvement that will be presented in this thesis is the use of decision networks in computing the expected winnings for each possible action and choosing an action based upon those values. The original version of BPP used a betting strategy based upon expert knowledge with inefficiency introduced through course approximations. Much of the expert knowledge has either been refined or removed from the betting strategy allowing BPP to select an action based upon a more accurate future expectation of winnings. This improved betting strategy utilises a decision node, representing the available decisions (actions) that BPP can perform, and a utility node for representing potential winnings. The network is effectively used in determining the expected winnings that each action is likely to return given the current belief in winning. The new betting strategy provides a significant improvement in the accuracy of the evaluation function used to select an action, and as a result, performance of BPP has improved greatly.
Experimental results will be presented to demonstrate the notable improvements in BPP's playing ability due to the new features implemented.
This thesis is organised as follows. Chapter 2 introduces poker terminology, describes the game of Five-Card Stud (the poker variation played by BPP) and discusses other important work done in computer poker. Chapter 3 describes the original architecture of BPP in detail. Chapter 4 discusses the improved network structure used to improve BPP's representation of the world. It also discusses the refinement of hand classifications used throughout BPP. Chapter 5 describes the improved opponent modeling techniques used by BPP. Chapter 6 presents the new betting strategy implemented using decision networks. It also looks at the ways in which BPP's bluffing behaviour has been improved. Chapter 7 is an overview of the experimental results obtained during the course of the thesis and a description of the methods used to obtain them. Conclusions and suggestions for future research are presented in Chapter 8.