One goal of this research project was to construct a series of self-play poker tournament experiments to obtain statistically significant results that show how each enhancement improves BPP's performance. The experimental design to accomplish these goals is described in this section.
Each self-play tournament consists of playing one version of BPP against another. After each game is played, seat order is swapped so that both players get the chance to be seated immediately to the left of the ``dealer''. This ensured that neither player is disadvantaged by being the first the act in most games. A tournament consists of 25,000 different games. The number of trials per simulation was chosen to meet real-time constraints and statistical significance.
To test an enhancement, one particular version of the program is first played against an identical program with the new feature. For example, a version of BPP based upon the original design is put against an identical version with the addition of an arc representing hand dependence. Secondly, the enhancement is tested in combination with other changes. Finally, the modification is tested against opponents that have different playing styles such as a player based purely upon expert knowledge and humans of different levels of experience.
To measure the impact of each new enhancement on the program's performance, we use the average number of betting units won per game. This is an easy metric for comparison and an obvious choice as we are trying to maximise the size of winnings.
One must be cautious when interpreting the results of these self-play experiments, since any feature could perform worse (or better) playing against human opposition. The main function of these experiments is to eradicate bad ideas. Ultimately, the only performance metric that is important is how BPP plays against humans. Since it is difficult (and expensive) to get this data, most of the experimentation must be done with self-play first.