next up previous contents
Next: Assessment of reliance on Up: Inference of Search Strategy Previous: Complexity

Inference Results

It can be seen quite clearly that the inference is fairly accurate and that the differences in the search strategies are enough for the Maximum Likelihood measure to differentiate between them. This is excellent, but it is reliant on the evaluation function to be known in advance. The key benefit is that the speculation over how deep a chessmaster actually searches can, to some extent, be established with this technique. The $\lambda $ values chosen were for the two attribute trials Material = 5 and Mobility = 0.5 and for the four attribute trials Material = 7, Mobility = 0.2, Centre Control = 0.5 and the Attack attribute was set at 3.0. These attribute values were chosen because the Attack and Material attributes are both very important, due to them increasing the likelihood the program will eliminate a quiescent position by capturing a piece. This was used so that a board game would have a lower difference in the overall strategy as the quiescent situations would be avoided or quickly eliminated by the program and thus inference between this strategy and another would become more difficult. If the inference was successful then the process should succeed in easier situations. The result was successful as shown below and thus is used in less error prone areas.

Where the Strategies are as follows:


 
Table 7: Inference results from inferring search strategies with one game and 100 trials per strategy.
  Actual Strategy
Inferred min1max1 min1max2 min1max3 min1max4 min2max2
min1max1 100% (30660.4) 0% (33383.0) 0% (30485.0) 0% (28191.3) 0% (54230.1)
min1max2 0% (41877.5) 88%(23435.8) 3% (24985.0) 3% (22787.1) 3% (42536.1)
min1max3 0% (54491.9) 4% (34823.7) 59% (22770.0) 39% (20124.4) 1% (81268.3)
min1max4 0% (54481.6) 4% (35368.4) 38% (22828.5) 58% (19892.4) 0% (81664.8)
min2max2 0% (48771.9) 4% (26674.3) 0% (28091.8) 0% (26961.1) 96% (35705.0)
$\hat\mu$ 306.6 234.4 227.7 198.9 357.1
$\hat\sigma$ 226.3 129.3 103.2 98.3 370.2
Min 14.7 8.9 11.03 14.0 16.3
Max 2182.8 970.6 606.1 499.3 3458.6

The tables are organised in the following format. Each row indicates the inferred strategy, and each column shows the actual strategy. The percentage of inferences for a particular strategy can be read down the column. The percentage indicates, for that strategy, how many times it was inferred. The value that is next to this is the total message length after 100 trials. The value for the mean shows the average message length that was inferred for that actual strategy. The $\hat\sigma$ is the standard deviation of the actual strategies message length and the Min and Max values are that of the actual message length. All message lengths are measured in bits.

The results for Table 7 show that for only one game generally all strategies are inferred correctly. The simple one ply search is quite different from that of the other search strategies and this fact is represented in that this strategy is found 100% of the time. The other strategies of one to two ply and two ply are also found with quite a high degree of accuracy and this can be attributed to slight differences to the others, while still having enough information to differentiate between them. The strategies of one to three ply and one to four ply are a lot closer, due to the lack of information and as a result, the inference of these strategies is not as accurate. The message lengths are relatively small and even more so when considering that the standard deviation is so high. The minimum and maximum values give some idea as to why the standard deviation is so large. The minimum is very small with one game only 8.9 bits. This small size is probably the main cause for the inaccuracy in the prediction.


 
Table 8: Inference results from inferring search strategies with two games and 100 trials per strategy.
  Actual Strategy
Inferred min1max1 min1max2 min1max3 min1max4 min2max2
min1max1 100% (64144.9) 0% (67386.9) 0% (59882.6) 0% (59583.9) 0% (105067.1)
min1max2 0% (91940.7) 99%(47236.3) 0% (48905.5) 2% (49528.0) 0% (82479.5)
min1max3 0% (116077.8) 0% (72508.1) 72% (44147.0) 28% (44113.3) 0% (159510.5)
min1max4 0% (115865.8) 0% (73428.9) 28% (44362.6) 70% (43699.4) 0% (160093.7)
min2max2 0% (110670.1) 1% (53851.5) 0% (55628.5) 0% (57216.3) 100% (69134.9)
$\hat\mu$ 641.4 472.4 441.5 437.0 691.3
$\hat\sigma$ 299.9 206.3 149.8 169.0 399.9
Min 234.2 167.9 152.9 93.2 160.7
Max 2407.8 1603.5 937.4 997.7 3730.7

The results of Table 8 are very interesting; they show that with a very small increase in the number of moves played, the inference can be enhanced markedly. The inference for the third and fourth strategies improved by over $10\%$. This is quite significant and is encouraging, because it means, as the number of moves used in the inference increases, the prediction rate increases. The values for the average message length show that the message lengths have doubled, as expected, but that they are still rather small. A minimum value of 93.2 bits is still quite small and probably the reason for the inaccuracy. The standard deviation has dropped in proportion to the average message length. It is mostly around half that of the average and while this is still high, it is reduced compared to Table 7, where it was greater than the average on the fifth strategy. This reduction means that the predictions can be more stable and therefore more accurate thus less prone to random factors.


 
Table 9: Inference results from inferring search strategies with five games and 100 trials per strategy.
  Actual Strategy
Inferred min1max1 min1max2 min1max3 min1max4 min2max2
min1max1 100% (153360.7) 0% (168470.9) 0% (151850.3) 0% (152942.6) 0% (267026.0)
min1max2 0% (214609.7) 100%(117244.3) 0% (124255.7) 0% (127918.7) 0% (211271.1)
min1max3 0% (274712.1) 0% (184740.6) 84% (112318.8) 13% (115051.5) 0% (415458.8)
min1max4 0% (274523.6) 0% (186709.1) 16% (112803.1) 87% (114061.4) 0% (416330.2)
min2max2 0% (252801.7) 0% (132944.6) 0% (141298.2) 0% (145715.0) 100% (176313.6)
$\hat\mu$ 1533.6 1172.4 1123.2 1140.6 1763.1
$\hat\sigma$ 393.4 298.9 228.5 259.3 531.0
Min 903.3 626.3 681.3 587.9 685.7
Max 3331.1 2482.5 2031.6 1866.4 4520.6

It is quite easy to see by looking at Table 9 that the predictive accuracy has improved from the two game to the five game inference by nearly another $15-20\%$. This is very impressive and shows that the predictive accuracy is related to the amount of data. The first, second and fifth strategies having the most information separating them, had 100% accuracy. The two strategies having more problems are the deeper depth quiescent strategies that require non-quiescent positions enabling them to be separated. When a prediction is wrong, it is usually a strategy that is very similar to the correct strategy. It should be noted that the minimum values at this point, are now, a lot higher than they were before. The increased predictive ability is because of the higher values and the standard deviation reducing in size.

 
Table 10: Inference results from inferring search strategies with ten games and 100 trials per strategy.
  Actual Strategy
Inferred min1max1 min1max2 min1max3 min1max4 min2max2
min1max1 100% (306740.8) 0% (336941.8) 0% (303335.4) 0% (305343.7) 0% (534052.0)
min1max2 0% (430046.4) 100%(234488.7) 0% (248289.2) 0% (255104.0) 0% (422542.1)
min1max3 0% (550513) 0% (369481.1) 100% (224461.6) 0% (229348.4) 0% (830917.7)
min1max4 0% (550085.1) 0% (373418.1) 0% (225415.3) 100% (227412) 0% (832660.6)
min2max2 0% (507215.0) 0% (265889.3) 0% (282193.8) 0% (290473.3) 100% (352627.3)
$\hat\mu$ 3067.4 2344.9 2244.6 2274.1 3526.3
$\hat\sigma$ 551.3 394.1 304.4 393.9 753.0
Min 2296.0 1607.5 1633.0 1365.6 2384.1
Max 4932.4 3518.7 3051.8 3025.0 6310.1

For the ten game inference Table 10 the results show that there is a perfect prediction rate. This is not too surprising when considering the increase in prediction rates from the previous game sizes. The thing to note is, the difference in size between each of the strategies is now very large when considering the total message length. The minimum values are high and the standard deviation is rather small, comparatively, all these things contribute to the high prediction rate.


 
Table 11: Inference results from inferring search strategies with ten games and using four attributes and 100 trials per strategy.
  Actual Strategy
Inferred min1max1 min1max2 min1max3 min1max4 min2max2
min1max1 100% (206772) 0% (302284.4) 0% (300692.6) 0% (297502.7) 0% (385977.7)
min1max2 0% (310853.5) 100%(232008.2) 0% (284756.5) 0% (281363.8) 0% (354066.6)
min1max3 0% (474911) 0% (578539.3) 100% (220678.8) 0% (223280.2) 0% (1004405.7)
min1max4 0% (477676.4) 0% (580077.2) 0% (221286.8) 100% (222507.8) 0% (998919.8)
min2max2 0% (349045.8) 0% (263577.2) 0% (314318.6) 0% (311945.7) 100% (240843)
$\hat\mu$ 2067.7 2320.1 2206.8 2225.1 2408.4
$\hat\sigma$ 515.6 432.5 354.8 307.7 489.2
Min 1385.9 1349.9 1806.2 1600.3 1469.0
Max 3197.7 3011.5 3142.8 2830.3 3288.4

All the values of the total message length, inference prediction, $\hat\mu$, $\hat\sigma$, Min and Max are similar to that of the two attributes used in Table 10. The four attributes had little effect on this methods ability to infer the strategy used. Table 11 clearly demonstrates the value that comes from the evaluation function, although affecting the choice of move, does not affect the search style when known. This is because, regardless of how many attributes are used, only a single value is outputted from the evaluation function and thus, if this value is known, it would then not matter how many attributes are used to generate it, the value is still known. The results for the 1 game, 2 game and 5 game trials were not included here as they were so similar to that of the two attribute trials that it was felt they were irrelevant. It can be seen quite clearly from the Table 11 that the results in the same area as that before.

This method predicts, in nearly all cases, the strategy that was used, or falsely predicts the next closest strategy, no matter the sample size. The results show that for similar strategies, larger sample sizes are needed to differentiate between them. Examining one game the average difference between the third strategy and the fourth is about $0.1\%$ difference, this makes the inference of these strategies difficult. This is due to an insufficient number of moves having no quiescence at the third ply and the need to look at the fourth. With the higher number of moves and games, it improves considerably. At five games, this problem area is not as serious with very few mistakes occurring and then only when the games are short. At ten games, there are no mistakes and the difference between the strategies is quite high. It is very useful to notice that the probability a certain strategy was used, will always be greater among similar types of strategies. This is advantageous, as it demonstrates that, even if the correct strategy is not chosen then the next closest to the original will be. For the inference of strategies that are unknown, this is very useful. This method allows for the probability of a strategy to be calculated, by having the current message length for a strategy divided by the sum of all the message lengths a probability of it being the correct strategy can be found. This would also allow for a determination of how close the strategies are to each other. This analysis however was not done here, but could be a possible future direction.


next up previous contents
Next: Assessment of reliance on Up: Inference of Search Strategy Previous: Complexity
Richard A O Wallbrink
2000-11-07