Other than the use of compression and message lengths that form the basis of this and the previous paper by Jansen et al. [17][18] there are many other forms of inference of game players' strategies. One of these techniques is that formulated by Jonathan Baxter [3] which involves the use of temporal differences. A temporal difference is the technique for "approximating the expected long term future cost (or cost-to-go) of a stochastic dynamical system as a function of the current state" [3]. This technique is quite different in that it is not determined simply through the data of past games; it relies on the playing of games with opponents to find the best possible moves.
Another area that attempts to find the strategy is through perceptual chunks, a technique Walczak utilized [34][33], and studied by others such as Schaeffer [15]. The definition of a chunk, given by Walczak [33] is: "any group of objects in the relevant domain that are perceptually (e.g., visually) grouped together". These papers take an entirely different approach to the previous studies; they concentrate on finding repeated patterns in the chessboard to predict a player's next move. The idea of using patterns for the prediction of a players move, relies on the studies by de Groot, Chase and Simon [8] [5] [6]. These studies have found that grandmasters are able to memorise thousands of patterns; that being the big difference between them and novices. The chunking method can then learn a pattern and the next probable move, given that pattern. This system closely reassembles the idea in this Thesis, as it works from game data, but obviously, it looks at pattern structure rather than attribute and search structure. Another form of this pattern matching was developed by Eduardo Morales [24]. However, this method then uses decision rules such as those used in other forms of artificial intelligence like 'Go' [20].
Another scheme is by Rajiv Sarin and Farshid Vahid, [29][28] that stems from the economists philosophy that all actions come from a person's prior beliefs. This thought was first tested in 1997 and found to have fairly high prediction rates. This method however does not learn from a database and does not use evaluation functions and search strategies in the same form as used in this paper.
David Carmel and Shaul Markovitch [4] have shown that if you can simplify a few conditions under which game playing takes place, it is possible to infer the original game playing strategy. They have shown that if an opponent's evaluation function is a linear combination of attributes and that if the program, trying to learn this, knows the complete set of these attributes, plus that of the opponent and keeps this same evaluation function throughout play, it is then possible to learn it. There is also the assumption that the opponent uses a basic minimax search to a fixed uniform depth. This means that quiescent search or singular extensions are not used and the opponent does not use any explicit model of a player.
The algorithm has a weight vector that holds the weights of each attribute. This weight vector is then found by simply performing a hill-climbing search at each depth to improve the weight vector, until no more significant improvement can be found. The move chosen should then have a heuristic value higher than that of any of the other possible moves. This is made into an inequality which states; that the weight times the heuristic value of the move chosen, should be greater than or equal to that of the weight vector times any other heuristic value. These inequalities are then accumulated for each ply and then the constraints (inequalities) are then solved to find a weight vector. This weight vector is then compared with the current best weight vector to find which is more accurate. If this new vector is no better than the old are, then the old vector is thought to be the best representation of the weights. This is then repeated with the next depth down the search tree, with the best vector now being the current vector for the next depth. The depth that the opponent searches to, is learnt by finding the depth which has highest accuracy in predicting the opponent's moves.
This method has been tested on three different strategies and was able to achieve prediction rates of 100% on the first two strategies and 93% on the third. In all cases, the actual depth of the search was found. This is very encouraging as it means that in an ideal world, the strategy of an opponent could be learnt almost perfectly, but as the assumptions do not hold, especially in relation to humans and world champions, this method is only of limited value. A further problem in trying to use this method in the real world, is that the technique is not robust; for instance a small error in calculating the evaluation function can increase the error rate of the depth finding part of the algorithm considerably. This method is significant however, as it forms the basis of the search for a player's strategy used in section 5. This paper uses a similar idea but is implemented in a different form.
Additional learning evaluation functions were created by Ferrer and Martin [13] and Lorenz and Markovitch [22] who used genetic programming to find evaluation functions. This approach used the same idea of weights but little else in common. Other areas related are those of using Neural Networks with NeuroChess [32], Metagame playing (Metagame is where one program can play a few similar types of board games)[25], and feature discovery [12].