A Preliminary MML Linear Classifier using Principal Components for Multiple Classes

L Kornienko, D W Albrecht and D L Dowe


In this paper we improve on the supervised classification method developed in Kornienko et al. (2002) by the introduction of Principal Component Analysis to the inference process. We also extend the classifier from dealing with binomial problems only to multinomial problems.

The classifier uses the Minimum Message Length (MML) criterion as an objective function, where the MML criterion is a Bayesian technique representing functional complexity via the length of a two-part message. The first part of the string describes the function as a model, while the second part of the string describes the data given the model in question. The MML estimator has been shown to possess desirable statistical properties and, due to the principle's intuitive and flexible nature, has already been used succesfully in many different applications.

The application to which the MML criterion has been applied in this paper is the classification of objects via a linear hyperplane, where the objects are able to come from any multi-class distribution. The inclusion of Principal Component Analysis to the original inference scheme reduces the bias present in the classifier's search technique. Such improvement lead to a method which when compared against three commercial Support Vector Machine (SVM) classifiers on Binary data, was found to be as good as the most successful SVM tested. Furthermore, the new scheme is able to classify objects of a multiclass distribution with just one hyperplane whereas SVMs require several hyperplanes.