SCHOOL OF COMPUTER SCIENCE AND SOFTWARE ENGINEERING
MONASH UNIVERSITY


TECHNICAL REPORT 2003/142


MML Inference of Single-Layer Neural Networks

E Makalic, L Allison and D L Dowe

ABSTRACT

Inference of the optimal neural network architecture for a specific dataset is a long standing and difficult problem. Although a number of researchers have proposed various model selection procedures, the problem still remains largely unsolved. The architecture of the neural network, (the number of hidden layers, hidden neurons, inputs, etc.) directly affects its performance. A network that is too simple will not learn that problem sufficiently well, resulting in poor performance. Conversely, a complex network can overfit and exhibit poor generalisation capabilities. This paper introduces a novel selection criterion, based on Minimum Message Length (MML), for inference of single hidden layer, fully-connected, feedforward neural networks. The criterion performance is demonstrated on several artificial and real datasets. Furthermore, the MML criterion is compared against an MDL-based criterion and variations of the Akaike's Information Criterion (AIC) and Bayesian Information Criterion (BIC). In all tests considered, the MML criterion never overfitted and performed as well as, and often better than other model selection criteria.