Snob page

Welcome to David Dowe's Snob page (for MML finite mixture models).
(See the page re-formatted and perhaps out of date here.)

Papers on theory behind Snob and papers on applications of Snob.

Snob uses the Minimum Message Length (MML) principle to do mixture modelling to infer MML finite mixture models.
Mixture modelling (or mixture modeling) concerns modelling a statistical distribution by a mixture (or weighted sum) of other distributions. Mixture modelling is also known as
  • unsupervised concept learning or unsupervised learning (in Artificial Intelligence)
  • intrinsic classification (in Philosophy), or, classification
  • clustering
  • numerical taxonomy

    Name of program:
    As per sec. 0.2.4, p535, footnote 113 in D. L. Dowe (2008), "Foreword re C. S. Wallace", Computer Journal, Vol. 51, No. 5 (Sept. 2008) [Christopher Stewart WALLACE (1933-2004) memorial special issue], pp523-560 (and here), the program was given the name ``Snob'' because of Chris Wallace's self-effacing joke that it (supposedly) makes ``arbitrary class distinctions''.

    Minimum Message Length (MML) is a method of machine learning, statistical inference, inductive learning, "knowledge discovery" and "data mining" very much in line with the notions of Kolmogorov complexity and algorithmic information theory pioneered by R. J. Solomonoff, A. N. Kolmogorov and Greg Chaitin. See also Wallace & Dowe (1999a), "Minimum Message Length and Kolmogorov complexity", Comp. J., Vol 42, No. 4, 270-283 [which is the Computer Journal's most downloaded ``full text as .pdf'' article - see, e.g., here].

    [Possibly see also Ray Solomonoff (1926-2009) 85th memorial conference (Wedn 30 Nov - Fri 2 Dec 2011), 1st Call for Papers and conference proceedings.]

    The original Snob paper was: Wallace, C. S. and Boulton, D.M. (1968), `An Information Measure for Classification', Computer Journal, Vol. 11, No. 2, 1968, pp. 185-194. This is the same paper in which MML was developed. (See also more recent Snob theory and application papers.)
    See also D. L. Dowe (2008), "Foreword re C. S. Wallace", Computer Journal, Vol. 51, No. 5 (Sept. 2008) [Christopher Stewart WALLACE (1933-2004) memorial special issue], pp523-560 (and here) for a survey of all of Wallace's works - including all his MML work and his Snob MML mixture modelling work.

    Snob currently deals with finite mixture models (or a finite mixture model) of
  • Normal (or Gaussian) distributions
  • discrete multi-state (also called Bernoulli or categorical) distributions
  • Poisson distributions
  • von Mises circular distributions
  • missing data

    Chris Wallace extended Snob in 1998 to deal with spatial correlation (and Markov fields), as occurs in images.
    Gerhard Visser and David Dowe (2007, 2009) endeavoured to extend Wallace (1998, above) with "G. Visser and D. Dowe" (2007) "Minimum Message Length Clustering Of Spatially-Correlated Data with Varying Inter-Class Penalties" and with G. Visser, D. L. Dowe and J. P. Uotila (2009) "Enhancing MML Clustering using Context Data with Climate Applications".

    Russell Edwards and David Dowe created a version of Snob (also in 1998) which deals with single Gaussian factor analysis in sequentially and spatially uncorrelated data. It uses total assignment. (See publications.) Yudi Agusta and David Dowe have also developed MML mixture modelling software for other correlated Gaussians, t distributions (2002) and Gamma distributions (2003, .pdf); and Jon Oliver and David Dowe published a note (1996) on MML mixture modelling of von Mises-Fisher spherical distributions.

    A useful set of links on Snob is given in the next few lines immediately below:
    Snob ReadMe, documentation and (data) sd1.raw files, .ps of Wallace and Dowe (1997) and .pdf of more recent, 2000, paper:
    Wallace, C.S. and D. L. Dowe (2000). MML clustering of multi-state, Poisson, von Mises circular
    and Gaussian distributions, Statistics and Computing, Vol. 10, No. 1, Jan. 2000, pp73-83.
    p73, p74, p75, p76, p77, p78, p79, p80, p81, p82, p83
    http://www.wkap.nl/issuetoc.htm/0960-3174+10+1+2000
    http://www.wkap.nl/sampletoc.htm?0960-3174+10+1+2000
    See also Wallace and Dowe (1994) (reference) in C.S. Wallace publications and/or D.L. Dowe publications.

    The Snob software is available subject to conditions.
    Snob Method: Bayesian, Minimum Message Length (MML). Features: Deals with missing data.

    Papers on theory behind Snob and papers on applications of Snob.
    Chris Wallace MML publications, 1968-1991 and Chris Wallace MML publications, 1990- .
    Chris Wallace MML applications, 1968-1996 and Chris Wallace MML applications, 1990- .

    Fortran compiler
    If you would like to download a Linux Fortran compiler, go to http://www.rpmfind.org/RPM/EByName.html and look for "egcs-g77...".

    C, C++, Java version(s)
    A C version is currently under construction.

    Link to Random number generation software
    (Pseudo-)Random number generation software in Fortran :
    uniform (for multinomial), Gaussian (Normal), von Mises (circular) and Poisson.
    http://www.csse.monash.edu.au/research/mdmc/software/random.

    Other links
    Link to Lloyd Allison's Short note on Snob.
    Link to K D Mine's S*i*ftware Snob notes, based on material supplied by D. Dowe and L. Allison.
    Minimum Description Length (MDL) and comparisons with MML (on pp270-283 and elsewhere) in Comp. J., Vol 42, No. 4, 1999.

    Bayesian networks using MML,
    clustering, mixture modelling and finite mixture models,
    comparisons between MML and the subsequent Minimum Description Length principle,
    data repositories,
    decision trees and decision graphs using MML,
    medical research,
    "Minimum Message Length, MDL and Generalised Bayesian Networks with Asymmetric Languages", by J. W. Comley and D.L. Dowe; Chapter 11 (pp265-294) in P. Grunwald, M. A. Pitt and I. J. Myung (eds.), Advances in Minimum Description Length: Theory and Applications, M.I.T. Press, April 2005, ISBN 0-262-07262-9. {This is about Generalised Bayesian nets, generalising MML Bayesian nets or MML Bayesian networks or MML Bayes nets (or Generalised directed graphical models, generalising MML directed graphical models); and it deals with a mix of both continuous and discrete variables. (See also Comley and Dowe (2003), .pdf.)}
    Minimum Message Length (MML),
    Occam's razor (Ockham's razor),
    a probabilistic sports prediction competition (and further reading on probabilistic scoring),
    Snob theory publications and applications publications,
    (econometric) time series using MML,
    Chris Wallace (1933-2004) (developer of MML in 1968) and his (2005) [posthumous] Book: Statistical and Inductive Inference by Minimum Message Length, Springer (Series: Information Science and Statistics), 2005, XVI, 432 pp., 22 illus., Hardcover, ISBN: 0-387-23795-X. (Link to table of contents, chapter headings and more.)
    Tribute to Chris Wallace: D. L. Dowe (2008), "Foreword re C. S. Wallace", Computer Journal, Vol. 51, No. 5 (Sept. 2008) [Christopher Stewart WALLACE (1933-2004) memorial special issue], pp523-560 (and here).

    chess and game theory research,
    Feeding the world (TheHungerSite), TheRainforestSite, and "do-goody"/"do-goody stuff, improving the world and saving the planet". This Snob page was put together by
    Dr David Dowe, Dept. of Computer Science, Monash University, Clayton, Vic. 3168, Austra lia
    e-mail: d l d at csse dot monash.edu.au (Fax: +61 3 9905-5146)
    (and was started on Sat 8th Mar. 1997) and was last updated no earlier than Mon 3rd Mar. 1998.
    Copyright David L. Dowe, Monash University, Australia, 8 Mar 1997, 3 Mar 1998, 7 May 1998, etc.
    Copying is not permitted without expressed permission from David L. Dowe.