Mixture modelling (or mixture modeling, or finite mixture modelling, or finite mixture modeling) concerns modelling a statistical distribution by a mixture (or weighted sum) of other distributions. Mixture modelling is also known as

Also, an e-mailing list exists for "Classification, clustering, and phylogeny estimation", namely (CLASS-L@CCVM.SUNYSB.EDU or) owner-class-l@CCVM.SUNYSB.EDU, as does

a WWW site for the International Federation of Classification Societies (IFCS),

a WWW site for the Classification Society of North America (CSNA),

a WWW site for the Societe Francophone de Classification (SFC),

a WWW site for the (Polish) Sekcja Klasyfikacji i Analizy Danych PTS (SKAD) and

a WWW site for the (Dutch) Vereniging voor Ordinatie en Classificatie (VOC).

In 2001, there was: Mixtures 2001, Recent Developments in Mixture Modelling, 23 - 28 July 2001, Universität der Bundeswehr, Hamburg, Germany.

[See also Ray Solomonoff (1926-2009) 85th memorial conference (Wedn 30 Nov - Fri 2 Dec 2011), 1st Call for Papers.]

Most mixture modelling is done for mixtures of

However, other distributions for which mixture modelling has been done include (e.g.) :

Chris Fraley's Classification Bibliography.

Peter Macdonald's mixture distribution bibliography.

Fionn Murtagh (and CSNA)'s Classification Bibliographies.

Warren S. Sarle's selected Bibliography on Cluster Analysis.

Luis Talavera's Bibliography of Conceptual Clustering.

John Uebersax's Latent Class Analysis bibliography. (Also here.) Below we give lists of some available mixture modellers of various distributions: On-Line Software for Clustering and Multivariate Analysis listed by the CSNA.

Fionn Murtagh's list of Multivariate Data Analysis Software and

Fionn Murtagh's pointers to, and addresses of, lots of multivariate data analysis code.

S*i*ftware's links to clustering software.

See "Mixture modellers of Multinomial (or Bernoulli or multi-category) distributions" below.

"MIX". Commercial (see below).

Yudi Agusta and David Dowe, using MML. See, e.g., publications (2003, .pdf).

AutoClass (and Peter Cheeseman).

Clustan: www.clustan.com.

COBWEB, by Doug H. Fisher.

ECOBWEB concept formation program.

John Wolfe's

"MIX" Software Home Page (and About MIX) and mixture distribution bibliography. Commercial.

Snob software & ReadMe & documentation files), and latest paper [pp73-83 (Jan. 2000)]; by C. Wallace and D. Dowe - finite mixture model(s) by MML.

Snob

S. Akaho's EM algorithm (was here) with link to paper.

S. Akaho's program also does "line mixing".

Mike Alder (from CIIPS, U.W.A.)'s book (including some examples of the EM algorithm used for Gaussian mixture modelling).

C. Ambroise et al.'s Constrained clustering and the EM algorithm software for spatial clustering (was Constrained clustering and the EM algorithm).

S. Aylward's Mixture Modeling for Medical Image Segmentation.

Kaye Basford (co-author of mixture modelling book with (below) Geoff McLachlan)'s home page and The Biometrics Unit (University of Queensland)'s publications.

R. A. Baxter and J. J. Oliver, Finding overlapping components with MML - see also earlier related work on doing finite mixture models using MML by Wallace and Dowe (1994) and Wallace and Dowe (1997) and contemporaneous work by Wallace and Dowe (2000).

Hamparsum Bozdogan's home page (was Hamparsum Bozdogan).

Dr Carroll's Quasilikelihood estimation in measurement error models with correlated replicates paper. Dr Carroll's

Complex Systems Computation group (CoSCo), U. of Helsinki. Home page and research projects.

D. Dacunha-Castelle and E. Gassiat's work, papers nos. 25 and 44.

Petros Dellaportas (and Dimitris Karlis)'s (mixture modelling) papers. Dellaportas-Karlis mixture modelling

David Dowe (and publications): See Snob by C. Wallace and D. Dowe. Has published on mixture modelling of Gaussians with factor analysis (with R. Edwards, 1998), and (with Y. Agusta) other correlated Gaussians (2003, .pdf), t distributions (2002) and Gamma distributions (2003).

Russell Edwards and David Dowe have extended Snob to deal with single Gaussian factor analysis (assuming total assignment) using MML.

Peter Green.

Cem Hocaoglu.

HTK Book (and links to chapters). Commercial. Entropic Cambridge Research Laboratory Ltd.

Michael Jordan's projects.

Murray Jorgensen's home page (link to MULTIMIX).

Geoff McLachlan is the author of several articles and a joint book on mixture modelling (with (above) Kaye Basford) and is currently completing EMMIX (MIXFIT) software, suitable for Max L'hood fitting of Gaussians in discriminant and cluster analyses and many experimental situations. Permits re-sampling-based tests and bootstrap-based standard error assessment. Some of G. McLachlan and David Peel's data sets.

Boris Mirkin's publications and current projects.

Radford Neal's Bayesian Mixture Modeling by Monte Carlo Simulation and Markov Chain Sampling Methods for Dirichlet Process Mixture Models. R. Neal's

Adrian Raftery's and Chris Fraley's Model-Based Clustering Software (MCLUST).

Christian Robert's ftp site.

Arthur C. Sanderson.

Rob Tibshirani's research, and T. Hastie & R. Tibshirani Gaussian mixture paper. T. Hastie

& R. Tibshirani's

Gerhard Visser's and David Dowe's (2007) "Minimum Message Length Clustering Of Spatially-Correlated Data with Varying Inter-Class Penalties" (and here), and (with J. P. Uotila, 2009) "Enhancing MML Clustering using Context Data with Climate Applications" (and here).

Chris Wallace and David Dowe's Snob work (and software and ReadMe), and latest paper - see Snob above. Uses Minimum Message Length (MML). [See also Wallace (1998) and Visser & Dowe (2007) on spatial correlation.]

Mike West's publications.

Dr Carroll's A nonparametric mixture approach to case-control studies with errors in covariables paper. Dr Carroll's

Dr Carroll's Segmented regression with errors in predictors paper. Dr Carroll's

See "Mixture modellers of Gaussian distributions" above.

Snob, by Chris Wallace and David Dowe - see Snob above, under "Gaussian". Uses Minimum Message Length (MML).

Murray Jorgensen's home page (see above, or link to MULTIMIX).

Martin Puterman's home page, with several of his papers, data and codes. M. Puterman has worked on mixture models for discrete data.

John Uebersax's Latent Class Analysis page has FAQs, bibliographies, software links, examples, and some of his papers and programs (including MIXBIN, which estimates a mixture of binomials).

See "Mixture modellers of Gaussian distributions" above.

Snob, by Chris Wallace and David Dowe - see Snob above, under "Gaussian". Uses Minimum Message Length (MML).

Petros Dellaportas's home page.

Yudi Agusta and David Dowe, using MML. See, e.g., publications.

Snob, by Chris Wallace and David Dowe - see Snob above, under "Gaussian". Uses Minimum Message Length (MML).

Oliver, J.J. and D.L. Dowe, 1996. Uses Minimum Message Length (MML).

"MIX". Commercial (see above).

Shotaro Akaho's EM algorithm (with link to paper) for "line mixing" (see above).

M. Black and A. Jepson's Mixture Models for Optical Flow Computation. Explores use of mixture models to represent optical flow in image regions containing multiple motions due to occlusion and transparency.

Vincent Garcia and Frank Nielsen's jMEF (``A Java library to create, process and manage mixtures of exponential families'').

Sara van de Geer's Home page.

IBM's CViz.

D. Laidlaw, K. Fleischer + A. Barr, Class'n of MRI Data for Geometric Modeling and Visualization.

Laidlaw, Fleischer and Barr's

Christian Lenart's (fuzzy) clustering page and description of software.

MEME software for finding patterns in DNA and protein sequences.

MIT (Germany)'s DataEngine Product Family page.

Vincent Garcia and Frank Nielsen's jMEF (``A Java library to create, process and manage mixtures of exponential families'').

NSWC Advanced Computational Technology Group's pattern recognition and classification, including work on mixtures based density estimation applied to statistical pattern recognition and image processing, e.g. J. Solka and W. Poston's Visualization of Finite and Adaptive Mixtures Models - Univariate Examples.

Adrian Raftery's clustering and spatial point pattern research and group on clustering and Bayesian model selection.

SPIDER is a large image processing system for electron microscopy, including multivariate statistical classification and cluster analysis. Commercial.

SUBDUE, by Diane J. Cook and Lawrence B. Holder.

M. Afzal Upal's publications on comparison(s) of non-hierarchical unsupervised classification algorithms.

Some data links (and some medical data links); and Geoff McLachlan and David Peel's "Finite Mixture Models" and data sets.

Statistical Society of Canada Case Studies in Data Analysis for 2000 and Mixtures Plus - Case Studies.

StatLib Index (from the Carnegie Mellon University Statistics Department).

Tjen-Sien Lim's "Tree-Structured & Rules Induction Programs Homepage"

Kevin Murphy's list of free Bayes net software.

"Minimum Message Length, MDL and Generalised Bayesian Networks with Asymmetric Languages", by J. W. Comley and D.L. Dowe; Chapter 11 (pp265-294) in P. Grunwald, M. A. Pitt and I. J. Myung (eds.), Advances in Minimum Description Length: Theory and Applications, M.I.T. Press, April 2005, ISBN 0-262-07262-9. {This is about Generalised Bayesian nets, generalising MML Bayesian nets or MML Bayesian networks or MML Bayes nets (or Generalised directed graphical models, generalising MML directed graphical models); and it deals with a mix of both continuous and discrete variables. (See also Comley and Dowe (2003), .pdf.)}

Data Mining Information, maintained by Graham Williams.

Online Machine Learning Resources, maintained by the ML Group at the Austrian Research Institute for Artificial Intelligence (OFAI), Vienna, Austria.

Artificial Intelligence Resources, maintained by NRC-CNRC Institute for Information Technology.

A Guide to the Web for Statisticians (was A Guide to the Web for Statisticians), maintained by Gordon Smyth.

Autonomous Agents '97 Related Sites.

AI Intelligence (and here)'s AI Information Bank. Commercial.

International Rough Set Soc'y, U. Regina's Electronic Bulletin of the Rough Set Community pages.

Bayesian Knowledge Discoverer (BKD), by Marco Ramoni and Paola Sebastiani: A program for model selection with missing data using directed graphical models and discrete variables.

http://www.gamma.rug.nl iec ProGAMMA.

http://www.eco.rug.nl/medewerk/WEDEL/slides/segmenta/sld001.htm slides on Market segmentation with mixture models.

NASA Data Archive and Distribution Service.

Michael Carley's (acoustics and acoustic mixing) home page.

Minimum message length (MML),

Chris Wallace (1933-2004) (developer of MML in 1968),

Bayesian Nets using Minimum message length (MML),

data repositories,

decision trees and decision graphs using MML,

Occam's razor (Ockham's razor),

Snob (program for MML clustering and mixture modelling, MML finite mixture models),

(econometric) time series using MML,

medical research,

a probabilistic sports prediction competition (and further reading on probabilistic scoring),

chess and game theory research;

Feeding the world (TheHungerSite), TheRainforestSite, "do-goody"/"do-goody stuff, improving the world and saving the planet".

Dr David Dowe, Dept. of Computer Science, Monash University, Clayton, Vic. 3168, Australia

(and was started on Sun 26th Jan. 1997) and was last updated no earlier than Fri 5th Feb. 1999.

Copyright David L. Dowe, Monash University, Australia, 26 Jan 1997, 3 Mar 1998, 7 May 1998, etc.

Copying is not permitted without expressed permission from David L. Dowe.