filler
^CSE454^ [01] >>

Mixture Modelling

CSE454 2003 : This document is online at   http://www.csse.monash.edu.au/~lloyd/tilde/CSC4/CSE454/   and contains hyper-links to other resources - Lloyd Allison ©.

  See L. Allison. Types and Classes of Machine Learning and Data Mining, Twenty-Sixth Australasian Computer Science Conference (ACSC2003) pp207-215, Adelaide, Australia, 4-7 February 2003. Conferences in Research and Practice in Information Technology, Vol.16. Michael Oudshoorn, Ed.

estMixture ests dataSet =
let
  -- [estimator]->[dataSpace] -> model of dataSpace
  -- i.e. [estimator] -> estimator

Takes a list of estimators, one per component of the mixture.


filler
<< [02] >>
memberships (Mix mixer components) =
let           -- memberships|Mixture
 doAll (d:ds) = prepend (doOne d) (doAll ds) -- all data
 doAll  []    = map (\x -> []) components
 doOne  datum = normalise(                   -- one datum
   map (\(c, m) ->
     (pr mixer c)*(pr m datum)) (zip [0..] components))
   -- pr(c) * pr(datum|c)  for class #c = m
in doAll dataSet

Given components, find (fit) the fractional memberships of things (data) to the components.


filler
<< [03] >>
randomMemberships =
let
 doAll seed  []    = map (\_ -> []) ests
 doAll seed (_:ds) =             -- all data
  let
   doOne seed  []      ans = (seed, normalise ans)
   doOne seed (_:ests) ans =     -- one datum
   doOne (prng seed) ests
	 ((fromIntegral(1+ seed `mod` 10)) : ans)
  in let (seed2, forDatum) = doOne seed ests []
     in prepend forDatum (doAll seed2 ds)
in doAll 4321 dataSet

Allocate initial pseudo-random (prng) fractional memberships to things (data), not very interesting.


filler
<< [04] >>
fit [] [] = []               -- Models|memberships
fit (est:ests) (mem:mems) =
  (est dataSet mem) : (fit ests mems)

fitMixture mems =
 Mix (freqs2model (map (foldl (+) 0) mems)) -- weights
     (fit ests mems)         -- components

Calculate mixture-weights of the components, and fit components (use the given estimators) to their weighted members.


filler
<< [05] >>
cycle    mx = fitMixture (memberships mx) -- EM step
cycles 0 mx = mx
cycles n mx = cycles (n-1) (cycle mx)     -- n x cycle

in mixture( cycles ?? (fitMixture randomMemberships) )
-- --------------9/2002--L.Allison--CSSE--Monash--.au--

Fit memberships to components; fit components to the memberships. Iterate some number of times, or until convergence, or... etc..


filler
<< [06]

Summary

What are the estimators?
e.g. Multistate -- add up frequencies (+0.5 for MML) and normalise.
e.g. Normal -- usual mean and std dev (latter uses  /(n-1) for MML)
e.g. Multivariate -- ``product'' estimator over attribute estimators, or
e.g. a factor model estimator if you have one, etc..
 
Mixture complexity: Search for 1, 2, 3, ... components for the optimal message length.
 
L. Allison. Types and Classes of Machine Learning and Data Mining, 26th Australasian Computer Science Conference (ACSC2003) pp207-215, Adelaide, Australia, 4-7 February 2003. Conferences in Research and Practice in Information Technology, Vol.16. Michael Oudshoorn, Ed.

© L. Allison, School of Computer Science and Software Engineering, Monash University, Australia 3168.
Created with "vi (IRIX)",   charset=iso-8859-1