^II/Working/^ [01] >>

Mixture Modelling

See L. Allison, (i) TR 2003/148, and (ii) Types and Classes of Machine Learning and Data Mining, Twenty-Sixth Australasian Computer Science Conference (ACSC2003) pp.207-215, Adelaide, Australia, 4-7 February 2003, Conferences in Research and Practice in Information Technology, Vol.16. Michael Oudshoorn, Ed.


estMixture ests dataSet =
let
  -- [estimator]->[dataSpace] -> model of dataSpace
  -- i.e. [estimator] -> estimator
  ...

Takes a list of estimators, one per component of the mixture.

^II/Working/^ << [02] >>
memberships (Mix mixer components) =
let           -- memberships|Mixture
 doAll (d:ds) = prepend (doOne d) (doAll ds) -- all data
 doAll  []    = map (\x -> []) components

 doOne  datum = normalise(                   -- one datum
   zipWith (\c -> \m ->
     (pr mixer c)*(pr m datum)) [0..] components)
   -- pr(c) * pr(datum|c)  for class #c = m

in doAll dataSet

Given components of the mixture, find (fit) the fractional memberships of things (data) in (to) the components.

^II/Working/^ << [03] >>
randomMemberships =
let
 doAll seed  []    = map (\_ -> []) ests
 doAll seed (_:ds) =             -- all data
  let
   doOne seed  []      ans = (seed, normalise ans)
   doOne seed (_:ests) ans =     -- one datum
   doOne (prng seed) ests
         ((fromIntegral(1+ seed `mod` 10)) : ans)
  in let (seed2, forDatum) = doOne seed ests []
     in prepend forDatum (doAll seed2 ds)

in doAll 4321 dataSet

Allocate initial pseudo-random (prng) fractional memberships to things (data), not very interesting.

^II/Working/^ << [04] >>
fit [] [] = []               -- Models|memberships
fit (est:ests) (mem:mems) =
  (est dataSet mem) : (fit ests mems)

fitMixture mems =
 Mix (freqs2model (map (foldl (+) 0) mems)) -- weights
     (fit ests mems)         -- components

Calculate mixture-weights of the components, and fit components (use the given estimators) to their weighted members.

^II/Working/^ << [05] >>
cycle    mx = fitMixture (memberships mx) -- EM step

cycles 0 mx = mx
cycles n mx = cycles (n-1) (cycle mx)     -- n x cycle

in mixture( cycles ?? (fitMixture randomMemberships) )
-- -----9/2002--9/2003--L.Allison--CSSE--Monash--.au--

Fit memberships to components; fit components to the memberships. Iterate some number of times, or until convergence, or... etc..

^II/Working/^ << [06] >>

Summary

What are the estimators?
e.g. Multistate -- add up frequencies (+0.5 for MML) and normalise.
e.g. Normal -- usual mean and std dev (latter uses  /(n-1) for MML)
e.g. Multivariate -- ``product'' estimator over attribute estimators, or
e.g. a factor model estimator if you have one, etc..
 
Mixture complexity: Can search for 1, 2, 3, ... components for the optimal message length.
 
L. Allison. Types and Classes of Machine Learning and Data Mining, 26th Australasian Computer Science Conference (ACSC2003) pp207-215, Adelaide, Australia, 4-7 February 2003. Conferences in Research and Practice in Information Technology, Vol.16. Michael Oudshoorn, Ed.

© 2004 L. Allison, School of Computer Science and Software Engineering, Monash University, Australia 3168.
Created with "vi (IRIX)",   charset=iso-8859-1