filler
[01] >>
^MML^

Clustering + a Markov Model

On work by Tim(e) Edgoose

Other debts (alphabetic order): Rohan Baxter, David Dowe, Chris Wallace.

© 2001 Lloyd Allison, Computer Science and Software Engineering, Monash University, Victoria, Australia 3800



This document is online at  www.csse.monash.edu.au/~lloyd/Seminars/Mixture-MM/index.shtml   and contains hyper-links to more detailed notes on certain topics and to some of the references.

filler
<< [02] >>

The Problem: Clustering, (unsupervised-) Classification, Mixture Modelling, Numerical Taxonomy, in a Time-Series.

Data: A sequence of multi-variate Observations.


filler
<< [03] >>

The Model

(a) The number of classes (N)

(b) The relative abundance of each class

(c) For each class
  • Distribution parameters for each significant attribute  (e.g. mu, sigma)
  • Relative abundance of next class conditional on current class

(d) For each Observation
  • An assigned class (but see twist later)
  • Attribute values given this class

filler
<< [04] >>
Rev T B

Model Complexity:


filler
<< [05] >>
Hidden Markov Model transitions for 2 Classes

Model Transitions for 2 Classes

filler
<< [06] >>

Estimation (1)

But this gives biased estimates of model parameters!

filler
<< [07] >>

Estimation (2) - the twist

more...

filler
<< [08] >>

Forward:

Consider all paths leading to   ok | cj:

F(o1 in ci) = P(ci).P(o1|ci)

F(ok in ci) = SUMj=1..N F(ok-1 in cj). P(ci|cj). P(ok|ci)
1 < k <= K

P(D|H) = SUMi=1..N F(oK in ci)

filler
<< [09] >>

Backwards

Consider all paths leading from   ok | cj:

B(oK in ci) = 1

B(ok = SUMj=1..N P(cj|ci). P(ok+1|cj). B(ok+1 in cj)
1 <= k < K

Now
P(D|H, ok in ci) = F(ok in ci). B(ok in ci).
and
P(D|H) = SUMi=1..N P(D|H, ok in ci)
for 1 <= k <= K

filler
<< [10] >>

. . . this gives the probability that ok is in cj. i.e. We can deal with fractional membership of classes.
Results in unbiased estimates of class parameters and transition probabilities.

Hidden Markov Models path conditional in ok in c1

filler
<< [11] >>

How Many Classes?

Can split class ci into ci' and ci'', in ratio w:1-w, e.g. w=0.5. New transition prob's initialised:
next>
v.
curr'
. . .ci'. . .ci''. . .cj. . .
.
.
.
       
ci' P(ci|ci).w P(ci|ci).(1-w) P(cj|ci) 
.
.
.
       
ci'' P(ci|ci).w P(ci|ci).(1-w) P(cj|ci) 
.
.
.
       
cj P(ci|cj).w P(ci|cj).(1-w)   
.
.
.
       

filler
<< [12] >>

Can also merge two classes.

Splitting or merging takes place if there is a nett reduction in message length.

Good heuristic, fast.

Must converge.

Could get stuck in local optimum, but unlikely.

filler
<< [13] >>
Tests, e.g. 2-classes:
varying class auto transition from 0.0 to 1.0
Hidden Markov Model Msg Len v. Auto-Correlation parameter

filler
<< [14] >>
varying data set size,   t=0.8
Hidden Markov Model Msg Len v. data set size, auto-correlation=0.8, s=2
1C: Snob, 1-class; 2C: Snob, 2-classes; M2C: Markov Model 2-classes.
M2C wins if log10(K)>2.3

filler
<< [15] >>
e.g. protein dihedral angles:
class structure by Hidden Markov Model on Protein secondary structure dihedral angles
Left: 17 classes

[big pic']
[paper]
[Bioinformatics]

filler
<< [16]

Conclusion


© L. Allison, School of Computer Science and Software Engineering, Monash University, Australia 3800.
Created with "vi (Linux & IRIX)",   charset=iso-8859-1