Multi-state

Consider a discrete sample space of M unordered values, e.g.

throw = {head, tail} M = 2
base = {A, C, G, T} M = 4
roll = {1, 2, 3, 4, 5, 6} M = 6. NB. unordered
amino acid = {Glycine, Alanine, Valine, Isoleucine, Leucine, Phenylalanine, Proline, Methionine, Serine, Threonine, Tyrosine, Tryptophan Aspargine, Glutamine, Cysteine, Aspartic acid, Glutamic acid, Lysine, Arginine, Histidine} M = 20

and sequences of these.

<< [02] >>

Distribution has M-1 parameters T₁, T₂, ..., T_M-1. M-1 degrees of freedom.

Also define T_M = 1 - T₁ - T₂ ... - T_M-1

<< [03] >>

Estimators

From data, observed frequencies are n₁, ..., n_M, let N = SUM_i=1..M n_i.

Maximum likelihood: T_i,ML = n_i/N what if n_i=0?

Minimum Message Length: T_i,MML = (n_i + 1/2)/(N + M/2)

MinEKL estimator: T_i,MinEKL = (n_i + 1)/(N + M) minimum expected Kullback Leibler

<< [04] >>

discrete sample spaces (as seen) and also
model of the "class" attribute in supervised classification
sub-model on 1st-order Markov model
proportions of the classes in a mixture model (unsupervised classification)
frequency of transitions out of a state in a Probabilistic Finite State Automaton (PFSA, hidden Markov model, HMM) . . .

<< [05] >>

Note finite number of transitions out of each state of automaton

Created with "vi (IRIX)", charset=iso-8859-1