Lloyd Allison,
School of Computer Science & Software Engineering,
To: AIG,
Seminar's area: Inductive inference,
Research problem: What are statistical models?
I.e. What do they do?
What can be done to them?
How can you combine two or more of them?
And what do you get?
This talk describes an approach and gives practical examples.
www.csse.monash.edu.au/~lloyd/Seminars/200410II/index.shtmlGiven a new problem in general computing,
the probability of having an existing solution is ~0, ...
programming languages (e.g. Haskell)
exist to ease writing new solutions.
Given a new inductive inference problem,
the probability of having an existing solution is ~0, ...
<?what?> exists to ease writing new solutions?
repeat {
new postgrad starts in machine learning;
devises new model (or variation);
does some maths (not just a hacker);
implements search & estimators (practical too!);
does comparative tests v. other methods (beats them);
writes thesis, gets degree & leaves.
}
We are often left with programs that are hard to (re) use

@0  continuous
@1  Boolean
@2  Boolean
@3  Boolean
@4  continuous.
Bayesian networks can "learn & explain" relationships between attributes (variables).
N. Friedman and M. Goldszmidt.
suggested using classification trees in the nodes of a Bayesian network.
Oh good, we just happen [1] to have:
the toy e.g. (5 attributes)
details of the network's nodes next...
{CTleaf N(1.0,0.41)(+0.1),_,_,_,_}, @0 {CTleaf _,mState[0.5,0.5],_,_,_}, @1 {CTfork @0<>=1.4[ @2@0,@1 {CTleaf _,_,mState[0.99,0.01],_,_}, {CTfork @1=FalseTrue[ {CTleaf _,_,mState[0.98,0.02],_,_}, {CTleaf _,_,mState[0.02,0.98],_,_}]}]}, {CTleaf _,_,_,mState[0.5,0.5],_}, @3 {CTfork @2=FalseTrue[ @4@0,@2 {CTfork @0<>=1.0[ {CTleaf _,_,_,_,N(0.55,0.2)(+0.1)}, {CTfork @0<>=1.4[ {CTleaf _,_,_,_,N(1.0,0.2)(+0.1)}, {CTleaf _,_,_,_,N(1.45,0.2)(+0.1)}]}]}, {CTleaf _,_,_,_,N(3.45,0.2)(+0.1)}]}N.B. Don't get the nodes of the network and of a tree confused!
mState[p0,p1,...]  multistate distribution[*].
N(m,s)  normal (Gaussian) distribution[*].
"classification" tree[*],
CTleaf  model[*] an attribute,
CTfork  test[*] an attribute.
([*] message lengths, i.e. complexities, direct search & prevent overfitting.)
data Tipe = Alzheimers  Child  Despondent  Hiker  ...
type Age = Double
data Race = White  Black deriving (Eq, Enum,...
data Gender = Male  Female deriving (Eq, Enum,...
data Topography = Mountains  Piedmont  Tidewater deriving...
data Urban = Rural  Suburban  Urban deriving (Eq,Ord,...
type HrsNt = Double  hours notified
type DistIPP = Double  distance
...
type MissingPerson = (Maybe Tipe, Maybe Age, ...)
e0 = (estModelMaybe estMultiState) Tipe
e1 = (estModelMaybe (estNormal 0 90 1 70 0.5)) Age
...
e7 = (estModelMaybe (estNormal 0 50 0.5 30 0.2)) DistIPP
estimator = estVariate15 e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 e10 e11 e12 e13 e14
NB. There is(n't:) a lot of missing data.
Net:[  @1, Age: {CTleaf _,(Maybe 50:50,N(40.6,27.5)... }, ... etc. ...  @6, HrsNt: {CTfork @1(<>=62.0?)[ {CTleaf...,(Maybe 50:50,N( 8.7, 7.6)...}, {CTleaf...,(Maybe 50:50,N(21.4,26.3))...}, {CTleaf...,(Maybe 50:50,N(20.0,..1case)..} ]}, ... ] network: 115.1 nits + data: 5396.6 nits null: 5935.6 nits (@0..@7)
estModelMaybe estModel dataSet = let present (Just _) = True present Nothing = False m1 = uniformModelOfBool  [*] m2 = estModel (map (\(Just x) > x) (filter present dataSet)) in modelMaybe m1 m2[*] or^{ }if modelling missingness...
m1 = estMultiState (map present dataSet)
class Project t where  as in `projection' select :: [Bool] > t > t selAll :: t > [Bool]  all True flags
A type, t, in class Project
can be projected onto a
selected subspace.
E.g. A 15dim'n estimator, such as estVariate15 e0...e14.
Also a related class for selective partitioning of data within trees.
print warning: May lose network's lines if shrunk.e.g. 

 
A Network Inferred for all 15 Attributes of `lost persons' 
Have types and typeclasses for Models, FunctionModels and TimeSeries models. Polymorphic types, type classes, highorder functions, lazyevaluation  all useful.
Easy to write a new model,
Other case studies include  mixture models (clustering), timeseries, Markov models, mixtures of Markov models, function models (regressions), etc..
[1] L. Allison. Types and classes of machine learning and data mining. 26th Australasian Computer Science Conference (ACSC), Adelaide, pp.207215,