^CSE454^
[01]
>>
1: Introduction
CSE454
2005
:
This document is online at
http://www.csse.monash.edu.au/~lloyd/tilde/CSC4/CSE454/
and contains hyper-links to other resources
|
^CSE454^
<<
[02]
>>
NB. The term
data space is often used,
in machine learning. |
^CSE454^
<<
[03]
>>
|
^CSE454^
<<
[04]
>>
InferencePeople often distinguish between
|
^CSE454^
<<
[05]
>>
BayesIf B1, B2, ..., Bk is a partition of a set B (of causes) then P(A|Bi).P(Bi) P(Bi|A) = ------------------------------ i=1, 2, ..., k P(A|B1).P(B1)+...+P(A|Bk).P(Bk) |
^CSE454^
<<
[06]
>>
. . . applied to data D and hypotheses Hi:
P(D|H1).P(H1)+...+P(D|Hk).P(Hk) = P(D) P(Hi|D) = P(D|Hi).P(Hi) / P(D) posterior P(Hi|D) P(D|Hi).P(Hi) ------- = -------------- posterior odds-ratio P(Hj|D) P(D|Hj).P(Hj) |
^CSE454^
<<
[07]
>>
NB. Can ignore P(Hi) in posterior odds-ratio
if, and only if, P(Hi)=P(Hj).
Maximum likelihood may can cause problems when we have inequality. |
^CSE454^
<<
[08]
>>
ExampleC1, a fair coin, P(H) = P(T) = 0.5. C2, a biased coin, P(H) = 2/3, P(T) = 1/3. One of the coins is thrown 4 times, giving H, T, T, H. Which coin was thrown?
|
^CSE454^
<<
[09]
>>
Prior, P(C1) = P(C2) = 0.5. Likelihood, P(HTTH | C1) = 1/16 and P(HTTH | C2) = 4/9 . 1/9 = 4/81. Posterior odds-ratio,
P(C1|HTTH)/P(C2|HTTH) =
|
^CSE454^
<<
[10]
>>
Now, P(C1|HTTH) + P(C2|HTTH) = 1 and if x/(1-x) = 81/64, then
P(C1|HTTH) = 81/145. This case is simple because the model space is discrete, in fact finite (2). |
^CSE454^
<<
[11]
>>
e.g. predictionKnow P(C1) = 81/145, P(C2) = 64/145. The more likely coin is C1. If we assumed the coin really was C1, would
predict But the coin might be C2. Should predict P(H) =
i.e. use a weighted average of the hypotheses. |
^CSE454^
<<
[12]
>>
ConclusionWe have looked at
© 2005 L. Allison, School of Computer Science and Software Engineering, |