^CSE454 /
2005^
CSE454 prac notes 2005
These are general comments;
you should be able to identify which, if any, apply to you.
- Do start the pracs early.
- Use correct scientific language and style.
-
- Prac 1: As discussed in lectures,
Snob likely finds more than 3 classes
because of correlations in the data:
- Length and width are both ~size and are very likely to be correlated
- (note those rather diagonal scatter plots that some of you drew).
- A crude way to remove some of this effect is to use a transformation
such as
- <length, width> ---> <length, aspectRatio>
where aspectRatio=width/length, or variations on similar themes.
- E.g. RGB-colour values have a similar "problem" and are often transformed
<red,green,blue> --->
<brightness,chr1,chr2>
for the same reason.
- A table of species v. Snob classes shows that the program
is something like 95+% "accurate", which is very good indeed.
If a machine learning method gets results half-way close to such a
figure on real-world data it is doing very well indeed!
And this is a very tough test because we have not even told the program
that there are 3 species nor which species a specimen belongs to.
The program is playing the botanist given some new, unknown specimens.
And it only has four numbers per specimen to go on!
- Bites:
Finding just one class does not imply that there is no effect,
that the data are random; that does not necessarily follow
(it might follow, given more details,
but it does not necessarily follow).
'One class' is not the "null hypothesis".
- What is the null hypothesis?
(You might get some ideas from a completely
[different] problem.)
- E.g. There could be a mean close to the full-moon and
a small std. dev. --
- you should state the std. dev. corresponding to kappa (κ), and
the angles corresponding to at least full- and new-moons.
- Prac 2: ...
- ..., OK, you'll start this one early?
2005 © L. Allison,
School of Computer Science and Software Engineering,
Faculty of Information Technology,
Monash University, Australia 3800.
Created with "vi (Linux & Solaris)", charset=iso-8859-1