^CSE454^ ^2002^

Prac' 1 CSE454 CSSE Monash Semester 1, 2002

Due 4pm Thursday, week 4, 28 March 2002, at the CSSE general office.

Just before Christmas 2000, two papers appeared in the British Medical Journal, on the topic of dog bites and the full moon. Bhattacharjee et al, Do animals bite more during a full moon? Retrospective observational analysis, BMJ 2000; V321 [pp1559-1561] (23 December) suggested that bites were more frequent near the full moon. The other (from Australia), Chapman and Morrell, Barking mad? Another lunatic hypothesis bites the dust, BMJ 2000; V321, [pp1561-1563] (23 December), held the opposite view that there was no such link.

The BMJ is on-line and I have (copies of the papers and) Chapman and Morrell's data at [..../dog/].

  1. Read the two BMJ papers.
  2. In at most 2-sides of A4, total, comment on the strengths and weaknesses of each paper. Which is the most convincing? Why? (Don't give me "we need more data", we can't get any more!-)
    [5%]
  3. Use the [Snob] mixture-modelling program to see if there any clusters in Chapman and Morrell's data. e.g. There is no single right answer to this question, but carry out some (sensible) investigations as to whether including different attribute(s), such as the sex or the age of the victim, makes any difference.
  4. Write a 1 to 2-side report on your conclusions including some representation of any classes that Snob finds.
    [7%]
  5. In [..../cgi/] are two log files (200K - 300K) recording the use of two cgi-bin programs, `lambda' and `prolog.toy' throughout one year. I would like to know if there are "typical behaviours" (classes, clusters) in "sessions" of usage.
    It is difficult to define a session, but a sequence of uses of one of the programs from one client-computer with a gap of no more than m minutes (e.g. m=60) could be deemed a session.
    Attributes of a session might be (i) the duration and (ii) the number of uses in the session. Maybe you can think of some others.
    What distributions should be used for the attributes?
  6. Write a 2 to 3-side report on your conclusions including some representation of any classes that Snob finds.
    [8%]

    E.g. "The von Mises distribution M(mu,kappa) has a mean direction mu and concentration parameter kappa. For small kappa it tends to a uniform distribution and for large kappa it tends to a Normal Distribution with variance 1/kappa."
    - T. Edgoose, L. Allison & D. L. Dowe. An MML Classification of Protein Sequences that knows about angles and sequences. Pacific Symp. Biocomputing 98, pp585-596, Jan' 1998.
    f(x | mu, kappa) = (1 / (2.pi.I0(kappa)). exp( kappa.cos(x-mu) )
    where I0(kappa) is a normalization cosntant.


    © L. Allison, School of Computer Science and Software Engineering, Monash University, Australia 3800.
    Created with "vi (Linux & IRIX)",   charset=iso-8859-1