[CSE423]

CSE423 Learning and Prediction, prac2, 2001

http:// ... /~lloyd/tilde/CSC4/CSC423/2001/prac2.html

We know that three ["obvious"] ways of transmitting a sequence of values sampled from a multi-state distribution give nearly equal two-part message lengths. The two methods that do not state an inference are shortest, and the method that states an inferred estimate (to optimum precision) takes a fraction of a bit more per parameter.

This exercise is make an experimental investigation of the corresponding situation for the [normal (Gaussian)] distribution.

[DLD] can give you a good normal, pseudo-random number generator.

Write a C program to compare two methods of transmitting a sequence of data values sampled from a normal distribution:
Method 1: The MML method as described at the link above.
Method 2: An adaptive code, based on the following idea.
Transmit the first two values from the sequence using a code based on a mean and variance from the prior.
Transmit the third and subsequent values using a code based on a distribution N(m,s) where m and s are calculated from the previously transmitted values, as available to both transmitter and receiver.
Your program must be able to do at least the following:
You should investigate some variations, e.g. the effect of using different methods for calculating m and s for the adaptive code such as the MML estimator v. the traditional estimator for variance.
Systems: ISO C (not C++), Linux.
Write a short, two to three page report describing your investigations and results. Append your program(s) to it.
Due Deadline: decided in class: Monday 23 April 2001; marks: 15%.
Place: CSSE general office.


© L. Allison, School of Computer Science and Software Engineering, Monash University, Australia 3168.