The technique used by Wallace and Freeman [#!Wallace.Freeman:1987!#] is to
define a coding region
about parameter
so that the continuous model-space is quantised into discrete sets.
Indeed, the fundamental problem in MML inference theory is how to optimally
divide the model-space into discrete coding regions. Herein lies MML's first
approximation to SMML. For continuous distributions, MML's optimal spacing
parameter is a function of the model
.
On the other hand, SMML divides
the model-space depending upon the data x [#!Dowe:private!#].
As a result of combining similar models into a set, the message length is not
simply the probabilities of the model
.
Instead, it is the sum of all
alternate theories within the set or coding region. Therefore, the message
length (now in nits) of the model is -
Where
is the coding region.
Therefore to optimally encode the data given the model, it would take
nits, where
is the model in the
first part of the message. However,
can be any value within the
coding region
.
Therefore, the average
or expectation value of
is taken over the coding
region.
(Note that nits (natural bits) have been used as the unit of information. This
is purely for mathematical convenience. However, the total message length
can always be multiplied by the constant
to convert
nits to bits.)
Therefore, the total message length has the form of the following expression.
This equation is the basis of the MML technique. The above equation
(
) does not have the
property of invariance [#!Dowe:private!#]. However, an invariant version
[#!Dowe:private!#] is used for the MMLD approximation technique (section
).