There are three difficulties implementing an experiment of the Poisson
distribution compared with the binomial distribution. Firstly, the problem as
stated so far is incomplete. Poisson-distributed random variables normally
produce a list of scalars, not a list of 2-tuples. If we consider the problem
as producing a list of scalars ci, then it is legitimate to ask the
distribution of the scalars ti. (The process can also be perform in reverse
by assuming that ti is a variable and ci comes from some hidden
distribution.) It does not affect the equations as the likelihood function has
factored in that ti is a variable. However, when implementing a numerical
solution, we must know what the data ``looks'' like - otherwise, sample data
cannot be generated. In this appendix, I've assumed that
.
That is, ti has a constant distribution. Just as legitimately, ti could
have a uniform distribution, Poisson distribution or some other distribution.
The second problem is the domain of ci. The binomial problem produces a
number between 0 and N. Therefore, provided N is relatively low, it is
a realistic solution to integrate (for any
)
over the domain of m.
Indeed, that was the technique used to evaluate the Expected Kullback-Leibler
and Expected Root-Mean-Square distances. The Poisson distribution produces
numbers (for any r) from zero to infinity. Obviously,
integration over the whole domain is not a realistic solution. However, the
probability for large values of ci, is relatively small. Furthermore, the
probability decreases at a super-exponential rate (factorial actually). As a
result, integration is still a possibility, although care must be taken to
ensure a cut-off point. This can be implemented by ignoring any data-point
ci higher than a certain number
cmax(r, ti). In this appendix, it is
assumed that
cmax(r, ti) = cmax(r, 1) = 10r.
The third problem is the fundamental difference between the binomial problem posed and the Poisson problem. With the binomial problem, the data-point consisted of a single number. With the Poisson problem, a list of N numbers (between 0 and cmax(r, ti)) are produced. In general, there are (cmax(r, ti)+1)N possible sequences to integrate over - which is simply unrealistic. Therefore, some form of Monte Carlo simulation is required. This introduces uncertainly as an adequate sample size must be chosen and the random number generator must of a sufficiently high quality.