Document Segmentation
for
Image Compression

by Emma Jonasson

Supervisor: Dr. Peter Tischer

Home - Downloads - Email

Abstract

This thesis explores the topic of identifying structure within binary document images, the benefits gained from knowing this structure and the issues that arise in the process. Document images tend to include large amounts of whitespace intertwined with textual and graphical elements. Every type of element has a different statistical nature to that of other elements. Taking this nature of the element into account when encoding the image leads to better compression than treating the entire image to have the same statistical properties.

The international standard for bi-level images, JBIG2, employs the option of using the knowledge of the visual structure to achieve great compression through segmentation. The standard allows for optimal compression of the individual elements up to 3 types of segments; text, half tones, and generic. However, JBIG2 does not specify how this segmentation is performed including what similarity measure should be used.

This project claims information content is a valid homogeneity property to base the segmentation on. Simple segmentation pre-processing, including the novel ``whitening transform'' method, captures the ``rough'' information content and optimises the image for segmentation. The size of the image is reduced in the pre-processing phase allowing for quicker segmentation. Various types of segmentation, including different levels of granularity, are examined to see what type of segmentation would lead to an overall best compression. A segmentation-based encoder is implemented to approximate the impact the different segmentation has on the compression rate. Trade-offs between how specific the segmentation is and how much overhead is required to transmit a more specific segmentation are explored.

Conducted experiments of segmentation-based encoding using multiple segmentation techniques and segmentation granularity show that segmentation is useful in enhancing compression. However, tests suggest there is no single segmentation technique that produces good results on all types of images. The segmentation granularity that performs best is also dependent on the type of image, although on average a finer granularity performs better.



© 2005 Emma Jonasson