Segmentation of Dermatological Images using
Mixture Models and Markov Random Fields
[Home] [Downloads] [Results] [Research Log] [Contact] © 2005 Ji Tran




Results

The segmented skin lesion images were generated using two clusters with the default settings and several other settings. For the Fuzzy C-means approach, the fuzzy factor was set to 2.0, and the stopping factor was 0.003. With the MRF based segmentation models, the settings are, initial temp. = 15, final temp. = 0.1, \beta = {0.1, 0.01}, c1 = 80, c2 = 1/2, EM steps = 25, SA steps = 100, MRF size of 4 & 8, and both the single variance and full covariance models were used. The RGB colour space was used.

The set of segmented skin used in this research are located here.
You can download the results here results [38192KB].

Lenna: Segmentation borders around the original image Lenna: Segmentation borders around the segmented image

The above is a MRF based segmentation result of the Lenna image. The settings were initial temp. = 15, final temp. = 0.1, \beta = 0.01, c1 = 80, c2 = 1/3, EM steps = 10, SA steps = 25, MRF size of 8, full covariance model, colour space is RGB, and finally, 3 clusters.

Click here to view the segmentation animation of the above image! Show Animation [1272KB]


Methodology & Implementation

I have implemented a few clustering algorithms that are variants of the classical K-means clustering method. The first implementation is the straight K-means method using the Euclidean distance measure to assign data points to the closest cluster centre. The variant of this method uses the Mahalanobis distance which takes into account the orientation of the data points in each cluster.

Following the K-means algorithm is the Fuzzy C-means implementation. The K-means approach is considered as hard clustering as each data point can only belong to one cluster only. In the Fuzzy C-means approach, a data point can belong to more than one cluster, thus producing the fuzzy membership properties between data points and clusters. I have implemented two versions of the Fuzzy C-means algorithm. The first is the classical Fuzzy C-means using the Euclidean distance measure, and the other is the orientation sensitive version by Phillipe Schmid that uses the Mahalanobis distance measure.

Moving along, Gaussian Mixture Models with EM is considered as an improved approach since it models the data points/mixture models as Gaussians using the Gaussian probability density function. The downfall with this approach is that it is purely based on the Gaussians only, this tends to cause variances that become too small, leading to inconsistent results.

So far, the methods above have not been modelling the spatial relationships in the images or data set. Because of this, the classical methods tend to be affected by the presence of noise. A new MRF based segmentation model proposed by Deng and Clausi (2004), introduces the concept of modelling the data using two components, the region and the feature components. The region component is modelled using the MRF, where a high cost is associated with a pixel if its label is different to its neighbours'. Similarly, the feature component is modelled using the Gaussian probability density function, and a high cost is associated with the pixel if its features are far away from the Gaussian its label corresponds to. With the two components, a variable weighting factor is proposed to combine the two components to generate an accurate and smooth segmented image.

The proposed method by Deng and Clausi (2004) uses a single variance to model the feature component of each pixel's multidimensional features. They also use a MRF of size 4 (NSEW), however, these two properties alone were not strong enough to remove the presence of noise, especially the hair structures in a skin lesion image. We extended their model by introducing the full covariance matrix to calculate the feature component, and a MRF of size 8 (NSEW + diagonals) to produce smoother segmentation results. Both the new and improve models are based on an EM approach, where at the Expectation step, the class means and variances are estimated, while in the Maximisation step, the data in the feature data are sampled using the Gibbs Simulated Annealing scheme to find the optimal class label for each data.

K-means (E) K-means (M) Fuzzy C-means (E) Fuzzy C-means (M) GMM with EM MRF with \alpha

The above is the scatter plot graph of four Gaussians segmented using the various segmentation methods.