VIDEO COMPRESSION via MORPHING

ABSTRACT :

Morphing is an animation technique used to create a realistic "shape changing" effect between two images. It is proposed to be used in a video encoding system to reconstruct intermediate frames, assuming only small changes between each key frame.

For the morph to look realistic, we need to define correspondences between initial and final images. Once the relationship is defined, we can create a smooth morphing sequence. In this project, seeds are used to map the most significant features between the initial and final images. I have implemented a technique to automatically locate seeds at the eyes, nose and mouth in facial images because they are the most recognisable features in faces.

1.0 INTRODUCTION :

This project is an extension of the projects completed by Edward Au and Daniel Alisauskas. The existing system uses the method of using points to define nets, which was implemented by Edward Au [Au, 1992] and in addition the method of using seed sites to define nets implemented by Daniel Alisauskas [Alisauskas, 1993]. The nets formulate the spatial transformation of the images. The transformation is then linearly approximated between the source and destination nets to form a sequence of intermediate nets.

Mapping points required to assist the deformation between the initial and final images will be called "seeds" in this project. The animation sequences are much smoother if seeds are located at the significant features in both images. For facial images, seeds should be located at the eyes, nose and mouth because they are the most noticeable features of the human face.

The technique implemented by Alisauskas is more efficient and easier to use. It uses seed sites to assist the deformation. However, the user has to manually locate the seeds in both the initial and final images to get good morphs. The goal of this project is to automatically produce the seed sites in facial images to assist the deformation.

The goal was achieved and tested on sixty different images. The new system can automatically locate seeds at the eyes, nose and mouth for the front view images of faces.

2.0 MORPHING :

2.1 What is Morphing ?

The word "morphing" is derived from "metamorphosis" [Sorenson, 1992] which means a change of form or transformation. It is an animation technique in which a topological deformation is specified between source and destination images.

Morphing is a combination of image warping and cross-dissolving [Sorenson, 1992]. Image warping, also called image distortion, involves transforming the geometry of digital images. It stretches and deforms the initial image to the shape of the final image. At the same time, the texture of the initial image is cross-dissolved to result in the texture in the final image. Cross-dissolving is the process of mixing the colours of the initial and final images to form a new colour in the intermediate images.

Morphing techniques can be separated into 2-Dimensional morphing and 3-Dimensional morphing. 2-D morphing techniques involve warping and blending images. To implement 3-D morphing, the system needs to store the 3-D model description. This is more complicated to implement and is not appropriate for this project.

2.2 Where has morphing been used and seen ?

Morphing is widely used to create realistic "shape changing" effects especially in

Movies :

"Terminator 2 : Judgment Day" : Every scene where the T-1000 changed its shape into another form.

Music videos :

"Black or White" by Michael Jackson : A series of dancers are morphed into each other whilst dancing.

Television commercials :

"Exxon" : a car was morphed into a tiger.

3.0 HOW IS MORPHING RELATED TO VIDEO COMPRESSION ?

Real-time digital video transmission requires extremely high bandwidth and therefore favours the use of compression techniques. This involves large amounts of data for transmission.

In normal video transmission, all the frames are sampled and sent from the transmitter to the receiver. Standard transmission rates range from 15 to 25 frames per second. Figure 1 shows an example of a standard video frame transmission system.

By using morphing techniques, we can sample a video sequence well below the standard frame rates. The sampled (key) frames are then compressed and transmitted in the normal way to the receiver. At the receiving end, the lost intermediate frames are reconstructed by morphing between successive key frames. An example of a compressed video frame transmission system using morphing is shown in Figure 2.

If 5 key frames are sampled and transmitted, intermediate frames are reconstructed using morphing techniques at the receiving end. A compression rate of 80% is achieved, and a speeding up of transmission by five times, for a 25 frames per second transmission system.

Click here for Picture

Figure 1 : Standard Video Frame Transmission.

Click here for Picture

Figure 2 : Compressed Video Frame Transmission.

4.0 PROJECT DESCRIPTION :

4.1 Assumptions :

Assumptions are made in this project to overcome some important problems. These assumptions are outlined below :

4.1.1 Assume images are front views of faces :

Localising seeds for any general object is beyond the scope of this project. Generally, we do not know what the significant features of the input images are, nor how to map them between frames.

Images of faces facing the camera were assumed. In these images, the mouth can usually be found at the bottom area of the images, the nose will be on top of the mouth and the eyes can be detected above the nose.

4.1.2 Eyes, nose and mouth are important features of faces :

Since this project concentrates on facial feature detection, I have done a lot of studies on feature recognition. In facial feature recognition, researchers are most interested in ways of locating the eyes, nose and mouth. Some of the methods used by other researchers for locating these features will be discussed in section 4.2.1 .

Since morphing requires the definition of correspondence between two images, searching for the significant features of faces is the area of interest in this project. Blurred images will appear in the intermediate frames if these features are not correctly mapped. Please refer to Figure 3 for an example of a morphing sequence where no correspondence between initial and final images is defined.

4.1.3 High speed machines are used :

This morphing system runs on high speed Silicon Graphics Iris machines. The speed of these machines greatly helps in searching for the eyes ( the methods will be discussed in section 5.4.3 ).

Figure 3 : Morphing errors if the eyes, nose and mouth are not correctly mapped.

4.2 The Problem :

The morphing process is accomplished using a set of mapping points in images. It involves matching the significant features in source and destination images. If no correspondence is defined in both images, the result will be extremely bad. For example, if morphing is done between two faces without defining any mapping points, the resulting intermediate images may appear with four eyes and two mouths and be a very blurred and confused sequence. Please refer to Figure 3 for an idea of the type of morphs that can be produced if no seeds are located for correct mapping.

4.2.1 Methods investigated and rejected :

There are a few methods that have been studied to detect the location of the eyes, nose and mouth.

1). Use of deformable templates to extract features :

The use of deformable templates is a possible way of detecting features in faces. The templates are specified by a set of parameters that enable a priori knowledge about the expected shape of the features to guide the detection process [Yuille, 1992]. The interaction of the template with the image is defined in terms of a potential energy function. This method was rejected because it involves very complex calculations and long computational times for the energy function to find the best fit.

2). Train neural Networks to locate features :

There has been a project where neural networks were trained to locate the eyes, nose and mouth in facial images [Debevec, 1992]. The networks were trained by repeatedly showing them positive and negative examples of these features from the training set. This method was rejected as it requires a lot of images for training the network and is very time consuming.

4.2.2 Method used for locating facial features :

The detection technique used in this project was my own idea. At the start of the project, I was looking for methods of facial feature detection. The methods that were studied (discussed in the previous section) were rejected mostly because they needed a lot of time to train the system. I required a method which would give me good results and use very little time in locating the facial features.

I looked at a lot of faces around me deeply and examined a lot of facial images. I came up with the idea of using edge detection techniques to extract the facial features. I applied both horizontal and vertical edge detection techniques on ten facial images. The results showed that the mouth, the nose and the eyes were extracted very well especially after the horizontal edge detection. So, I continued with the edge detection techniques to locate facial features.

The edge detection methods gave acceptable results quickly. They were able to locate the approximate location for facial features adequately for this project. They also used less time to search for the features on faces compared to the methods discussed in the previous section.

There are two approaches to feature detection :

1. From the top to bottom, (i.e. the eyes, the nose, followed by the mouth).

The eyes will be the first to be found.

2. From the bottom to top (i.e. the mouth, the nose, followed by the eyes)

The mouth will be the first to be found.

The distance from the chin to the mouth is shorter, compared with the distance from the top of head to the eyes. Hence, in the second method, the time required in searching for the first feature uses less computational time. Once the first feature is found, the area for searching for the next one will be restricted and seed allocation will also be more accurate. Therefore, I chose to start searching from the bottom to the top.

Another reason for choosing the second method was that the location of the mouth was more accurately found compared with the eyes in the edge detection technique that I used. Horizontal edge detection works extremely well in finding the mouth and nose. Once the nose is found, the positions of the eyes would be determined more accurately.

4.3 Achievements :

I used sixty front view facial images for testing in this project. Among these faces, some wear spectacles and some have beards. More than half of them have tilted their heads or are not facing fully forward.

The seeds of the mouth were correctly located in 93.3% (56/60) of the test images. In all cases, the nose was located correctly once the mouth was found. In 85.0% (51/60) of the test images, the eyes were located correctly. Once the seeds are located at the mouth, nose and eyes in both initial and final images, we can start morphing. Figure 4 shows an example of a morphing sequence created by mapping the mouth, the nose and the eyes in the initial and final images.

This detection technique is going to be incorporated into the morphing system. Only very few modifications have to be made to complete the job. One or two readjustments of the seeds of the eyes may be necessary if the results are not satisfactory.

4.4 Other Considerations :

The morphing system was implemented on Silicon Graphics Iris machines. All the programs were written in the C language. The form library in Iris was used as a tool to define the user interface [Au 1992, Alisauskas 1993].

Figure 4 : Morphing sequence created by mapping the eyes, nose and mouth in the initial and final images.

5.0 IMPLEMENTATION :

5.1 Stages for feature detection :

All the rules and estimations for locating facial features were obtained by examining the testing images. Many different estimations and figures were used throughout the implementation stage. The figures for estimating the search area used in sections 5.2 to 5.4 gave the best results for locating features in the test images.

The system has several stages of feature detection :

Changes input images to grey scale

Locates the top of the head and the chin using horizontal edge detection.

Locates the left and right sides of the face using vertical edge detection.

Locates seeds for the mouth

Locates seeds for the nose

Locates seeds for the eyes

The system will first transform the input images to grey scale (refer to Figure 5(a) and 5(b)), then apply Sobel operators for horizontal and vertical edge detection. Sobel operators were used because they provide a differencing and a smoothing effect [Gonzalez, 1992].

Two new grey scale images were produced, one after horizontal edge detection (refer to Figure 5(c)) and another after vertical edge detection (refer to Figure 5(e)). The separate images were then binarized with a threshold of 256/3 (refer to Figure 5(d) and 5(f)). This was done to enable feature detection. A few thresholds between 256/4 to 256/2 were tested to get clear edges in the images. I decided to use the value of 256/3 as a binarizing threshold because it gave the best result for highlighting the edges.

Horizontal edge detection was used to find the approximate location of the top of the head and the chin. The sides of the face were estimated using vertical edge detection. Feature detection was restricted to the area between the left and right sides of the face, the top of the head and the chin. Once the system found these locations, it would search for the mouth, then the nose and finally the eyes.

(a) (b)

(c) (d)

(e) (f)

Figure 5 : (a) 24 bit RGB image.

(b) Gray scale image.

(c) Image after horizontal edge detection.

(d) Image binarized from image (c).

(e) Image after vertical edge detection.

(f) Image binarized from image (e).

5.2 Mouth Detection :

A face model was drawn on a graph with grid lines. Most of the ratios used in the search areas were obtained from this face model. The width of the face model was 23 squares and the height was 28 squares. The face model used in this project is shown in Figure 6.

Click here for Picture

Figure 6 : Face model.

5.2.1 The area for locating the mouth :

The selection of the search area for the mouth is the most important as this would give a more efficient search for the mouth. If the mouth is not correctly found, it would affect the search for the nose and the eyes. A lot of different ratios were tried for the borders of the search area. The estimated area for searching for the mouth is shown in Figure 7.

The area is restricted by :

¨ 1/4 of the face width from the left side of face as left border.

¨ 1/4 of the face width from the right side of face as right border.

¨ 3/28 of the face length from the chin as bottom border.

¨ 12/28 of the face length from the chin as top border.

Click here for Picture

Figure 7 : The area where the mouth can normally be found.

About 10 different ratios between 1/7 and 1/3 were attempted to get the correct ratios for the left and right borders. 1/4 of the face width from the left and right sides of the face gave the best results. Ratios larger than this value might waste the searching time and less than this might detect the mouth width to be smaller than the original size of the mouth.

The bottom border of 3/28 of the face length from the chin was successful in the first trial. I tried to decrease the ratio, however the chin was assigned to the mouth. This was because after the horizontal edge detection the edges of the chin were also clearly detected. More than 20 different ratios between 1/28 and 14/28 were tried to obtain the right ratio for the top border. The ratio 12/28 was the best ratio found for the top border. This was most important because if the top border was too low, the location of the mouth might not be right. If it was too high, the edges of the spectacles might be incorrectly assigned to the mouth.

Reasons for restricting the search area :

The mouth is centred at the bottom part of a face.

The width of the mouth would not be larger than half a face width.

The search was started a bit higher than the chin, to avoid locating seeds at the chin instead of the mouth.

5.2.2 The method used to locate seeds at the mouth :

The binarized image after horizontal edge detection was reused at this stage. Several horizontal edges would appear white on a black background in the image. The system would start searching for the horizontal scan-line with the most edge pixels set, ie a row which contains of the maximum number of white pixels.

When the system has determined both ends of the horizontal line which has the most white pixels, it will then follow any adjoining line segments (not necessarily horizontal) outwards and use the ends of these as seeds for the mouth. The system is implemented such that if the mouth is slanted, it will reorientate to accommodate this.

An image with the line segments within the search area of the mouth, which was produced from horizontal edge detection, and then binarized with a threshold of 256/3, is shown in Figure 8(a). From this figure, one can easily see that the search area for the mouth is a lot smaller compared with the size of the original image.

(a) (b)

Figure 8 : (a) Line segments within the search area for the mouth obtained from image in Figure 5(d).

(b) Line segments within the search area for the nose obtained from image in Figure 5(d).

5.3 Nose Detection :

5.3.1 The area for locating the nose :

It is restricted by :

¨ 1/15 of the face length from the mouth as bottom border.

¨ 1/5 of the face length from the mouth as top border.

¨ the left seed of the mouth as left border.

¨ the right seed of the mouth as right border.

Figure 9 shows the estimated area for determining the location of the nose.

Click here for Picture

Figure 9 : The area where the nose is usually found.

Reasons for restricting the search area :

The nose will be on top of the mouth.

The nose will not be bigger than the width of the mouth found.

5.3.2 The method used to locate seeds at the nose:

I used the same image used for locating the top and bottom of the head to search for the nose. A few horizontal line segments would appear within the search area of the nose. The system would search for the horizontal scan-line with the longest horizontal line segment. The seeds of the nose would then be located at the outermost ends of this horizontal line segment.

Please refer to Figure 8(b) for an example of the image, with horizontal line segments within the search area, produced from horizontal edge detection, followed by binarization with a threshold of 256/3. Once again one should notice the small search area compared with the size of the whole image.

5.4 Eyes Detection :

Detection was done separately for each eye. M is defined to be the distance from the mouth to the nose.

5.4.1 The areas for locating the eyes :

Area for searching for the left eye is restricted by :

¨ M / 2 from the nose as the bottom border.

¨ 3 * M from the bottom border as the top border. ( The top of the head was used as the top border if 3*M was higher than the top of head)

¨ 1/5 of the face width from the left seed of nose as the left border.

¨ the left seed of the nose as the right border.

Area for searching for the right eye is restricted by :

¨ M / 2 from the nose as the bottom border.

¨ 3 * M from the bottom border as the top border. ( The top of the head was used as the top border if 3*M was higher than the top of head)

¨ 1/5 of the face width from the right seed of the nose as right border.

¨ the right seed of the nose as left border.

The search areas for detecting the left and right eyes are shown in Figure 10.

Click here for Picture

Figure 10 : The area where the eyes are normally found.

Reasons for restricting the search areas :

Bottom border : the eyes will not be located too near to the nose.

Top border : the eyes will not be located too far from the nose or above the head.

The eyes will not be found in the center above the nose, so searching starts from the nose seeds.

The eyes will not be located at the cheeks, so searching ends once the cheeks are reached.

5.4.2 Methods attempted to locate seeds at the eyes but rejected :

1. Binarization method :

A threshold was applied to the grey scale image in this method. I created a histogram for each search area. A cumulative sum of the number of pixels for each grey level (starting with zero) was made. The threshold grey level was chosen by selecting the grey level for which cumulative sum was 10% to 25% of the total number of pixels.

This method worked very well for only some of the images. The top borders of the search areas were too high for some images and included much of the hair. This gave a very high percentage failure rate because the grey values for hair were usually low and so if the threshold percentage was used, only the hair would show up, resulting in some of the faces appearing with eyes missing after binarization.

The top borders were the hair in some images. However, I did not lower the top border of the search areas, as in some images this limit just covered the eyes in full. I tried firstly to use 4 times the mouth to nose distance from the bottom border as the top border. This gave poorer results compared to the one I used (3*M from the bottom border). I also tried to use two as well as two and a half times the mouth to nose distance as the top border. These limits covered only part of the eyes, or were just below the eyes in some images.

2. Multi-binarization method :

In this method, a few thresholds were applied to the grey scale image, and the results of the edge sets were combined to get a new binary image. The new image would have edges extracted clearly if the combination was suitable for the image. I tried a few sets of threshold combinations which some of which are listed below.

The thresholds 25, 50 and 75 as a set.

The thresholds 50, 60 and 70 as a set.

The thresholds 50, 75 and 100 as a set.

The thresholds 60, 80 and 100 as a set.

The thresholds 70, 85 and 100 as a set.

The thresholds 80, 100 and 120 as a set.

This method worked better compared with the previous one. However, this method did not give satisfactory results for some images. The top borders of the search areas were still too high for some images. The system also left out the eyes in several test images.

3. Binarization followed by horizontal edge detection method :

As in the first method, I created a histogram and obtained the 20% threshold of the histogram to create a new binarized image. I combined this new image with the horizontal edge detected images that I used for locating the top and bottom of the head.

This gave very clear edges for the eyes. Circular shapes appeared in the region of the eyes. I then tried to create several masks to extract the circles in these eye regions. All the centers of these circles were kept as possible seed locations. The seed sites for the eyes were then obtained by averaging all the possible seeds found in the eye areas. These gave the approximate locations of the eyes. Figure 11(b) shows an example of the resulting image from binarization and edge detection. Note the circular edges shown in this image. The edges may not be perfectly round.

This method was rejected because there were edges extracted near the eye brows or the sides of the eyes in about 40% of the test images used. Once I averaged the centers of these edges, the seeds were located at points outside the eyes. Some were located between the eye brows and the eyes, and some were located at the sides of the eyes. This method also took quite some time to locate the seed sites for the eyes due to the numerous masks which had to be applied.

(a) (b)

Figure 11 : (a) 24 bit RGB image.

(b) Circular edges appeared in the search area of the image created from binarization followed by edge detection.

5.4.3 Methods used to locate seeds at the eyes :

1. Looking for the exact eye windows, using edge detection followed by binarization method :

All the values used in searching for the eyes were the results of trial and error. In the grey scale image, I applied horizontal edge detection to the search areas for the eyes. I then binarized these areas by a threshold of 256/2.5.

I tried a few thresholds, for example 256/3.5, 256/3, 256/2.5 and 256/2 to determine which would give better results. Finally, most of the results showed that 256/2.5 was the best threshold for extracting the edges. Refer to Figure 12(b) in the next page for an example of the resulting image. The edges extracted in the search areas were clearly defined and sufficient for detecting the eyes.

(a) (b)

Figure 12 : (a) 24 bit RGB image.

(b) Line segments appeared in the search areas for the eyes after the horizontal edge detection followed by binarization with a threshold of 256/2.5.

Click here for Picture
Click here for Picture

(a) (b)

Figure 13 : (a) Edges extracted in the search area of the left eye of image in

Figure 12(b).

(b) Edges covered in the left eye window in Figure 13(a), used to determine the location of the seed site for the left eye.

The detection was done separately for each eye. In the new image produced from the above method, I tried to find the exact windows for the eyes by eliminating the eye brows and the hair in the search areas. I calculated the width of the search area as the eye width. Then for each eye, I started searching for the bottom border for the eye window.

I added up the total number of pixels in which edges appeared for each row, starting from the bottom of the search area and stopping when the sum exceeded 1/5 of the eye width. This row was then used as the bottom border for the eye window. This was to avoid marks that would give false border for the eye window.

Above the base of the eye window, I counted the total number of pixels that edges appeared for each row. As searching to the top of the search area, if the total number of pixels that edges appeared in that row was less than 40% of the predecessor, the previous row would be used as the top border of the eye window. Please refer to Figure 13 in the previous page for an example of a search area and an eye window, which were taken from the left eye of image in Figure 12(b). The search area of the left eye is shown in Figure 13(a) and the left eye window is shown in Figure 13(b).

The two methods, that I attempted to compute the seed locations for the eyes within the eye windows, are :

i). To average the positions of each pixel that appeared as edges within the eye window. For example in Figure 13(b), the seed for the eye would be located at the average positions of all the black pixels within the eye window.

ii). To average the positions of the four outermost black pixels within the eye window. For instance the eye in Figure 13(b), the seed would be in (X, Y) location of the eye, where X is the horizontal direction and Y is the vertical direction.

X1 : Row value of the highest black pixel in the eye window.

X2 : Row value of the lowest black pixel in the eye window.

The values of X is computed as below :

X = ( X1 + X2 ) / 2,

Y3 : Column value of the outermost left black pixel in the eye window.

Y4 : Column value of the outermost right black pixel in the eye window.

The value of Y is computed as below :

Y = ( Y3 + Y4 ) / 2 ,

The results of the first method were not satisfactory. More edges were extracted near the top part of the eyes. When I averaged the positions of the black pixels, the seed would be located nearer the top area of the eye. Then, I tried the second method. In this method, the seed was located more towards the center of the eye in the horizontal direction. However, when I compared the sites in the vertical direction, most of them showed that the first method actually gave better results, so I had combined both methods to compute the seed locations for the eyes. I used the first method to compute the seed sites in the vertical direction (Y) and the second method to calculate the seed sites in the horizontal direction (X).

The system will check the distance of the seed sites between the left and the right eyes in the horizontal direction. If the distance is large ( greater than the distance from mouth to nose ), the system will then use the method outlined below in 5.4.3-2. to search for the eyes.

Using this combination method, I successfully separated the eyes from the whole search areas (search areas may include eyes, eye brows and the hair ) in 51 out of 60 test images used in this project.

2. Looking for the centers of circular edges within the search area :

In some faces with spectacles, there were cases where one eye would be incorrectly found, using the previous method 5.4.3-1. As in Figure 14(b), the edges introduced by the spectacles in the eye on the left were unclear and the system avoided these edges. However, the edges introduced by the spectacles in the eye on the right were very clear. The system then allocated the wrong position for the right eye due to the false eye window found. This would result in a large distance between the left and the right eyes in the horizontal direction (see Figure 14(c)). If this distance is too large (greater than the distance from the mouth to the nose), the system would use the following method to redetermine the locations of the eyes.

The original search areas were used instead of the eye windows. In the image produced by horizontal edge detection followed by binarization, some circular edges would also appear in the search area. For each eye, several masks would be created to extract the circles. The centers of the circles would be kept, and the seed for the eye will be the average of the centers of the circles.

Click here for Picture

(a)

Click here for Picture
Click here for Picture

(b) (c)

Figure 14 : (a) Grey Scale image.

(b) Line segments within the search areas for the eyes of image (a).

(c) Line segments within the eye windows for the eyes of image (b).

6.0 TESTING :

Testing was done throughout the implementation stage and the appropriate corrections were made after each test. I used sixty different front view facial images for testing. They were of different sizes, colour depth, textures and brightness. Among these human faces, seven wore spectacles and four had beards. More than half of them have tilted their heads, and some have turned their faces slightly ( not facing forward fully).

All rules and estimations for searching for the features were obtained from examination of the test images. For example mouth to nose distances and search areas for each feature were determined by examination of the test sets as well as by trial and error.

6.1 General comments on the input images :

The following gives some general ideas on the robustness of the system as well as its limitations.

* The system can still process faces which are slightly tilted or are not fully facing forward.

* If the input images are clear, the system can automatically locate seeds to their correct locations at the eyes, nose and mouth.

* The system may incorrectly allocate seeds if the input images are not bright enough as the edges in these images are not sufficiently clear for edge detection.

* The system may have trouble locating seeds for the mouth if the faces have beards. The detected mouth size may be wider compared with the original image due to edges introduced by the beard.

* For faces wearing spectacles, the seeds for the eyes may be incorrectly located at points below the eyes or slightly above the eyes.

6.2 The results of the final test :

The following outlines how well the system located seeds at the eyes, nose and mouth, and the time taken for locating seeds in 60 images.

* In 56 out of 60 faces (93.3%), the location of the mouth was correctly found and the seeds were located at both ends of the mouth.

* There were four images to which the system failed to locate seeds at the mouth correctly. Two images were very dark and the edges were not clear. One of these had a seed located at the center of the mouth while the other one had a seed located outside the mouth. The other two images were faces with beards. The two seeds were located too far apart at points outside the mouth.

* In the four images mentioned above, the noses and eyes were still correctly found. This was because the search areas for the rest of the features were still within the correct range.

* In a video frame transmission system using morphing techniques, the seeds would be located in the key frames for correct mapping. For those faces with beards, the seeds might be allocated a bit further from each other, as the detected mouth was wider in the horizontal direction.

The system would still be capable of creating all the intermediate frames between the key frames by morphing. If the results are not satisfactory, the user can manually readjust the seed sites for the mouth, and run the morphing engine again to get better results.

* Seeds for the nose were located in the acceptable ranges in all test images even though the mouth may have be incorrectly located.

* In 51 out of 60 images (85.0%), the seeds were located at acceptable points within the pupils of the eyes.

* In nine images, the system located seeds at positions near the eyes but not within the pupils. Five of which were dark images and four of which were faces wearing spectacles. Three of them had seeds located at the area just below the eyes, and six of them were slightly above the eyes.

* In summary, problems were encountered mainly when the faces had spectacles or beards. The worst problems occurred when there were both spectacles and beards. The best pictures were those of studio photographs as these gave the clearest edges for detection.

Seeds will be allocated to the eyes, nose and mouth of the key frames in the video frame transmission system, even though some of the seeds may have been located at sites slightly further from the searching features. The system would still create the intermediate frames between key frames. If the results are not satisfactory, the seed sites can be reallocated by the user to get better results.

* [[Iota]]n the current system, there are two methods (previously discussed in section 5.4.3) for locating the eyes. The system only used the first method to find the eyes in 56 out of 60 test images. It found the incorrect locations for the eyes using the first method in four test images. Then, it attempted the second method. The seeds were located correctly in two of these images. However, the seeds were only located near to the eyes in the other two faces. The system could not successfully locate seeds to the correct eye locations in these two faces because they were wearing spectacles.

All the test images are 24 bit RGB images. The image of size 99x98 pixels was the smallest image used for testing. The system took 2.5 seconds to locate seeds at the mouth, nose and eyes for this image. The largest image size was 270x300 pixels, and the system required 19.6 seconds to locate seeds at the eyes, nose and mouth. All timing was performed by the UNIX tool TIME.

One should note that the system will only use the second method to locate seeds at the eyes if the first method failed. The system took longer to locate seeds in such cases. The time taken to locate seeds at these images are listed below :

- An image of size 145x139 pixels took 18.1 seconds.

- An image of size 160x203 pixels took 34.4 seconds.

- Two images of size 199x220 pixels took an average of 49.9 seconds.

7.0 FUTURE DIRECTIONS :

à Previous students have implemented two different methods for defining nets in the system. This may confuse the user as to which one to use. These methods are also very hard to maintain because they require too many program structures. The future system should only use one method to increase speed and efficiency.

à Non-linear interpolation ( eg. spline interpolation ) can be implemented to create intermediate frames. The system can be extended to implement an "acceleration-deceleration" model of the muscles in faces to get smoother seed-point motion between frames.

à Automatic seed allocation can be implemented for faces with different views, i.e. without assuming front view of faces. Deformable templates may be needed to extract the features. This may also require the use of artificial intelligence to detect the orientation of the faces.

à Automatic seed allocation can be implemented for a general object, i.e. without assuming images of faces. The system could also be extended to remove the background of the images. This would aid the detection of the significant features of the initial and final images. This may also involve the use of artificial intelligence.

à Use of seed information from one frame to help detection and seed allocation in the next frame assuming only small changes occur between frames.

8.0 CONCLUSION :

Movies and commercials utilise morphing to create special shape-changing effects. Morphing techniques can also be used in video transmission systems. Only compressed key frames are sent in a compressed video transmission system. Morphing techniques can then be used to reconstruct the intermediate frames between the key frames in the receiving end assuming only small changes between each key frame.

At the receiving end, the system has to define correspondences at the most significant features in the key frames for correct mapping. For facial images, the eyes, nose and mouth have to be mapped correctly because these are the most noticeable features of the faces. In this project, I have developed the technique for automatically locating the seed sites for the eyes, nose and mouth for front view images of faces.

9.0 REFERENCES :

1. Au. Edward, "A Topological Morphing System - Final Report", csc400, Monash University, 1992.

2. Au. Edward, "TopoMorph - User Manual Rel.1.0", csc400, Monash University, 1992.

3. Alisauskas Daniel, "Automated Morphing - Final Report", csc400, Monash University, 1993.

4. Alisauskas Daniel, "autoMorph - User Manual Version 1.0", csc400, Monash University, 1993.

5. Debevec Paul. "A Neural Network for Facial Feature location", CS 283 Course project. Fall 1992 UC Berkeley.

6. Foley. J, Andries Van Dam, Feiner. S & Hughes. J. "Computer Graphics : Principles and Practice", 2nd edition. Addison-Wesley, Reading, MA, 1987.

7. Gonzalez Rafael C. "Digital Image Processing", Addison-Wesley, 1992. pp. 416-429.

8. Mason David K. "Morphing on Your PC", Waite Group Inc, CA. 1994.

9. Roos. T, Noltemeir H. "Dynamic Voronoi Diagrams in Motion Planning", Computational Geometry, Methods algorithms and Applications. Springer-Verlag 1991 pp. 228-236

10. Russell Michael John, "Methodologies of Morphing Transformations - Thesis Project", Computer Systems Engineering, Monash University 1993.

11. Shackleton M. A. and Welsh W. J. "Classification of Facial Features for Recognition", Proc. Computer Vision and Pattern Recognition. Hawaii, 1991, pp. 573-579.

12. Sorenson P. "Morphing Magic", Computer Graphics World. January 1992, pp. 36-42.

13. Yang Guangzheng and Huang Thomas S. "Human Face Detection in a Complex Background", Pattern Recognition. Vol. 27, No. 1, 1994, pp. 53-63.

14. Yuille. Allan L, Hallinan Peter W. and Cohen David S. "Feature Extraction from Faces Using Deformable Templates", International Journal of Computer Vision, Vol 8, 1992, pp. 99-111.