When deriving a model to represent possible facial expressions, one can take two approaches. The first approach is to derive a model which is based on how the face is perceived. Elements of such a model are things such as perceptible actions (such as ``eyebrow raised'' or ``lips pursed''). This is a top-down approach of sorts; I shall refer to these models as perceptual models.
The second approach is to derive a model from the actual physical structure of the face; that is, to model a substrate of immovable bones with a hinged jaw, a movable or elastic skin over them and, in the middle, the various muscles which exert force on the bones and thereby distort the skin. As opposed to the first approach, this is a bottom-up approach; I shall term models of this sort anatomical models.
While the term ``facial model'' could be applied to any method of notation which describes a facial expression, and thus one could, in the extreme case, consider modalities such as natural-language descriptions and digitally-encoded photographs to be examples of facial models, these are clearly not useful to us; while an English-language description of a facial expression may be easily understood and visualised by a human being, it is not very useful when one wishes to process the facial information using a computer. (However, one of the aims of my project was to create a system that would allow someone who has such an intuitive description in mind to be able to easily find the appropriate expression in the facial model.) Images such as photographs are not useful in this application because they are not concise; a photograph contains a lot of information in an unstructured form; there is no one-to-one correspondence between, say, a set of pixels in the image and a facial parameter.