Introduction
This website presents the research study on sentence classification. Sentence classification
is the automatic categorization of sentences into a predefined set of sentence types. A few
examples of the sentences types are given below. The xml tag that surrounds the sentence is the sentence type.
- <instruction>Download the latest version from the link below </instruction>
- <url> http://www.hp.com </url>
- <specification> Compaq Insight Manager 7 </specification>
- <request> Please email us in case of any enquiries </request>
The project aims to develop a sentence classifier to classify the sentences into one of the categories.
The domain of the study is helpdesk emails. This is motivated by the high volume of email enquiries received
by helpdesk operators. Since many email enquiries are often repetitive, it will be better if those emails
can be responded automatically. One research study is investigating the automatic generation of email
responses in the helpdesk emails (Marom and Zukerman, 2005),
and the sentence classifier developed in this project can be useful.
The use of sentence classifier is not restricted to the application mentioned above. Recently, more
research studies have been conducted at the sentence-level. Sentences have been used to improve the accuracy
of classifying documents (Ko et al., 2002), to summarize a
multi-document biography (Zhou et al., 2004), to determine the
intention of the email sender in the email (Cohen et al., 2004),
and so on. There has also been a number of studies on sentence classification. Previous studies mostly looked
at two aspects of the classification, with the first one being the use of simple type of features, such as
using only the words in the sentences, and the second one being the classification methods.
In specific, this project focuses on two aspects of sentence classification that have not been looked into in
the past. The first one is feature selection methods. Feature selection has been commonly used in text
classification, which is the automatic categorization of documents. Thus, text classification is similar to
sentence classification, but they are using different unit of analysis. However, there has not been any sentence
classification studies that have applied feature selection in the past. This project examines what effect feature
selection brings to classifying sentences. The other aspect that the project focuses is using context in classification.
The context of a sentence refers to its surrounding sentences. We study whether the information provided by the
context can help improve the accuracy of classifying the sentences. This is a new approach that we introduce
to this research area.
|