We are developing a corpus-based approach for the prediction of help-desk responses from features in customers' emails, where responses are represented at two levels of granularity: document and sentence. We present an automatic and human-based evaluation of our system's responses. The automatic evaluation involves textual comparisons between generated responses and responses composed by the help-desk operators. The results show that both levels of granularities produce good responses, addressing inquiries of different kinds. The human-based evaluation measures response informativeness, and confirms our conclusion that both levels of granularity produce useful responses.
The paper is available as gzipped postscript (53 kB) and pdf (82 kB).
Alternatively, you can request a copy by e-mailing me.