Document Overlap
Detection System for Distributed Digital Libraries
Krisztián Monostori, Arkady Zaslavsky, Heinz Schmidt
Tel:+61-3-9903-{1410,2479,2332}
E-mail:
{krisztian.monostori, arkady.zaslavsky,
heinz.schmidt}@infotech.monash.edu.au
In this
paper we introduce the MatchDetectReveal(MDR) system, which is capable of identifying overlapping
and plagiarised documents. Each component of the
system is briefly described. The matching-engine component uses a modified
suffix tree representation, which is able to identify the exact overlapping
chunks and its performance is also presented.
KEYWORDS:
overlap detection, string-matching, suffix tree, distributed system