Document Overlap Detection System for Distributed Digital Libraries

 

Krisztián Monostori, Arkady Zaslavsky, Heinz Schmidt

School of Computer Science and Software Engineering

Monash University, Melbourne

900 Dandenong Road, Caulfield East, 3145

Australia

Tel:+61-3-9903-{1410,2479,2332}

E-mail: {krisztian.monostori, arkady.zaslavsky, heinz.schmidt}@infotech.monash.edu.au

 

Abstract

In this paper we introduce the MatchDetectReveal(MDR) system, which is capable of identifying overlapping and plagiarised documents. Each component of the system is briefly described. The matching-engine component uses a modified suffix tree representation, which is able to identify the exact overlapping chunks and its performance is also presented.

KEYWORDS: overlap detection, string-matching, suffix tree, distributed system

 


Disclaimer