NEW APPROACH FOR PLAGIARISM DETECTION
Abstract. The paper proposes a new approach for intrinsic plagiarism detection, based on a new unique method, which enables identifying style changes in a text using novel chronology-based similarity measures. A model for finding significant deviations in the style across a given document is constructed aiming to indicate text parts which are suspected to be written by co-authors, or to be devoted to a different thematic, or to be a plagiarism. We consider each segment as ``result of the text evolution'' provided by its predecessors in the text. Resting upon this evolution standpoint, the metric evaluating dissimilarity between two given segments is introduced, and a text is clustered using this measure aiming to turn out disparity of the text. We also propose a new clustering procedure involving an embedding of data in an Euclidean space with subsequent clustering using the K-means approach. The obtained results demonstrate high ability of the method.
AMS Subject Classification: 94A13, 62H30, 68T10


Download full article from here (pdf format).

DOI: 10.12732/ijam.v29i3.7

Volume: 29
Issue: 3
Year: 2016