Cross-language text alignment: A proposed two-level matching scheme for plagiarism detection

作者:

Highlights:

• Candidate pairs can be detected just by choosing the best translation of each word.

• Dynamic expansion technique improves the candidate fragment identification approach.

• Analysis on various types of plagiarism cases shows superiority of the proposed model.

• Considering words dependence along with their meanings increases the performance.

摘要

•Candidate pairs can be detected just by choosing the best translation of each word.•Dynamic expansion technique improves the candidate fragment identification approach.•Analysis on various types of plagiarism cases shows superiority of the proposed model.•Considering words dependence along with their meanings increases the performance.

论文关键词:Plagiarism detection,Cross-language plagiarism,Text alignment,Graph-of-words representation,Multilingual word embeddings

论文评审过程:Received 13 November 2019, Revised 21 May 2020, Accepted 1 July 2020, Available online 11 July 2020, Version of Record 23 July 2020.

论文官网地址:https://doi.org/10.1016/j.eswa.2020.113718