Text plagiarism classification using syntax based linguistic features

作者:

Highlights:

• An approach that utilizes minimal and effective syntax based linguistic features for plagiarism classification extracted using shallow natural language processing techniques.

• A two-phase feature selection approach that identifies minimal and best features for plagiarism classification.

• Detailed analysis of the impact and dependencies of plagiarism types and complexities on the extracted features.

摘要

•An approach that utilizes minimal and effective syntax based linguistic features for plagiarism classification extracted using shallow natural language processing techniques.•A two-phase feature selection approach that identifies minimal and best features for plagiarism classification.•Detailed analysis of the impact and dependencies of plagiarism types and complexities on the extracted features.

论文关键词:Plagiarism classification,Syntactic features,Linguistic features,POS tags,Chunks

论文评审过程:Received 23 February 2017, Revised 6 July 2017, Accepted 7 July 2017, Available online 8 July 2017, Version of Record 28 July 2017.

论文官网地址:https://doi.org/10.1016/j.eswa.2017.07.006