High quality error-tolerant phrase mining on text corpus

作者:

Highlights:

• Mining high-quality phrases on text with errors.

• Error-tolerant phrase model to maximize global probability of extracted phrases.

• Accelerating phrase mining process using dynamic programming.

• Trie structure based optimization to memorizing temporary information.

摘要

•Mining high-quality phrases on text with errors.•Error-tolerant phrase model to maximize global probability of extracted phrases.•Accelerating phrase mining process using dynamic programming.•Trie structure based optimization to memorizing temporary information.

论文关键词:Phrase mining,Error tolerant,Scalability

论文评审过程:Received 30 March 2019, Revised 14 February 2020, Accepted 30 December 2020, Available online 7 January 2021, Version of Record 25 January 2021.

论文官网地址:https://doi.org/10.1016/j.eswa.2020.114557