Pronounce differently, mean differently: A multi-tagging-scheme learning method for Chinese NER integrated with lexicon and phonetic features

作者:

Highlights:

• The pronunciation of Chinese characters contains important semantic information.

• The phonetic features can solve the ambiguity problem in Chinese entity boundary identification.

• The proposed multi-tagging-scheme method can alleviate the data sparsity and error propagation problems for Chinese NER.

摘要

•The pronunciation of Chinese characters contains important semantic information.•The phonetic features can solve the ambiguity problem in Chinese entity boundary identification.•The proposed multi-tagging-scheme method can alleviate the data sparsity and error propagation problems for Chinese NER.

论文关键词:Named entity recognition,Phonetic feature,Lexicon feature,Multiple tagging schemes,Natural language processing,Information extraction

论文评审过程:Received 24 November 2021, Revised 18 July 2022, Accepted 20 July 2022, Available online 4 August 2022, Version of Record 4 August 2022.

论文官网地址:https://doi.org/10.1016/j.ipm.2022.103041