Beyond MeSH: Fine-grained semantic indexing of biomedical literature based on weak supervision

作者:

Highlights:

• Semantic indexing with MeSH descriptors may aggregate several distinct concepts.

• Concept-occurrence is a good heuristic for fine-grained semantic indexing.

• Models trained with concept-occurrence as weak supervision can achieve good accuracy.

• Lexical and semantic features combined can lead to improved predictive performance.

• Under-sampling the major class in training data, can also lead to further improvement.

摘要

•Semantic indexing with MeSH descriptors may aggregate several distinct concepts.•Concept-occurrence is a good heuristic for fine-grained semantic indexing.•Models trained with concept-occurrence as weak supervision can achieve good accuracy.•Lexical and semantic features combined can lead to improved predictive performance.•Under-sampling the major class in training data, can also lead to further improvement.

论文关键词:Semantic indexing,MeSH,Biomedical literature,Weak supervision,00-01,99-00

论文评审过程:Received 27 November 2019, Revised 6 March 2020, Accepted 24 April 2020, Available online 23 May 2020, Version of Record 23 May 2020.

论文官网地址:https://doi.org/10.1016/j.ipm.2020.102282