Automatic discovery of adverse reactions through Chinese social media

作者:Mengxue Zhang, Meizhuo Zhang, Chen Ge, Quanyang Liu, Jiemin Wang, Jia Wei, Kenny Q. Zhu

摘要

Despite tremendous efforts made before the release of every drug, some adverse drug reactions (ADRs) may go undetected and thus, cause harm to both the users and to the pharmaceutical companies. One plausible venue to collect evidence of such ADRs is online social media, where patients and doctors discuss medical conditions and their treatments. There is substantial previous research on ADRs extraction from English online forums. However, very limited research was done on Chinese data. In this paper, we try to use the posts from two popular Chinese social media as the original dataset. We propose a semi-supervised learning framework that detects mentions of medications and colloquial ADR terms and extracts lexicon-syntactic features from natural language text to recognize positive associations between drug use and ADRs. The key contribution is an automatic label generation algorithm, which requires very little manual annotation. This bootstrapping algorithm could also be further applied on English data. The research results indicate that our algorithm outperforms the hidden Markov model and conditional random fields. With this approach, we discovered a large number of side effects for a variety of popular medicines in real world scenarios.

论文关键词:Adverse drug reaction, Chinese social media, Natural language processing

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10618-018-00610-2