Feature assembly method for extracting relations in Chinese

作者:

摘要

The goal of relation extraction is to detect relations between two entities in free text. In a sentence, a relation instance usually comprises a small number of words, which yields a sparse feature representation. To make better use of limited information in a relation instance, parsing trees and combined features are employed widely to capture the local dependencies of relation instances. However, the performance of parsing tree-based systems is often degraded by chunking or parsing errors. Combined features are used widely, but few studies have addressed how features can be combined to achieve optimal performance. Thus, in this study, we propose a feature assembly method for relation extraction. Six types of candidate features (head noun, POS tag, n-gram, omni-word, etc.) are employed as atomic features and six constraint conditions (singleton, position, syntax, etc.) are used to combine these features in different settings. Depending on the utilization of candidate features, different constraint conditions can be explored to achieve the optimal extraction performance. Our method is effective for capturing local dependencies and it reduces the errors caused by inaccurate parsing. We tested the proposed method using the ACE 2005 Chinese and English corpora, and it achieved state-of-the-art performance, where it was significantly superior to existing methods.

论文关键词:Constraint condition,Feature assembly,Relation extraction

论文评审过程:Received 5 February 2015, Revised 5 July 2015, Accepted 7 July 2015, Available online 14 July 2015, Version of Record 13 August 2015.

论文官网地址:https://doi.org/10.1016/j.artint.2015.07.003