Exploring syntactic structured features over parse trees for relation extraction using kernel methods

作者:

Highlights:

摘要

Extracting semantic relationships between entities from text documents is challenging in information extraction and important for deep information processing and management. This paper proposes to use the convolution kernel over parse trees together with support vector machines to model syntactic structured information for relation extraction. Compared with linear kernels, tree kernels can effectively explore implicitly huge syntactic structured features embedded in a parse tree. Our study reveals that the syntactic structured features embedded in a parse tree are very effective in relation extraction and can be well captured by the convolution tree kernel. Evaluation on the ACE benchmark corpora shows that using the convolution tree kernel only can achieve comparable performance with previous best-reported feature-based methods. It also shows that our method significantly outperforms previous two dependency tree kernels for relation extraction. Moreover, this paper proposes a composite kernel for relation extraction by combining the convolution tree kernel with a simple linear kernel. Our study reveals that the composite kernel can effectively capture both flat and structured features without extensive feature engineering, and easily scale to include more features. Evaluation on the ACE benchmark corpora shows that the composite kernel outperforms previous best-reported methods in relation extraction.

论文关键词:Information extraction,Relation extraction,Syntactic structured features,Convolution tree kernel,Composite kernel

论文评审过程:Received 28 November 2006, Revised 28 July 2007, Accepted 30 July 2007, Available online 11 September 2007.

论文官网地址:https://doi.org/10.1016/j.ipm.2007.07.013