A multi-phase correlation search framework for mining non-taxonomic relations from unstructured text

作者:Mei Kuan Wong, Syed Sibte Raza Abidi, Ian D. Jonsen

摘要

Over the last decade, ontology engineering has been pursued by “learning” the ontology from domain-specific electronic documents. Most of the research works are focused on extraction of concepts and taxonomic relations. The extraction of non-taxonomic relations is often neglected and not well researched. In this paper, we present a multi-phase correlation search framework to extract non-taxonomic relations from unstructured text. Our framework addresses the two main problems in any non-taxonomic relations extraction: (a) the discovery of non-taxonomic relations and (b) the labelling of non-taxonomic relations. First, our framework is capable of extracting correlated concepts beyond ordinary search window size of a single sentence. Interesting correlations are then filtered using association rule mining with lift interestingness measure. Next, our framework distinguishes non-taxonomic concept pairs from taxonomic concept pairs based on existing domain ontology. Finally, our framework features the usage of domain related verbs as labels for the non-taxonomic relations. Our proposed framework has been tested with the marine biology domain. Results have been validated by domain experts showing reliable results as well as demonstrate significant improvement over traditional association rule approach in search of non-taxonomic relations from unstructured text.

论文关键词:Correlation search, Non-taxonomic relation, Relation labelling, Association rule mining, Lift interestingness measure, Ontology learning

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-012-0593-7