Semantic enrichment of documents: a classification perspective for ontology-based imbalanced semantic descriptions

作者:Georgios Stratogiannis, Panagiotis Kouris, Georgios Alexandridis, Georgios Siolas, Giorgos Stamou, Andreas Stafylopatis

摘要

This article presents a novel framework for the semantic enrichment of documents, exploiting the hierarchical ontological knowledge of a domain in conjunction with classification techniques. The main contributions of this work are fourfold: (a) a well-defined theoretical model for the semantic representation and enrichment of documents is defined, (b) a method for dealing with the problem of class imbalance is outlined, based on the transformation of the document representations into more balanced ones, (c) a methodology is proposed for assigning semantic labels in those cases where it is hard to decide which label fits best and (d) a set of novel metrics for evaluating the performance of the suggested framework are introduced. The extensive experimental procedure that follows, conducted on two popular datasets, exhibits promising results and constitutes a proof of the robustness of the overall approach.

论文关键词:Semantic enrichment, Semantic labeling, Semantic annotation, Semi-structured documents, Ontologies, Classification class imbalance

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-021-01615-y