Transforming arbitrary tables into logical form with TARTAR

作者:

Highlights:

摘要

The tremendous success of the World Wide Web is countervailed by efforts needed to search and find relevant information. For tabular structures embedded in HTML documents, typical keyword or link-analysis based search fails. The Semantic Web relies on annotating resources such as documents by means of ontologies and aims to overcome the bottleneck of finding relevant information. Turning the current Web into a Semantic Web requires automatic approaches for annotation since manual approaches will not scale in general. Most efforts have been devoted to automatic generation of ontologies from text, but with quite limited success. However, tabular structures require additional efforts, mainly because understanding of table contents requires the comprehension of the logical structure of the table on the one hand, as well as its semantic interpretation on the other. The focus of this paper is on the automatic transformation and generation of semantic (F-Logic) frames from table-like structures. The presented work consists of a methodology, an accompanying implementation (called TARTAR) and a thorough evaluation. It is based on a grounded cognitive table model which is stepwise instantiated by the methodology. A typical application scenario is the automatic population of ontologies to enable query answering over arbitrary tables (e.g. HTML tables).

论文关键词:Table structure,Table modeling,Knowledge frame,Ontology learning,Semantic Web,Query answering

论文评审过程:Received 10 April 2006, Accepted 13 April 2006, Available online 11 May 2006.

论文官网地址:https://doi.org/10.1016/j.datak.2006.04.002