Semantic search in the World News domain using automatically extracted metadata files

作者：

Highlights：

•

摘要

The Semantic Web can have great influence on various domains of information. One of them is the domain of World News. Semantic Web technologies aim at providing the means to organize the vast amount of knowledge that is scattered in the Web, in a machine understandable way. Then, searching and data retrieval would be much easier. This would be particularly helpful in the World News domain. There is a big variety of news sources and it would be useful to provide an efficient method to automatically organize them. In this paper, we describe World News Finder, a system which performs semantic search on the World News domain. The system is based on metadata files created for every single World News HTML webpage in an automatic way. According to a user query, the system performs the search on these metadata files rather than keyword search. To achieve the above, we developed the World News Ontology and a large set of domain-specific heuristic rules.

论文关键词：Semantic search,Automatic metadata extraction,World News,Text categorization,Natural language understanding

论文评审过程：Received 24 July 2009, Revised 9 December 2011, Accepted 11 December 2011, Available online 16 December 2011.

论文官网地址：https://doi.org/10.1016/j.knosys.2011.12.007