Geographic knowledge extraction and semantic similarity in OpenStreetMap

作者:Andrea Ballatore, Michela Bertolotto, David C. Wilson

摘要

In recent years, a web phenomenon known as Volunteered Geographic Information (VGI) has produced large crowdsourced geographic data sets. OpenStreetMap (OSM), the leading VGI project, aims at building an open-content world map through user contributions. OSM semantics consists of a set of properties (called ‘tags’) describing geographic classes, whose usage is defined by project contributors on a dedicated Wiki website. Because of its simple and open semantic structure, the OSM approach often results in noisy and ambiguous data, limiting its usability for analysis in information retrieval, recommender systems and data mining. Devising a mechanism for computing the semantic similarity of the OSM geographic classes can help alleviate this semantic gap. The contribution of this paper is twofold. It consists of (1) the development of the OSM Semantic Network by means of a web crawler tailored to the OSM Wiki website; this semantic network can be used to compute semantic similarity through co-citation measures, providing a novel semantic tool for OSM and GIS communities; (2) a study of the cognitive plausibility (i.e. the ability to replicate human judgement) of co-citation algorithms when applied to the computation of semantic similarity of geographic concepts. Empirical evidence supports the usage of co-citation algorithms—SimRank showing the highest plausibility—to compute concept similarity in a crowdsourced semantic network.

论文关键词:Semantic similarity, OpenStreetMap, Volunteered Geographic Information, OSM Semantic Network, SimRank, P-Rank, Co-citation, Crowdsourcing

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-012-0571-0