Hashing and canonicalizing Notation 3 graphs

作者:

Highlights:

摘要

This paper presents a hash and a canonicalization algorithm for Notation 3 (N3) and Resource Description Framework (RDF) graphs. The hash algorithm produces, given a graph, a hash value such that the same value would be obtained from any other equivalent graph. Contrary to previous related work, it is well-suited for graphs with blank nodes, variables and subgraphs. The canonicalization algorithm outputs a canonical serialization of a given graph (i.e. a canonical representative of the set of all the graphs that are equivalent to it). Potential applications of these algorithms include, among others, checking graphs for identity, computing differences between graphs and graph synchronization. The former could be especially useful for crawlers that gather RDF/N3 data from the Web, to avoid processing several times graphs that are equivalent. Both algorithms have been evaluated on a big dataset, with more than 29 million triples and several millions of subgraphs and variables.

论文关键词:Semantic Web,Notation 3,RDF,Hash

论文评审过程:Received 26 May 2009, Revised 9 November 2009, Available online 28 January 2010.

论文官网地址:https://doi.org/10.1016/j.jcss.2010.01.003