Frameworks for entity matching: A comparison

作者:

Highlights:

摘要

Entity matching is a crucial and difficult task for data integration. Entity matching frameworks provide several methods and their combination to effectively solve different match tasks. In this paper, we comparatively analyze 11 proposed frameworks for entity matching. Our study considers both frameworks which do or do not utilize training data to semi-automatically find an entity matching strategy to solve a given match task. Moreover, we consider support for blocking and the combination of different match algorithms. We further study how the different frameworks have been evaluated. The study aims at exploring the current state of the art in research prototypes of entity matching frameworks and their evaluations. The proposed criteria should be helpful to identify promising framework approaches and enable categorizing and comparatively assessing additional entity matching frameworks and their evaluations.

论文关键词:Entity resolution,Entity matching,Matcher combination,Match optimization,Training selection

论文评审过程:Received 3 December 2008, Revised 2 October 2009, Accepted 3 October 2009, Available online 14 October 2009.

论文官网地址:https://doi.org/10.1016/j.datak.2009.10.003