BLOSS: Effective meta-blocking with almost no effort

作者:

Highlights:

• We present a new solution for configuring meta-blocking with reduced user effort.

• A sampling is combined with an active learning approach to select informative pairs.

• A strategy is proposed to remove outliers, improving the recall.

• The experiments show a reducing in labeling effort and also a precision improvement.

摘要

•We present a new solution for configuring meta-blocking with reduced user effort.•A sampling is combined with an active learning approach to select informative pairs.•A strategy is proposed to remove outliers, improving the recall.•The experiments show a reducing in labeling effort and also a precision improvement.

论文关键词:Data integration,Deduplication,Blocking,Meta-blocking

论文评审过程:Received 20 September 2017, Revised 8 February 2018, Accepted 14 February 2018, Available online 16 February 2018, Version of Record 9 March 2018.

论文官网地址:https://doi.org/10.1016/j.is.2018.02.005