Box clustering segmentation: A new method for vision-based web page preprocessing

作者:

Highlights:

• New, purely vision-based, segmentation technique is formally described.

• Only a few simple visual cues are used to assess similarity of the rectangles.

• Its performance better by an order of magnitude when compared with competition.

• Rectangle clustering is a viable way to perform web page segmentation.

摘要

•New, purely vision-based, segmentation technique is formally described.•Only a few simple visual cues are used to assess similarity of the rectangles.•Its performance better by an order of magnitude when compared with competition.•Rectangle clustering is a viable way to perform web page segmentation.

论文关键词:Clustering,Segmentation,Vision-based page segmentation,VIPS

论文评审过程:Received 5 May 2016, Revised 1 December 2016, Accepted 2 February 2017, Available online 16 February 2017, Version of Record 16 February 2017.

论文官网地址:https://doi.org/10.1016/j.ipm.2017.02.002