Focused crawling enhanced by CBP–SLC
作者:
Highlights:
• A heuristic-based approach, CBP–SLC, is presented for enhancing focused crawling.
• A weighted voting classifier using TFIPNDF feature weighting approach is built.
• 1-DNFC identifies more reliable negative documents from the unlabeled examples set.
摘要
•A heuristic-based approach, CBP–SLC, is presented for enhancing focused crawling.•A weighted voting classifier using TFIPNDF feature weighting approach is built.•1-DNFC identifies more reliable negative documents from the unlabeled examples set.
论文关键词:Focused crawling,DOM tree,TFIPNDF,CBP–SLC,WVC,Tunneling
论文评审过程:Received 1 January 2013, Revised 24 May 2013, Accepted 13 June 2013, Available online 11 July 2013.
论文官网地址:https://doi.org/10.1016/j.knosys.2013.06.008