Building intelligent Web applications using lightweight wrappers

作者:

Highlights:

摘要

The Web so far has been incredibly successful at delivering information to human users. So successful actually, that there is now an urgent need to go beyond a browsing human. Unfortunately, the Web is not yet a well organized repository of nicely structured documents but rather a conglomerate of volatile HTML pages.To address this problem, we present the World Wide Web Wrapper Factory (W4F), a toolkit for the generation of wrappers for Web sources, that offers: (1) an expressive language to specify the extraction of complex structures from HTML pages; (2) a declarative mapping to various data formats like XML; (3) some visual tools to make the engineering of wrappers faster and easier.

论文关键词:Web,XML,Information extraction,Wrappers

论文评审过程:Available online 22 December 2000.

论文官网地址:https://doi.org/10.1016/S0169-023X(00)00051-3