A tool for producing structured interoperable data from product features on the web

作者:

Highlights:

• A tool producing structured data from product features on the web is introduced.

• This is the first Protégé plug-in that extracts product features from web pages.

• Extracting information from complex-data intensive web sites is partially handled.

• The user creates a template manually using a domain-specific language.

• The output is GoodRelations snippets containing product features in RDFa/ Microdata.

摘要

Highlights•A tool producing structured data from product features on the web is introduced.•This is the first Protégé plug-in that extracts product features from web pages.•Extracting information from complex-data intensive web sites is partially handled.•The user creates a template manually using a domain-specific language.•The output is GoodRelations snippets containing product features in RDFa/ Microdata.

论文关键词:Information extraction,GoodRelations,Protégé,Web scraping,Ontology,Rich snippets

论文评审过程:Received 11 March 2015, Accepted 7 September 2015, Available online 25 September 2015, Version of Record 14 October 2015.

论文官网地址:https://doi.org/10.1016/j.is.2015.09.002