A general framework for subjective information extraction from unstructured English text

作者：

Highlights：

•

摘要

In this paper, we present an information extraction (IE) strategy for handling subjective information from unstructured text. The presented methodology is general: it can be useful in many real-life applications that could potentially benefit from an automatic IE system that makes human-like decisions. We test our methodology in the sphere of company news evaluation with respect to the potential effect of the news on the company’s stock prices. The described general framework comprises four sequential processing steps: part-of-speech tagging, syntactic parsing, relation generation, and criteria evaluation. The first two steps perform generic NLP tasks, while the last two phases are application-specific and require a thorough understanding of the application domain. We describe each stage and illustrate the flow of the modus operandi. We keep up with the company news evaluation example throughout the paper. Due to the inherent subjectivity of the envisaged problem, results cannot be categorically justified. However, comparing the system’s evaluation of company news to our own, the results were very encouraging.

论文关键词：Information extraction,Natural language processing,Text evaluation,Intelligent systems,Financial analysis

论文评审过程：Received 21 March 2006, Revised 4 September 2006, Accepted 5 October 2006, Available online 7 November 2006.

论文官网地址：https://doi.org/10.1016/j.datak.2006.10.001