Scanning World Wide Web documents with the vector space model

作者:

Highlights:

摘要

The vector space model used in Information Retrieval is combined with discriminant analysis to provide an automated WWW environment scanning system to detect signals of interest to an organization. The vector space model converts text-based information to numerical vectors that are then used in discriminant analysis. We illustrate the methodology using news articles pertaining to a predefined randomly selected set of stocks to test whether they provide predictive signals on whether the stock's return will increase or decrease relative to the market in the target period following the report or whether the stock's trading volume will increase or decrease.

论文关键词:Multivariate statistics,Discriminant analysis,Environmental scanning,Decision support systems,Vector space model,Text classification

论文评审过程:Available online 26 April 2005.

论文官网地址:https://doi.org/10.1016/j.dss.2005.03.002