Long-term stock index forecasting based on text mining of regulatory disclosures

作者:

Highlights:

• Financial news entail a prognostic capacity of stock market movements.

• We propose text mining for long-term predictions of stock indices.

• We reveal that financial disclosure have a high predictive power in the long run.

• It outperforms common benchmarks based on time series methodology.

摘要

Share valuations are known to adjust to new information entering the market, such as regulatory disclosures. We study whether the language of such news items can improve short-term and especially long-term (24 months) forecasts of stock indices. For this purpose, this work utilizes predictive models suited to high-dimensional data and specifically compares techniques for data-driven and knowledge-driven dimensionality reduction in order to avoid overfitting. Our experiments, based on 75,927 ad hoc announcements from 1996–2016, reveal the following results: in the long run, text-based models succeed in reducing forecast errors below baseline predictions from historic lags at a statistically significant level. Our research provides implications to business applications of decision-support in financial markets, especially given the growing prevalence of index ETFs (exchange traded funds).

论文关键词:Text mining,Natural language processing,Financial news,Financial forecasting,Stock index,Predictive analytics

论文评审过程:Received 31 December 2017, Revised 20 June 2018, Accepted 21 June 2018, Available online 30 June 2018, Version of Record 14 July 2018.

论文官网地址:https://doi.org/10.1016/j.dss.2018.06.008