Stock portfolio selection balancing variance and tail risk via stock vector representation acquired from price data and texts

作者:

Highlights:

摘要

Recent works on portfolio selection report ways to incorporate textual data in addition to price movements. Price, texts, and events as what lies underneath take heterogeneous data form and therefore have been processed without any consistent mathematical formulation.In this article, we propose to generalize portfolio selection by representing all related objects (stocks, news, events) in an embedding vector space, that we call a NEws-STock space with Event Distribution (NESTED). A NESTED forms an inner product vector space (Hilbert space), in which texts and stocks are represented as vectors (embeddings), acquired through a distribution of events. In this article, we first theoretically reformulate Markowitz’s portfolio optimization problem on NESTED. We show how our new formulation has the potential to better incorporate the tail risk, which is represented better in textual data.One typical method to acquire such embeddings is via neural computing. Our experimental results, obtained by using it on 24 news-price datasets across three markets, showed that the Pareto’s exponent in the negative tail of the generated portfolios increased in all markets, which is evidence that the stock embeddings captured the tail risks. Our method showed a large improvement balancing between the tail risk and non-tail risk, up to 45.5% larger gain and 59.4% larger Information ratio.

论文关键词:Portfolio optimization,Mean–variance minimization,News text,Stock embedding,Neural network,Tail risk

论文评审过程:Received 9 December 2020, Revised 7 October 2021, Accepted 24 April 2022, Available online 30 April 2022, Version of Record 14 May 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.108917