Deep learning for detecting financial statement fraud

作者:

Highlights:

• Combining financial and text data enhances fraudulent financial statements detection.

• HAN, GPT-2, ANN and XGB detect financial misstatements based on textual cues.

• Novel NLP techniques allow to capture content and context of MD&As.

• Interpretability offered with “red-flag” sentences in the MD&As of annual reports.

• The proposed models provide decision support for stakeholders.

摘要

Financial statement fraud is an area of significant consternation for potential investors, auditing companies, and state regulators. The paper proposes an approach for detecting statement fraud through the combination of information from financial ratios and managerial comments within corporate annual reports. We employ a hierarchical attention network (HAN) to extract text features from the Management Discussion and Analysis (MD&A) section of annual reports. The model is designed to offer two distinct features. First, it reflects the structured hierarchy of documents, which previous approaches were unable to capture. Second, the model embodies two different attention mechanisms at the word and sentence level, which allows content to be differentiated in terms of its importance in the process of constructing the document representation. As a result of its architecture, the model captures both content and context of managerial comments, which serve as supplementary predictors to financial ratios in the detection of fraudulent reporting. Additionally, the model provides interpretable indicators denoted as “red-flag” sentences, which assist stakeholders in their process of determining whether further investigation of a specific annual report is required. Empirical results demonstrate that textual features of MD&A sections extracted by HAN yield promising classification results and substantially reinforce financial ratios.

论文关键词:Fraud detection,Financial statements,Deep learning,Text analytics

论文评审过程:Received 5 May 2020, Revised 3 September 2020, Accepted 30 September 2020, Available online 10 October 2020, Version of Record 6 November 2020.

论文官网地址:https://doi.org/10.1016/j.dss.2020.113421