Data misrepresentation detection for insurance underwriting fraud prevention

作者:

Highlights:

• We introduce a flexible approach for detecting insurance premium fraud based on validated self-reported information.

• Our approach estimates premium fraud based on the expected value of the self-reported information and its reported value.

• We present evidence on the effectiveness of the proposed approach for detecting premium fraud on real insurance motor data.

• We show how our model based on an orthonormal basis transformation of the self-reported variable(s) can be made explainable.

摘要

Premium fraud concerns data misrepresentation committed by an insurance customer with the intent to benefit from an unduly low premium at the underwriting of a policy. In this paper, we propose a novel approach for evaluating the risk of underwriting premium fraud at the time of application in the presence of potentially misrepresented self-reported information. The aim of the approach is to support insurance companies in identifying fraudulent applications and their decisions to underwrite insurance contract propositions. Likewise, it can be use to make straight-through processing (i.e. automated) underwriting systems more fraudproof, by e.g., triggering a validation on applications prone to misrepresentations. Our approach is based on conditional density estimates for a set of validated contracts. The proposed approach does not require historical fraud labels and can adapt to changes in pricing policy. Moreover, the approach can be used to detect outliers in addition to predicting underwriting fraud and is extended to multivariate self-reported data. We further demonstrate a link between Shapley values in common conditional expectation problems and conditional density estimations to make our approach explainable. We report a case study involving motor insurance underwriting, in which a driver's identity and driving record can be misrepresented to benefit from an unduly low premium; the results indicate the effectiveness of the proposed approach for detecting and preventing underwriting fraud.

论文关键词:Insurance underwriting fraud,Premium fraud,Data misrepresentation,Machine learning,Nonlife insurance

论文评审过程:Received 19 October 2021, Revised 12 April 2022, Accepted 13 April 2022, Available online 5 May 2022, Version of Record 10 June 2022.

论文官网地址:https://doi.org/10.1016/j.dss.2022.113798