Isolation-based conditional anomaly detection on mixed-attribute data to uncover workers' compensation fraud

作者:

Highlights:

• We propose iForestCAD for isolation-based conditional anomaly detection.

• Our method can handle mixed type attributes.

• It thereby integrates valuable expert knowledge, producing meaningful anomaly scores.

• It enables the detection of “hidden anomalies,” interesting for domain experts.

• We apply iForestCAD for fraud detection on real-world workers' compensation claims.

摘要

The development of new data analytical methods remains a crucial factor in the combat against insurance fraud. Methods rooted in the research field of anomaly detection are considered as promising candidates for this purpose. Commonly, a fraud data set contains both numeric and nominal attributes, where, due to the ease of expressiveness, the latter often encodes valuable expert knowledge. For this reason, an anomaly detection method should be able to handle a mixture of different data types, returning an anomaly score meaningful in the context of the business application.

论文关键词:Workers' compensation insurance fraud,Fraud detection,Conditional anomaly detection,Isolation forest

论文评审过程:Received 12 November 2017, Revised 17 April 2018, Accepted 17 April 2018, Available online 22 April 2018, Version of Record 14 June 2018.

论文官网地址:https://doi.org/10.1016/j.dss.2018.04.001