Applying the CRISP-DM data mining process in the financial services industry: Elicitation of adaptation requirements

作者:

Highlights:

摘要

Data mining techniques have gained widespread adoption over the past decades, particularly in the financial services domain. To achieve sustained benefits from these techniques, organizations have adopted standardized processes for managing data mining projects, most notably CRISP-DM. Research has shown that these standardized processes are often not used as prescribed, but instead, they are extended and adapted to address a variety of requirements. To improve the understanding of how standardized data mining processes are extended and adapted in practice, this paper reports on a case study in a financial services organization, aimed at identifying perceived gaps in the CRISP-DM process and characterizing how CRISP-DM is adapted to address these gaps. The case study was conducted based on documentation from a portfolio of data mining projects, complemented by semi-structured interviews with project participants. The results reveal 18 perceived gaps in CRISP-DM alongside their perceived impact and mechanisms employed to address these gaps. The identified gaps are grouped into six categories. Next, they were triangulated and augmented with the gaps discovered in the other studies. Then, the requirements for adapting CRISP-DM to address the gaps were derived, and the directions for the potential adaptations were outlined.The study presents a two-fold contribution. It provides practitioners with a structured set of gaps to be considered when applying CRISP-DM, or similar processes, in the financial services sector. Additionally, the study elicits the requirements and sketches the potential solutions to address these gaps. Also, the number of the identified gaps is generic and applicable to other sectors with similar concerns (e.g. privacy), such as telecom or e-commerce.

论文关键词:Data mining,CRISP-DM,Case study,Requirements

论文评审过程:Received 14 October 2021, Revised 30 January 2022, Accepted 26 March 2022, Available online 1 April 2022, Version of Record 23 April 2022.

论文官网地址:https://doi.org/10.1016/j.datak.2022.102013