An approach to mining the multi-relational imbalanced database

作者:

Highlights:

摘要

The class imbalance problem is an important issue in classification of Data mining. For example, in the applications of fraudulent telephone calls, telecommunications management, and rare diagnoses, users would be more interested in the minority than the majority. Although there are many proposed algorithms to solve the imbalanced problem, they are unsuitable to be directly applied on a multi-relational database. Nevertheless, many data nowadays such as financial transactions and medical anamneses are stored in a multi-relational database rather than a single data sheet. On the other hand, the widely used multi-relational classification approaches, such as TILDE, FOIL and CrossMine, are insensitive to handle the imbalanced databases. In this paper, we propose a multi-relational g-mean decision tree algorithm to solve the imbalanced problem in a multi-relational database. As shown in our experiments, our approach can more accurately mine a multi-relational imbalanced database.

论文关键词:Data mining,Classification,Imbalance,Relational database

论文评审过程:Available online 30 June 2007.

论文官网地址:https://doi.org/10.1016/j.eswa.2007.05.048