Integration of an imbalance framework with novel high-generalizable classifiers for radiomics-based distant metastases prediction of advanced nasopharyngeal carcinoma

作者:

Highlights:

摘要

Model overfitting and data imbalance are two main challenges in radiomics studies. In this study, we develop a high-generalizable classifier MERGE (Multi-kErnel Regression with Graph Embedding) and an imbalance framework SUS (Sensitivity-based Under-Sampling) to address these two challenges. First, we integrate the class compactness graph into multi-kernel regression to keep the samples from the same class close together when they are transformed to the label space. In the class compactness graph, each pair of samples from the same class are linked by an undirected weighted edge to capture the relationship between two samples. In such a way, samples from the same class can be kept as close as possible so that the model overfitting problem can be weakened to a great extent. Secondly, to utilize potentially informative data, we propose a sensitivity-based under-sampling imbalance ensemble framework SUS. In each ensemble procedure, the majority class is organized into different blocks through clustering according to the sensitivity of each sample computed by MERGE and then each block is randomly under-sampled in a concordant & self-paced manner. We collect 100 advanced nasopharyngeal carcinoma patients from Hong Kong Queen Elizabeth Hospital and use the proposed SUS-MERGE framework to predict distant metastasis using radiomics features extracted from tumor subregions of different image modalities. Experimental results show promising performance as compared with benchmarking methods.

论文关键词:Overfitting,Imbalance classification,Class compactness graph,Sensitivity-based under-sampling,Radiomics-based prediction

论文评审过程:Received 20 December 2020, Revised 24 August 2021, Accepted 30 September 2021, Available online 29 October 2021, Version of Record 2 November 2021.

论文官网地址:https://doi.org/10.1016/j.knosys.2021.107649