How an epileptic EEG segment, used as reference, can influence a cross-correlation classifier?

摘要

Several neurological disorders, such as epilepsy, can be diagnosed by electroencephalogram (EEG). Data mining supported by machine learning (ML) techniques can be used to find patterns and to build classifiers for the data. In order to make it possible, data should be represented in an appropriate format, e.g. attribute-value table, which can be built by feature extraction approaches, such as the cross-correlation (CC) method, which uses one signal as reference and correlates it with other signals. However, the reference is commonly selected randomly and, to the best of our knowledge, no studies have been conducted to evaluate whether this choice can affect the ML method performance. Thereby, this work aims to verify whether the choice of an epileptic EEG segment as reference can affect the performance of classifiers built from data. Also, a CC with artificial reference (CCAR) method is proposed in order to reduce possible consequences of the random selection of a signal as reference. Two experimental evaluations were conducted in a set of 200 EEG segments to induce classifiers using ML algorithms, such as J48, 1NN, naive Bayes, BP-MLP, and SMO. In the first study, each epileptic EEG segment was selected as reference to apply CC and ML methods. The evaluation found extremely significant difference, evidencing that the choice of an EEG segment as reference can influence the performance of ML methods. In the second study, the CCAR method was performed, in which statistical tests, only in comparisons involving the SMO classifier, showed not-so-good results.