Ambiguity-driven fuzzy C-means clustering: how to detect uncertain clustered records

作者:Meysam Ghaffari, Nasser Ghadiri

摘要

As a well-known clustering algorithm, Fuzzy C-Means (FCM) allows each input sample to belong to more than one cluster, providing more flexibility than non-fuzzy clustering methods. However, the accuracy of FCM is subject to false detections caused by noisy records, weak feature selection and low certainty of the algorithm in some cases. The false detections are very important in some decision-making application domains like network security and medical diagnosis, where weak decisions based on such false detections may lead to catastrophic outcomes. They mainly emerge from making decisions about a subset of records that do not provide sufficient evidence to make a good decision. In this paper, we propose a method for detecting such ambiguous records in FCM by introducing a certainty factor to decrease invalid detections. This approach enables us to send the detected ambiguous records to another discrimination method for a deeper investigation, thus increasing the accuracy by lowering the error rate. Most of the records are still processed quickly and with low error rate preventing performance loss which is common in similar hybrid methods. Experimental results of applying the proposed method on several datasets from different domains show a significant decrease in error rate as well as improved sensitivity of the algorithm.

论文关键词:FCM clustering, Intrusion detection, Classification with ambiguity, Certainty factor, Location privacy, Fuzzy image segmentation

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-016-0759-1