Fuzzy entropy clustering by searching local border points for the analysis of gene expression data

作者:

Highlights:

摘要

Clustering data by identifying a subset of representative border points is essential for detecting dataset structures and processing complicated messages in data. Such border points can be found by exactly choosing an initial subset of data points, and then iteratively refining it with fuzzy membership values; however, this only works well if the fuzzy rule choice is close to a good solution. We propose a clustering algorithm, known as affine fuzzy neuron (AFN), which measures the relations of linear memberships between border points and the cluster centers of points within a subset. Fuzzy memberships of border points are calculated until their corresponding clusters gradually emerge. AFN exploits a typical feature of real-life biological gene clusters, like gene expression measurements, from microarrays or single-cell RNA-Seq. That is, genes sharing the same expression profile belong to different clusters with diverse functions. We hypothesize that these genes only appear at cluster boundaries and a subset of genes comprises points with similar expression profiles but entirely different functions. We use AFN to cluster several test cases and analyze various gene expression datasets. The proposed clustering algorithm is a flexible modeling framework that increases accuracy and combines border regions across multiple clusters while remaining computationally feasible.

论文关键词:Fuzzy clustering,Fuzzy entropy,Border points,Local density peeks,Gene expression data

论文评审过程:Received 14 March 2019, Revised 26 November 2019, Accepted 28 November 2019, Available online 4 December 2019, Version of Record 7 February 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2019.105309