Unsupervised pattern recognition of mixed data structures with numerical and categorical features using a mixture regression modelling framework

作者:

Highlights:

• Cluster analysis of mixed-feature data imposes challenges in mixture modelling.

• Comorbid-condition groups inform potential shared biologic processes among diseases.

• Individuals with heterogeneous comorbidity patterns show different risk features.

• Regression models improve clustering results by adjustment of relevant risk factors.

• This method is applicable for more general mixed data, via consensus clustering.

摘要

•Cluster analysis of mixed-feature data imposes challenges in mixture modelling.•Comorbid-condition groups inform potential shared biologic processes among diseases.•Individuals with heterogeneous comorbidity patterns show different risk features.•Regression models improve clustering results by adjustment of relevant risk factors.•This method is applicable for more general mixed data, via consensus clustering.

论文关键词:Mixture model,Mixed feature,Cluster analysis,Comorbidity,Generalised Bernoulli distribution

论文评审过程:Received 2 March 2018, Revised 18 July 2018, Accepted 17 November 2018, Available online 20 November 2018, Version of Record 27 November 2018.

论文官网地址:https://doi.org/10.1016/j.patcog.2018.11.022