Structure discovery in medical databases: a conceptual clustering approach

作者:

Highlights:

摘要

Clustering is an important data analysis tool for discovering structure in data sets. Although research on conceptual clustering has produced algorithms showing significant advantages over earlier numerical ones, existing methods still present some limitations regarding applicability to biomedical domains. In this paper we describe ADAGIO, a conceptual clustering algorithm combining a low-cost preordering process with a breadth-first incremental control strategy that incorporates merging and splitting operators. Experimental evaluation indicated that the algorithm achieves a good balance between structure discovery performance and computational efficiency, and demonstrated the comparative effectiveness of its missing information handling process. ADAGIO is able to handle qualitative, quantitative and mixed-type data. An application example to a cancer domain is given, where the algorithm was able to suggest interesting epidemiological interpretations.

论文关键词:Unsupervised learning from databases,Conceptual clustering,Order bias in incremental clustering,Missing values handling,Clustering algorithms' evaluation

论文评审过程:Received 2 November 1995, Accepted 1 April 1996, Available online 23 March 1999.

论文官网地址:https://doi.org/10.1016/S0933-3657(96)00353-3