Reinterpreting the Category Utility Function

作者:Boris Mirkin

摘要

The category utility function is a partition quality scoring function applied in some clustering programs of machine learning. We reinterpret this function in terms of the data variance explained by a clustering, or, equivalently, in terms of the square-error classical clustering criterion that administers the K-Means and Ward methods. This analysis suggests extensions of the scoring function to situations with differently standardized and mixed scale data.

论文关键词:clustering, data standardization, contingency coefficient, correlation ratio, weighting features, mixed-scale data

论文评审过程:

论文官网地址:https://doi.org/10.1023/A:1010924920739