On data mining, compression, and Kolmogorov complexity

作者:Christos Faloutsos, Vasileios Megalooikonomou

摘要

Will we ever have a theory of data mining analogous to the relational algebra in databases? Why do we have so many clearly different clustering algorithms? Could data mining be automated? We show that the answer to all these questions is negative, because data mining is closely related to compression and Kolmogorov complexity; and the latter is undecidable. Therefore, data mining will always be an art, where our goal will be to find better models (patterns) that fit our datasets as best as possible.

论文关键词:Data mining, Compression, Kolmogorov complexity, Clustering, Classification, Forecasting, Outliers

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10618-006-0057-3