An efficient set estimator in high dimensions: consistency and applications to fast data visualization

作者:

Highlights:

摘要

Data visualization from a point set by estimating the underlying region is a problem of considerable practical interest and is an associated problem of set estimation. The most important issue in set estimation is consistency. Only a few existing point pattern shape descriptors that estimate the underlying region are consistent set estimators (a set estimator is consistent if it converges—in an appropriate sense—to the original set as the sample size increases). On the other hand, to be used as a shape descriptor, a set estimator should also satisfy several important criteria such as correct identification of number of components, robustness in the presence of noise and computational efficiency. Here we propose such a class of set estimators called s-shapes, which remain consistent in finite dimensions when the data are generated from any continuous distribution. These set estimators can be easily computed and effectively used for fast data visualization. Detailed studies on their performance such as error rates, robustness in presence of noise, run-time analysis, etc., are also performed.

论文关键词:

论文评审过程:Received 4 September 2002, Accepted 17 October 2003, Available online 3 December 2003.

论文官网地址:https://doi.org/10.1016/j.cviu.2003.10.002