The Computational Complexity of Densest Region Detection

作者:

Highlights:

摘要

We investigate the computational complexity of the task of detecting dense regions of an unknown distribution from unlabeled samples of this distribution. We introduce a formal learning model for this task that uses a hypothesis class as it “anti-overfitting” mechanism. The learning task in our model can be reduced to a combinatorial optimization problem. We can show that for some constants, depending on the hypothesis class, these problems are NP-hard to approximate to within these constant factors. We go on and introduce a new criterion for the success of approximate optimization geometric problems. The new criterion requires that the algorithm competes with hypotheses only on the points that are separated by some margin μ from their boundaries. Quite surprisingly, we discover that for each of the two hypothesis classes that we investigate, there is a “critical value” of the margin parameter μ. For any value below the critical value the problems are NP-hard to approximate, while, once this value is exceeded, the problems become poly-time solvable.

论文关键词:

论文评审过程:Received 28 September 2000, Revised 6 March 2001, Available online 25 May 2002.

论文官网地址:https://doi.org/10.1006/jcss.2001.1797