Stochastic local search for the Feature Set problem, with applications to microarray data

作者:

Highlights:

摘要

We prove a (m/δ)O(κ) · na time bound for finding minimum solutions Smin of Feature Set problems, where n is the total size of a given Feature Set problem, κ ⩽ ∣Smin∣, m equals the number of non-target features, a is a (relatively small) constant, and 1 − δ is the confidence that the solution is of minimum length. In terms of parameterized complexity of NP-complete problems, our time bound differs from an FPT-type bound by the factor mO(κ) for fixed δ. The algorithm is applied to a prominent microarray dataset: The classification of gene-expression data related to acute myeloid leukaemia (AML) and acute lymphoblastic leukaemia (ALL). From the set of potentially significant features calculated by the algorithm we can identify three genes (D88422, M92287, L09209) that produce zero errors on the test set by using a simple, straightforward evaluation procedure (performing the test on the single gene M84526 produces only one error).

论文关键词:Feature Set problem,Parameterized complexity,Stochastic local search,Simulated annealing,Gene-expression analysis,Microarrays

论文评审过程:Available online 8 August 2006.

论文官网地址:https://doi.org/10.1016/j.amc.2006.05.128