Approximability of constrained LCS

作者:

Highlights:

摘要

The problem Constrained Longest Common Subsequence is a natural extension to the classical problem Longest Common Subsequence, and has important applications to bioinformatics. Given k input sequences A1,…,Ak and l constraint sequences B1,…,Bl, C-LCS(k,l) is the problem of finding a longest common subsequence of A1,…,Ak that is also a common supersequence of B1,…,Bl. Gotthilf et al. gave a polynomial-time algorithm that approximates C-LCS(k,1) within a factor mˆ|Σ|, where mˆ is the length of the shortest input sequence and |Σ| is the alphabet size. They asked whether there are better approximation algorithms and whether there exists a lower bound. In this paper, we answer their questions by showing that their approximation factor mˆ|Σ| is in fact already very close to optimal although a small improvement is still possible:1.For any computable function f and any ϵ>0, there is no polynomial-time algorithm that approximates C-LCS(k,1) within a factor f(|Σ|)⋅mˆ1/2−ϵ unless NP=P. Moreover, this holds even if the constraint sequence is unary.2.There is a polynomial-time randomized algorithm that approximates C-LCS(k,1) within a factor |Σ|⋅O(OPT⋅loglogOPT/logOPT) with high probability, where OPT is the length of the optimal solution, OPT⩽mˆ. For the problem over an alphabet of arbitrary size, we show that3.For any ϵ>0, there is no polynomial-time algorithm that approximates C-LCS(k,1) within a factor mˆ1−ϵ unless NP=P.4.There is a polynomial-time algorithm that approximates C-LCS(k,1) within a factor O(mˆ/logmˆ). We also present some complementary results on exact and parameterized algorithms for C-LCS(k,1).

论文关键词:Longest common subsequence,Computational biology,Computational complexity,Approximation algorithms

论文评审过程:Received 17 January 2011, Revised 25 September 2011, Accepted 10 October 2011, Available online 20 October 2011.

论文官网地址:https://doi.org/10.1016/j.jcss.2011.10.002