Maximal repetitions in strings
作者:
Highlights:
•
摘要
The cornerstone of any algorithm computing all repetitions in strings of length n in O(n) time is the fact that the number of maximal repetitions (runs) is linear. Therefore, the most important part of the analysis of the running time of such algorithms is counting the number of runs. Kolpakov and Kucherov [R. Kolpakov, G. Kucherov, Finding maximal repetitions in a word in linear time, in: Proc. of FOCS'99, IEEE Computer Society Press, 1999, pp. 596–604] proved it to be cn but could not provide any value for c. Recently, Rytter [W. Rytter, The number of runs in a string: Improved analysis of the linear upper bound, in: B. Durand, W. Thomas (Eds.), Proc. of STACS'06, in: Lecture Notes in Comput. Sci., vol. 3884, Springer, Berlin, Heidelberg, 2006, pp. 184–195] proved that c⩽5. His analysis has been improved by Puglisi et al. to obtain 3.48 and by Rytter to 3.44 (both submitted). The conjecture of Kolpakov and Kucherov, supported by computations, is that c=1. Here we improve dramatically the previous results by proving that c⩽1.6 and show how it could be improved by computer verification down to 1.18 or less. While the conjecture may be very difficult to prove, we believe that our work provides a good approximation for all practical purposes.For the stronger result concerning the linearity of the sum of exponents, we give the first explicit bound: 5.6n. Kolpakov and Kucherov did not have any and Rytter considered “unsatisfactory” the bound that could be deduced from his proof. Our bound could be as well improved by computer verification down to 2.9n or less.
论文关键词:Combinatorics on words,Repetitions in strings,Runs,Maximal repetitions,Maximal periodicities,Sum of exponents
论文评审过程:Received 1 December 2006, Revised 14 September 2007, Available online 28 September 2007.
论文官网地址:https://doi.org/10.1016/j.jcss.2007.09.003