A two-step string-matching procedure

作者:

Highlights:

摘要

String-matching algorithms which will be used as part of a knowledge-based retrieval system for multimedia documents, consisting of the graphical images of corporate trademarks, trademark words or phrases, and representations of companies and products, are discussed. Let LLCS(A, B) denote the length of the longest common subsequence of two strings A and B. We describe an improvement to the Wagner-Fischer Algorithm called WF+ for finding the LLCS which runs in linear space and quadratic time. An approximate algorithm for finding the LLCS, called the sorted string approximate algorithm (SSAA), is then presented, which runs in linear time but works on strings in sorted order only. The SSAA and WF+ are then combined into a two-step process which retrieves strings similar to a given string in near-linear time. Empirical data are presented in support of the estimates of average running time for the algorithms.

论文关键词:String-matching algorithms,Information retrieval

论文评审过程:Received 3 January 1991, Revised 19 December 1991, Available online 19 May 2003.

论文官网地址:https://doi.org/10.1016/0031-3203(91)90038-7