On processing continuous frequent K-N-match queries for dynamic data over networked data sources

作者:Shih-Chuan Chiu, Jiun-Long Huang, Jen-He Huang

摘要

Similarity search is one of the critical issues in many applications. When using all attributes of objects to determine their similarity, most prior similarity search algorithms are easily influenced by a few attributes with high dissimilarity. The frequent k-n-match query is proposed to overcome the above problem. However, the prior algorithm to process frequent k-n-match queries is designed for static data, whose attributes are fixed, and is not suitable for dynamic data. Thus, we propose in this paper two schemes to process continuous frequent k-n-match queries over dynamic data. First, the concept of safe region is proposed and four formulae are devised to compute safe regions. Then, scheme CFKNMatchAD-C is developed to speed up the process of continuous frequent k-n-match queries by utilizing safe regions to avoid unnecessary query re-evaluations. To reduce the amount of data transmitted by networked data sources, scheme CFKNMatchAD-C also uses safe regions to eliminate transmissions of unnecessary data updates which will not affect the results of queries. Moreover, for large-scale environments, we further propose scheme CFKNMatchAD-D by extending scheme CFKMatchAD-C to employ multiple servers to process continuous frequent k-n-match queries. Experimental results show that scheme CFKNMatchAD-C and scheme CFKNMatchAD-D outperform the prior algorithm in terms of average response time and the amount of produced network traffic.

论文关键词:Similarity search, k-n-match problem, Continuous query, Dynamic data

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-011-0413-5