Similarity join on time series by utilizing a dynamic segmentation index

作者:Jinhua Wang, Qiuhong Li, Zhongsheng Li, Peng Wang, Yang Wang, Wei Wang, Ningting Pan, Mingmin Chi

摘要

Similarity join on time series databases is an essential operation for data analysis applications. Due to the curse of dimensionality, it is not suitable to use traditional index techniques, such as R-tree and kd-tree. In the paper, a dynamic segment index (i.e., DSTree) is utilized to reduce the huge comparison cost on the similarity join on time series databases. However, the DSTree is designed for similarity search and only supports bound estimations between a time series and a batch of time series in a DSTree node. To make the DSTree suitable for the similarity join on time series databases, it is necessary to have tight bounds for the nodes to achieve a better pruning power, where the biggest challenge is that the DSTree nodes may have different segmentations. To solve the problem aforementioned, a segmentation alignment and synopsis evaluation method is proposed to support the estimation of DSTree nodes to significantly reduce the time cost by pruning unnecessary comparisons. Moreover, to make our approach I/O efficient, a caching strategy is proposed by taking advantage of both graph partitioning and the locality of the DSTree index. The efficiency and effectiveness of the proposed approaches are verified by experiments on real-life datasets.

论文关键词:Time series, Similarity join, Lower bound

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-018-1317-4