Generalized regression model for sequence matching and clustering

作者:Hansheng Lei, Venu Govindaraju

摘要

Linear relation has been found to be valuable in rule discovery of stocks, such as if stock X goes up a, stock Y will go down b. The traditional linear regression models the linear relation of two sequences faithfully. However, if a user requires clustering of stocks into groups where sequences have high linearity or similarity with each other, it is prohibitively expensive to compare sequences one by one. In this paper, we present generalized regression model (GRM) to match the linearity of multiple sequences at a time. GRM also gives strong heuristic support for graceful and efficient clustering. The experiments on the stocks in the NASDAQ market mined interesting clusters of stock trends efficiently.

论文关键词:Similarity measure, Sequence matching, Sequence clustering, Generalized regression model, Eigenvalue and eigenvector

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-006-0008-8