Fast anytime retrieval with confidence in large-scale temporal case bases

作者:

Highlights:

摘要

This work is about speeding up retrieval in Case-Based Reasoning (CBR) for large-scale case bases (CBs) comprised of temporally related cases in metric spaces. A typical example is a CB of electronic health records where consecutive sessions of a patient forms a sequence of related cases. k-Nearest Neighbors (kNN) search is a widely used algorithm in CBR retrieval. However, brute-force kNN is impossible for large CBs. As a contribution to efforts for speeding up kNN search, we introduce an anytime kNN search methodology and algorithm. Anytime Lazy kNN finds exact kNNs when allowed to run to completion with remarkable gain in execution time by avoiding unnecessary neighbor assessments. For applications where the gain in exact kNN search may not suffice, it can be interrupted earlier and it returns best-so-far kNNs together with a confidence value attached to each neighbor. We describe the algorithm and methodology to construct a probabilistic model that we use both to estimate confidence upon interruption and to automatize the interruption at desired confidence thresholds. We present the results of experiments conducted with publicly available datasets. The results show superior gains compared to brute-force search. We reach to an average gain of 87.18% with 0.98 confidence and to 96.84% with 0.70 confidence.

论文关键词:Large-scale case-based reasoning,Exact and approximate k-nearest neighbor search,Anytime algorithms

论文评审过程:Received 30 January 2020, Revised 15 May 2020, Accepted 4 August 2020, Available online 12 August 2020, Version of Record 17 August 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.106374