Iterative Boolean combination of classifiers in the ROC space: An application to anomaly detection with HMMs

作者:

Highlights:

摘要

Hidden Markov models (HMMs) have been shown to provide a high level performance for detecting anomalies in sequences of system calls to the operating system kernel. Using Boolean conjunction and disjunction functions to combine the responses of multiple HMMs in the ROC space may significantly improve performance over a “single best” HMM. However, these techniques assume that the classifiers are conditional independent, and their of ROC curves are convex. These assumptions are violated in most real-world applications, especially when classifiers are designed using limited and imbalanced training data. In this paper, the iterative Boolean combination (IBC) technique is proposed for efficient fusion of the responses from multiple classifiers in the ROC space. It applies all Boolean functions to combine the ROC curves corresponding to multiple classifiers, requires no prior assumptions, and its time complexity is linear with the number of classifiers. The results of computer simulations conducted on both synthetic and real-world host-based intrusion detection data indicate that the IBC of responses from multiple HMMs can achieve a significantly higher level of performance than the Boolean conjunction and disjunction combinations, especially when training data are limited and imbalanced. The proposed IBC is general in that it can be employed to combine diverse responses of any crisp or soft one- or two-class classifiers, and for wide range of application domains.

论文关键词:Receiver operating characteristics,Combination of classifiers,Limited and imbalanced data,Hidden Markov models,Anomaly detection,Computer and network security

论文评审过程:Received 5 September 2009, Revised 8 January 2010, Accepted 7 March 2010, Available online 12 March 2010.

论文官网地址:https://doi.org/10.1016/j.patcog.2010.03.006