A hybrid intrusion detection system (HIDS) based on prioritized k-nearest neighbors and optimized SVM classifiers

摘要

Intrusion Detection System (IDS) is an effective security tool that helps preventing unauthorized access to network resources through analyzing the network traffic. However, due to the large amount of data flowing over the network, effective real time intrusion detection is almost impossible. The goal of this paper is to design a Hybrid IDS (HIDS) that can be successfully employed in a real time manner and suitable for resolving the multi-class classification problem. HIDS relies on a Naïve Base feature selection (NBFS) technique, which is used to reduce the dimensionality of sample data. Moreover, HIDS has another pioneering issue that other techniques do not have, which is the outlier rejection. Outliers are noisy input samples that can lead to high rate of misclassification if they are applied for model training. Rejecting outliers has been accomplished through applying a distance based methodology to choose the most informative training examples, which are then used to train an Optimized Support Vector Machines (OSVM). Afterward, OSVM is employed for rejecting outliers. Finally, after outlier rejection, HIDS can successfully detect attacks through applying a Prioritized K-Nearest Neighbors (PKNN) classifier. Hence, HIDS is a triple edged strategy as it has three main contributions, which are: (i) NBFS, which has been employed for dimensionality reduction, (ii) OSVM, which is applied for outlier rejection, and (iii) PKNN, which is used for detecting input attacks. HIDS has been compared against recent techniques using three well-known intrusion detection datasets: KDD Cup ’99, NSL-KDD and Kyoto 2006+ datasets. HIDS has the ability to quickly detect attacks and accordingly can be employed for real time intrusion detection. Thanks to OSVM and PKNN, HIDS performed high detection rates specifically for the attacks which are rare such as R2L and U2R. PKNN is also suitable for resolving the multi-label classification problem.