Integration of Expectation Maximization using Gaussian Mixture Models and Naïve Bayes for Intrusion Detection

Abstract
Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity. Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability. Nonetheless, it is Naïve use of the mean data value for the cluster core that presents a major drawback. The chances of two circular clusters having different radius and centering at the same mean will occur. This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together. However, if the clusters are not spherical, it fails. To overcome this issue, a new integrated hybrid model by integrating expectation maximizing (EM) clustering using a Gaussian mixture model (GMM) and naïve Bays classifier have been proposed. In this model, GMM give more flexibility than K-Means in terms of cluster covariance. Also, they use probabilities function and soft clustering, that’s why they can have multiple cluster for a single data. In GMM, we can define the cluster form in GMM by two parameters: the mean and the standard deviation. This means that by using these two parameters, the cluster can take any kind of elliptical shape. EM-GMM will be used to cluster data based on data activity into the corresponding category.