keywords: Cyberattacks, cybersecurity, cyberattack prediction, expectation-maximisation, deep learning, PCA
One of the most damaging security threats on the Internet today is cyberattacks. As new paradigms emerge, new vulnerabilities and flaws are discovered on a daily basis. These vulnerabilities have been consistently exploited by malicious users to stage cyberattacks, which erode the confidentiality, integrity and availability of critical data, and other computing resources. In recent times, the research focus has been on signature based and anomaly detection approaches. However, the challenges of using known attack signatures and profiles have made the prediction of attacks an elusive and cumbersome activity. The use of task specific algorithms has also created more setbacks in cyberattack prediction, hence the need for new approaches that exploit the learning of data representations. Therefore, this paper presents a combination of Principal Component Analysis (PCA) and Expectation Maximization (EM) for intelligent clustering, and a supervised Deep Neural Network (DNN) for the training of the model to make predictions on attack data. In the hybrid model, PCA and EM algorithm perform dimensionality reduction and clustering of the attack dataduring the unsupervised pre-training stage of the model building. The output of the unsupervised pre-training is fed into the DNN for supervised training, at which point rectified linear units (RELUs) in the hidden layers are used to generate a cascade of concepts for making accurate predictions on the modeled dataset. For experimentation, we use a Python environment test bed to fully assess the performance of the model and report its accuracy, false positive rate, precision rate, recall rate, F-measure and entropy. The results obtained show a 99.8% accuracy for predicting the modeled attack types.