Random CapsNet forest model for imbalanced malware type classification task
Behavior of malware varies depending the malware types, which affects the strategies of the system protection software. Many malware classification models, empowered by machine and/or deep learning, achieve superior accuracies for predicting malware types. Machine learning-based models need to do heavy feature engineering work, which affects the performance of the models greatly. On the other hand, deep learning-based models require less effort in feature engineering when compared to that of the machine learning-based models. However, traditional deep learning architectures components, such as max and average pooling, cause architecture to be more complex and the models to be more sensitive to data. The capsule network architectures, on the other hand, reduce the aforementioned complexities by eliminating the pooling components. Additionally, capsule network architectures based models are less sensitive to data, unlike the classical convolutional neural network architectures. This paper proposes an ensemble capsule network model based on the bootstrap aggregating technique. The proposed method is tested on two widely used, highly imbalanced datasets (Malimg and BIG2015), for which the-state-of-the-art results are well-known and can be used for comparison purposes. The proposed model achieves the highest F-Score, which is 0.9820, for the BIG2015 dataset and F-Score, which is 0.9661, for the Malimg dataset. Our model also reaches the-state-of-the-art, using 99.7% lower the number of trainable parameters than the best model in the literature.