USING TRANSFER LEARNING FOR MALWARE CLASSIFICATION

Abstract
In this paper, we propose a malware classification framework using transfer learning based on existing Deep Learning models that have been pre-trained on massive image datasets. In recent years there has been a significant increase in the number and variety of malwares, which amplifies the need to improve automatic detection and classification of the malwares. Nowadays, neural network methodology has reached a level that may exceed the limits of previous machine learning methods, such as Hidden Markov Models and Support Vector Machines (SVM). As a result, convolutional neural networks (CNNs) have shown superior performance compared to traditional learning techniques, specifically in tasks such as image classification. Motivated by this success, we propose a CNN-based architecture for malware classification. The malicious binary files are represented as grayscale images and a deep neural network is trained by freezing the pre-trained VGG16 layers on the ImageNet dataset and adapting the last fully connected layer to the malware family classification. Our evaluation results show that our approach is able to achieve an average of 98% accuracy for the MALIMG dataset.