Malware classification with LSTM and GRU language models and a character-level CNN

1 March 2017

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 2482-2486
https://doi.org/10.1109/icassp.2017.7952603

Abstract

Malicious software, or malware, continues to be a problem for computer users, corporations, and governments. Previous research [1] has explored training file-based, malware classifiers using a two-stage approach. In the first stage, a malware language model is used to learn the feature representation which is then input to a second stage malware classifier. In Pascanu et al. [1], the language model is either a standard recurrent neural network (RNN) or an echo state network (ESN). In this work, we propose several new malware classification architectures which include a long short-term memory (LSTM) language model and a gated recurrent unit (GRU) language model. We also propose using an attention mechanism similar to [12] from the machine translation literature, in addition to temporal max pooling used in [1], as an alternative way to construct the file representation from neural features. Finally, we propose a new single-stage malware classifier based on a character-level convolutional neural network (CNN). Results show that the LSTM with temporal max pooling and logistic regression offers a 31.3% improvement in the true positive rate compared to the best system in [1] at a false positive rate of 1%.

Keywords

This publication has 12 references indexed in Scilit:

MtNet: A Multi-Task Neural Network for Dynamic Malware Classification
Lecture Notes in Computer Science, 2016
Visualized Malware Classification Based-on Convolutional Neural Network
Journal of the Korea Institute of Information Security and Cryptology, 2016
Deep neural network based malware detection using two dimensional binary program features
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Malware classification with recurrent networks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
Published by Association for Computational Linguistics (ACL) ,2014
Large-scale malware classification using random projections and neural networks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Advances in optimizing recurrent networks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication
Science, 2004
Data mining methods for detection of new malicious executables
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Long Short-Term Memory
Neural Computation, 1997

Cited by 137 articles