Abstract
Deep neural networks show good performance in image recognition, speech recognition, and pattern analysis. However, deep neural networks show weaknesses, one of which is vulnerability to backdoor attacks. A backdoor attack performs additional training of the target model on backdoor samples that contain a specific trigger so that normal data without the trigger will be correctly classified by the model, but the backdoor samples with the specific trigger will be incorrectly classified by the model. Various studies on such backdoor attacks have been conducted. However, the existing backdoor attack causes misclassification by one classifier. In certain situations, it may be necessary to carry out a selective backdoor attack on a specific model in an environment with multiple models. In this paper, we propose a multi-model selective backdoor attack method that misleads each model to misclassify samples into a different class according to the position of the trigger. The experiment for this study used MNIST and Fashion-MNIST as datasets and TensorFlow as the machine learning library. The results show that the proposed scheme has a 100% average attack success rate for each model while maintaining 97.1% and 90.9% accuracy on the original samples for MNIST and Fashion-MNIST, respectively.

This publication has 7 references indexed in Scilit: