Seeds of SEED: NMT-Stroke: Diverting Neural Machine Translation through Hardware-based Faults

Abstract
The rapid development of deep learning has significantly bolstered the performance of natural language processing (NLP) in the form of language modeling. Recent advances in hardware security studies have demonstrated that hardware-based threats can severely jeopardize the integrity of computing systems (e.g., fault attacks for data at rest). Internal adversaries exploiting such hardware vulnerabilities are becoming a major security concern. Yet the impact of hardware faults on systems running NLP models has not been fully understood.In this paper, we perform the first investigation of hardware-based fault injections in modern neural machine translation (NMT) models. We find that compared to neural network classifiers (e.g., CNNs), fault attacks on NMT models present unique challenges. We propose a novel attack framework–NMT-Stroke–that can maliciously divert the translation of a victim NMT model by modeling memory fault injections with the rowhammer attack vector. We design a fault injection strategy to minimize bit flips needed, which would mislead the translation to an arbitrary natural output sentence. Our evaluation on state-of-the-art Transformer-based NMT models shows that NMT-Stroke can effectively induce the attacker-desired and linguistically sound translation by faulting minimal parameter bits. Our work highlights the significance of understanding the robustness of emerging NLP models with the presence of hardware vulnerabilities, which could lead to future new research directions.

This publication has 26 references indexed in Scilit: