Membership Inference Attacks Against Machine Learning Models

Top Cited Papers

1 May 2017

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 3-18
https://doi.org/10.1109/sp.2017.41

Abstract

We quantitatively investigate how machine learning models leak information about the individual data records on which they were trained. We focus on the basic membership inference attack: given a data record and black-box access to a model, determine if the record was in the model's training dataset. To perform membership inference against a target model, we make adversarial use of machine learning and train our own inference model to recognize differences in the target model's predictions on the inputs that it trained on versus the inputs that it did not train on. We empirically evaluate our inference techniques on classification models trained by commercial "machine learning as a service" providers such as Google and Amazon. Using realistic datasets and classification tasks, including a hospital discharge dataset whose membership is sensitive from the privacy perspective, we show that these models can be vulnerable to membership inference attacks. We then investigate the factors that influence this leakage and evaluate mitigation strategies.

Keywords

Other Versions

Version 2, 2016-10-19, preprints

This publication has 21 references indexed in Scilit:

Privacy-Preserving Deep Learning
Published by Association for Computing Machinery (ACM) ,2015
Private predictive analysis on encrypted medical data
Journal of Biomedical Informatics, 2014
Functional mechanism
Proceedings of the VLDB Endowment, 2012
Differential Privacy
Published by Springer Science and Business Media LLC ,2011
Genomic privacy and limits of individual detection in a pool
Nature Genetics, 2009
Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays
PLoS Genetics, 2008
Privacy-preserving Naïve Bayes classification
The VLDB Journal, 2007
Privacy-preserving distributed k-means clustering over arbitrarily partitioned data
Published by Association for Computing Machinery (ACM) ,2005
The elements of statistical learning: data mining, inference and prediction
The Mathematical Intelligencer, 2005
Privacy-Preserving Multivariate Statistical Analysis: Linear Regression and Classification
Published by Society for Industrial & Applied Mathematics (SIAM) ,2004

Cited by 1669 articles