A Random Decision Tree Framework for Privacy-Preserving Data Mining

1 October 2013

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Dependable and Secure Computing

Vol. 11 (5), 399-411
https://doi.org/10.1109/tdsc.2013.43

Abstract

Distributed data is ubiquitous in modern information driven applications. With multiple sources of data, the natural challenge is to determine how to collaborate effectively across proprietary organizational boundaries while maximizing the utility of collected information. Since using only local data gives suboptimal utility, techniques for privacy-preserving collaborative knowledge discovery must be developed. Existing cryptography-based work for privacy-preserving data mining is still too slow to be effective for large scale data sets to face today's big data challenge. Previous work on random decision trees (RDT) shows that it is possible to generate equivalent and accurate models with much smaller cost. We exploit the fact that RDTs can naturally fit into a parallel and fully distributed architecture, and develop protocols to implement privacy-preserving RDTs that enable general and efficient distributed privacy-preserving knowledge discovery.

Keywords

This publication has 22 references indexed in Scilit:

Multi-label Classification without the Multi-label Cost
Published by Society for Industrial & Applied Mathematics (SIAM) ,2010
Privacy-preserving decision trees over vertically partitioned data
ACM Transactions on Knowledge Discovery From Data, 2008
Privacy-preserving Naïve Bayes classification
The VLDB Journal, 2007
A general framework for accurate and fast regression by data summarization in random decision trees
Published by Association for Computing Machinery (ACM) ,2006
Privacy-preserving clustering with distributed EM mixture modeling
Knowledge and Information Systems, 2005
Privacy-preserving Bayesian network structure computation on distributed heterogeneous data
Published by Association for Computing Machinery (ACM) ,2004
Privacy-preserving distributed mining of association rules on horizontally partitioned data
IEEE Transactions on Knowledge and Data Engineering, 2004
Privacy preserving association rule mining in vertically partitioned data
Published by Association for Computing Machinery (ACM) ,2002
On the design and quantification of privacy preserving data mining algorithms
Published by Association for Computing Machinery (ACM) ,2001
Public-Key Cryptosystems Based on Composite Degree Residuosity Classes
Published by Springer Science and Business Media LLC ,1999

Cited by 84 articles