Identifying Personal Genomes by Surname Inference
Top Cited Papers
- 18 January 2013
- journal article
- research article
- Published by American Association for the Advancement of Science (AAAS) in Science
- Vol. 339 (6117), 321-324
- https://doi.org/10.1126/science.1229566
Abstract
Sharing sequencing data sets without identifiers has become a common practice in genomics. Here, we report that surnames can be recovered from personal genomes by profiling short tandem repeats on the Y chromosome (Y-STRs) and querying recreational genetic genealogy databases. We show that a combination of a surname with other types of metadata, such as age and state, can be used to triangulate the identity of the target. A key feature of this technique is that it entirely relies on free, publicly accessible Internet resources. We quantitatively analyze the probability of identification for U.S. males. We further demonstrate the feasibility of this technique by tracing back with high probability the identities of multiple participants in public sequencing projects.Keywords
This publication has 22 references indexed in Scilit:
- lobSTR: A short tandem repeat profiler for personal genomesGenome Research, 2012
- On Sharing Quantitative Trait GWAS Results in an Era of Multiple-omics Data and the Limits of Genomic PrivacyAmerican Journal of Human Genetics, 2012
- Assessing and managing risk when sharing aggregate genetic variant dataNature Reviews Genetics, 2011
- A map of human genome variation from population-scale sequencingNature, 2010
- A new statistic and its power to infer membership in a genome-wide association study using genotype frequenciesNature Genetics, 2009
- Inferential Genotyping of Y Chromosomes in Latter-Day Saints Founders and Comparison to Utah Samples in the HapMap ProjectAmerican Journal of Human Genetics, 2009
- Founders, Drift, and Infidelity: The Relationship between Y Chromosome Diversity and Patrilineal SurnamesMolecular Biology and Evolution, 2009
- Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping MicroarraysPLoS Genetics, 2008
- The Diploid Genome Sequence of an Individual HumanPLoS Biology, 2007
- Variation of 52 new Y-STR loci in the Y Chromosome Consortium worldwide panel of 76 diverse individualsInternational journal of legal medicine, 2006