Association test using Copy Number Profile Curves (CONCUR) enhances power in rare copy number variant analysis
Open Access
- 4 May 2020
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 16 (5), e1007797
- https://doi.org/10.1371/journal.pcbi.1007797
Abstract
Copy number variants (CNVs) are the gain or loss of DNA segments in the genome that can vary in dosage and length. CNVs comprise a large proportion of variation in human genomes and impact health conditions. To detect rare CNV associations, kernel-based methods have been shown to be a powerful tool due to their flexibility in modeling the aggregate CNV effects, their ability to capture effects from different CNV features, and their accommodation of effect heterogeneity. To perform a kernel association test, a CNV locus needs to be defined so that locus-specific effects can be retained during aggregation. However, CNV loci are arbitrarily defined and different locus definitions can lead to different performance depending on the underlying effect patterns. In this work, we develop a new kernel-based test called CONCUR (i.e., copy number profile curve-based association test) that is free from a definition of locus and evaluates CNV-phenotype associations by comparing individuals’ copy number profiles across the genomic regions. CONCUR is built on the proposed concepts of “copy number profile curves” to describe the CNV profile of an individual, and the “common area under the curve (cAUC) kernel” to model the multi-feature CNV effects. The proposed method captures the effects of CNV dosage and length, accounts for the numerical nature of copy numbers, and accommodates between- and within-locus etiological heterogeneity without the need to define artificial CNV loci as required in current kernel methods. In a variety of simulation settings, CONCUR shows comparable or improved power over existing approaches. Real data analyses suggest that CONCUR is well powered to detect CNV effects in the Swedish Schizophrenia Study and the Taiwan Biobank. Copy number variants comprise a large proportion of variation in human genomes. Large rare CNVs, especially those disrupting genes or changing the dosages of genes, can carry relatively strong risks for neurodevelopmental and neuropsychiatric disorders. Kernel-based association methods have been developed for the analysis of rare CNVs and shown to be a valuable tool. Kernel methods model the collective effect of rare CNVs using flexible kernel functions that capture the characteristics of CNVs and measure CNV similarity of individual pairs. Typically kernels are created by summarizing similarity within an artificially defined “CNV locus” and then collapsing across all loci. In this work, we propose a new kernel-based test, CONCUR, that is based on the CNV location information contained in standard processing of the variants and which obviates the need for arbitrarily defined CNV loci. CONCUR quantifies similarity between individual pairs as the common area under their copy number profile curves and is designed to detect CNV dosage, length and dosage-length interaction effects. In simulation studies and real data analysis, we demonstrate the ability of the CONCUR test to detect CNV effects under diverse CNV architectures with power and robustness over existing methods.Funding Information
- National Institutes of Health (P01CA142538)
- National Institutes of Health (P01CA142538-01)
- Ministry of Science and Technology, Taiwan (MOST-106-2314-B-002-134-MY2)
- Ministry of Science and Technology, Taiwan (MOST-104-2314-B-002-107-MY2)
This publication has 29 references indexed in Scilit:
- CNVs: Harbingers of a Rare Variant Revolution in Psychiatric GeneticsCell, 2012
- De novo CNV analysis implicates specific abnormalities of postsynaptic signalling complexes in the pathogenesis of schizophreniaMolecular Psychiatry, 2011
- FMRP Stalls Ribosomal Translocation on mRNAs Linked to Synaptic Function and AutismCell, 2011
- Identification of recurrent regions of copy-number variants across multiple individualsBMC Bioinformatics, 2010
- Functional Gene Group Analysis Reveals a Role of Synaptic Heterotrimeric G Proteins in Cognitive AbilityAmerican Journal of Human Genetics, 2010
- Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's diseaseNature Genetics, 2008
- Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed modelsBMC Bioinformatics, 2008
- Genome-wide DNA copy number analysis in pancreatic cancer using high-density single nucleotide polymorphism arraysOncogene, 2007
- PennCNV: An integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping dataGenome Research, 2007
- Semiparametric Regression of Multidimensional Genetic Pathway Data: Least‐Squares Kernel Machines and Linear Mixed ModelsBiometrics, 2007