Mixture Modeling for Genome‐Wide Localization of Transcription Factors
- 1 March 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Biometrics
- Vol. 63 (1), 10-21
- https://doi.org/10.1111/j.1541-0420.2005.00659.x
Abstract
Summary Chromatin immunoprecipitation followed by DNA microarray analysis (ChIP-chip methodology) is an efficient way of mapping genome-wide protein–DNA interactions. Data from tiling arrays encompass DNA–protein interaction measurements on thousands or millions of short oligonucleotides (probes) tiling a whole chromosome or genome. We propose a new model-based method for analyzing ChIP-chip data. The proposed model is motivated by the widely used two-component multinomial mixture model of de novo motif finding. It utilizes a hierarchical gamma mixture model of binding intensities while incorporating inherent spatial structure of the data. In this model, genomic regions belong to either one of the following two general groups: regions with a local protein–DNA interaction (peak) and regions lacking this interaction. Individual probes within a genomic region are allowed to have different localization rates accommodating different binding affinities. A novel feature of this model is the incorporation of a distribution for the peak size derived from the experimental design and parameters. This leads to the relaxation of the fixed peak size assumption that is commonly employed when computing a test statistic for these types of spatial data. Simulation studies and a real data application demonstrate good operating characteristics of the method including high sensitivity with small sample sizes when compared to available alternative methods.Keywords
This publication has 17 references indexed in Scilit:
- Multiple Testing Methods For ChIP–Chip High Density Oligonucleotide Array DataJournal of Computational Biology, 2006
- TileMap: create chromosomal map of tiling array hybridizationsBioinformatics, 2005
- A high-resolution map of active promoters in the human genomeNature, 2005
- A hidden Markov model for analyzing ChIP-chip experiments on genome tiling arrays and its application to p53 binding sequencesBioinformatics, 2005
- Detecting differential gene expression with a semiparametric hierarchical mixture methodBiostatistics, 2004
- Unbiased Mapping of Transcription Factor Binding Sites along Human Chromosomes 21 and 22 Points to Widespread Regulation of Noncoding RNAsCell, 2004
- On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profilesStatistics in Medicine, 2003
- Supervised Detection of Regulatory Motifs in DNA SequencesStatistical Applications in Genetics and Molecular Biology, 2003
- Use of Chromatin Immunoprecipitation To Clone Novel E2F Target PromotersMolecular and Cellular Biology, 2001
- Genome-Wide Location and Function of DNA Binding ProteinsScience, 2000