Genome-wide prediction of imprinted murine genes

Abstract
Imprinted genes are epigenetically modified genes whose expression is determined according to their parent of origin. They are involved in embryonic development, and imprinting dysregulation is linked to cancer, obesity, diabetes, and behavioral disorders such as autism and bipolar disease. Herein, we train a statistical model based on DNA sequence characteristics that not only identifies potentially imprinted genes, but also predicts the parental allele from which they are expressed. Of 23,788 annotated autosomal mouse genes, our model identifies 600 (2.5%) to be potentially imprinted, 64% of which are predicted to exhibit maternal expression. These predictions allowed for the identification of putative candidate genes for complex conditions where parent-of-origin effects are involved, including Alzheimer disease, autism, bipolar disorder, diabetes, male sexual orientation, obesity, and schizophrenia. We observe that the number, type, and relative orientation of repeated elements flanking a gene are particularly important in predicting whether a gene is imprinted.