The pattern of insertion/deletion polymorphism in Arabidopsis thaliana

Abstract
Little is known about variation of nucleotide insertion/deletions (indels) within species. In Arabidopsis thaliana, we investigated indel polymorphism patterns between two genome sequences and among 96 accessions at 1215 loci. Our study identified patterns in the variation of indel density, size, GC content and distribution, and a correlation between indels and substitutions. We found that the GC content in indel sequences was lower than that in non-indel sequences and that indels typically occur in regions with lower GC content. Patterns of indel frequency distribution among populations were more consistent with neutral expectation than substitution patterns. We also found that the local level of substitutions is positively correlated with indel density and negatively correlated with their distance to the closed indel, suggesting that indels play an important role in nucleotide variation.