Functionally relevant microsatellites in sugarcane unigenes

Abstract
Background: Unigene sequences constitute a rich source of functionally relevant microsatellites. The present study was undertaken to mine the microsatellites in the available unigene sequences of sugarcane for understanding their constitution in the expressed genic component of its complex polyploid/aneuploid genome, assessing their functional significance in silico, determining the extent of allelic diversity at the microsatellite loci and for evaluating their utility in large-scale genotyping applications in sugarcane. Results: The average frequency of perfect microsatellite was 1/10.9 kb, while it was 1/44.3 kb for the long and hypervariable class I repeats. GC-rich trinucleotides coding for alanine and the GA-rich dinucleotides were the most abundant microsatellite classes. Out of 15,594 unigenes mined in the study, 767 contained microsatellite repeats and for 672 of these putative functions were determined in silico. The microsatellite repeats were found in the functional domains of proteins encoded by 364 unigenes. Its significance was assessed by establishing the structure-function relationship for the beta-amylase and protein kinase encoding unigenes having repeats in the catalytic domains. A total of 726 allelic variants (7.42 alleles per locus) with different repeat lengths were captured precisely for a set of 47 fluorescent dye labeled primers in 36 sugarcane genotypes and five cereal species using the automated fragment analysis system, which suggested the utility of designed primers for rapid, large-scale and high-throughput genotyping applications in sugarcane. Pair-wise similarity ranging from 0.33 to 0.84 with an average of 0.40 revealed a broad genetic base of the Indian varieties in respect of functionally relevant regions of the large and complex sugarcane genome. Conclusion: Microsatellite repeats were present in 4.92% of sugarcane unigenes, for most (87.6%) of which functions were determined in silico. High level of allelic diversity in repeats including those present in the functional domains of proteins encoded by the unigenes demonstrated their use in assay of useful variation in the genic component of complex polyploid sugarcane genome.