The Complete Genome Sequence and Comparative Genome Analysis of the High Pathogenicity Yersinia enterocolitica Strain 8081

Abstract
The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B) and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common themes in the genome evolution of other human enteropathogens. The goal of this study was to catalogue all the genes encoded within the Y. enterocolitica genome to help us better understand how this bacterium and related bacteria cause different diseases. There are currently genome sequences (complete gene catalogues) available for two other members of this bacterial lineage, which cause dramatically different diseases: Y. pseudotuberculosis, like Y. enterocolitica, is a gut pathogen (enteropathogen) causing gastroenteritis in humans and animals. Yersinia pestis mostly resides within blood (circulating or in fleas following blood meals) and lymph tissue. It causes bubonic plague in humans and animals, and is historically known as “The Black Death.” A three-way comparison of these genomes revealed a patchwork of genes we have defined as being species- or disease-specific and genes that are common to all three Yersinia species. This has provided us with important information on shared gene functions that define the two enteropathogenic yersinias and those that differentiate them. This will help us to connect what we know about the Y. enterocolitica lifestyle within the gut to the disease it causes and its genetic makeup. We have also provided further evidence of gene-loss by Y. pestis as it has evolved from Y. pseudotuberculosis into a more acute systemic pathogen. Similar patterns of gene loss are seen in other important pathogens such as Salmonella enterica serovar Typhi.