Phylogenomics of 8,839 Clostridioides difficile genomes reveals recombination-driven evolution and diversification of toxin A and B

Abstract
Author summary Clostridioides difficile is a major worldwide cause of antibiotic-associated gastrointestinal infection. Two toxins (TcdA and TcdB) are responsible for C. diffile pathogenicity, but genetic variants within these toxins complicates the development of broad-spectrum diagnostics, therapeutics and vaccines. Here we provide a global classification and analysis of available C. difficile toxin sequences and introduce a new open online database (diffbase.uwaterloo.ca) to serve the unmet needs of the clinical and research community. Our analysis partitions TcdA and TcdB genes into 7 and 12 distinct groups which provides a new method for sequence-based C. difficile toxin subtyping. Our analysis revealed that recombination has driven extensive diversification of TcdB in particular, resulting in TcdB subtypes with distinct antigenic, functional, and phenotypic properties. As validation of our method, we were able to rapidly subtype a new dataset of 351 clinical strains from Brigham and Women's Hospital, predicting their phenotypic and clinical features. Lastly, based on sequence analysis, we identified conserved regions in TcdB that represent ideal targets for the development of universal C. difficile therapeutics and diagnostics. Clostridioides difficile is the major worldwide cause of antibiotic-associated gastrointestinal infection. A pathogenicity locus (PaLoc) encoding one or two homologous toxins, toxin A (TcdA) and toxin B (TcdB) is essential for C. difficile pathogenicity. However, toxin sequence variation poses major challenges for the development of diagnostic assays, therapeutics, and vaccines. Here, we present a comprehensive phylogenomic analysis of 8,839 C. difficile strains and their toxins including 6,492 genomes that we assembled from the NCBI short read archive. A total of 5,175 tcdA and 8,022 tcdB genes clustered into 7 (A1-A7) and 12 (B1-B12) distinct subtypes, which form the basis of a new method for toxin-based subtyping of C. difficile. We developed a haplotype coloring algorithm to visualize amino acid variation across all toxin sequences, which revealed that TcdB has diversified through extensive homologous recombination throughout its entire sequence, and formed new subtypes through distinct recombination events. In contrast, TcdA varies mainly in the number of repeats in its C-terminal repetitive region, suggesting that recombination-mediated diversification of TcdB provides a selective advantage in C. difficile evolution. The application of toxin subtyping is then validated by classifying 351 C. difficile clinical isolates from Brigham and Women's Hospital in Boston, demonstrating its clinical utility. Subtyping partitions TcdB into binary functional and antigenic groups generated by intragenic recombinations, including two distinct cell-rounding phenotypes, whether recognizing frizzled proteins as receptors, and whether it can be efficiently neutralized by monoclonal antibody bezlotoxumab, the only FDA-approved therapeutic antibody. Our analysis also identifies eight universally conserved surface patches across the TcdB structure, representing ideal targets for developing broad-spectrum therapeutics. Finally, we established an open online database (DiffBase) as a central hub for collection and classification of C. difficile toxins, which will help clinicians decide on therapeutic strategies targeting specific toxin variants, and allow researchers to monitor the ongoing evolution and diversification of C. difficile.
Funding Information
  • Natural Sciences and Engineering Research Council of Canada (RGPIN-2019-04266)
  • Natural Sciences and Engineering Research Council of Canada (RGPAS-2019-00004)
  • Government of Ontario
  • National Institute of Allergy and Infectious Diseases (R01AI132387)
  • National Institute of Allergy and Infectious Diseases (R01AI139087)
  • National Institutes of Health (P30 DK034854)
  • Boston Children’s Hospital Intellectual and Developmental Disabilities Research Center (P30HD18655)
  • Burroughs Wellcome Fund - Investigator in the Pathogenesis of Infectious Disease award
  • Intramural Research Program of the National Library of Medicine, NIH
  • National Institute of Health (P30 DK034854)
  • Hatch Family Foundation
  • Brigham and Women’s Hospital Precision Medicine Institute