Investigating Multidrug Resistance in Escherichia coli with Phylogenetics and Machine Learning

Doctoral Candidate Name: 
David C. Brown
Program: 
Bioinformatics and Computational Biology
Abstract: 

The next pandemic is already underway in the proliferation of antimicrobial resistance (AMR) genes. Evolutionary principles guide this ``silent pandemic'', resulting in multidrug resistant (MDR) bacteria that resist three or more classes of antimicrobial compounds. One hypothesis for the development of MDR Escherichia coli (E. coli) theorizes that resistance results from increased mutations attributed to bacteria with a deficient Mutator S gene.
First, I used phylogenetic comparative analyses on the mutS genes from 817 high-quality E. coli isolates. Although I observed 271 MDR isolates in this data set, I found no evidence for a deficient mutS gene. Additionally, when modeling the coevolution of MDR and variant residues in the MutS protein, the evidence supported independent evolution between the traits.
To understand this confounding result, I trained five random forest estimators to predict AMR, achieving a mean ROC AUC of 0.87 +/- 0.04 on 66 features engineered from 5511 annotated genes in the pangenome. The top performing predictors did not include mutS, but instead genes associated with horizontal gene transfer. This result supports the role of accessory genes in spreading MDR. My work demonstrates the combined usefulness of phylogenetic methods and machine learning to arrive at hypotheses for polygenic traits.

Defense Date and Time: 
Thursday, April 6, 2023 - 2:30pm
Defense Location: 
Bioinformatics Building, 4th Floor, Seminar Room
Committee Chair's Name: 
Dr. Daniel Janies
Committee Members: 
Dr. Jun-tao Guo, Dr. Alex Dornburg, Dr. Adam Reitzel