The phylogenetic analysis of bacterial genomes to detect outbreaks is a mature field. However, most methods are unable to fully resolve plasmid transmission. There is also a lack of methods to decide if plasmids are closely related, as they evolve as much by rearrangement and gene gain/loss as by mutation.
Here we present RoundHound, a tool designed to detect plasmid transmission directly from short-read sequencing data. We first construct a plasmid reference database using all plasmids from PLSDB (~31,872). A plasmid pangenome is created are stored in pan-genome reference graphs (PRGs). A plasmid network is also constructed using a combination of gene-jaccard and rearrangement/indel distances, and then divided into communities using the asynchronous label propagation algorithm. To query each sample, short-reads are mapped to the plasmid pan-genome (using the tool Pandora) to identify gene presence/absence and variants, which is used to determine the presence/absence of plasmids and their associated community.
We tested RoundHound using two short-read datasets with known plasmid transmission validated with long-read sequencing (Roberts et al., 2022, Hawkey et al., 2022). In both datasets, we had a recall of >96% and precision of >75% when detecting plasmid transmission between bacterial samples. RoundHound is a powerful tool for enhancing surveillance of plasmid transmission in vulnerable settings (such as healthcare) and could be used to strategically select samples for long-read sequencing and further plasmid analysis.