While the number of SARS-CoV-2 genome sequences grew to over 15 million, the Ultrafast Sample placement on Existing tRees (UShER) tool suite maintained a comprehensive phylogenetic tree in near real time. This experience, and critical performance improvements throughout the pandemic, provide valuable lessons for rapidly scaling analyses.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Data Availability
SARS-CoV-2 genome sequences and metadata used to build the tree were aggregated from multiple sources. Sequences in member databases of the International Nucleotide Sequence Database Collaboration (INSDC; GenBank, European Nucleotide Archive (ENA) and DNA Data Bank of Japan (DDBJ)) are freely available and were retrieved using National Center for Biotechnology Information (NCBI) datasets (https://www.ncbi.nlm.nih.gov/datasets/taxonomy/2697049/). COVID-19 Genomics UK Consortium (COG-UK) sequences may be freely downloaded from https://cog-uk.s3.climb.ac.uk/phylogenetics/latest/ and most have been submitted to ENA and are available from INSDC. China National Center for Bioinformation (CNCB) makes additional sequences available in GenBase (https://ngdc.cncb.ac.cn/genbase/). Global Initiative on Sharing All Influenza Data (GISAID) data are available after registration and acceptance of the GISAID user agreement (https://gisaid.org/terms-of-use/). Sequence accessions from all sources for 15,831,377 genomes used in the 2023-08-01 tree are available from https://doi.org/10.5281/zenodo.10076358.
References
Hodcroft, E. B. et al. Nature 591, 30–33 (2021).
Minh, B. Q. et al. Mol. Biol. Evol. 37, 1530–1534 (2020).
Price, M. N., Dehal, P. S. & Arkin, A. P. PLoS One 5, e9490 (2010).
Shu, Y. & McCauley, J. Eurosurveillance 22, 30494 (2017).
Chand, M. et al. Investigation of novel SARS-CoV-2 variant, Variant of Concern 202012/01, Technical briefing 2 (Public Health England, 2020).
Turakhia, Y. et al. Nat. Genet. 53, 809–816 (2021).
Suchard, M. A. et al. Virus Evol. 4, vey016 (2018).
Ye, C. et al. Bioinformatics 38, 3734–3740 (2022).
Rambaut, A. et al. Nat. Microbiol. 5, 1403–1407 (2020).
McBroome, J., Martin, J., de Bernardi Schneider, A., Turakhia, Y. & Corbett-Detig, R. Virus Evol. 8, veac048 (2022).
Turakhia, Y. et al. Nature 609, 994–997 (2022).
Sanderson, T. eLife 11, e82392 (2022).
Karthikeyan, S. et al. Nature 609, 101–108 (2022).
Obermeyer, F. et al. Science 376, 1327–1332 (2022).
McBroome, J. et al. Mol. Biol. Evol. 38, 5819–5824 (2021).
Hinrichs, A. UShER performance statistics, SARS-CoV-2 daily builds 2021-2023 (Zenodo, 2023); https://doi.org/10.5281/zenodo.10070727
Acknowledgements
We gratefully acknowledge the authors and their originating laboratories responsible for obtaining the specimens, and their submitting laboratories for generating the genetic sequence and metadata and sharing via public repositories and/or the GISAID Initiative, on which this research is based. This work was funded by the Centers for Disease Control and Prevention (CDC) grant BAA 200-2021-11554.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Art Poon and the other, anonymous, reviewer(s) for their contribution to the peer review of the work.
Rights and permissions
About this article
Cite this article
Hinrichs, A., Ye, C., Turakhia, Y. et al. The ongoing evolution of UShER during the SARS-CoV-2 pandemic. Nat Genet 56, 4–7 (2024). https://doi.org/10.1038/s41588-023-01622-5
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-023-01622-5