Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning

An Author Correction to this article was published on 18 October 2023

This article has been updated

Abstract

Programmable C•G-to-G•C base editors (CGBEs) have broad scientific and therapeutic potential, but their editing outcomes have proved difficult to predict and their editing efficiency and product purity are often low. We describe a suite of engineered CGBEs paired with machine learning models to enable efficient, high-purity C•G-to-G•C base editing. We performed a CRISPR interference (CRISPRi) screen targeting DNA repair genes to identify factors that affect C•G-to-G•C editing outcomes and used these insights to develop CGBEs with diverse editing profiles. We characterized ten promising CGBEs on a library of 10,638 genomically integrated target sites in mammalian cells and trained machine learning models that accurately predict the purity and yield of editing outcomes (R = 0.90) using these data. These CGBEs enable correction to the wild-type coding sequence of 546 disease-related transversion single-nucleotide variants (SNVs) with >90% precision (mean 96%) and up to 70% efficiency (mean 14%). Computational prediction of optimal CGBE–single-guide RNA pairs enables high-purity transversion base editing at over fourfold more target sites than achieved using any single CGBE variant.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Development of prototype CGBEs.
Fig. 2: CRISPRi knockdown screen across 476 genes enriched for those with roles in DNA repair identifies candidate regulators of C•G-to-G•C editing.
Fig. 3: Effect of varying the cytidine deaminase and Cas9 components of CGBEs on C•G-to-G•C editing outcomes in HEK293T cells.
Fig. 4: New engineered CGBEs with various DNA repair proteins, deaminases, Cas proteins and architectures offer diverse editing performance on different target sites.
Fig. 5: Target library characterization and machine learning modeling of ten CGBE variants.
Fig. 6: Target library characterization and machine learning modeling of CGBE variants.

Similar content being viewed by others

Data availability

The target library sequencing data generated during this study are available at the NCBI Sequence Read Archive database under PRJNA631290. Data from the Repair-seq screens are available under PRJNA721212. Processed target library data used for training machine learning models have been deposited under the following DOIs: https://doi.org/10.6084/m9.figshare.12275645 and https://doi.org/10.6084/m9.figshare.12275654.

Code availability

Code used for analysis of CRISPRi screens is available at https://github.com/jeffhussmann/repair-seq. Codes used for target library data processing and analysis iare available at https://github.com/maxwshen/lib-dataprocessing and https://github.com/maxwshen/lib-analysis, respectively. The machine learning models for CGBEs trained on target library data are available as a part of the BE-Hive interactive web application at https://crisprbehive.design and the BE-Hive Python package at https://github.com/maxwshen/be_predict_efficiency and https://github.com/maxwshen/be_predict_bystander.

Change history

References

  1. Landrum, M. J. et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 44, D862–D868 (2016).

    Article  CAS  PubMed  Google Scholar 

  2. Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Gaudelli, N. M. et al. Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Nature 551, 464–471 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Gehrke, J. M. et al. An APOBEC3A-Cas9 base editor with minimized bystander and off-target activities. Nat. Biotechnol. 36, 977–982 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Nishida, K. et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. Science 353, aaf8729 (2016).

    Article  PubMed  Google Scholar 

  6. Richter, M. F. et al. Phage-assisted evolution of an adenine base editor with improved Cas domain compatibility and activity. Nat. Biotechnol. 38, 883–891 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Rees, H. A. & Liu, D. R. Base editing: precision chemistry on the genome and transcriptome of living cells. Nat. Rev. Genet. 19, 770–788 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Anzalone, A. V., Koblan, L. W. & Liu, D. R. Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 38, 824–844 (2020).

    Article  CAS  PubMed  Google Scholar 

  9. Gaudelli, N. M. et al. Directed evolution of adenine base editors with increased activity and therapeutic application. Nat. Biotechnol. 38, 892–900 (2020).

    Article  CAS  PubMed  Google Scholar 

  10. Mok, B. Y. et al. A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing. Nature 583, 631–637 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Komor, A. C. et al. Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with higher efficiency and product purity. Sci. Adv. 3, eaao4774 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Arbab, M. et al. Determinants of base editing outcomes from target library analysis and machine learning. Cell 182, 463–480 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Kurt, I. C. et al. CRISPR C-to-G base editors for inducing targeted DNA transversions in human cells. Nat. Biotechnol. 39, 41–46 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Zhao, D. et al. Glycosylase base editors enable C-to-A and C-to-G base changes. Nat. Biotechnol. 39, 35–40 (2020).

    Article  PubMed  Google Scholar 

  15. Chen, L. et al. Programmable C:G to G:C genome editing with CRISPR-Cas9-directed base excision repair proteins. Nat. Commun. 12, 1384 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Liu, D. R. & Koblan, L. W. Cytosine to guanine base editor. Patentscope https://patentscope.wipo.int/search/en/detail.jsf?docId=WO2018165629 (2018).

  17. Marquart, K. F. et al. Predicting base editing outcomes with an attention-based deep learning algorithm trained on high-throughput target library screens. Preprint at bioRxiv https://doi.org/10.1101/2020.07.05.186544 (2020).

  18. Sang, P. B., Srinath, T., Patil, A. G., Woo, E.-J. & Varshney, U. A unique uracil-DNA binding protein of the uracil DNA glycosylase superfamily. Nucleic Acids Res. 43, 8452–8463 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Ahn, W.-C. et al. Covalent binding of uracil DNA glycosylase UdgX to abasic DNA upon uracil excision. Nat. Chem. Biol. 15, 607–614 (2019).

    Article  CAS  PubMed  Google Scholar 

  20. Tu, J., Chen, R., Yang, Y., Cao, W. & Xie, W. Suicide inactivation of the uracil DNA glycosylase UdgX by covalent complex formation. Nat. Chem. Biol. 15, 615–622 (2019).

    Article  CAS  PubMed  Google Scholar 

  21. Hussmann, J. A. et al. Mapping the genetic landscape of DNA double-strand break repair. Preprint at bioRxiv https://doi.org/10.1101/2021.06.14.44834 (2021).

  22. Gilbert, L. A. et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Gallina, I. et al. The ubiquitin ligase RFWD3 is required for translesion DNA synthesis. Molecular Cell 81, 442–458.e9 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Levy, J. M. et al. Cytosine and adenine base editing of the brain, liver, retina, heart and skeletal muscle of mice via adeno-associated viruses. Nat. Biomed. Eng. 4, 97–110 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Kim, Y. B. et al. Increasing the genome-targeting scope and precision of base editing with engineered Cas9-cytidine deaminase fusions. Nat. Biotechnol. 35, 371–376 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Kleinstiver, B. P. et al. High-fidelity CRISPR–Cas9 nucleases with no detectable genome-wide off-target effects. Nature 529, 490–495 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Slaymaker, I. M. et al. Rationally engineered Cas9 nucleases with improved specificity. Science 351, 84–88 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  28. Chen, J. S. et al. Enhanced proofreading governs CRISPR–Cas9 targeting accuracy. Nature 550, 407–410 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Lee, J. K. et al. Directed evolution of CRISPR-Cas9 to increase its specificity. Nat. Commun. 9, 3048 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Koblan, L. W. et al. Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction. Nat. Biotechnol. 36, 843–846 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Shen, M. W. et al. Predictable and precise template-free CRISPR editing of pathogenic variants. Nature 563, 646–651 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Nishimasu, H. et al. Engineered CRISPR-Cas9 nuclease with expanded targeting space. Science 361, 1259–1262 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Stenson, P. D. et al. Human Gene Mutation Database: towards a comprehensive central mutation database. J. Med. Genet. 45, 124–126 (2007).

    Article  Google Scholar 

  34. Frank, M. et al. The type of variants at the COL3A1 gene associates with the phenotype and severity of vascular Ehlers–Danlos syndrome. Eur. J. Hum. Genet. 23, 1657–1664 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Petrucelli, N., Daly, M. B. & Feldman, G. L. Hereditary breast and ovarian cancer due to mutations in BRCA1 and BRCA2. Genet. Med. 12, 245–259 (2010).

    Article  CAS  PubMed  Google Scholar 

  36. Douglas, J. et al. NSD1 mutations are the major cause of Sotos syndrome and occur in some cases of Weaver syndrome but are rare in other overgrowth phenotypes. Am. J. Hum. Genet. 72, 132–143 (2003).

    Article  CAS  PubMed  Google Scholar 

  37. Luna-Peláez, N. et al. The Cornelia de Lange syndrome-associated factor NIPBL interacts with BRD4 ET domain for transcription control of a common set of genes. Cell Death Dis. 10, 548 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149–157 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Clement, K. et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol. 37, 224–226 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Horlbeck, M. A. et al. Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation. eLife 5, e19760 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  41. Gilbert, LukeA. et al. Genome-scale CRISPR-mediated control of gene repression and activation. Cell 159, 647–661 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Gilbert, LukeA. et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Sherwood, R. I. et al. Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat. Biotechnol. 32, 171–178 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32, 8024–8035 (2019).

    Google Scholar 

Download references

Acknowledgements

This work was supported by US NIH (nos. U01AI142756, UG3AI150551, RM1HG009490, R35GM118062, R35GM138167 and P30CA072720), HHMI and Princeton University. B.A. acknowledges a Searle Scholars award. The authors acknowledge NSF Graduate Research Fellowships to L.W.K., M.W.S. and T.A.S.; a NWO Rubicon Fellowship to M.A.; a Jane Coffin Childs postdoctoral fellowship to A.V.A.; fellowship support from the NSF and Hertz Foundation to J.L.D.; a Helen Hay Whitney postdoctoral fellowship to G.A.N.; a Damon Runyon Postdoctoral Fellowship to D.Y.; a Singapore A*STAR NSS fellowship to B.M.; and NIH Ruth L. Kirschstein National Research Service Award no. F31NS115380 to J.M.R. J.A.H. was the Rebecca Ridley Kry Fellow of the Damon Runyon Cancer Research Foundation.

Author information

Authors and Affiliations

Authors

Contributions

L.W.K, M.A., M.W.S., J.A.H., A.V.A., J.S.W., B.A. and D.R.L. designed the research. L.W.K., M.A., M.W.S., J.A.H., A.V.A., J.L.D., G.A.N., D.Y., B.M., J.M.R., A.X., T.A.S. and B.A. performed experiments. J.S.W., B.A. and D.R.L. supervised the project. L.W.K. and D.R.L. wrote the manuscript with input from all authors.

Corresponding authors

Correspondence to Jonathan S. Weissman, Britt Adamson or David R. Liu.

Ethics declarations

Competing interests

J.A.H. is a consultant for Tessera Therapeutics. J.M.R. is a consultant for Maze Therapeutics. J.S.W. is a consultant for, and holds equity in, Maze Therapeutics, Chroma Medicine and KSQ Therapeutics. B.A. was a member of a ThinkLab Advisory Board for, and holds equity in, Celsius Therapeutics. D.R.L. is a consultant for, and holds equity in, Beam Therapeutics, Prime Medicine, Pairwise Plants and Chroma Medicine. The remaining authors declare no competing interests.

Additional information

Peer review information Nature Biotechnology thanks Jia Chen, Leopold Parts and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–15, Discussion 1–6, Sequences and References.

Reporting Summary

41587_2021_938_MOESM3_ESM.xlsx

Supplementary Table 1. CRISPRi sgRNA library. Supplementary Table 2. Changes in base editing outcomes for all genes in CRISPRi screens. Supplementary Table 3. Base editing outcomes in a library of disease-related alleles correctable by editing C•G to G•C or to A•T. Supplementary Table 4. CGBE targets, amplicons and oligos used for this study.

Supplementary Data 1

All C•G-to-G•C editing yield, purity and indel outcomes for all experiments in this manuscript. T-tests can be generated for any pairwise comparison in this file.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Koblan, L.W., Arbab, M., Shen, M.W. et al. Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning. Nat Biotechnol 39, 1414–1425 (2021). https://doi.org/10.1038/s41587-021-00938-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41587-021-00938-z

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing