Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Repeat expansions confer WRN dependence in microsatellite-unstable cancers

Abstract

The RecQ DNA helicase WRN is a synthetic lethal target for cancer cells with microsatellite instability (MSI), a form of genetic hypermutability that arises from impaired mismatch repair1,2,3,4. Depletion of WRN induces widespread DNA double-strand breaks in MSI cells, leading to cell cycle arrest and/or apoptosis. However, the mechanism by which WRN protects MSI-associated cancers from double-strand breaks remains unclear. Here we show that TA-dinucleotide repeats are highly unstable in MSI cells and undergo large-scale expansions, distinct from previously described insertion or deletion mutations of a few nucleotides5. Expanded TA repeats form non-B DNA secondary structures that stall replication forks, activate the ATR checkpoint kinase, and require unwinding by the WRN helicase. In the absence of WRN, the expanded TA-dinucleotide repeats are susceptible to cleavage by the MUS81 nuclease, leading to massive chromosome shattering. These findings identify a distinct biomarker that underlies the synthetic lethal dependence on WRN, and support the development of therapeutic agents that target WRN for MSI-associated cancers.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: WRN depletion in MSI cells induces recurrent DSBs at (TA)n dinucleotide repeats.
Fig. 2: TA breaks are dependent on structure-specific endonucleases MUS81–EME1 and SLX4.
Fig. 3: Replication stalling and collapse at (TA)n repeats in MSI cell lines.
Fig. 4: (TA)n repeats undergo large-scale expansion in MSI cell lines.

Similar content being viewed by others

Data availability

END-seq, ChIP–seq, whole-genome sequencing and Pacbio CLR data have been deposited in the Gene Expression Omnibus (GEO) database under the accession number GSE149709Source data are provided with this paper.

References

  1. Behan, F. M. et al. Prioritization of cancer therapeutic targets using CRISPR–Cas9 screens. Nature 568, 511–516 (2019).

    CAS  PubMed  ADS  Google Scholar 

  2. Chan, E. M. et al. WRN helicase is a synthetic lethal target in microsatellite unstable cancers. Nature 568, 551–556 (2019).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  3. Kategaya, L., Perumal, S. K., Hager, J. H. & Belmont, L. D. Werner syndrome helicase is required for the survival of cancer cells with microsatellite instability. iScience 13, 488–497 (2019).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  4. Lieb, S. et al. Werner syndrome helicase is a selective vulnerability of microsatellite instability-high tumor cells. eLife 8, e43333 (2019).

    PubMed  PubMed Central  Google Scholar 

  5. Fujimoto, A. et al. Comprehensive analysis of indels in whole-genome microsatellite regions and microsatellite instability across 21 cancer types. Genome Res. (2020).

  6. Dudley, J. C., Lin, M. T., Le, D. T. & Eshleman, J. R. Microsatellite instability as a biomarker for PD-1 blockade. Clin. Cancer Res. 22, 813–820 (2016).

    CAS  PubMed  Google Scholar 

  7. Chu, W. K. & Hickson, I. D. RecQ helicases: multifunctional genome caretakers. Nat. Rev. Cancer 9, 644–654 (2009).

    CAS  PubMed  Google Scholar 

  8. Toledo, L. I. et al. ATR prohibits replication catastrophe by preventing global exhaustion of RPA. Cell 155, 1088–1103 (2013).

    CAS  PubMed  Google Scholar 

  9. Canela, A. et al. DNA breaks and end resection measured genome-wide by end sequencing. Mol. Cell 63, 898–911 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Paiano, J. et al. ATM and PRDM9 regulate SPO11-bound recombination intermediates during meiosis. Nat. Commun. 11, 857 (2020).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  11. Khil, P. P., Smagulova, F., Brick, K. M., Camerini-Otero, R. D. & Petukhova, G. V. Sensitive mapping of recombination hotspots using sequencing-based detection of ssDNA. Genome Res. 22, 957–965 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Tubbs, A. et al. Dual roles of Poly(dA:dT) tracts in replication initiation and fork collapse. Cell 174, 1127–1142 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Bowater, R., Aboul-ela, F. & Lilley, D. M. Large-scale stable opening of supercoiled DNA in response to temperature and supercoiling in (A + T)-rich regions that promote low-salt cruciform extrusion. Biochemistry 30, 11495–11506 (1991).

    CAS  PubMed  Google Scholar 

  14. Dayn, A. et al. Formation of (dA-dT)n cruciforms in Escherichia coli cells under different environmental conditions. J. Bacteriol. 173, 2658–2664 (1991).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. McClellan, J. A., Boublíková, P., Palecek, E. & Lilley, D. M. Superhelical torsion in cellular DNA responds directly to environmental and genetic factors. Proc. Natl Acad. Sci. USA 87, 8373–8377 (1990).

    CAS  PubMed  ADS  PubMed Central  Google Scholar 

  16. Zlotorynski, E. et al. Molecular basis for expression of common and rare fragile sites. Mol. Cell. Biol. 23, 7143–7151 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Kaushal, S. et al. Sequence and nuclease requirements for breakage and healing of a structure-forming (AT)n sequence within fragile site FRA16D. Cell Rep. 27, 1151–1164 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Wang, H. et al. CtIP maintains stability at common fragile sites and inverted repeats by end resection-independent endonuclease activity. Mol. Cell 54, 1012–1021 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Shastri, N. et al. Genome-wide identification of structure-forming repeats as principal sites of fork collapse upon ATR inhibition. Mol. Cell 72, 222–238 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. Inagaki, H. et al. Chromosomal instability mediated by non-B DNA: cruciform conformation and not DNA sequence is responsible for recurrent translocation in humans. Genome Res. 19, 191–198 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Minocherhomji, S. & Hickson, I. D. Structure-specific endonucleases: guardians of fragile site stability. Trends Cell Biol. 24, 321–327 (2014).

    CAS  PubMed  Google Scholar 

  22. Wyatt, H. D., Laister, R. C., Martin, S. R., Arrowsmith, C. H. & West, S. C. The SMX DNA repair tri-nuclease. Mol. Cell 65, 848–860 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Ammazzalorso, F., Pirzio, L. M., Bignami, M., Franchitto, A. & Pichierri, P. ATR and ATM differently regulate WRN to prevent DSBs at stalled replication forks and promote replication fork recovery. EMBO J. 29, 3156–3169 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Cortes-Ciriano, I., Lee, S., Park, W. Y., Kim, T. M. & Park, P. J. A molecular portrait of microsatellite instability across multiple cancers. Nat. Commun. 8, 15180 (2017).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  25. Dolzhenko, E. et al. Detection of long repeat expansions from PCR-free whole-genome sequence data. Genome Res. 27, 1895–1903 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Tankard, R. M. et al. Detecting expansions of tandem repeats in cohorts sequenced with short-read sequencing data. Am. J. Hum. Genet. 103, 858–873 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Mitsuhashi, S. & Matsumoto, N. Long-read sequencing for rare human genetic diseases. J. Hum. Genet. 65, 11–19 (2020).

    PubMed  Google Scholar 

  28. Khristich, A. N. & Mirkin, S. M. On the wrong DNA track: molecular mechanisms of repeat-mediated genome instability. J. Biol. Chem. 295, 4134–4170 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Glover, T. W., Wilson, T. E. & Arlt, M. F. Fragile sites in cancer: more than meets the eye. Nat. Rev. Cancer 17, 489–501 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. The Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).

    ADS  Google Scholar 

  31. The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Pan-cancer analysis of whole genomes. Nature 578, 82–93 (2020).

    CAS  ADS  Google Scholar 

  32. Fearon, E. R. et al. Identification of a chromosome 18q gene that is altered in colorectal cancers. Science 247, 49–56 (1990).

    CAS  PubMed  ADS  Google Scholar 

  33. Ding, L. & Chen, F. Predicting tumor response to PD-1 blockade. N. Engl. J. Med. 381, 477–479 (2019).

    PubMed  Google Scholar 

  34. Mandal, R. et al. Genetic diversity of tumors with mismatch repair deficiency influences anti-PD-1 immunotherapy response. Science 364, 485–491 (2019).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  35. Feng, X. et al. ATR inhibition potentiates ionizing radiation-induced interferon response via cytosolic nucleic acid-sensing pathways. EMBO J. 39, e104036 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Harding, S. M. et al. Mitotic progression following DNA damage enables pattern recognition within micronuclei. Nature 548, 466–470 (2017).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  37. Callen, E. et al. 53BP1 enforces distinct pre- and post-resection blocks on homologous recombination. Mol. Cell 77, 26–38 (2020).

    CAS  PubMed  Google Scholar 

  38. Palermo, V. et al. CDK1 phosphorylates WRN at collapsed replication forks. Nat. Commun. 7, 12880 (2016).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

  39. Chang, J. H., Kim, J. J., Choi, J. M., Lee, J. H. & Cho, Y. Crystal structure of the Mus81-Eme1 complex. Genes Dev. 22, 1093–1106 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

    PubMed  PubMed Central  Google Scholar 

  42. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202-8 (2009).

    PubMed  Google Scholar 

  44. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  45. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Dolzhenko, E. et al. ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions. Bioinformatics 35, 4754–4756 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Pedersen, B. S. & Quinlan, A. R. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics 34, 867–868 (2018).

    CAS  PubMed  Google Scholar 

  50. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Ghandi, M. et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).

    CAS  PubMed  PubMed Central  ADS  Google Scholar 

Download references

Acknowledgements

We thank R. Awasthi for assistance with Southern blotting; D. Goldstein, B. Tran and the CCR Genomics core for sequencing support; M. Lawrence for computational assistance; and F. Alt, B. Vogelstein and J. Haber for helpful discussions. Work in the S.C.W. laboratory is supported by the Francis Crick Institute (FC10212) and the European Research Council (ERC-ADG-666400). The Francis Crick Institute receives core funding from Cancer Research UK, the Medical Research Council, and the Wellcome Trust. K. Fugger is the recipient of fellowships from the Benzon Foundation and the Lundbeck Foundation. The P.J.M. laboratory is funded by the MRC MR/R009368/1; A.C.-M. is the recipient of a fellowship from AstraZeneca; E.M.C. is supported by the Damon Runyon Cancer Research Foundation, and E.M.C. and A.J.B. are supported by a pilot grant from the Dana-Farber Department of Medical Oncology. The A.N. laboratory is supported by the Intramural Research Program of the NIH, an Ellison Medical Foundation Senior Scholar in Aging Award (AG-SS- 2633-11), the Department of Defense Idea Expansion (W81XWH-15-2-006) and Breakthrough (W81XWH-16-1-599) Awards, the Alex’s Lemonade Stand Foundation Award, and an NIH Intramural FLEX Award.

Author information

Authors and Affiliations

Authors

Contributions

N.v.W. set up the project, performed END-seq and flow cytometry experiments upon WRN, MUS81 and SLX4 depletion, and performed preliminary analysis of END-seq data; W.J.N. performed MUS81–EME1 in situ END-seq and PCR; A.T. performed END-seq, Southern blotting and designed ATR-mutant WRN cDNA; E.M.C. generated the inducible WRN shRNA in KM12 and HCT116 cells, performed and analysed the HSEC western blot and viability experiments, long-read sequencing, and analysed the CCLE and WRN dependency data; E.C. performed ATRi END-seq experiments, western blotting, and metaphase analysis. V.T. performed RPA ChIP–seq; K. Foster performed the HSEC and long-read sequencing experiments; N.W. performed western blotting and helped to generate WRN(3A) and WRN(6A) cells; J.N. and J.K. analysed the CCLE and WRN dependency data; S.S. analysed END-seq, RPA ChIP–seq experiments; W.W. analysed WGS, PacBio coverage across repeats, deletion breakpoints in MSI cancers, and performed quantitative modeling; F.B. analysed nucleotide composition of broken versus non-broken repeats and replication timing; E.D. performed ExpansionHunter and exSTRa bioinformatic analysis; M.A.E. supervised computational work; K.G., Y.H., A.A.B., J.T.S. and N.K. analysed the data and designed bioinformatic pipelines; R.L.W. prepared WGS libraries; A.C.-M. and K. Fugger provided recombinant MUS81–EME1; J.A.S. provided recombinant WRN; B.E.H. provided advice about PCR across repeats; K.U. provided advice about repeat expansion biology; C.H.F. provided advice about secondary structure biology; R.M.B. provided advice about WRN helicase; S.C.W. provided advice about structure specific nucleases and recombination intermediates; P.J.M. helped design in situ experiments with recombinant proteins; P.S.M. provided advice on WGS experiments and analyses; A.J.B. and A.N. supervised the project; N.v.W., W.J.N., A.T., E.M.C., A.J.B. and A.N. wrote the manuscript with comments from the other authors. N.v.W., S.S., W.J.N., A.T. and E.M.C. contributed equally; E.C. and W.W. contributed equally as second authors.

Corresponding author

Correspondence to André Nussenzweig.

Ethics declarations

Competing interests

A.J.B. has research support from Bayer, Merck and Novartis.

Additional information

Peer review information Nature thanks Sergei Mirkin and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 WRN depletion induces DNA damage in different MSI cell lines.

a, Western blot analysis of MLH1, MSH2, and GAPDH protein levels in HSECs after CRISPR–Cas9 knockout. sgLuc, control single-guide RNA (sgRNA) targeting luciferase; sgMLH1 and sgMSH2, sgRNAs targeting MLH1 and MSH2, respectively. For gel source data, see Supplementary Fig. 1. b, Relative viability 7 days after sgRNA transduction in HSECs. sgCh2.2 and sgCh2.4 denote negative controls targeting chromosome 2 intergenic sites; sgPolR2D denotes a pan-essential control. sgWRN2 and sgWRN3 denote experimental sgRNA targeting WRN. Data are mean and s.d. P values were determined using two-tailed Student’s t-test (n = 3). c, Example of flow cytometry gating strategy used in d and Extended Data Fig. 4c. d, Flow cytometry profiles for exponentially growing KM12-shWRN cells treated with DMSO (NT) or doxycycline (shWRN) for 72 h. EdU was added during the last 30 min before collecting cells. Percentage of cells in the gates is indicated. Data are representative of three independent experiments. e, Western blot analysis of WRN protein levels in KM12-shWRN and SW837-shWRN treated with DMSO or doxycycline for 72 h. Data are representative of three independent experiments. For gel source data, see Supplementary Fig. 1. f, Western blot analysis of WRN and pKAP1 protein levels in KM12-shWRN and KM12-shWRN.C911 (non-targeting shRNA) treated with DMSO or doxycycline for 72 h. Data are representative of three independent experiments. For gel source data, see Supplementary Fig. 1.

Source data

Extended Data Fig. 2 WRN depletion induces recurrent and overlapping DSBs in MSI cells.

a, Genome browser screenshot displaying END-seq profiles as normalized read density (RPM) for HCT116-shWRN and KM12-shWRN cells treated with DMSO (NT) or doxycycline (shWRN), or transfected with non-targeting siRNAs (siCTRL) or WRN siRNAs (siWRN) for 72 h. b, Scatterplots of END-seq peak intensity between biological replicates of KM12-shWRN and HCT116-shWRN cells treated with doxycycline for 72 h. Pearson correlation coefficients are indicated. c, Venn diagrams showing overlap between peaks detected in HCT116-shWRN and KM12-shWRN cells treated with either doxycycline (shWRN) or WRN siRNAs (siWRN) for 72 h. n = 1,000 random datasets were generated to test significance of overlap using one-sided Fisher’s Exact test for both the Venn diagrams (P < 2.2 × 10−16 for both comparisons). d, Quantification of END-seq peak intensity for KM12-shWRN and SW837-shWRN cells treated with doxycycline for 72 h. n = 5,424 peaks were examined for statistical significance using one-sided Wilcoxon rank sum test. Box plots are as in Fig. 2a, b. ***P <2.2 × 10−16. e, Venn diagram showing overlap between peaks identified from END-seq and RPA-bound ssDNA ChIP–seq for KM12-shWRN cells treated with doxycycline for 72 h. n = 1,000 random datasets were generated to test significance of overlap using one-sided Fisher’s exact test (P < 2.2 × 10−16). f, Composite plot of END-seq (black: positive-strand reads, grey: negative-strand reads) and RPA-bound ssDNA ChIP–seq (blue: positive-strand reads, red: negative-strand reads) signal around DSB sites in KM12-shWRN cells treated with doxycycline for 72 h. g, Heat map displaying intensity of END-seq signal in KM12-shWRN cells treated with doxycycline for 72 h, relative to the centre of the gap between positive- and negative-strand peaks. Sites are ordered by the size of the gap, from smallest to largest. h, Calculated size distribution from the reference genome of (TA)n repeats either located in gaps between positive and negative END-seq peaks (black, broken sites) or located elsewhere in the genome (grey, non-broken sites), determined from KM12-shWRN cells treated with doxycycline for 72 h.

Source data

Extended Data Fig. 3 WRN depletion induces DNA breakage in common fragile sites and palindromic TA-rich repeats in MSI cells.

a, Genome browser screenshot displaying END-seq profiles of common fragile sites FRA16D, FRA3B, FRA10B and FRA7I as normalized read density (RPM) for KM12-shWRN cells treated with DMSO (NT) or doxycycline (shWRN) for 72 h. The number of uninterrupted (TA)n repeat units in the hg19 reference genome at DSB sites is indicated. b, Genome browser screenshot displaying END-seq profiles of PATRRs on chromosomes 11 and 22 as normalized read density (RPM) for KM12-shWRN cells treated with DMSO (NT) or doxycycline (shWRN) for 72 h.

Extended Data Fig. 4 (TA)n repeat-forming repeats in MSI cell lines are substrates for MUS81–EME1.

a, Quantitative PCR with reverse transcription (qRT–PCR) analysis quantification (n = 1) of MUS81 and SLX4 mRNA levels in KM12-shWRN cells transfected with non-targeting siRNAs (siCTRL), MUS81 siRNAs (siMUS81), or SLX4 siRNAs (siSLX4). b, Representative images of metaphase spreads from KM12-shWRN cells treated with doxycycline (shWRN) and non-targeting siRNAs (siCTRL), MUS81 siRNAs (siMUS81), or SLX4 siRNAs (siSLX4) for 48 h. Data are representative of three independent experiments, n = 100 metaphases for each condition. c, Flow cytometric profiles for KAP1 phosphorylation in exponentially growing KM12-shWRN cells treated with doxycycline (shWRN), plus non-targeting siRNAs (siCTRL), MUS81 siRNAs (siMUS81), or SLX4 siRNAs (siSLX4) for 72 h. Data are representative of three independent experiments. d, Genome browser screenshot displaying END-seq profiles as normalized read density (RPM) for KM12-shWRN cells treated with doxycycline (shWRN), plus non-targeting siRNAs (siCTRL), MUS81 siRNAs (siMUS81), or SLX4 siRNAs (siSLX4) for 72 h. e, Schematic representation of DNA cruciform cleavage by MUS81–EME1 structure-specific endonuclease. f, Venn diagram displaying overlap of END-seq TA breaks between two biological replicates of DMSO-treated KM12-shWRN cells processed with purified recombinant MUS81–EME1 enzyme in situ (MUS81–EME1). n = 1,000 random datasets were generated to test significance of overlap using one-sided Fisher’s exact test (P < 2.2 × 10−16). g, Venn diagram showing overlap in TA breaks between KM12-shWRN cells treated with doxycycline (shWRN) for 72 h, and DMSO-treated cells processed with MUS81–EME1 enzyme in situ (MUS81–EME1). n = 1,000 random datasets were generated to test significance of overlap using one-sided Fisher’s exact test (P < 2.2 × 10−16). h, Venn diagram displaying overlap between TA breaks from KM12-shWRN and HCT116-shWRN genomic DNA processed in situ with MUS81–EME1 in situ (n = 1 for HCT116). n = 1,000 random datasets were generated to test significance of overlap using one-sided Fisher’s exact test (P < 2.2 × 10−16). i, Genome-wide aggregate analysis of END-seq signal around TA breaks from KM12-shWRN cells treated with doxycycline for 72 h (shWRN) (black denotes positive-strand reads, grey denotes negative-strand reads), or DMSO-treated KM12-shWRN cells processed with purified recombinant MUS81–EME1 enzyme in situ (blue denotes positive-strand reads, red denotes negative-strand reads). j, Genome browser screenshot displaying END-seq profiles for DMSO-treated KM12-shWRN cells (WRN proficient) processed in situ with either purified recombinant WRN, MUS81–EME1, or WRN followed by MUS81–EME1. For the latter, proteinase K digestion was performed between the two enzymatic treatments.

Source data

Extended Data Fig. 5 Structure-forming repeats in MSI cells activate ATR.

a, Genome browser screenshot displaying END-seq profiles for DMSO-treated KM12, HCT116, SW837 and RPE-1 cells containing an inducible shWRN cassette processed in situ with purified recombinant MUS81–EME1. Cells are indicated as MSI (red) or MSS (blue, n = 1). b, Quantification of END-seq peak intensity for libraries displayed in a. Box plots are as in Fig. 2a, b. c, Western blot analysis of WRN and pKAP1 levels in HCT116 cells expressing wild-type WRN, or ATR phosphorylation mutants WRN(3A) or WRN(6A). Endogenous WRN was depleted using an siRNA targeting the WRN 5′ UTR. Data are representative of three independent experiments. For gel source data, see Supplementary Fig. 1. d, Genome browser screenshot displaying END-seq profiles within FRA3B on chromosome 3 as normalized read density (RPM) for KM12-shWRN, HCT116-shWRN, RPE-1-shWRN, and eHAP-shWRN cells treated with doxycycline (shWRN) for 72 h or APH plus ATRi for 8 h. e, Venn diagrams displaying overlap of DSBs detected after WRN depletion or APH plus ATRi treatment in KM12 and HCT116 cells. n = 1,000 random datasets were generated to test significance of overlap using one-sided Fisher’s exact test for both the Venn diagrams (P < 2.2 × 10−16).

Source data

Extended Data Fig. 6 (TA)n repeat sequences are underrepresented in whole-genome sequencing data from MSI cells.

a, Bar plots indicating the percentage of recurrent mutations in different classes of repeats (left; mono, di, tri and tetra) and a bar plot (right) showing the number of various dinucleotide repeats in the 1,000 altered loci. The plots were based on sequencing analysis from24, which considered microsatellites smaller than 40 bp. b, Agarose gels showing PCR fragments (or lack thereof) of sites of different (TA)n repeats in one MSS and four MSI cell lines. Broken sites B1–B8 were chosen based on the presence of END-seq peaks after WRN depletion in KM12 cells. Sites NB1–NB3 were chosen with similar (TA)− repeat lengths as broken sites, but were not broken after WRN depletion in KM12 cells. Fragment sizes (in bp) are displayed. Data are representative of three independent experiments. For gel source data, see Supplementary Fig. 1. c, Genome browser screenshots of short read PCR-free whole genome sequencing reads, indicating coverage, in KM12 and HCT116 cell lines (n = 1). Shown are two regions containing (TA)n repeats, one that displays END-seq peaks after WRN depletion in KM12 (site B2), and one that does not (site NB3). Regions correspond to equivalent PCR sites in Fig. 4a and Extended Data Fig. 5b. d, Box plots displaying coverage at different classes of mono- and di-nucleotide repeats in PCR-free whole-genome sequencing libraries made from HCT116 cells. (TA)n repeats are split into those that overlap END-seq peaks after shWRN induction, and those that do not contain DSBs. Dotted red lines indicate the average coverage over the genome.

Source data

Extended Data Fig. 7 (TA)n repeats undergo large-scale expansions in MSI cells.

a, Cumulative fraction of expanded (TA)n repeats in KM12 and HCT116, based on ExpansionHunter analysis of PCR-free whole genome sequencing data. (TA)n repeats were split into broken (red) and non-broken (black) based on presence or absence of END-seq peaks after WRN depletion in KM12 cells. b, Graphical representation of a (TA)n repeat expansion in HCT116. This site has 33 (TA)n repeat units in the reference genome; ExpansionHunter identified an expansion to 86–87 repeat units based on PCR-free whole-genome sequencing of HCT116. c, Empirical cumulative distribution function based on the length by which each read overlaps the (TA)n repeat shown in b as identified by exSTRa. d, Southern blots for two different genomic regions containing non-broken (TA)n repeats corresponding to the same sites in Fig. 4a and Extended Data Fig. 6b. Red markers and dotted lines represent expected fragment sizes. For gel source data, see Supplementary Fig. 1. e, Southern blots for broken (TA)n repeat B2 (top) and non-broken (TA)n repeat NB3 (bottom) in MSS (blue) and MSI (red) cell lines, confirming expansion of broken (TA)n repeats in MSI cell lines. Red markers and dotted lines represent expected fragment sizes based on the reference genome. For gel source data, see Supplementary Fig. 1. f, Box plots displaying coverage at different classes of repeats in long-read sequencing libraries made from MSI (red) and MSS (blue) cells (n = 1). g, Motif analysis for sequence enrichment at broken (TA)n in the KM12 cell line from long-read sequencing data.

Source data

Extended Data Fig. 8 Large-scale expansions occur at long, uninterrupted (TA)n repeat sequences.

(a) Boxplot showing, in the hg19 reference genome, the proportion of (TA)n repeat units found within the full annotated sequence at broken or non-broken (TA)n repeats in KM12 cells. n = 5,400 (broken) and n = 59,729 (non-broken) sites were examined for statistical significance using one-sided Wilcoxon rank sum test. ***P < 2.2 × 10−16. b, Box plot showing, in the hg19 reference genome, the proportion of the longest run of uninterrupted (TA)n within the full annotated sequence at broken or non-broken (TA)n repeats in KM12 cells. n = 5,400 (broken) and n = 59,729 (non-broken) sites were examined for statistical significance using one-sided Wilcoxon rank sum test. ***P < 2.2 × 10−16. c, Box plot showing, in the hg19 reference genome, the length (bp) of the longest uninterrupted (TA)n dinucleotide repeats within the full annotated sequence at broken or non-broken (TA)n repeats in KM12 cells. n = 5,400 (broken) and n = 59,729 (non-broken) sites were examined for statistical significance using one-sided Wilcoxon rank sum test. ***P < 2.2 × 10−16. d, Box plot showing, in long read sequencing data, the proportion of (TA)n repeat units found within the full sequence at broken or non-broken (TA)n repeats in KM12 cells. n = 5,400 (broken) and n = 61,244 (non-broken) sites were examined for statistical significance using one-sided Wilcoxon rank sum test. ***P < 2.2 × 10−16. e, Box plot showing, in long-read sequencing data, the proportion of the longest run of uninterrupted (TA)n within the full sequence at broken or non-broken (TA)n repeats in KM12 cells. n = 5,400 (broken) and n = 61,244 (non-broken) sites were examined for statistical significance using one-sided Wilcoxon rank sum test. ***P < 2.2 × 10−16. f, Boxplot showing, in long-read sequencing data, the length (bp) of the longest uninterrupted (TA)n dinucleotide repeat within the full sequence at broken or non-broken (TA)n repeats in KM12 cells. n = 5,400 (broken) and n = 61,244 (non-broken) sites were examined for statistical significance using one-sided Wilcoxon rank sum test. ***P < 2.2 × 10−16. g, Multiple linear regression model predicting END-seq peak intensity of KM12-shWRN cells treated with doxycycline (shWRN) for 72 h derived from END-seq intensity of MUS81–EME1 cleavage in situ, replication timing, and expanded length of broken (TA)n. The Pearson correlation coefficient is indicated (see i). h, END-seq intensity of broken (TA)n repeats in KM12-shWRN cells treated with doxycycline for 72 h grouped by replication timing values from late replicating to early replicating. i, Multiple linear regression was performed to predict END-seq peak intensity of KM12-shWRN cells treated with doxycycline for 72 h based on following parameters: END-seq intensity of MUS81–EME1 cleavage in situ, replication timing, and expanded length of broken (TA)n. END-seq intensity upon shWRN induction and MUS81–EME1 cleavage were calculated using RPKM in ±1 kb window around broken (TA)n. Mean value was used for replication timing quantification. Expanded lengths were identified from long read sequencing data. Estimates of the standardized regression coefficients (β) are shown, along with t-statistics and P values based on the standardized coefficients. j, Model for MSI cell dependence on WRN. Large-scale expansions of (TA)n repeats are associated with MSI in MMR-deficient cells. When (TA)n reach above a critical length, they extrude into cruciform-like structures, which stall replication forks and activate ATR kinase, which in turn phosphorylates WRN and other substrates to complete DNA replication. In the absence of WRN, MUS81–EME1 or SLX4 cleaves secondary structures at (TA)n repeats, thereby shattering the chromosomes. All box plots are as in Fig. 2a, b.

Source data

Extended Data Fig. 9 Deletion breakpoints in MSI cancers are enriched at (TA)n repeats.

a, Genome browser screenshot of a broken (TA)n (defined from KM12), MSI deletion (derived from a patient sample), and END-seq profile (in WRN-depleted KM12 cells). The sequences around the breakpoints are shown in the inset. b, Junctions associated with six different MSI deletions from patients. Seq1 represents the sequence from −50 bp to left breakpoint and Seq2 represents the sequence from right breakpoint to +50 bp. c, Enrichment of simple repeats, broken and non-broken (TA)n, and long interspersed nuclear element (LINE), short interspersed nuclear element (SINE) and long terminal repeat (LTR) elements at patient deletion breakpoints relative to their overlap with random deletion breakpoints of the same size (enrichment value = 1). B(TA)n–B(TA)n represents cases in which both breakpoints overlap with broken (TA)n repeats; B(TA)n− represents cases in which only one breakpoint overlaps with a broken (TA)n repeat. B(TA)n, broken TA repeat; NB(TA)n, non-broken TA repeat.

Extended Data Fig. 10 DNA breaks within DCC gene body.

a, Genome browser screenshots within DCC gene displaying END-seq profiles as normalized read density (RPM) for KM12-shWRN cells treated DMSO (NT), doxycycline (shWRN) for 72 h, or MUS81–EME1 in situ. b, Zoom-in view of region including exons 6 and 7 of DCC gene, containing two (TA)n repeats displaying END-seq peaks. The highlighted sequences below were extracted from long read sequencing reads in KM12 cells. The (TA)n repeat in intron 7 is where Vogelstein and colleagues previously detected an insertion.

Supplementary information

Supplementary Data

This file contains Supplementary Figure 1: Source Data Gels and Supplementary Table 1: Genomic locations of PCR products.

Reporting Summary

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

van Wietmarschen, N., Sridharan, S., Nathan, W.J. et al. Repeat expansions confer WRN dependence in microsatellite-unstable cancers. Nature 586, 292–298 (2020). https://doi.org/10.1038/s41586-020-2769-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-020-2769-8

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer