Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Landscape of G-quadruplex DNA structural regions in breast cancer

Abstract

Response and resistance to anticancer therapies vary due to intertumor and intratumor heterogeneity1. Here, we map differentially enriched G-quadruplex (G4) DNA structure-forming regions (∆G4Rs) in 22 breast cancer patient-derived tumor xenograft (PDTX) models. ∆G4Rs are associated with the promoters of highly amplified genes showing high expression, and with somatic single-nucleotide variants. Differences in ΔG4R landscapes reveal seven transcription factor programs across PDTXs. ∆G4R abundance and locations stratify PDTXs into at least three G4-based subtypes. ∆G4Rs in most PDTXs (14 of 22) were found to associate with more than one breast cancer subtype, which we also call an integrative cluster (IC)2. This suggests the frequent coexistence of multiple breast cancer states within a PDTX model, the majority of which display aggressive triple-negative IC10 gene activity. Short-term cultures of PDTX models with increased ∆G4R levels are more sensitive to small molecules targeting G4 DNA. Thus, G4 landscapes reveal additional IC-related intratumor heterogeneity in PDTX biopsies, improving breast cancer stratification and potentially identifying new treatment strategies.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Quantitative G4-ChIP–seq of PDTX reveals differentially enriched G4 DNA regions.
Fig. 2: G4 DNA prevalence in the genomic and transcriptomic architecture of PDTX breast cancer models.
Fig. 3: G4 DNA regions reveal the activity of distinct transcription factor programs.
Fig. 4: G4 DNA levels predict response to G4-ligands.

Similar content being viewed by others

Data availability

The qG4-ChIP–seq data reported in this paper are available at the Gene Expression Omnibus (National Center for Biotechnology Information repository) under accession number GSE152216. Gene-expression (RNA-seq) data of the PDTX models are available at the European Genome-phenome Archive under accession number EGAS00001001913. Source data are provided with this paper.

Code availability

Sample sheets describing the detailed experimental design are available at https://github.com/sblab-bioinformatics/qG4-ChIP-seq-of-breast-cancer-PDTX. Details of data analysis have been deposited at the same link. An overview of all software tools for the processing of sequencing data is provided in Supplementary Table 6. Source data are provided with this paper.

References

  1. Flavahan, W. A., Gaskell, E. & Bernstein, B. E. Epigenetic plasticity and the hallmarks of cancer. Science 357, eaal2380 (2017).

    Article  Google Scholar 

  2. Curtis, C. et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature 486, 346–352 (2012).

    Article  CAS  Google Scholar 

  3. Rhodes, D. & Lipps, H. J. G-quadruplexes and their regulatory roles in biology. Nucleic Acids Res. 43, 8627–8637 (2015).

    Article  CAS  Google Scholar 

  4. Varshney, D., Spiegel, J., Zyner, K., Tannahill, D. & Balasubramanian, S. The regulation and functions of DNA and RNA G-quadruplexes. Nat. Rev. Mol. Cell Biol. https://doi.org/10.1038/s41580-020-0236-x (2020).

  5. Marsico, G. et al. Whole genome experimental maps of DNA G-quadruplexes in multiple species. Nucleic Acids Res. 47, 3862–3874 (2019).

    Article  CAS  Google Scholar 

  6. Hänsel-Hertsch, R. et al. G-quadruplex structures mark human regulatory chromatin. Nat. Genet. 48, 1267–1272 (2016).

    Article  Google Scholar 

  7. Hänsel-Hertsch, R., Spiegel, J., Marsico, G., Tannahill, D. & Balasubramanian, S. Genome-wide mapping of endogenous G-quadruplex DNA structures by chromatin immunoprecipitation and high-throughput sequencing. Nat. Protoc. 13, 551–564 (2018).

    Article  Google Scholar 

  8. Hänsel-Hertsch, R., Di Antonio, M. & Balasubramanian, S. DNA G-quadruplexes in the human genome: detection, functions and therapeutic potential. Nat. Rev. Mol. Cell Biol. 18, 279–284 (2017).

    Article  Google Scholar 

  9. Paeschke, K., Capra, J. A. & Zakian, V. A. DNA replication through G-quadruplex motifs is promoted by the Saccharomyces cerevisiae Pif1 DNA helicase. Cell 145, 678–691 (2011).

    Article  CAS  Google Scholar 

  10. Cheung, I., Schertzer, M., Rose, A. & Lansdorp, P. M. Disruption of dog-1 in Caenorhabditis elegans triggers deletions upstream of guanine-rich DNA. Nat. Genet. 31, 405–409 (2002).

    Article  CAS  Google Scholar 

  11. Georgakopoulos-Soares, I., Morganella, S., Jain, N., Hemberg, M. & Nik-Zainal, S. Noncanonical secondary structures arising from non-B DNA motifs are determinants of mutagenesis. Genome Res. 28, 1264–1271 (2018).

    Article  CAS  Google Scholar 

  12. Lensing, S. V. et al. DSBCapture: in situ capture and sequencing of DNA breaks. Nat. Methods 13, 855–857 (2016).

    Article  CAS  Google Scholar 

  13. Bouwman, B. A. M. & Crosetto, N. Endogenous DNA double-strand breaks during DNA transactions: emerging insights and methods for genome-wide profiling. Genes (Basel) 9, 632 (2018).

    Article  Google Scholar 

  14. De, S. & Michor, F. DNA secondary structures and epigenetic determinants of cancer genome evolution. Nat. Struct. Mol. Biol. 18, 950–955 (2011).

    Article  CAS  Google Scholar 

  15. Chambers, V. S. et al. High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat. Biotechnol. 33, 1–7 (2015).

    Article  Google Scholar 

  16. Pereira, B. et al. The somatic mutation profiles of 2,433 breast cancers refine their genomic and transcriptomic landscapes. Nat. Commun. 7, 11479 (2016).

    Article  CAS  Google Scholar 

  17. Bruna, A. et al. A biobank of breast cancer explants with preserved intra-tumor heterogeneity to screen anticancer compounds. Cell 167, 260–274.e22 (2016).

    Article  CAS  Google Scholar 

  18. Rueda, O. M. et al. Dynamics of breast-cancer relapse reveal late-recurring ER-positive genomic subgroups. Nature 567, 399–404 (2019).

    Article  CAS  Google Scholar 

  19. Orlando, D. A. et al. Quantitative ChIP–Seq normalization reveals global modulation of the epigenome. Cell Rep. 9, 1163–1170 (2014).

    Article  CAS  Google Scholar 

  20. Scheinin, I. et al. DNA copy number analysis of fresh and formalin-fixed specimens by shallow whole-genome sequencing with identification and exclusion of problematic regions in the genome assembly. Genome Res. 24, 2022–2032 (2014).

    Article  CAS  Google Scholar 

  21. Mao, S.-Q. et al. DNA G-quadruplex structures mold the DNA methylome. Nat. Struct. Mol. Biol. 25, 951–957 (2018).

    Article  CAS  Google Scholar 

  22. Zaret, K. S. & Carroll, J. S. Pioneer transcription factors: establishing competence for gene expression. Genes Dev. 25, 2227–2241 (2011).

    Article  CAS  Google Scholar 

  23. Gertz, J. et al. Distinct properties of cell-type-specific and shared transcription factor binding sites. Mol. Cell 52, 25–36 (2013).

    Article  CAS  Google Scholar 

  24. Oki, S. et al. ChIP‐Atlas: a data‐mining suite powered by full integration of public ChIP–seq data. EMBO Rep. 19, e46255 (2018).

    Article  Google Scholar 

  25. Rodriguez, R. et al. A novel small molecule that alters shelterin integrity and triggers a DNA-damage response at telomeres. J. Am. Chem. Soc. 130, 15758–15759 (2008).

    Article  CAS  Google Scholar 

  26. Xu, H. et al. CX-5461 is a DNA G-quadruplex stabilizer with selective lethality in BRCA1/2 deficient tumours. Nat. Commun. 8, 14432 (2017).

    Article  CAS  Google Scholar 

  27. Biffi, G., Tannahill, D., Miller, J., Howat, W. J. & Balasubramanian, S. Elevated levels of G-quadruplex formation in human stomach and liver cancer tissues. PLoS One 9, e102711 (2014).

    Article  Google Scholar 

  28. McLuckie, K. I. E. et al. G-quadruplex DNA as a molecular target for induced synthetic lethality in cancer cells. J. Am. Chem. Soc. 135, 9640–9643 (2013).

    Article  CAS  Google Scholar 

  29. Zimmer, J. et al. Targeting BRCA1 and BRCA2 deficiencies with G-quadruplex-interacting compounds. Mol. Cell 61, 449–460 (2016).

    Article  CAS  Google Scholar 

  30. Schmidt, D. et al. ChIP–seq: using high-throughput sequencing to discover protein–DNA interactions. Methods 48, 240–248 (2009).

    Article  CAS  Google Scholar 

  31. Biffi, G., Tannahill, D., McCafferty, J. & Balasubramanian, S. Quantitative visualization of DNA G-quadruplex structures in human cells. Nat. Chem. 5, 182–186 (2013).

    Article  CAS  Google Scholar 

  32. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013).

  33. Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    Article  CAS  Google Scholar 

  34. Khan, A. & Mathelier, A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics 18, 287 (2017).

    Article  Google Scholar 

  35. Chin, S.-F. et al. Shallow whole genome sequencing for robust copy number profiling of formalin-fixed paraffin-embedded breast cancers. Exp. Mol. Pathol. 104, 161–169 (2018).

    Article  CAS  Google Scholar 

  36. Huang, W., Loganantharaj, R., Schroeder, B., Fargo, D. & Li, L. PAVIS: a tool for peak annotation and visualization. Bioinformatics 29, 3097–3099 (2013).

    Article  CAS  Google Scholar 

  37. Le, D. D., Di Antonio, M., Chan, L. K. M. & Balasubramanian, S. G-quadruplex ligands exhibit differential G-tetrad selectivity. Chem. Commun. (Camb.) 51, 8048–8050 (2015).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the staff in the Genomics and the Compliance and Biobanking Core Facilities at the Cancer Research UK Cambridge Institute. We acknowledge support from University of Cambridge and Cancer Research UK program. We kindly thank Ben Czech of the Hannon laboratory for providing S2 D. melanogaster cells. The Caldas and Balasubramanian laboratories are supported by core funding from Cancer Research UK (C14303/A17197). The Balasubramanian laboratory is supported by program grant funding from Cancer Research UK (C9681/A18618 and C9681/A29214) and a Wellcome Trust Investigator Award (209441/z/17/z). We acknowledge Marco Di Antonio for conceptualizing the design of i-PDS. Prior to the revision of this study work by Robert Hänsel-Hertsch was supported by the Balasubramanian group; afterwards this work was additionally supported by core funding from the Center for Molecular Medicine Cologne.

Author information

Authors and Affiliations

Authors

Contributions

R.H.-H., C.C. and S.B. conceived this study. R.H.-H. developed quantitative G4-ChIP–seq. R.H.-H., A.B., O.M.R. and C.C. designed the PDTX model panel for this study. R.H.-H. processed all the PDTX tissues and prepared the chromatin samples for G4-ChIP–seq. R.H.-H., W.W.I.H. and K.G.Z. optimized and performed G4-ChIP–seq. A.M., A.B., O.M.R. and C.C. performed and interpreted genomic and transcriptomic characterization of all PDTX models. A. Shea performed the G4-ligand treatment assay, which was analyzed by O.M.R. R.H.-H., G.M. and A. Simeone developed and implemented a computational pipeline to measure normalization performance. X.Z. synthesized i-PDS with the support of S.A., and performed G4-ligand in vitro experiments and analysis. R.H.-H. and A. Simeone performed all the G4-ChIP–seq-related computational analysis. R.H.-H., D.T., C.C. and S.B. interpreted the results with input from all authors. R.H.-H. prepared the figures. R.H.-H., D.T., C.C. and S.B. wrote the manuscript with contributions from all authors.

Corresponding author

Correspondence to Shankar Balasubramanian.

Ethics declarations

Competing interests

S.B. is an advisor and shareholder of Cambridge Epigenetix Ltd.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Quantitative G4-ChIP-seq (qG4-ChIP-seq).

a, Hierarchical clustering measuring euclidean distance for all 108 PDTX qG4-ChIP-seq libraries. Top: input subtraction (counts per million), middle: input subtraction plus reference normalization using all recovered reads (Total recovery), and bottom: input subtraction plus reference normalization using G4 regions to calculate normalization factors. Blue and red brackets indicate the sorted position of the labels in the dendogram. Label naming convention (PDTX.model_replicate.number) b, Fold-enrichment (left, bar chart) and absolute abundance (right bar chart) of ~26,000 qG4-ChIP-seq regions with different G4 motif classes relative to random occurrence in the human genome. Dashed line indicates fold enrichment threshold (1). Mean values calculated across 10 randomizations are shown. c, Genome browser (IGV) snapshots of a CG4R (constantly enriched region, bottom 19 kb window) and a ∆G4R (top 10 kb window).

Source data

Extended Data Fig. 2 Differential G4 DNA prevalence in the genomic and transcriptomic architecture of breast cancer.

a, Left: ∆G4R levels in each PDTX model; right: mean ∆G4R levels in PDTX models classified by IC or ER status. Respectively, n = 11 PDTX models are ER-pos or ER-neg, n = 7 PDTX models are IC8/1 and n = 13 are IC10/9. Data are presented as mean values + /- SEM. b, Distribution of the level of ∆G4Rs found in amplified (AMP + GAIN), unchanged (neutral, NEUT) and deleted (HETD + HOMD) CNA regions. AMP = highly amplified regions, GAIN = amplified, NEUT = unchanged or neutral regions, HETD = heterozygous deletions, HOMD = homozygous deletions. Significances were calculated using an unpaired nonparametric Mann-Whitney test **** = P < 0.0001 (p-values are exact, two-tailed). c, Scatter plots of ∆G4R fold-enrichment (right) or level (left) in highly amplified regions (AMPs) vs AMP frequency. Pearson correlations (r) and corresponding p-value (exact two-tailed p-value for parametric correlation) are shown (p). For b and c n = 22 PDTX models were considered. d, Fisher-test of ∆G4Rs overlap with promoters (+/− 1kbp TSS) of upregulated (log2FC > 0.6, p < 0.05) gene sets that define IC membership as defined by Curtis et al. (Nature 2012)2. The mean of 22 ∆G4R associations are shown. Error bars = standard error of the sample mean, significance from paired t test *** = p < 0.001 (p-value is exact, two-tailed) e, Scatter plots of individual PDTXs. Y-axis: Overlap of gene promoters (%) for distinct gene signatures of the 10 different integrative clusters as defined by Curtis et al. (Nature 2012)2, that overlap with the ∆G4Rs for each PDTX model. X-axis: The significance (-log10(P-value)) of the overlap by chance (Fisher-test). The expected IC classification for each PDTX model is highlighted in red. Box plot elements: center line, median; box limits, lower and upper quartiles; whiskers, lowest and highest value.

Source data

Extended Data Fig. 3 Differentially enriched G4 DNA regions reveal distinct transcription factor programs.

a, Example scatter plots for two PDTXs showing ∆G4R fold-enrichments in different breast cancer-related transcription factor binding sites (TFBS, derived from ChIP-ATLAS, see Methods) relative to random overlap (N = 134 transcription factors were considered). For -/10/AB521M, the ∆G4R - TFBS fold-enrichments are plotted against -/10/VHIO179 ∆G4R - TFBS fold-enrichments (top), and for + /8/STG143 the ∆G4R – TFBS fold-enrichments is plotted against + /1/HCI005 (bottom). Spearman correlations (r) and corresponding p-values (exact two-tailed p-values for nonparametric correlation) are shown, respectively. b, Same as (a) but showing Spearman correlations among PDTX models as hierarchical clustered heatmap. Color intensity and the size of the circle are proportional to the correlation coefficients. c, Left: Heatmap of ∆G4R - TFBS fold-enrichments belonging to the different TF programs as presented in Fig. 3. Heatmaps are sorted by PDTX and TF hierarchical clustering. Missing fold-enrichment values are shown in white. Right: Heatmap of normalized TF expression values (transcripts per million), which are sorted as in the left heatmaps.

Source data

Extended Data Fig. 4 Reduced G4 affinity of isomer-PDS relative to PDS.

a, Left: data for responses to 50 drug responses (area under the curve = AUC) for 4 different PDTX models from IC10 are shown, sorted by increasing response of the -/10/STG201 model (red). Right:, median drug response (AUC) values of 4 PDTX models; Wilcoxon matched-pairs signed rank test indicates significant differences. **** = p < 0.0001, ** = p < 0.01 (p-values are exact, two-tailed). b, Fluorescence quench equilibrium dissociation binding assay for PDS and i-PDS. Apparent equilibrium dissociation constants (Kdapp) of PDS and i-PDS with Cy5 labeled H-Telo, c-Myc, Kit-1, and ds-DNA. c, Scatterplots of PDTX AMP, GAIN, NEUT levels (x-axis) against PDTC response (AUC, y-axis) to G4-ligands with enhanced (PDS, CX-5461) and reduced (i-PDS) G4 affinities, see also Methods. Error bars reflect mean, upper and lower limit AUCs. N = 9 PDTC samples. Additionally, N = 3 PDTC samples were independently investigated. Spearman correlation (r) and significance (exact two-tailed p-value for nonparametric correlation) are shown. Box plot elements: center line, median; box limits, lower and upper quartiles; whiskers, lowest and highest value.

Source data

Supplementary information

Supplementary Information

Supplementary Data 1

Reporting Summary

Supplementary Tables

Supplementary Tables 1–6 and Supplementary Data 2

Source data

Source Data Fig. 1

statistical source data

Source Data Fig. 2

statistical source data

Source Data Fig. 3

statistical source data

Source Data Fig. 4

statistical source data

Source Data Extended Data Fig. 1

statistical source data

Source Data Extended Data Fig. 2

statistical source data

Source Data Extended Data Fig. 3

statistical source data

Source Data Extended Data Fig. 4

statistical source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hänsel-Hertsch, R., Simeone, A., Shea, A. et al. Landscape of G-quadruplex DNA structural regions in breast cancer. Nat Genet 52, 878–883 (2020). https://doi.org/10.1038/s41588-020-0672-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-020-0672-8

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer