Abstract
U1 snRNP (U1) functions in splicing introns and telescripting, which suppresses premature cleavage and polyadenylation (PCPA). Using U1 inhibition in human cells, we show that U1 telescripting is selectively required for sustaining long-distance transcription elongation in introns of large genes (median 39 kb). Evidence of widespread PCPA in the same locations in normal tissues reveals that large genes incur natural transcription attrition. Underscoring the importance of U1 telescripting as a gene-size-based mRNA-regulation mechanism, small genes were not sensitive to PCPA, and the spliced-mRNA productivity of ∼1,000 small genes (median 6.8 kb) increased upon U1 inhibition. Notably, these small, upregulated genes were enriched in functions related to acute stimuli and cell-survival response, whereas genes subject to PCPA were enriched in cell-cycle progression and developmental functions. This gene size–function polarization increased in metazoan evolution by enormous intron expansion. We propose that telescripting adds an overarching layer of regulation to size–function-stratified genomes, leveraged by selective intron expansion to rapidly shift gene expression priorities.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
References
Kaida, D. et al. U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation. Nature 468, 664–668 (2010).
Berg, M.G. et al. U1 snRNP determines mRNA length and regulates isoform expression. Cell 150, 53–64 (2012).
Shi, Y. & Manley, J.L. The end of the message: multiple protein-RNA interactions define the mRNA polyadenylation site. Genes Dev. 29, 889–897 (2015).
Lerner, M.R., Boyle, J.A., Mount, S.M., Wolin, S.L. & Steitz, J.A. Are snRNPs involved in splicing? Nature 283, 220–224 (1980).
Engreitz, J.M. et al. RNA-RNA interactions enable specific targeting of noncoding RNAs to nascent Pre-mRNAs and chromatin sites. Cell 159, 188–199 (2014).
Almada, A.E., Wu, X., Kriz, A.J., Burge, C.B. & Sharp, P.A. Promoter directionality is controlled by U1 snRNP and polyadenylation signals. Nature 499, 360–363 (2013).
Ntini, E. et al. Polyadenylation site-induced decay of upstream transcripts enforces promoter directionality. Nat. Struct. Mol. Biol. 20, 923–928 (2013).
Younis, I. et al. Minor introns are embedded molecular switches regulated by highly unstable U6atac snRNA. eLife 2, e00780 (2013).
Dölken, L. et al. High-resolution gene expression profiling for simultaneous kinetic parameter analysis of RNA synthesis and decay. RNA 14, 1959–1972 (2008).
Rabani, M. et al. Metabolic labeling of RNA uncovers principles of RNA production and degradation dynamics in mammalian cells. Nat. Biotechnol. 29, 436–442 (2011).
Yao, C. et al. Transcriptome-wide analyses of CstF64-RNA interactions in global regulation of mRNA alternative polyadenylation. Proc. Natl. Acad. Sci. USA 109, 18773–18778 (2012).
Derti, A. et al. A quantitative atlas of polyadenylation in five mammals. Genome Res. 22, 1173–1183 (2012).
Bradnam, K.R. & Korf, I. Longer first introns are a general property of eukaryotic gene structure. PLoS One 3, e3093 (2008).
Fong, N. et al. Effects of transcription elongation rate and Xrn2 exonuclease activity on RNA polymerase II termination suggest widespread kinetic competition. Mol. Cell 60, 256–267 (2015).
Proudfoot, N.J. Transcriptional termination in mammals: Stopping the RNA polymerase II juggernaut. Science 352, aad9926 (2016).
Connelly, S. & Manley, J.L. A functional mRNA polyadenylation signal is required for transcription termination by RNA polymerase II. Genes Dev. 2, 440–452 (1988).
Vorlová, S. et al. Induction of antagonistic soluble decoy receptor tyrosine kinases by intronic polyA activation. Mol. Cell 43, 927–939 (2011).
Fang, H., Knezevic, B., Burnham, K.L. & Knight, J.C. XGR software for enhanced interpretation of genomic summary data, illustrated by application to immunological traits. Genome Med. 8, 129 (2016).
Supek, F., Bošnjak, M., Škunca, N. & Šmuc, T. REVIGO summarizes and visualizes long lists of gene ontology terms. PLoS One 6, e21800 (2011).
Bertagnolli, N.M., Drake, J.A., Tennessen, J.M. & Alter, O. SVD identifies transcript length distribution functions from DNA microarray data and reveals evolutionary forces globally affecting GBM metabolism. PLoS One 8, e78913 (2013).
Gabel, H.W. et al. Disruption of DNA-methylation-dependent long gene repression in Rett syndrome. Nature 522, 89–93 (2015).
Crispino, J.D., Blencowe, B.J. & Sharp, P.A. Complementation by SR proteins of pre-mRNA splicing reactions depleted of U1 snRNP. Science 265, 1866–1869 (1994).
Tarn, W.Y. & Steitz, J.A. SR proteins can compensate for the loss of U1 snRNP functions in vitro. Genes Dev. 8, 2704–2717 (1994).
Fukumura, K., Taniguchi, I., Sakamoto, H., Ohno, M. & Inoue, K. U1-independent pre-mRNA splicing contributes to the regulation of alternative splicing. Nucleic Acids Res. 37, 1907–1914 (2009).
Munding, E.M., Shiue, L., Katzman, S., Donohue, J.P. & Ares, M. Jr. Competition between pre-mRNAs for the splicing machinery drives global regulation of splicing. Mol. Cell 51, 338–348 (2013).
Cooper, T.A., Wan, L. & Dreyfuss, G. RNA and disease. Cell 136, 777–793 (2009).
Miller, J.W. et al. Recruitment of human muscleblind proteins to (CUG)(n) expansions associated with myotonic dystrophy. EMBO J. 19, 4439–4448 (2000).
Timchenko, L.T. et al. Identification of a (CUG)n triplet repeat RNA-binding protein and its expression in myotonic dystrophy. Nucleic Acids Res. 24, 4407–4414 (1996).
Elkon, R., Ugalde, A.P. & Agami, R. Alternative cleavage and polyadenylation: extent, regulation and function. Nat. Rev. Genet. 14, 496–506 (2013).
Catania, F. & Lynch, M. Where do introns come from? PLoS Biol. 6, e283 (2008).
Gelfman, S. et al. Changes in exon-intron structure during vertebrate evolution affect the splicing pattern of exons. Genome Res. 22, 35–50 (2012).
Rogozin, I.B., Carmel, L., Csuros, M. & Koonin, E.V. Origin and evolution of spliceosomal introns. Biol. Direct 7, 11 (2012).
Yates, A. et al. Ensembl 2016. Nucleic Acids Res. 44, D710–D716 (2016).
Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).
Feng, J. et al. GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data. Bioinformatics 28, 2782–2788 (2012).
Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10–12 (2011).
Katz, Y., Wang, E.T., Airoldi, E.M. & Burge, C.B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate - a practical and powerful approach to multiple testing. J. R. Stat. Soc. B 57, 289–300 (1995).
Sims, D. et al. CGAT: computational genomics analysis toolkit. Bioinformatics 30, 1290–1291 (2014).
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Kataoka, N., Diem, M.D., Kim, V.N., Yong, J. & Dreyfuss, G. Magoh, a human homolog of Drosophila mago nashi protein, is a component of the splicing-dependent exon-exon junction complex. EMBO J. 20, 6424–6433 (2001).
Acknowledgements
We thank members of our laboratory for helpful discussions and comments on the manuscript. This work was supported by the US National Institutes of Health (R01GM112923 to G.D.). G.D. is supported by the Howard Hughes Medical Institute.
Author information
Authors and Affiliations
Contributions
J.-M.O., I.Y., A.M.P., L.W. and G.D. conceived and designed the study. J.-M.O., C.C.V., C.A., I.Y., B.R.S. and Z.Z. performed the experiments. C.D. and C.C.V. performed the bioinformatics analysis. All authors contributed to data analysis. J.-M.O., C.C.V., C.D., J.G., I.Y. and G.D. wrote the manuscript with input from all authors. G.D. is responsible for the project's planning and experimental design.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 U1 inhibition causes a shift of RNA-seq reads from exons to introns.
(a) Evidence for the high purity of 4-shU-labeled RNAs used for RNA-seq. HeLa cells transfected with control or U1 AMO either not labeled or metabolically labeled with 4-shU and RNA purified as described in Methods. RT-qPCR was used to quantify two regions in PELO (exon 2-3 junction) and PRPF38B (exon 1-2 junction). After 2 cycles of 4-shU-labeled RNA selection, significant enrichment of these regions were detected in the elution, in contrast to separate, unlabeled RNA samples where they were exclusively detected in the flow through (wash) (b). Data are represented as mean ± standard deviation (n=3, independent cell cultures). The high stringency procedure achieved very strong enrichment (80-200 fold) of 4-shU RNAs over unlabeled. (c) Pie charts representing the mapping distribution of 4-shU labeled RNA-seq reads distribution in CDS, 5’ UTR, 3’ UTR and introns of control and U1 AMO treated HeLa cells, as indicated. Histogram showing the (d) total number of junction reads as a percent of the total sequencing depth, and (e) percentage of these junction reads spanning canonical (previously annotated) exon-exon splice junctions (blue) versus aberrant (non-canonical or de novo) spliced reads (red) in control and U1 AMO samples. Shown are both 4 and 8 hours 4-shU labeled RNA-seq.
Supplementary Figure 2 U1 inhibition causes multiple, moderate PCPAs.
(a) PCPA validation by 3’ RACE. Cells transfected with control or U1 AMO for 8 hours and metabolically labeled with 4-shU were used for analysis. After 2 cycles of 4-shU-labeled RNA selection, RNA was converted into cDNA and 3’RACE was performed as described previously1. Blue arrow indicates the forward primer location of 3’ RACE. (b) Genome browser views of RNA-seq of representative genes showed multiple moderate PCPAs in several introns. Venn diagrams showing the overlaps between PCPAed genes detected in 4 and 8 hours post transfection (c), and between PCPAed genes and down-regulated genes 8h post transfection sample (d). (e) Gene size highly correlates with size of introns. Scatterplots showing the Spearman correlation between total gene size and total intron size in all expressed genes (RPKM ≥ 1) in HeLa cells.
Supplementary Figure 3 PCPAed genes are more down-regulated than non-PCPAed genes.
(a) Boxplots showing the gene expression changes in non-PCPAed genes (n=5,052) and PCPAed genes (n=3,590) in the 8 h 4-shU labeled RNA-seq from cells treated with U1 AMO. For boxplots: center line, median; box limits, first and third quartiles; whiskers, 1.5x IQR; points, outliers. Statistical tests used are described in Methods. (b) Genome browser view of Non-PCPAed and small genes with no expression change. (c-e) RT-qPCR confirms gene expression in RNA-seq. HeLa cells transfected with control or U1 AMO and metabolically labeled with 4-shU were used for RT-qPCR analysis. ERCC RNA spike-in controls were added to each sample before the rRNA depletion process and used for normalization. Data are represented as mean ± standard deviation (n=3, independent cell cultures). P value was calculated with two-tailed Student’s t-test. A Poisson test measuring RNA-seq reads in exons normalized to the total mapped reads (P value < 0.01) confirmed the RT-qPCR results. (f) Intronless genes are PCPA resistant in U1 AMO. Histogram showing the 3’-poly(A) reads in gene body (internal) and 3’ end region in intronless genes (n = 143), non-PCPAed genes (n = 3,254) and PCPAed genes (n = 2,692). For all genes expressed in HeLa cells (RPKM ≥ 1), only those with 3’-poly(A) reads in either their gene body or 3’ end were selected for each group in this analysis.
Supplementary Figure 4 PCPAed genes lose more exon-exon junctions near the TES than non-PCPAed genes.
Metagene plot showing the ratio of exon-exon junction reads (U1 AMO/control, grey line) binned along the gene body, 5' to 3', in the PCPAed genes (a) and the non-PCPAed genes (b). All genes used for this analysis, from TSS to TES, were scaled to the same length (3 kb). The thick red line represents a smoothed fit line for each data point.
Supplementary Figure 5 Pol II metagenes of all expressed genes and non-PCPAed genes.
(a) U1 inhibition’s effect on upstream, antisense transcription. Genome browser view of each gene and its upstream, antisense transcript showing PCPA in both directions by U1 AMO. (b) Metagene plot of pol II ChIP-seq reads for all expressed genes (n = 9,744) and highly up-regulated, non-PCPAed genes (n = 115) in control (black) or with U1 AMO (red), relative to TSS regions (TSS -1000 bp to +500 bp) and TES (500 bp upstream of the annotated mRNA 3’ ends and 1000 bp downstream) in control and U1 AMO. Each gene’s body, between TSS + 500 bp and TES - 500 bp, was scaled to 2 kb.
Supplementary Figure 6 U1 AMO increases transcription attrition in large genes.
(a) Transcription attrition naturally occurs in large genes. Genome browser views of RNA-seq of representative genes with transcription attrition. RefSeq gene structures along with any additional isoforms from AceView are shown underneath the panels (RefSeq is the top track). (b) Internal and last exon 3’-poly(A) reads distribution versus gene size. Scatter plot of each gene’s 3’-ploy(A) reads in either the gene body (from TSS up to, but excluding, the last exon) or only the last exon in several human tissues12. Regression lines of internal poly(A) reads is dependent on gene size (left panel, R2 = 0.15, P value < 0.05) while last exon poly(A) reads is not (right panel, R2 = 5e-4, P value > 0.05). Arrow in x-axis represents the location of median gene length of all expressed genes, 22.8 kb. (c) Scatter plot of ratio of the total number of 3’-poly(A) reads found in the last exon compared to those in the gene body (from TSS up to, but excluding, the last exon) in control (left panel) and U1 AMO (right panel) RNA-seq. Arrow and vertical blue dashed line in x-axis represents the location of median gene length of all expressed genes, 22.8 kb. Fewer genes in the upper right blue colored zone represents increased transcription attrition (less full-length mRNA) in large genes by U1 AMO treatment.
Supplementary Figure 7 U2 AMO induces splicing inhibition.
Cells transfected with control, U1 or U2 AMO for 8 h and metabolically labeled with 4-shU were used for analysis. The decreases in spliced products with U2 AMOs show that splicing is dependent on the U2 snRNP.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–7 and Supplementary Tables 1, 3, 6, 7. (PDF 1356 kb)
Supplementary Table 2
The list of PCPAed genes. (XLSX 2496 kb)
Supplementary Table 4
The ratio of 3′-poly(A) reads in the last exon vs. gene body. (XLSX 394 kb)
Supplementary Table 5
Gene ontology enrichment analysis of non-PCPAed, upregulated or PCPAed, down-regulated genes. (XLSX 83 kb)
Supplementary Data Set 1
Original western blot images shown in Figure 3c. a. Uncropped immuno-blot of Cyr61 and Magoh b. Un-cropped immunoblot of Myc. Arrows in a and b indicate corresponding full-length proteins. (PDF 246 kb)
Rights and permissions
About this article
Cite this article
Oh, JM., Di, C., Venters, C. et al. U1 snRNP telescripting regulates a size–function-stratified human genome. Nat Struct Mol Biol 24, 993–999 (2017). https://doi.org/10.1038/nsmb.3473
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nsmb.3473
This article is cited by
-
A CpG island-encoded mechanism protects genes from premature transcription termination
Nature Communications (2023)
-
Mechanisms of lncRNA biogenesis as revealed by nascent transcriptomics
Nature Reviews Molecular Cell Biology (2022)
-
A conserved role for the ALS-linked splicing factor SFPQ in repression of pathogenic cryptic last exons
Nature Communications (2021)
-
Conserved long-range base pairings are associated with pre-mRNA processing of human genes
Nature Communications (2021)
-
RNA m6A modification orchestrates a LINE-1–host interaction that facilitates retrotransposition and contributes to long gene vulnerability
Cell Research (2021)