Abstract
Recent methods for spatial imaging of tissue samples can identify up to ~100 individual proteins1,2,3 or RNAs4,5,6,7,8,9,10 at single-cell resolution. However, the number of proteins or genes that can be studied in these approaches is limited by long imaging times. Here we introduce Composite In Situ Imaging (CISI), a method that leverages structure in gene expression across both cells and tissues to limit the number of imaging cycles needed to obtain spatially resolved gene expression maps. CISI defines gene modules that can be detected using composite measurements from imaging probes for subsets of genes. The data are then decompressed to recover expression values for individual genes. CISI further reduces imaging time by not relying on spot-level resolution, enabling lower magnification acquisition, and is overall about 500-fold more efficient than current methods. Applying CISI to 12 mouse brain sections, we accurately recovered the spatial abundance of 37 individual genes from 11 composite measurements covering 180 mm2 and 476,276 cells.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 /Â 30Â days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
We used publicly available snRNA-seq datasets released by BICCN (U19 Huang generated by the Regev lab; http://data.nemoarchive.org/biccn/grant/huang/macosko_regev/transcriptome/sncell/) and full-length scRNA-seq (the Allen Institute Mouse Whole Cortex and Hippocampus SMART-seq (RRID:SCR_019013)). Raw image data from the large validation study are available for download at the Brain Image Library: https://download.brainimagelibrary.org/49/77/49777378713bb584/.
Code availability
An online repository of code used in this study can be found at https://github.com/cleary-lab/CISI. Please see the accompanying Life Sciences Reporting Summary for additional information.
References
Angelo, M. et al. Multiplexed ion beam imaging of human breast tumors. Nat. Med. 20, 436–442 (2014).
Keren, L. et al. A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging. Cell 174, 1373–1387 (2018).
Goltsev, Y. et al. Deep profiling of mouse splenic architecture with CODEX multiplexed imaging. Cell 174, 968–981 (2018).
Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).
Shah, S., Lubeck, E., Zhou, W. & Cai, L. In situ transcription profiling of single cells reveals spatial organization of cells in the mouse hippocampus. Neuron 92, 342–357 (2016).
Shah, S., Lubeck, E., Zhou, W. & Cai, L. seqFISH accurately detects transcripts in single cells and reveals robust spatial organization in the hippocampus. Neuron 94, 752–758 (2017).
Wang, X. et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science 361, eaat5691 (2018).
Wang, G., Moffitt, J. R. & Zhuang, X. Multiplexed imaging of high-density libraries of RNAs with MERFISH and expansion microscopy. Sci. Rep. 8, 4847 (2018).
Codeluppi, S. et al. Spatial organization of the somatosensory cortex revealed by osmFISH. Nat. Methods 15, 932–935 (2018).
Choi, H. M. T. et al. Third-generation in situ hybridization chain reaction: multiplexed, quantitative, sensitive, versatile, robust. Development 145, dev165753 (2018).
Raj, A., van den Bogaard, P., Rifkin, S. A., van Oudenaarden, A. & Tyagi, S. Imaging individual mRNA molecules using multiple singly labeled probes. Nat. Methods 5, 877–879 (2008).
Eng, C. H. L. et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH. Nature 568, 235–239 (2019).
Cleary, B., Cong, L., Cheung, A., Lander, E. S. & Regev, A. Efficient generation of transcriptomic profiles by random composite measurements. Cell 171, 1424–1436 (2017).
Hrvatin, S. et al. Single-cell analysis of experience-dependent transcriptomic states in the mouse visual cortex. Nat. Neurosci. 21, 120–129 (2018).
Cleary, B. & Regev, A. The necessity and power of random, under-sampled experiments in biology. Preprint at https://arxiv.org/abs/2012.12961 (2020).
Abrà moff, M. D., Magalhães, P. J. & Ram, S. J. Image processing with ImageJ. Biophotonics International https://imagescience.org/meijering/publications/download/bio2004.pdf (2004).
Hörl, D. et al. BigStitcher: reconstructing high-resolution image datasets of cleared and expanded samples. Nat. Methods 16, 870–874 (2019).
Axelrod, S. et al. Starfish: open source image based transcriptomics and proteomics tools. http://github.com/spacetx/starfish (2020).
McQuin, C. et al. CellProfiler 3.0: next-generation image processing for biology. PLoS Biol. 16, e2005970 (2018).
Acknowledgements
We thank A. Hupalowska and L. Gaffney for help with figures; S. Farhi, Y. Eldar and members of the Cleary, Chen, Regev and Lander labs for helpful discussions; and the National Institute of Health’s (NIH) BICCN for open sharing of data before publication. This work was supported by BICCN (1RF1MH12128901) (B.C., A.R. and F.C.) and NIH 1U19MH114821 (A.R.), the Merkin Institute Fellowship at the Broad Institute (B.C.), the Klarman Cell Observatory, the Howard Hughes Medical Institute, the National Human Genome Research Institute Center of Excellence in Genome Science (RM1HG006193) (A.R.) and the Eric and Wendy Schmidt Fellows Program at the Broad Institute (F.C.).
Author information
Authors and Affiliations
Contributions
B.C., F.C. and A.R. conceived the study. B.S., J.B., B.C., E.M. and A.S. performed experiments, with assistance and feedback from J.M. B.C., S.A. and E.H. performed snRNA-seq data analysis and developed the image processing pipeline. B.C. developed and implemented the decompression algorithms. B.C., A.R., F.C., E.S.L. and B.S. wrote the manuscript, with input from all authors.
Corresponding authors
Ethics declarations
Competing interests
A.R. is a founder and equity holder of Celsius Therapeutics, an equity holder in Immunitas Therapeutics and, until August 31, 2020, was a Scientific Advisory Board member of Syros Pharmaceuticals, Neogene Therapeutics, Asimov and Thermo Fisher Scientific. From August 1, 2020, A.R. is an employee of Genentech, a member of the Roche Group. E.S.L. serves on the Board of Directors for Codiak BioSciences and Neon Therapeutics and serves on the Scientific Advisory Board of F-Prime Capital Partners and Third Rock Ventures. E.S.L. also serves on the Board of Directors of the Innocence Project, Count Me In and the Biden Cancer Initiative and on the Board of Trustees for the Parker Institute for Cancer Immunotherapy.
Additional information
Peer review information Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Marker gene expression in snRNA-seq clusters.
For each of 37 genes, shown is the distribution of expression (individual violin plots; y-axis) in each of 23 snRNA-Seq clusters (x axis). Marker genes for similar cell types are grouped together with the cell type labeled on top.
Extended Data Fig. 2 Analysis of modular factorization based on gene and module diversity.
Pearson correlation (y-axis) between the original expression levels of 37 genes in each cell and those approximated in those cells by Sparse Module Activity Factorization (SMAF). Contour plots depict the density of cells at each level of correlation with either a given number of genes expressed (a; x-axis) or a given number of gene modules by SMAF decomposition (b; x-axis).
Extended Data Fig. 3 Evaluation of performance of simulated compositions.
Distribution of Pearson correlation between the original and recovered expression levels of 37 genes in each cell (y axis) across simulation trials for different numbers of composite measurements (a), or for different measurement densities, set by the maximum number of measurements in which each gene was included (b). In (a) the maximum compositions per gene is 3, and in (b) the number of compositions is 10. Mini boxplots depict median (center dots), inner quartiles (upper and lower bounds of box for 25th and 75th percentile), and 1.5x quartile range (minima and maxima of whiskers).
Extended Data Fig. 4 Autoencoder based decompression successfully recovers accurate spatial patterns of individual genes compared to direct measurement on the same section.
RNA images recovered by decompression with the segmentation free algorithm (magenta) and directly measured (green) in the same tissue section. White: images overlap exactly. Genes are grouped based on the section in which their direct measurements were made. Insets for all genes in a section show the same region, or an adjacent region if no cells for a given gene were present. Scale bar: 500um. Representative fields of view in each tissue section were chosen such that every gene validated in a tissue section could be visualized in the same region, while quantification of overlap (correlation) was calculated using all cells in a given tissue section, or using randomly selected testing cells (where indicated).
Extended Data Fig. 5 Comparison of autoencoding and segmentation-based decompression.
Individual gene images recovered (magenta) using the autoencoding algorithm (left) or the segmentation based algorithm (right) are overlaid with direct measurement (green) of the genes in the same tissue sections (white: direct overlap). For segmentation-based decompression, the decompressed signal for each gene is projected uniformly over each segmentation mask. Scale bar: 500um. Representative fields of view were selected to highlight expression of indicated genes, while quantification of overlap (correlation) was calculated using all cells in a given tissue section, or using randomly selected testing cells (where indicated).
Extended Data Fig. 6 Evaluation of recovered signals before and after co-measurement adjustment.
a,b, Adjustment improves recovered signals. Integrated signal intensity for each gene in each cell (individual dots) from direct measurements (x axis) and from estimates recovered by the autoencoder decompressed images (y axis) either before (a) and after (b) co-measurement correction. c, Example correction. Segmented cell intensities before (left) and after (right) correction for two co-measured genes (Hmha1 and Slc17a7) that were not correlated in snRNA-Seq.
Extended Data Fig. 7 Evaluation based on genes per cell and cell clusters.
a, Distribution of expression diversity (effective number of genes expressed per cell out of 37 total; y axis) in snRNA-Seq, or based on recovered expression levels using autoencoding or segmentation-based decompression (x axis). Mini boxplots depict median (center dots), inner quartiles (upper and lower bounds of box for 25th and 75th percentile), and 1.5x quartile range (minima and maxima of whiskers). b, Correspondence (Pearson’s correlation of mean gene expression; color bar) between cell clusters from snRNA-Seq (rows) and those found from post hoc segmentation of images recovered using the autoencoding algorithm (columns). One marker gene for each cluster is indicated.
Extended Data Fig. 8 CISI recapitulates clusters and conditional probabilities from scRNA-Seq.
a,b, Consistent identification of cell type specific gene programs in scRNA-Seq and CISI. The correlation coefficient (colorbar) between pairs of genes (row and column labels) in scRNA-Seq (a) and decompressed CISI measurements (b). Rows and columns are clustered. Gene clusters of cell type specific markers are labeled by the respective cell type. c,d, Consistent cell type expression patterns for IEGs in scRNA-Seq is CISI. Conditional probability (colorbars) of IEGs (columns) in cells that express a given gene (rows) in scRNA-Seq (c) and decompressed CISI (d) data.
Extended Data Fig. 9 Gene-level correlation with validation measurements.
For each gene (individual dots) validated in each tissue (colors) the correlation across all segmented cells (y-axis) between values recovered by CISI and directly measured values is plotted vs the average expression level (TPM) in cells expressing the gene in scRNA-Seq (x-axis, left), or the percentage of cells expressing the gene (x-axis, right). Individual data points labeled by gene are provided in Supplementary Table 10.
Supplementary information
Supplementary Tables
Excel workbook with Supplementary Tables 1–10
Rights and permissions
About this article
Cite this article
Cleary, B., Simonton, B., Bezney, J. et al. Compressed sensing for highly efficient imaging transcriptomics. Nat Biotechnol 39, 936–942 (2021). https://doi.org/10.1038/s41587-021-00883-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41587-021-00883-x
This article is cited by
-
scGIST: gene panel design for spatial transcriptomics with prioritized gene sets
Genome Biology (2024)
-
Highly sensitive spatial transcriptomics using FISHnCHIPs of multiple co-expressed genes
Nature Communications (2024)
-
Scalable genetic screening for regulatory circuits using compressed Perturb-seq
Nature Biotechnology (2023)
-
Low-cost multiclass-image encryption based on compressive sensing and chaotic system
Nonlinear Dynamics (2023)
-
A data compression and encryption method for green edge computing
Cluster Computing (2023)