Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Quantifying the effect of experimental perturbations at single-cell resolution

Abstract

Current methods for comparing single-cell RNA sequencing datasets collected in multiple conditions focus on discrete regions of the transcriptional state space, such as clusters of cells. Here we quantify the effects of perturbations at the single-cell level using a continuous measure of the effect of a perturbation across the transcriptomic space. We describe this space as a manifold and develop a relative likelihood estimate of observing each cell in each of the experimental conditions using graph signal processing. This likelihood estimate can be used to identify cell populations specifically affected by a perturbation. We also develop vertex frequency clustering to extract populations of affected cells at the level of granularity that matches the perturbation response. The accuracy of our algorithm at identifying clusters of cells that are enriched or depleted in each condition is, on average, 57% higher than the next-best-performing algorithm tested. Gene signatures derived from these clusters are more accurate than those of six alternative algorithms in ground truth comparisons.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Illustrative description of perturbation analysis using MELD and VFC.
Fig. 2: Vertex frequency analysis using the sample-associated indicator signals and relative likelihood.
Fig. 3: Quantitative comparison of the sample-associated relative likelihood and VFC.
Fig. 4: MELD recovers signature of TCR activation.
Fig. 5: Characterizing chordin Cas9 mutagenesis with MELD.
Fig. 6: MELD characterizes the response to IFN-γ in pancreatic islet cells.

Similar content being viewed by others

Data availability

Gene expression counts matrices prepared in ref. 13 were accessed from NCBI GEO database accession GSE92872. Gene expression counts matrices prepared in ref. 15 were downloaded from NCBI GEO accession GSE112294. The pancreatic islets datasets are available on NCBI GEO at accession GSE161465.

Code availability

Code for the MELD and VFC algorithms implemented in Python is available as part of the MELD package on GitHub (https://github.com/KrishnaswamyLab/MELD) and on the Python Package Index. The GitHub repository also contains tutorials, code to reproduce the analysis of the zebrafish dataset and code associated with several of the quantitative comparisons.

References

  1. Luecken, M. D. & Theis, F. J. Current best practices in single-cell RNA-seq analysis: a tutorial. Mol. Syst. Biol. 15, e8746 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Weinreb, C., Wolock, S., Klein, A. M. & Berger, B. SPRING: a kinetic interface for visualizing high dimensional single-cell expression data. Bioinformatics 34, 1246–1248 (2018).

    Article  CAS  PubMed  Google Scholar 

  3. Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nat. Biotechnol. 37, 1482–1492 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. van Dijk, D. et al. Recovering gene interactions from single-cell data using data diffusion. Cell 174, 716–729 (2018).

    Google Scholar 

  5. Shekhar, K. et al. Comprehensive classification of retinal bipolar neurons by single-cell transcriptomics. Cell 166, 1308–1323 (2016).

    Article  Google Scholar 

  6. Levine, J. H. et al. Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162, 184–197 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Xu, C. & Su, Z. Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics 31, 1974–1980 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Becht, E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 37, 38–44 (2019).

    Article  CAS  Google Scholar 

  9. Patel, A. P. et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344, 1396–1401 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Finak, G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).

  11. Tirosh, I. et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 352, 189–196 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Jaitin, D. A. et al. Dissecting immune circuits by linking CRISPR-pooled screens with single-cell RNA-seq. Cell 167, 1883–1896 (2016).

    Article  Google Scholar 

  13. Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14, 297–301 (2017).

  14. Gao, X., Hu, D., Gogol, M. & Li, H. ClusterMap: comparing analyses across multiple single cell RNA-seq profiles. Bioinformatics 35, 3038–3045 (2018).

  15. Wagner, D. E. et al. Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo. Science 360, 981–987 (2018).

  16. Farrell, J. A. et al. Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis. Science 360, eaar3131 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Dann, E., Henderson, N. C., Teichmann, S. A., Morgan, M. D. & Marioni, J. C. Milo: differential abundance testing on single-cell data using k-NN graphs | Preprint at bioRxiv https://doi.org/10.1101/2020.11.23.393769 (2020).

  18. Büttner, M., Ostner, J., Müller, C., Theis, F. & Schubert, B. scCODA: a Bayesian model for compositional single-cell data analysis. Preprint at bioRxiv https://doi.org/10.1101/2020.12.14.422688 (2020).

  19. Moon, K. R. et al. Manifold learning-based methods for analyzing single-cell RNA-sequencing data. Curr. Opin. Syst. Biol. 7, 36–46 (2018).

    Article  Google Scholar 

  20. Shuman, D. I., Narang, S. K., Frossard, P., Ortega, A. & Vandergheynst, P. The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag. 30, 83–98 (2013).

    Article  Google Scholar 

  21. Botev, Z. I., Grotowski, J. F. & Kroese, D. P. Kernel density estimation via diffusion. Ann. Stat. 38, 2916–2957 (2010).

    Article  Google Scholar 

  22. Shuman, D. I., Vandergheynst, P. & Frossard, P. Chebyshev polynomial approximation for distributed signal processing. In: Distributed Computing in Sensor Systems and Workshops (DCOSS). 2011 International Conference on Distributed Computing in Sensor Systems, 1–8 (IEEE, 2011).

  23. Shuman, D. I., Ricaud, B. & Vandergheynst, P. Vertex-frequency analysis on graphs. Applied Comput. Harmon. Anal. 40, 260–291 (2016).

    Article  Google Scholar 

  24. Zappia, L., Phipson, B. & Oshlack, A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol. 18, 174 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Traag, V. A., Waltman, L. & van Eck, N. J. From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9, 5233 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. DePasquale, E. A. K. et al. CellHarmony: cell-level matching and holistic comparison of single-cell transcriptomes. Nucleic Acids Res. 47, e138–e138 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Fischer, D. Theislab/diffxpy. Theis Lab https://github.com/theislab/diffxpy (2020).

  28. Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Yen, S.-T. et al. Somatic mosaicism and allele complexity induced by CRISPR/Cas9 RNA injections in mouse zygotes. Dev. Biol. 393, 3–9 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Hammerschmidt, M. et al. Dino and mercedes, two genes regulating dorsal development in the zebrafish embryo. Development 123, 95–102 (1996).

    Article  CAS  PubMed  Google Scholar 

  31. Schulte-Merker, S., Lee, K. J., McMahon, A. P. & Hammerschmidt, M. The zebrafish organizer requires chordino. Nature 387, 862–863 (1997).

    Article  CAS  PubMed  Google Scholar 

  32. Fisher, S. & Halpern, M. E. Patterning the zebrafish axial skeleton requires early chordin function. Nat. Genet. 23, 442–446 (1999).

    Article  CAS  PubMed  Google Scholar 

  33. Ablamunits, V., Elias, D., Reshef, T. & Cohen, I. R. Islet T cells secreting IFN-γ in NOD mouse diabetes: arrest by p277 peptide treatment. J. Autoimmun. 11, 73–81 (1998).

    Article  CAS  PubMed  Google Scholar 

  34. Lopes, M. et al. Temporal profiling of cytokine-induced genes in pancreatic β-cells by meta-analysis and network inference. Genomics 103, 264–275 (2014).

    Article  CAS  PubMed  Google Scholar 

  35. Muraro, M. J. et al. A single-cell transcriptome atlas of the human pancreas. Cell Syst. 3, 385–394 (2016).

    Google Scholar 

  36. Xin, Y. et al. Pseudotime ordering of single human β-cells reveals states of insulin production and unfolded protein response. Diabetes 67, 1783–1794 (2018).

    Article  CAS  PubMed  Google Scholar 

  37. Farack, L. et al. Transcriptional heterogeneity of beta cells in the intact pancreas. Dev. Cell 48, 115–125 (2019).

    Article  Google Scholar 

  38. Ramana, C. V., Gil, M. P., Schreiber, R. D. & Stark, G. R. Stat1-dependent and -independent pathways in IFN-γ-dependent signaling. Trends Immunol. 23, 96–101 (2002).

    Article  CAS  PubMed  Google Scholar 

  39. Sadler, A. J. & Williams, B. R. G. Interferon-inducible antiviral effectors. Nat. Rev. Immunol. 8, 559–568 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Fitzgerald, K. A. The interferon inducible gene: viperin. J. Interferon Cytokine Res. 31, 131–135 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Zheng, Z., Wang, L. & Pan, J. Interferon-stimulated gene 20-kDa protein (ISG20) in infection and disease: review and outlook. Intractable Rare Dis. Res. 6, 35–40 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Hultcrantz, M. et al. Interferons induce an antiviral state in human pancreatic islet cells. Virology 367, 92–101 (2007).

    Article  CAS  PubMed  Google Scholar 

  43. Stewart, A. F. et al. Human β-cell proliferation and intracellular signaling: part 3. Diabetes 64, 1872–1885 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Chen, X. et al. MLL-AF9 initiates transformation from fast-proliferating myeloid progenitors. Nat. Commun. 10, 5767 (2019).

  45. Dutrow, E. V. et al. The human accelerated region HACNS1 modifies developmental gene expression in humanized mice. Preprint at https://www.biorxiv.org/content/10.1101/2019.12.11.873075v1 (2019).

  46. Savell, K. E. et al. A dopamine-induced gene expression signature regulates neuronal function and cocaine response. Sci. Adv. 6, eaba4221 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Chung, K. M. et al. Endocrine–exocrine signaling drives obesity-associated pancreatic ductal adenocarcinoma. Cell 181, 832–847 (2020).

    Article  Google Scholar 

  48. Ravindra, N. G. et al. Single-cell longitudinal analysis of SARS-CoV-2 infection in human airway epithelium. Preprint at https://www.biorxiv.org/content/10.1101/2020.05.06.081695v2 (2020).

  49. Haghverdi, L., Lun, A. T. L., Morgan, M. D. & Marioni, J. C. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36, 421–427 (2018).

  50. Coifman, R. R. & Lafon, S. Diffusion maps. Applied Comput. Harmon. Anal. 21, 5–30 (2006).

    Article  Google Scholar 

  51. Mack, Y. P. & Rosenblatt, M. Multivariate k-nearest neighbor density estimates. J. Multivar. Anal. 9, 1–15 (1979).

    Article  Google Scholar 

  52. Biau, G., Chazal, F., Cohen-Steiner, D., Devroye, L. & Rodríguez, C. A weighted k-nearest neighbor density estimate for geometric inference. Electron. J. Stat. 5, 204–237 (2011).

    Article  Google Scholar 

  53. Kung, Y.-H., Lin, P.-S. & Kao, C.-H. An optimal k-nearest neighbor for density estimation. Stat. Probabil. Lett. 82, 1786–1791 (2012).

    Article  Google Scholar 

  54. Von Luxburg, U. & Alamgir, M. Density estimation from unweighted k-nearest neighbor graphs: a roadmap. In: Burges, C. J. C., Bottou, L., Welling, M., Ghahramani, Z. & Weinberger, K. Q. (eds.) Advances in Neural Information Processing Systems 26, 225–233 (Curran Associates, 2013).

  55. Silverman, B. W. Density Estimation for Statistics and Data Analysis (Routledge, 2018).

  56. Hammond, D. K., Vandergheynst, P. & Gribonval, R. Wavelets on graphs via spectral graph theory. Applied Comput. Harmon. Anal. 30, 129–150 (2011).

    Article  Google Scholar 

  57. Perraudin, N., Ricaud, B., Shuman, D. & Vandergheynst, P. Global and local uncertainty principles for signals on graphs. APSIPA Trans. Signal Inform. Process. 7, E3 (2018); https://doi.org/10.1017/ATSIP.2018.2

  58. Mallat, S.A. Wavelet Tour of Signal Processing: The Sparse Way (Academic Press, 2008).

  59. Zhou, D. & Schölkopf, B. A regularization framework for learning from graph data. In: ICML Workshop on Statistical Relational Learning and Its Connections to Other Fields 15, 67–68 (2004).

  60. Ham, J., Lee, D. D. & Saul, L. K. Semisupervised alignment of manifolds. Proc. Annu. Conf. Uncertainty in Artificial Intelligence (eds Ghahramani, Z. & Cowell, R.) (AUAI Press, 2005).

  61. Belkin, M., Matveeva, I. & Niyogi, P. Regularization and semi-supervised learning on large graphs. In: International Conference on Computational Learning Theory, 624–638 (Springer, 2004).

  62. Ando, R. K. & Zhang, T. Learning on graph with Laplacian regularization. In: Schölkopf, B., Platt, J. C. & Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, 25–32 (MIT Press, 2007).

  63. Weinberger, K. Q., Sha, F., Zhu, Q. & Saul, L. K. Graph Laplacian regularization for large-scale semidefinite programming. In: Schölkopf, B., Platt, J. C. & Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, 1489–1496 (MIT Press, 2007).

  64. He, X., Ji, M., Zhang, C. & Bao, H. A variance minimization criterion to feature selection using Laplacian regularization. IEEE Trans. Pattern Anal. Mach. Intell. 33, 2013–2025 (2011).

    Article  PubMed  Google Scholar 

  65. Liu, X., Zhai, D., Zhao, D., Zhai, G. & Gao, W. Progressive image denoising through hybrid graph Laplacian regularization: a unified framework. IEEE Trans. Image Process. 23, 1491–1503 (2014).

    Article  PubMed  Google Scholar 

  66. Pang, J., Cheung, G., Ortega, A. & Au, O. C. Optimal graph Laplacian regularization for natural image denoising. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2294–2298 (IEEE, 2015).

  67. Pang, J. & Cheung, G. Graph Laplacian regularization for image denoising: analysis in the continuous domain. IEEE Trans. Image Process. 26, 1770–1785 (2017).

    Article  PubMed  Google Scholar 

  68. Perraudin, N. et al. GSPBOX: a toolbox for signal processing on graphs. Preprint at https://arxiv.org/abs/1408.5781 (2016).

  69. Barron, M. & Li, J. Identifying and removing the cell-cycle effect from single-cell RNA-sequencing data. Sci. Rep. 6, 33892 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Belkin, M. & Niyogi, P. Convergence of Laplacian eigenmaps. In: Schölkopf, B., Platt, J. C. & Hoffman, T. (eds.) Advances in Neural Information Processing Systems 19, 129–136 (MIT Press, 2006).

  71. Coifman, R. R. & Maggioni, M. Diffusion wavelets. Applied Comput. Harmon. Anal. 21, 53–94 (2006).

    Article  Google Scholar 

  72. Chaudhuri, P. & Marron, J. S. Scale space view of curve estimation. Ann. Stat. 28, 408–428 (2000).

    Article  Google Scholar 

  73. Perraudin, N., Holighaus, N., Søndergaard, P. L. & Balazs, P. Designing Gabor windows using convex optimization. Appl. Math. Comput. 330, 266–287 (2018).

    Google Scholar 

  74. Ng, A. Y., Jordan, M. I. & Weiss, Y. On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems 849–856 (NIPS, 2001).

Download references

Acknowledgements

The authors would like to thank C. Vejnar, R. Coifman, J. Noonan, V. Tornini and C. Kontur for fruitful discussions. We would also like to thank G. Wang of the Yale Center for Genome Analysis for help in preparing the pancreatic islet data. This research was supported, in part, by the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institues of Health (NIH) (award no. F31HD097958) (to D.B.); the Gruber Foundation (to S.G.); IVADO Professor startup and operational funds, IVADO Fundamental Research Project grant PRF-2019-3583139727 (to G.W.); NIH grants R01GM135929 and R01GM130847 (to G.W. and S.K.); and Chan-Zuckerberg Initiative grants 182702 and CZF2019-002440 (to S.K.). The content provided here is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.

Author information

Authors and Affiliations

Authors

Contributions

D.B.B., S.K., G.W., D.v.D. and A.J.G. envisioned the project. D.B.B., J.S., A.T., S.K. and G.W. developed the mathematical formulation of the problem and related numerical analysis. D.B.B., J.S. and S.G. implemented the code. D.B.B. and S.K. performed the analysis of biological and simulated data. A.L.P. and K.C.H. generated and assisted with the analysis of the pancreatic islet dataset. A.J.G. assisted with the analysis of the zebrafish data and related writing. D.B.B., J.S., A.T., S.K. and G.W. wrote the paper. S.G. assisted with the writing.

Corresponding authors

Correspondence to David van Dijk or Smita Krishnaswamy.

Ethics declarations

Competing interests

The authors declare the following competing interest: S.K. is a paid scientific advisor to AI Therapeutics.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–14, Tables 1 and 2 and Notes 1–3

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Burkhardt, D.B., Stanley, J.S., Tong, A. et al. Quantifying the effect of experimental perturbations at single-cell resolution. Nat Biotechnol 39, 619–629 (2021). https://doi.org/10.1038/s41587-020-00803-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41587-020-00803-5

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing