Abstract
More than 800 million people suffer from kidney disease, yet the mechanism of kidney dysfunction is poorly understood. In the present study, we define the genetic association with kidney function in 1.5 million individuals and identify 878 (126 new) loci. We map the genotype effect on the methylome in 443 kidneys, transcriptome in 686 samples and single-cell open chromatin in 57,229 kidney cells. Heritability analysis reveals that methylation variation explains a larger fraction of heritability than gene expression. We present a multi-stage prioritization strategy and prioritize target genes for 87% of kidney function loci. We highlight key roles of proximal tubules and metabolism in kidney function regulation. Furthermore, the causal role of SLC47A1 in kidney disease is defined in mice with genetic loss of Slc47a1 and in human individuals carrying loss-of-function variants. Our findings emphasize the key role of bulk and single-cell epigenomic information in translating genome-wide association studies into identifying causal genes, cellular origins and mechanisms of complex traits.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The data of eGFRcrea GWAS, kidney meQTLs and kidney eQTLs produced in the present study are publicly available online at the Susztaklab Kidney Biobank (https://susztaklab.com/GWAS; https://susztaklab.com/Kidney_meQTL; https://susztaklab.com/Kidney_eQTL) and figshare (https://doi.org/10.6084/m9.figshare.15183495)91. The GWAS summary statistics are also available at the GWAS Catalog (accession no. GCST90100220). The RNA-seq and human kidney snATAC-seq data have been deposited with the Gene Expression Omnibus (GEO) under accession nos. GSE115098, GSE173343, GSE172008 and GSE200547 and the Common Metabolic Diseases Genome Atlas (https://cmdga.org/search/?type=Experiment&searchTerm=FNIH0000000). The Integrative Genomics Viewer visualization of human kidney snATAC-seq is publicly available at https://susztaklab.com/Human_snATAC. The summary statistics of five eGFRcrea GWAS datasets used for GWAS meta-analysis were obtained from consortium websites (download links provided in Supplementary Table 1). No consent was obtained to share individual-level genotype data for kidney samples. There is no mechanism to obtain consent because kidney tissue was collected as medical discard and the samples were permanently deidentified. Summary statistics for GWAS heritability analysis were obtained from the Alkes Price lab (https://alkesgroup.broadinstitute.org/LDSCORE/independent_sumstats)37. Mouse kidney snATAC-seq data were obtained from the GEO (accession no. GSE157079)60 and mouse kidney single-cell RNA-seq data from the GEO (accession no. GSE107585)56. Drug–gene interactions were identified using the Drug Gene Interaction Database (DGIdb v.4.2.0, https://www.dgidb.org)45. Source data are provided with this paper.
Code availability
Customized code used in the present study is available at github (https://github.com/hbliu/Kidney_Epi_Pri) and Zenodo (https://doi.org/10.5281/zenodo.6392494)92.
References
GBD Chronic Kidney Disease Collaboration. Global, regional, and national burden of chronic kidney disease, 1990–2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet 395, 709–733 (2020).
Kottgen, A. et al. Multiple loci associated with indices of renal function and chronic kidney disease. Nat. Genet. 41, 712–717 (2009).
Pattaro, C. et al. Genetic associations at 53 loci highlight cell types and biological pathways relevant for kidney function. Nat. Commun. 7, 10023 (2016).
Wuttke, M. et al. A catalog of genetic loci associated with kidney function from analyses of a million individuals. Nat. Genet. 51, 957–972 (2019).
Hellwege, J. N. et al. Mapping eGFR loci to the renal transcriptome and phenome in the VA Million Veteran Program. Nat. Commun. 10, 3842 (2019).
Sullivan, K. M. & Susztak, K. Unravelling the complex genetics of common kidney diseases: from variants to mechanisms. Nat. Rev. Nephrol. 16, 628–640 (2020).
Qiu, C. et al. Renal compartment-specific genetic variation analyses identify new pathways in chronic kidney disease. Nat. Med. 24, 1721–1731 (2018).
GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
Ko, Y. A. et al. Genetic-variation-driven gene-expression changes highlight genes with important functions for kidney disease. Am. J. Hum. Genet. 100, 940–953 (2017).
Gillies, C. E. et al. An eQTL landscape of kidney tissue in human nephrotic syndrome. Am. J. Hum. Genet. 103, 232–244 (2018).
Sheng, X. et al. Mapping the genetic architecture of human traits to cell types in the kidney identifies mechanisms of disease and potential treatments. Nat. Genet. 53, 1322–1333 (2021).
Yao, D. W., O’Connor, L. J., Price, A. L. & Gusev, A. Quantifying genetic effects on disease mediated by assayed gene expression levels. Nat. Genet. 52, 626–633 (2020).
Reik, W. Stability and flexibility of epigenetic gene regulation in mammalian development. Nature 447, 425–432 (2007).
Boix, C. A., James, B. T., Park, Y. P., Meuleman, W. & Kellis, M. Regulatory genomic circuitry of human disease loci by integrative epigenomics. Nature 590, 300–307 (2021).
Jones, P. A. Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat. Rev. Genet. 13, 484–492 (2012).
Ziller, M. J. et al. Charting a dynamic DNA methylation landscape of the human genome. Nature 500, 477–481 (2013).
Hannon, E. et al. Methylation QTLs in the developing brain and their enrichment in schizophrenia risk loci. Nat. Neurosci. 19, 48–54 (2016).
Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414 e24 (2016).
Taylor, D. L. et al. Integrative analysis of gene expression, DNA methylation, physiological traits, and genetic variation in human skeletal muscle. Proc. Natl Acad. Sci. USA 116, 10883–10888 (2019).
Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).
van Zuydam, N. R. et al. A genome-wide association study of diabetic kidney disease in subjects with type 2 diabetes. Diabetes 67, 1414–1427 (2018).
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
Sinnott-Armstrong, N. et al. Genetics of 35 blood and urine biomarkers in the UK Biobank. Nat. Genet. 53, 185–194 (2021).
Stanzick, K. J. et al. Discovery and prioritization of variants and genes for kidney function in >1.2 million individuals. Nat. Commun. 12, 4350 (2021).
Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).
Barton, A. R., Sherman, M. A., Mukamel, R. E. & Loh, P. R. Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses. Nat. Genet. 53, 1260–1269 (2021).
Kaushal, G. P., Haun, R. S., Herzog, C. & Shah, S. V. Meprin A metalloproteinase and its role in acute kidney injury. Am. J. Physiol. Renal Physiol. 304, F1150–F1158 (2013).
Wen, X. et al. Transgenic expression of the human MRP2 transporter reduces cisplatin accumulation and nephrotoxicity in Mrp2-null mice. Am. J. Pathol. 184, 1299–1308 (2014).
Lu, W. et al. NFIA haploinsufficiency is associated with a CNS malformation syndrome and urinary tract defects. PLoS Genet. 3, e80 (2007).
Eales, J. M. et al. Uncovering genetic mechanisms of hypertension through multi-omic analysis of the kidney. Nat. Genet. 53, 630–637 (2021).
Chambers, B. E. et al. Tfap2a is a novel gatekeeper of nephron differentiation during kidney development. Development 146, dev172387 (2019).
Jonker, J. W., Wagenaar, E., Van Eijl, S. & Schinkel, A. H. Deficiency in the organic cation transporters 1 and 2 (Oct1/Oct2 [Slc22a1/Slc22a2]) in mice abolishes renal secretion of organic cations. Mol. Cell Biol. 23, 7902–7908 (2003).
Sheng, X. et al. Systematic integrated analysis of genetic and epigenetic variation in diabetic kidney disease. Proc. Natl Acad. Sci. USA 117, 29013–29024 (2020).
Delahaye, F. et al. Genetic variants influence on the placenta regulatory landscape. PLoS Genet. 14, e1007785 (2018).
Husquin, L. T. et al. Exploring the genetic basis of human population differences in DNA methylation and their causal impact on immune gene regulation. Genome Biol. 19, 222 (2018).
Bonder, M. J. et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat. Genet. 49, 131–138 (2017).
Loh, P. R., Kichaev, G., Gazal, S., Schoech, A. P. & Price, A. L. Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906–908 (2018).
Hekselman, I. & Yeger-Lotem, E. Mechanisms of tissue and cell-type specificity in heritable traits and diseases. Nat. Rev. Genet. 21, 137–150 (2020).
Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Ulirsch, J. C. et al. Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nat. Genet. 51, 683–693 (2019).
Pliner, H. A. et al. Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data. Mol. Cell 71, 858–871 e8 (2018).
Groopman, E. E. et al. Diagnostic utility of exome sequencing for kidney disease. N. Engl. J. Med. 380, 142–151 (2019).
Wu, Y. et al. Integrative analysis of omics summary data reveals putative mechanisms underlying complex traits. Nat. Commun. 9, 918 (2018).
Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).
Freshour, S. L. et al. Integration of the drug-gene interaction database (DGIdb 4.0) with open crowdsource efforts. Nucleic Acids Res. 49, D1144–D1151 (2021).
Guo, D. et al. Selective inhibition on organic cation transporters by carvedilol protects mice from cisplatin-induced nephrotoxicity. Pharm. Res. 35, 204 (2018).
Sarhan, M., von Mässenhausen, A., Hugo, C., Oberbauer, R. & Linkermann, A. Immunological consequences of kidney cell death. Cell Death Dis. 9, 114 (2018).
Miao, N. et al. The cleavage of gasdermin D by caspase-11 promotes tubular epithelial cell pyroptosis and urinary IL-18 excretion in acute kidney injury. Kidney Int. 96, 1105–1120 (2019).
Tsuda, M. et al. Targeted disruption of the multidrug and toxin extrusion 1 (mate1) gene in mice reduces renal secretion of metformin. Mol. Pharmacol. 75, 1280–1286 (2009).
Vilaysane, A. et al. The NLRP3 inflammasome promotes renal inflammation and contributes to CKD. J. Am. Soc. Nephrol. 21, 1732–1744 (2010).
Xu, Y. et al. A role for tubular necroptosis in cisplatin-induced AKI. J. Am. Soc. Nephrol. 26, 2647–2658 (2015).
Mulay, S. R., Linkermann, A. & Anders, H. J. Necroinflammation in kidney disease. J. Am. Soc. Nephrol. 27, 27–39 (2016).
Gamazon, E. R. et al. Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation. Nat. Genet. 50, 956–967 (2018).
Li, Y. I. et al. RNA splicing is a primary link between genetic variation and disease. Science 352, 600–604 (2016).
Zhang, Z. et al. Genetic analyses support the contribution of mRNA N6-methyladenosine (m6A) modification to human disease heritability. Nat. Genet. 52, 939–949 (2020).
Park, J. et al. Single-cell transcriptomics of the mouse kidney reveals potential cellular targets of kidney disease. Science 360, 758–763 (2018).
Li, Y. et al. Integration of GWAS summary statistics and gene expression reveals target cell types underlying kidney function traits. J. Am. Soc. Nephrol. 31, 2326–2340 (2020).
Fairfax, B. P. et al. Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression. Science 343, 1246949 (2014).
Guan, Y. et al. Dnmt3a- and Dnmt3b-decommissioned fetal enhancers are linked to kidney disease. J. Am. Soc. Nephrol. 31, 765–782 (2020).
Miao, Z. et al. Single cell regulatory landscape of the mouse kidney highlights cellular differentiation programs and disease targets. Nat. Commun. 12, 2277 (2021).
Sveinbjornsson, G. et al. Rare mutations associating with serum creatinine and chronic kidney disease. Hum. Mol. Genet. 23, 6935–6943 (2014).
Levey, A. S. et al. A new equation to estimate glomerular filtration rate. Ann. Intern. Med. 150, 604–612 (2009).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
Delaneau, O., Zagury, J.-F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5 (2013).
Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. G3 1, 457–470 (2011).
Zhou, W., Triche, T. J. Jr., Laird, P. W. & Shen, H. SeSAMe: reducing artifactual detection of DNA methylation by Infinium BeadChips in genomic deletions. Nucleic Acids Res. 46, e123 (2018).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinf. 12, 323 (2011).
Fang, R. et al. Comprehensive analysis of single cell ATAC-seq data with SnapATAC. Nat. Commun. 12, 1337 (2021).
Stouffer, S. A., Suchman, E. A., Devinney, L. C., Star, S. A. & Williams Jr, R. M. The American Soldier: Adjustment during army life (Studies in Social Psychology in World War II) Vol. 1 (Princeton Univ. Press, 1949).
Chu, A. Y. et al. Multiethnic genome-wide meta-analysis of ectopic fat depots identifies loci associated with adipocyte development and differentiation. Nat. Genet. 49, 125–130 (2017).
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).
Yang, J. et al. FTO genotype is associated with phenotypic variability of body mass index. Nature 490, 267–272 (2012).
Shabalin, A. A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).
Stegle, O., Parts, L., Durbin, R. & Winn, J. A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies. PLoS Comput. Biol. 6, e1000770 (2010).
Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics 32, 1479–1485 (2016).
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).
Han, B. & Eskin, E. Interpreting meta-analyses of genome-wide association studies. PLoS Genet. 8, e1002555 (2012).
Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).
Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
Giambartolomei, C. et al. A Bayesian framework for multiple trait colocalization from summary association statistics. Bioinformatics 34, 2538–2545 (2018).
Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
Huang da, W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
Park, J. et al. Exome-wide evaluation of rare coding variants using electronic health records identifies new gene-phenotype associations. Nat. Med. 27, 66–72 (2021).
Bramer, G. R. International statistical classification of diseases and related health problems. Tenth revision. World Health Stat. Q. 41, 32–36 (1988).
Carroll, R. J., Bastarache, L. & Denny, J. C. R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinformatics 30, 2375–2376 (2014).
Li, Q., Peng, X., Yang, H., Wang, H. & Shu, Y. Deficiency of multidrug and toxin extrusion 1 enhances renal accumulation of paraquat and deteriorates kidney injury in mice. Mol. Pharm. 8, 2476–2483 (2011).
Liu, H. et al. Epigenomic and transcriptomic analyses define core cell types, genes and targetable mechanisms for kidney disease (Data Set). figshare https://doi.org/10.6084/m9.figshare.15183495 (2022).
Liu, H. et al. Epigenomic and transcriptomic analyses define core cell types, genes and targetable mechanisms for kidney disease (Code). Zenodo https://doi.org/10.5281/zenodo.6392494 (2022).
Acknowledgements
We thank the Molecular Pathology and Imaging Core (grant no. P30-DK050306 to K.S.) and Diabetes Research Center (grant no. P30-DK19525 to K.S.) at the University of Pennsylvania for their services. The work in K.S.’s laboratory has been supported by the NIH (grant nos. R01DK087635, R01DK076077 and R01DK105821 to K.S.).
Author information
Authors and Affiliations
Contributions
K.S. and H.L. conceived, planned and oversaw the present study and wrote the manuscript. H.L. analyzed the data. T.D. performed the wet lab experiments. Z.Y.M., X.S., A.A., Z.M., B.F.V., H.Z.L. and C.B. assisted with data generation and analysis. J.P., M.D.R., H.M.T.V. and G.N.N. performed PheWAS analysis. M.P. performed histopathological descriptor measurement. G.D. and S.Y. provided Slc47a1 KO mice and helped with the animal experiments.
Corresponding author
Ethics declarations
Competing interests
The laboratory of K.S. receives funding from GSK, Regeneron, Gilead, Merck, Boehringer Ingelheim, Bayer, Novartis Maze, Jnana, Ventus and Novo Nordisk. The funders had no influence on the data analysis. K.S. serves on the scientific advisory board (SAB) of Jnana pharmaceuticals and receives equity. M.D.R. serves on the SAB for Goldfinch Bio and Cipherome. The other authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Cristian Pattaro, Pascal Schlosser and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Meta-analysis of eGFRcrea GWAS and validation using eGFRcys and BUN GWAS.
a. Manhattan plots of meta-analysis eGFRcrea GWAS (N = 1,508,659 individuals) and eGFRcrea GWAS datasets including CKDGen, UKBB, MVP, PAGE and SUMMIT. For each panel, the x-axis is chromosomal location of SNP. The y-axis strength of association -log10(p). The two-sided p value was obtained from GWAS studies. b. Scatter plot of effect size correlation between meta-analysis eGFRcrea GWAS (x-axis) and five source eGFRcrea GWAS data from CKDGen, UKBB, MVP, PAGE and SUMMIT (y-axis). The density of dots from low to high are shown from yellow to red. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation. c. Scatter plot of effect sizes between eGFRcrea GWAS (N = 1,508,659 individuals) and eGFRcys GWAS (N = 421,714 individuals). Significant eGFRcrea GWAS variants passing two-sided p < 5 × 10−8 in this study were used for the plot. Red dots represent validated variants showing nominally significant (two-sided GWAS p < 0.05) association with eGFRcys in the same effect direction. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation. d. Scatter plot of effect sizes between eGFRcrea GWAS (N = 1,508,659 individuals) and BUN GWAS (N = 852,678 individuals). Significant eGFRcrea GWAS variants passing two-sided p < 5 × 10−8 in this study were used for plot. Blue dots represent validated variants showing nominally significant (two-sided GWAS p < 0.05) association with BUN in the opposite effect direction. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation. e. Venn plot of eGFRcrea GWAS significant variants validated by eGFRcys GWAS or BUN GWAS. Y-axis is strength of GWAS association -log10(p value based of z statistic).
Extended Data Fig. 2 Identification and function annotation of independent eGFRcrea GWAS loci.
a. The strategy to identify independent loci and novel eGFRcrea GWAS loci. b. Pie chart of the number of independent loci categorized into different groups by comparing previously reported sentinel variants tagging independent loci. c. Pie chart of the number of novel independent loci validated by eGFRcys GWAS and/or BUN GWAS. d. Functional enrichment analysis of 126 novel independent loci annotated by GREAT. The positions of lead SNPs were inputted into GREAT (http://great.stanford.edu/public/html/), and the two nearest genes within 1 Mb were used for function enrichment in mouse phenotype catalogue. The two-sided uncorrected p-value was calculated by binomial test over inputted loci, and false discovery rate q-value was calculated for multiple test correction. e. Literature-based gene function of the closest genes to the 126 novel kidney disease loci. f. Expression of the mouse orthologues of 42 kidney disease genes (of the 126 newly identified GWAS genes) in adult mouse kidney samples (GSE107585). The mean expression was calculated for each cell types and z-scores were plotted. g. LocusZoom view of three novel independent loci, MEP1A, ABCC2 and NFIA. Y-axis is strength of association -log10(two-sided p value from GWAS meta-analysis z-statistic).
Extended Data Fig. 3 Meta-analysis of the kidney cis-eQTL data.
a. Manhattan plot of eQTL meta-analysis by integrating four eQTL datasets consisting of a total of 686 kidney samples. X-axis is chromosomal location of SNP, and y-axis is strength of association -log10 (two-sided p value based z-statistic from eQTL meta-analysis). b. Manhattan plot of eQTLs by Sheng et al. (n = 356 human kidney tubule samples). X-axis is chromosomal location of SNP, and y-axis is strength of association -log10 (two-sided p value from linear regression eQTL model). c. Manhattan plot of eQTLs by Ko et al. (n = 91 human kidney cortex samples). X-axis is chromosomal location of SNP, and y-axis is strength of association -log10 (two-sided p value from linear regression eQTL model). d. Manhattan plot of eQTLs by GTEx (v8) (n = 73 human kidney cortex samples). X-axis is chromosomal location of SNP, and y-axis is strength of association -log10 (two-sided p value from linear regression eQTL model). e. Manhattan plot of eQTLs by NephQTL (n = 166 human kidney tubule samples). X-axis is chromosomal location of SNP, and y-axis is strength of association -log10 (two-sided p value from linear regression eQTL model). f. Scatter plots of effect size correlation between eQTL meta-analysis and each individual eQTL datasets. The common variant-gene pairs passing eQTL p < 0.00001 in any of the two datasets were used for each plot. The density of dots from low to high was represented by yellow to red. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation.
Extended Data Fig. 4 Functional annotation of kidney-specific meQTLs and mCpGs.
a. Tissue-specific and shared meQTLs across kidney, blood and skeletal muscle tissue. M value > 0.9 was used to define meQTL for each set. b. Fraction of meQTL CpGs annotated by ChromHMM chromatin states in kidney, blood (CD3 + ) cell and skeletal muscle. c. Transcription factor motif enrichment (HOMER) of tissue-specific mCpGs. The p value was calculated by binomial test. d. Enrichment of kidney specific meQTL CpGs to cell type-specific open chromatin regions determined by snATAC-seq in human kidney. X-axis is odds ratio and Y-axis is strength of enrichment -log10(two-sided chi-square test p). Size of the dot represents the number of kidney-specific meQTL CpG sites. e. Enrichment of kidney specific meQTL SNPs to GWAS traits. X-axis is odds ratio and Y-axis is strength of enrichment -log10(two-sided chi-square test p). Size of the dot represents the number of SNPs and colors represent the type of GWAS trait.
Extended Data Fig. 5 Human kidney expression quantitative trait methylation (eQTM).
a. Schematic representation of the eQTM analysis. b. eQTM discovery rate estimated by the number of identified CpG~Gene pairs using different number of PEER factors as covariates. c. Volcano plot of eQTMs. The x-axis is the beta value and y-axis the strength of association (-log10(p)). Negative and positive eQTMs are colored in blue and red, respectively. d. The fraction of identified meQTL CpGs by eQTM analysis. The red line is the global FDR, dark blue line CpG level FDR and light blue line is nominal significance threshold. The x-axis is the eQTM significance and the y-axis is the cumulative fraction of meQTL CpGs. Vertical line represents the significance cutoff 0.05. e. Validation of the eQTMs in publicly available eQTM studies. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation. f. Scatter plot of CpG methylation (x-axis) and gene expression of PMD201 and CYP4F1 (y-axis) in 414 kidney samples. Each dot represents one kidney sample. Correlation coefficient was calculated using Spearman’s rho (R) statistic and two-sided p value was calculated using asymptotic t approximation. g. IGV visualization of eQTM association at the PM20D1, CYP4F11 and TBX5 loci. h. Number and fraction of negative and positive eQTM CpGs associated with the expression of nearest or distal genes. The nearest gene was defined based on the TSS (transcription start site) to eQTM CpG distance. The distal gene was defined if it was not the closest TSS to the eQTM CpG. Two-sided p value was calculated by chi-square test. i. Relative fraction of negative and positive eQTM CpGs localized to regulatory regions in the kidney. j. Profile plot of H3K4me3, H3K4me1, H3K27ac, and H3K27me3 histone modification across negative and positive eQTM CpGs and 5 kb flanking regions.
Extended Data Fig. 6 Estimated proportion of heritability mediated by kidney methylation and expression.
a. Estimation of heritability (\(h_{med}^2/h_g^2\)) mediated by kidney meQTL, kidney eQTL and the eQTL of best non-kidney GTEx tissue for three kidney function traits based three different biomarkers (eGFRcrea, eGFRcys and BUN). Here, best non-kidney GTEx tissue refers to the non-kidney tissue whose eQTL resulted in the highest estimates of \(h_{med}^2/h_g^2\) compared to all other non-kidney tissues. The x-axis represents different QTL groups and y-axis for \(h_{med}^2/h_g^2\) estimated for three kidney function traits Data are presented as mean ± SD. P values were calculated by one-tailed paired t test. b, c. Estimation of eGFRcrea GWAS heritability (\(h_{med}^2/h_g^2\)) mediated by methylation and expression for different number of human kidneys using multi-ancestry datasets (b) and European-ancestry datasets(c). The x-axis represents sample sizes used for the meQTL and eQTL, and y-axis for \(h_{med}^2/h_g^2\) estimated for eGFRcrea GWAS. d. Estimation of eGFRcrea and eGFRcys GWAS heritability mediated by meQTL and eQTL from different tissues. The x-axis represents \(h_{med}^2/h_g^2\), while the y-axis represents eQTL or meQTL data obtained from different tissues. meQTL data is shown in red and eQTL in blue. e. Estimation of heritability mediated by kidney eQTL and non-kidney eQTL for six kidney function traits and 28 independent non-kidney GWAS traits. The x-axis represents \(h_{med}^2/h_g^2\), while the y-axis represents different GWAS traits. For each trait, kidney eQTL data is shown in blue and best non-kidney GTEx tissue in gray. Here, best non-kidney GTEx tissue refers to the non-kidney tissue whose eQTL resulted in the highest estimates of \(h_{med}^2/h_g^2\) compared to all other non-kidney tissues. (b-e) For each bar plot, the centre of error bar represents the value of \(h_{med}^2/h_g^2\), and error bar represent jackknife standard error estimated for \(h_{med}^2/h_g^2\).
Extended Data Fig. 7 Enrichment of GWAS trait heritability mediated by enhancer methylation in 128 tissues/cell types.
a. GWAS heritability mediated by kidney methylation categorized as enhancers in 128 tissues/cell types. The x-axis shows the GWAS traits, while the y-axis shows tissue enhancers in kidney and 127 other tissue samples from the Roadmap project ChromHMM data. Gray, non-significant, while white to red indicates significant enrichment (nominal two-sided p < 0.05 calculated by MESC). Asterisk indicates h2med enrichment passing FDR q < 0.05 (accounting for 4,352 tests for 128 enhancer CpG sets and 34 GWAS traits). b. GWAS heritability mediated by blood methylation categorized as enhancers in 128 tissues/cell types. The x-axis shows the GWAS traits, while the y-axis shows tissue enhancers in kidney and 127 other tissue samples from the Roadmap project ChromHMM data. Gray, non-significant, while white to red indicates significant enrichment (nominal two-sided p < 0.05 calculated by MESC). Asterisk indicates h2med enrichment passing FDR q < 0.05 (accounting for 4,352 tests for 128 enhancer CpG sets and 34 GWAS traits).
Extended Data Fig. 8 Gene prioritization for eGFRcrea GWAS variants and functional annotation.
a. Schematic representation of gene prioritization strategy based on eight prioritization datasets and methods. b. Number of eGFRcrea GWAS variants prioritized using different priority score threshold. c. eGFRcrea GWAS independent loci prioritized by this study (priority score ≥ 1) and previous studies. The number represents the number of independent loci overlapping with independent signals prioritized (GPS score ≥ 1) by Stanzick et al. and/or creatinine-associated exome rare variants by Backman et al. or Barton et al. d. Features of the top variants prioritized for the 328 loci with priority score ≥ 3. Each row shows the top variant for each locus. Loci were ordered from top to bottom based on priority scores from 8 to 3. Loci with the same priority score were ordered by GWAS significance from strongest (dark blue) to lowest (light blue). Each column represents a feature overlapped with the variant. For each feature, the fraction of overlapping variants is shown in the upper panel. 22 top prioritized genes supported by all eight datasets and methods were listed. e. Tissue specificity of 566 prioritized genes (priority score ≥ 3) in 54 tissue types (GTEx v8) using GENE2FUNC of FUMA. The x-axis is the 54 tissue types ordered according to significance of enrichment in up-regulated differentially expressed gene sets. Y-axis represents enrichment significance -log10(p value calculated by hypergeometric test). Tissue with Bonferroni p value < 0.05 is shown in red. f. Heatmap of the expression of 417 mouse orthologues of prioritized genes in adult mouse kidney single cell dataset. The mean expression was calculated for each cell types and z-scores were plotted. Right panel shows 87 genes with the highest level of expression in proximal tubule cells.
Extended Data Fig. 9 PheWAS analysis of rs111653425 SLC47A1 variants in UKBB and BioMe Biobanks.
a. Single variant (rs111653425) PheWAS analysis of SLC47A1 in UKBB dataset. The x-axis is the strength of association -log10(p value calculated by linear regression PheWAS model). Blue line is p = 0.05 and red line is Bonferroni adjusted p = 0.05. The y-axis is the analyzed phenotype. b. SLC47A1 pLOF burden pheWAS analysis in BioMe dataset. The x-axis is the strength of association -log10(p value calculated by linear regression PheWAS model). Blue line is p = 0.05 and red line is Bonferroni adjusted p = 0.05. The y-axis is the analyzed phenotype. c. Single variant (rs111653425) pheWAS analysis of SLC47A1 in BioMe dataset. The x-axis is the strength of association -log10(p value calculated by linear regression PheWAS model). Blue line is p = 0.05 and red line is Bonferroni adjusted p = 0.05. The y-axis is the analyzed phenotype.
Extended Data Fig. 10 Slc47a1 loss confers kidney disease risk in mice.
a. The relative expression of fibrosis markers; Collagen3 (Col3a1), Collagen4 (Col4a1), Fibronectin (Fn1), and Connective tissue growth factor (Ctgf) in kidney of control or cisplatin treated Slc47a1+/+and Slc47a1−/− mice. Data are presented as mean ± SD. P values were calculated by one-way ANOVA with post hoc Tukey test. n.s., not significant. n = 4 biologically independent Slc47a1+/+ cisplatin mice examined over n = 3 independent Slc47a1+/+ control; n = 5 biologically independent Slc47a1−/− cisplatin mice examined over n = 4 independent Slc47a1+/+ cisplatin mice). b. Relative expression of markers of inflammation; Adhesion G protein-coupled receptor E1 (Adgre1), Tumor necrosis factor ligand (Tnfsf12), Interleukin 1beta (Il1b) in kidneys of control or cisplatin treated Slc47a1+/+and Slc47a1−/− mice. Data are presented as mean ± SD. P values were calculated by one-way ANOVA with post hoc Tukey test. n.s., not significant. n = 4 biologically independent Slc47a1+/+ cisplatin mice examined over n = 3 independent Slc47a1+/+ control; n = 5 biologically independent Slc47a1−/− cisplatin mice examined over n = 4 independent Slc47a1+/+ cisplatin mice).
Supplementary information
Supplementary Information
Supplementary Note and Figs. 1–15.
Supplementary Tables
Supplementary Tables 1–28.
Source data
Source Data Fig. 8
Unprocessed scan of gel image for Fig. 8i.
Rights and permissions
About this article
Cite this article
Liu, H., Doke, T., Guo, D. et al. Epigenomic and transcriptomic analyses define core cell types, genes and targetable mechanisms for kidney disease. Nat Genet 54, 950–962 (2022). https://doi.org/10.1038/s41588-022-01097-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-022-01097-w
This article is cited by
-
Genome-wide association study of serum magnesium in type 2 diabetes
Genes & Nutrition (2024)
-
GBA1 as a risk gene for osteoporosis in the specific populations and its role in the development of Gaucher disease
Orphanet Journal of Rare Diseases (2024)
-
Mosaic loss of Y chromosome is associated with aging and epithelial injury in chronic kidney disease
Genome Biology (2024)
-
Novel genetic markers for chronic kidney disease in a geographically isolated population of Indigenous Australians: Individual and multiple phenotype genome-wide association study
Genome Medicine (2024)
-
Unraveling the epigenetic code: human kidney DNA methylation and chromatin dynamics in renal disease development
Nature Communications (2024)