Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Functionally informed fine-mapping and polygenic localization of complex trait heritability

Abstract

Fine-mapping aims to identify causal variants impacting complex traits. We propose PolyFun, a computationally scalable framework to improve fine-mapping accuracy by leveraging functional annotations across the entire genome—not just genome-wide-significant loci—to specify prior probabilities for fine-mapping methods such as SuSiE or FINEMAP. In simulations, PolyFun + SuSiE and PolyFun + FINEMAP were well calibrated and identified >20% more variants with a posterior causal probability >0.95 than identified in their nonfunctionally informed counterparts. In analyses of 49 UK Biobank traits (average n = 318,000), PolyFun + SuSiE identified 3,025 fine-mapped variant–trait pairs with posterior causal probability >0.95, a >32% improvement versus SuSiE. We used posterior mean per-SNP heritabilities from PolyFun + SuSiE to perform polygenic localization, constructing minimal sets of common SNPs causally explaining 50% of common SNP heritability; these sets ranged in size from 28 (hair color) to 3,400 (height) to 2 million (number of children). In conclusion, PolyFun prioritizes variants for functional follow-up and provides insights into complex trait architectures.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Calibration, power and computational cost of fine-mapping methods in main simulations.
Fig. 2: Summary of fine-mapping results for UK Biobank traits.
Fig. 3: Examples of the advantages of functionally informed fine-mapping for UK Biobank traits.
Fig. 4: Functional enrichment of SuSiE fine-mapped common SNPs for UK Biobank traits.
Fig. 5: Polygenic localization results for UK Biobank traits.

Similar content being viewed by others

Data availability

PolyFun fine-mapping results generated in the present study are available for public download at http://data.broadinstitute.org/alkesgroup/polyfun_results. Summary LD information generated in the present study is available for public download at https://data.broadinstitute.org/alkesgroup/UKBB_LD. Baseline-LF v2.2.UKB annotations and LD scores for UK Biobank SNPs are available at https://data.broadinstitute.org/alkesgroup/LDSCORE/baselineLF_v2.2.UKB.tar.gz. Access to the UK Biobank resource is available via application (http://www.ukbiobank.ac.uk).

Code availability

PolyFun and PolyLoc software is available at https://github.com/omerwe/polyfun. SuSiE software is available at https://github.com/stephenslab/susieR. FINEMAP software is available at http://www.christianbenner.com/#.

References

  1. Visscher, P. M. et al. 10 years of GWAS discovery: biology, function, and translation. Am. J. Hum. Genet. 101, 5–22 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Shendure, J., Findlay, G. M. & Snyder, M. W. Genomic medicine—progress, pitfalls, and promise. Cell 177, 45–57 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Schaid, D. J., Chen, W. & Larson, N. B. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet. 19, 491–504 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. The Wellcome Trust Case Control Consortium et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 44, 1294–1301 (2012).

    Google Scholar 

  5. Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. & Eskin, E. Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Chen, W. et al. Fine mapping causal variants with an approximate Bayesian method using marginal test statistics. Genetics 200, 719–736 (2015).

    PubMed  PubMed Central  Google Scholar 

  7. Farh, K. K.-H. et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518, 337–343 (2015).

    CAS  PubMed  Google Scholar 

  8. Huang, H. et al. Fine-mapping inflammatory bowel disease loci to single-variant resolution. Nature 547, 173–178 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Mahajan, A. et al. Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes. Nat. Genet. 50, 559–571 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Mahajan, A. et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 50, 1505–1513 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Westra, H.-J. et al. Fine-mapping and functional studies highlight potential causal variants for rheumatoid arthritis and type 1 diabetes. Nat. Genet. 50, 1366–1374 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Ulirsch, J. C. et al. Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nat. Genet. 51, 683–693 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    PubMed Central  Google Scholar 

  15. Trynka, G. et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 45, 124–130 (2013).

    CAS  PubMed  Google Scholar 

  16. The Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    Google Scholar 

  17. Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Kichaev, G. et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet 10, e1004722 (2014).

    PubMed  PubMed Central  Google Scholar 

  19. Kichaev, G. et al. Improved methods for multi-trait fine mapping of pleiotropic risk loci. Bioinformatics 33, 248–255 (2017).

    CAS  PubMed  Google Scholar 

  20. Chen, W., McDonnell, S. K., Thibodeau, S. N., Tillmans, L. S. & Schaid, D. J. Incorporating functional annotations for fine-mapping causal variants in a Bayesian framework using summary statistics. Genetics 204, 933–958 (2016).

    PubMed  PubMed Central  Google Scholar 

  21. Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. Series B https://doi.org/10.1111/rssb.12388 (2020).

  22. Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Benner, C., Havulinna, A., Salomaa, V., Ripatti, S. & Pirinen, M. Refining fine-mapping: effect sizes and regional heritability. Preprint at bioRxiv https://doi.org/10.1101/318618 (2018).

  24. Gazal, S. et al. Linkage disequilibrium–dependent architecture of human complex traits shows action of negative selection. Nat. Genet. 49, 1421–1427 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Gazal, S. et al. Functional architecture of low-frequency variants highlights strength of negative selection across coding and non-coding annotations. Nat. Genet. 50, 1600–1607 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Gazal, S., Marquez-Luna, C., Finucane, H. K. & Price, A. L. Reconciling S-LDSC and LDAK functional enrichment estimates. Nat. Genet. 51, 1202–1204 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Benner, C. et al. Prospects of fine-mapping trait-associated genomic regions by using summary statistics from genome-wide association studies. Am. J. Hum. Genet. 101, 539–551 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Pasaniuc, B. & Price, A. L. Dissecting the genetics of complex traits using summary association statistics. Nat. Rev. Genet. 18, 117–127 (2017).

    CAS  PubMed  Google Scholar 

  30. Marquez-Luna, C. et al. Modeling functional enrichment improves polygenic prediction accuracy in UK Biobank and 23andMe data sets. Preprint at bioRxiv https://doi.org/10.1101/375337 (2019).

  31. Loh, P.-R., Kichaev, G., Gazal, S., Schoech, A. P. & Price, A. L. Mixed-model association for biobank-scale datasets. Nat. Genet. 50, 906–908 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Hujoel, M. L., Gazal, S., Hormozdiari, F., van de Geijn, B. & Price, A. L. Disease heritability enrichment of regulatory elements is concentrated in elements with ancient sequence age and conserved function across species. Am. J. Hum. Genet. 104, 611–624 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. O’Connor, L. J. et al. Extreme polygenicity of complex traits is explained by negative selection. Am. J. Hum. Genet. 105, 456–476 (2019).

    PubMed  PubMed Central  Google Scholar 

  35. Zeng, J. et al. Signatures of negative selection in the genetic architecture of human complex traits. Nat. Genet. 50, 746–753 (2018).

    CAS  PubMed  Google Scholar 

  36. Zhang, Y., Qi, G., Park, J.-H. & Chatterjee, N. Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits. Nat. Genet. 50, 1318–1326 (2018).

    CAS  PubMed  Google Scholar 

  37. Zhu, X. & Stephens, M. Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes. Nat. Commun. 9, 4361 (2018).

    PubMed  PubMed Central  Google Scholar 

  38. Moser, G. et al. Simultaneous discovery, estimation and prediction analysis of complex traits using a Bayesian mixture model. PLoS Genet. 11, e1004969 (2015).

    PubMed  PubMed Central  Google Scholar 

  39. Yang, J. et al. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat. Genet. 47, 1114–20 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Purcell, S. M. et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009).

    CAS  PubMed  Google Scholar 

  41. Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Loh, P.-R. et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177–1186 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. Wuttke, M. et al. A catalog of genetic loci associated with kidney function from analyses of a million individuals. Nat. Genet. 51, 957–972 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Landi, M. T. et al. Genome-wide association meta-analyses combining multiple risk phenotypes provide insights into the genetic architecture of cutaneous melanoma susceptibility. Nat. Genet. 52, 494–504 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Vujkovic, M. et al. Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nat. Genet. 52, 680–691 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Chatterjee, N., Shi, J. & García-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Márquez-Luna, C., Loh, P.-R. & Consortium, S. A. T. Multiethnic polygenic risk scores improve risk prediction in diverse populations. Genet. Epidemiol. 41, 811–823 (2017).

    PubMed  PubMed Central  Google Scholar 

  50. Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Claussnitzer, M. et al. FTO obesity variant circuitry and adipocyte browning in humans. N. Engl. J. Med. 373, 895–907 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Jung, I. et al. A compendium of promoter-centered long-range chromatin interactions in the human genome. Nat. Genet. 51, 1442–1449 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Zeggini, E., Gloyn, A. L., Barton, A. C. & Wain, L. V. Translational genomics and precision medicine: moving from the lab to the clinic. Science 365, 1409–1413 (2019).

    CAS  PubMed  Google Scholar 

  54. Solovieff, N., Cotsapas, C., Lee, P. H., Purcell, S. M. & Smoller, J. W. Pleiotropy in complex traits: challenges and strategies. Nat. Rev. Genet. 14, 483–495 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Wang, K., Li, M. & Hakonarson, H. Analysing biological pathways in genome-wide association studies. Nat. Rev. Genet. 11, 843–854 (2010).

    CAS  PubMed  Google Scholar 

  56. De Leeuw, C. A., Neale, B. M., Heskes, T. & Posthuma, D. The statistical properties of gene-set analysis. Nat. Rev. Genet. 17, 353–364 (2016).

    CAS  PubMed  Google Scholar 

  57. Haworth, S. et al. Apparent latent structure within the UK Biobank sample has implications for epidemiological analysis. Nat. Commun. 10, 333 (2019).

    PubMed  PubMed Central  Google Scholar 

  58. Kichaev, G. & Pasaniuc, B. Leveraging functional-annotation data in trans-ethnic fine-mapping studies. Am. J. Hum. Genet. 97, 260–271 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Wang, H. & Song, M. Ckmeans.1d.dp: optimal k-means clustering in one dimension by dynamic programming. R J. 3, 29–33 (2011).

    PubMed  PubMed Central  Google Scholar 

  60. Shi, H., Kichaev, G. & Pasaniuc, B. Contrasting the genetic architecture of 30 complex traits from summary association data. Am. J. Hum. Genet. 99, 139–53 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. The UK10K Consortium et al. The UK10K project identifies rare variants in health and disease. Nature 526, 82–90 (2015).

  62. Price, A. L. et al. Long-range LD can confound genome scans in admixed populations. Am. J. Hum. Genet. 83, 132–135 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Wolfe, D., Dudek, S., Ritchie, M. D. & Pendergrass, S. A. Visualizing genomic information across chromosomes with PhenoGram. BioData Min. 6, 18 (2013).

    PubMed  PubMed Central  Google Scholar 

  64. Welter, D. et al. The NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank B. Pasaniuc, G. Kichaev, M. Stephens, G. Wang, M. Kanai, B. M. Schilder and T. Raj for helpful discussions. This research was conducted using the UK Biobank Resource under application no. 16549 and was funded by National Institutes of Health grants (nos. U01 HG009379, R37 MH107649, R01 MH101244 and R01 HG006399) and the Academy of Finland grants (nos. 288509 and 312076). H.K.F. is supported by E. and W. Schmidt. Computational analyses were performed on the O2 High-Performance Compute Cluster at Harvard Medical School.

Author information

Authors and Affiliations

Authors

Contributions

O.W. and A.L.P. designed the study. O.W. and S.G. analyzed the data. C.B. extended the FINEMAP software. O.W. and A.L.P. wrote the manuscript with assistance from F.H., C.B., R.C., J.U., S.G., A.P.S., B.v.d.G., Y.R., C.M.L., L.O., M.P. and H.K.F.

Corresponding authors

Correspondence to Omer Weissbrod or Alkes L. Price.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Assessing the individual impact of step 1 of PolyFun (estimating functional enrichment) via perturbation analysis, by randomly shuffling different proportions of annotation coefficient estimates.

For each evaluated value of the proportion of shuffled annotation coefficient estimates, we report the number of experiments having each obtained FDR level >0 (left panel) and the number of experiments having each obtained power level >0 (right panel), out of 1000 experiments. FDR and power are reported with respect to identifying PIP ≥ 0.95 SNPs. Experiments with FDR = 0 (resp. power=0) are not reported in the left panel (resp. right panel) to improve clarity. Numerical reports are provided in Supplementary Table 6.

Extended Data Fig. 2 Assessing the individual impact of step 2 of PolyFun (estimating per-SNP heritabilities on odd/even chromosomes) via perturbation analysis, by using both odd and even chromosomes to estimate functional enrichment.

The figure is similar to Extended Data Figure 1 but applies a different perturbation (using both odd and even chromosomes to estimate functional enrichment). Numerical reports are provided in Supplementary Table 6.

Extended Data Fig. 3 Assessing the individual impact of step 3 of PolyFun (partitioning all SNPs into 20 bins of similar per-SNP heritability) via perturbation analysis, by varying the number of per-SNP heritability bins.

The figure is similar to Extended Data Figure 1 but applies a different perturbation (changing the number of per-SNP heritability bins). Numerical reports are provided in Supplementary Table 6.

Extended Data Fig. 4 Assessing the individual impact of step 4 of PolyFun (re-estimating per-SNP heritabilities within each bin excluding the target chromosome) via perturbation analysis, by not excluding the target chromosome from the re-estimation procedure.

The figure is similar to Extended Data Figure 1 but applies a different perturbation (disables the exclusion of the target chromosome, either when using the default sample size N = 320 K or when using a smaller sample size of N = 10 K). Numerical reports are provided in Supplementary Table 6.

Extended Data Fig. 5 Assessing the individual impact of step 5 of PolyFun (specifying prior causal probabilities in proportion of the re-estimated per-SNP heritabilities) via perturbation analysis, by randomly permuting estimated prior causal probabilities.

The figure is similar to Extended Data Figure 1 but applies a different perturbation (randomly permuting estimated prior causal probabilities). Numerical reports are provided in Supplementary Table 6.

Extended Data Fig. 6 Visualization of fine-mapping results for UK Biobank traits.

We display an ideogram of all 2,225 PIP > 0.95 fine-mapped SNPs identified by PolyFun + SuSiE across 49 UK Biobank traits. Traits are color-coded into groups (see legend and Supplementary Table 8). White circles indicate SNPs that are pleiotropic for ≥2 genetically uncorrelated traits, with circles to the right of a white circle denoting the genetically uncorrelated traits (max of 5 colored circles due to space limitations). Numerical results are reported in Supplementary Table 10.

Extended Data Fig. 7 Functional enrichment of PolyFun + SuSiE fine-mapped common SNPs for UK Biobank traits.

The figure is analogous to Fig. 4 but uses PIPs computed by PolyFun + SuSiE instead of SuSiE. Numerical results are reported in Supplementary Table 26.

Extended Data Fig. 8 Functional enrichment of SuSiE fine-mapped MAF > 0.001 SNPs for UK Biobank traits.

The figure is analogous to Fig. 4 but uses MAF > 0.001 SNPs instead of common (MAF > 0.05) SNPs. Numerical results are reported in Supplementary Table 27.

Extended Data Fig. 9 Functional enrichment of SuSiE fine-mapped low-frequency and rare SNPs for UK Biobank traits.

The figure is analogous to Fig. 4 but uses only low-frequency and rare SNPs (0.05>MAF > 0.001) instead of common (MAF > 0.05) SNPs. Numerical results are reported in Supplementary Table 28.

Supplementary information

Supplementary Information

Supplementary Note

Reporting summary

Supplementary Tables

Supplementary Tables 1–33

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Weissbrod, O., Hormozdiari, F., Benner, C. et al. Functionally informed fine-mapping and polygenic localization of complex trait heritability. Nat Genet 52, 1355–1363 (2020). https://doi.org/10.1038/s41588-020-00735-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-020-00735-5

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research