Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Whole-genome sequencing of 175 Mongolians uncovers population-specific genetic architecture and gene flow throughout North and East Asia

Abstract

The genetic variation in Northern Asian populations is currently undersampled. To address this, we generated a new genetic variation reference panel by whole-genome sequencing of 175 ethnic Mongolians, representing six tribes. The cataloged variation in the panel shows strong population stratification among these tribes, which correlates with the diverse demographic histories in the region. Incorporating our results with the 1000 Genomes Project panel identifies derived alleles shared between Finns and Mongolians/Siberians, suggesting that substantial gene flow between northern Eurasian populations has occurred in the past. Furthermore, we highlight that North, East, and Southeast Asian populations are more aligned with each other than these groups are with South Asian and Oceanian populations.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Sampling, variants, and imputation.
Fig. 2: Population genetic structure.
Fig. 3: Inference of population demographic history.
Fig. 4: Gene flow between Mongolians and global human populations of 1000G.
Fig. 5: Phylogenetic relatedness of East Asian groups with other people.

Similar content being viewed by others

Data availability

Raw sequencing data and variant sets have been deposited to the CNGB (China National Genebank) Nucleotide Sequence Archive (CNSA) with accession CNP0000063 (https://db.cngb.org/cnsa/).

References

  1. Bai, H. et al. The genome of a Mongolian individual reveals the genetic imprints of Mongolians on modern human populations. Genome Biol. Evol. 6, 3122–3136 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Kolman, C. J., Sambuughin, N. & Bermingham, E. Mitochondrial DNA analysis of Mongolian populations and implications for the origin of New World founders. Genetics 142, 1321–1334 (1996).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Merriwether, D. A., Hall, W. W., Vahlne, A. & Ferrell, R. E. mtDNA variation indicates Mongolia may have been the source for the founding population for the New World. Am. J. Hum. Genet. 59, 204–212 (1996).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Karafet, T. M. et al. Ancestral Asian source(s) of new world Y-chromosome founder haplotypes. Am. J. Hum. Genet. 64, 817–831 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Brace, C. L. et al. Old World sources of the first New World human inhabitants: a comparative craniofacial view. Proc. Natl Acad. Sci. USA 98, 10017–10022 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Franke, H. & Twitchett, D. The Cambridge History of China: Alien Regimes and Border States, 907–1368 (Cambridge Univ. Press, New York, 1994).

  7. Zerjal, T. et al. The genetic legacy of the Mongols. Am. J. Hum. Genet. 72, 717–721 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Hellenthal, G. et al. A genetic atlas of human admixture history. Science 343, 747–751 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Weatherford, J. M. Genghis Khan and the Making of the Modern World (Three Rivers Press, New York, 2004).

  10. Li, J. Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100–1104 (2008).

    Article  CAS  PubMed  Google Scholar 

  11. The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  Google Scholar 

  12. Pagani, L. et al. Genomic analyses inform on migration events during the peopling of Eurasia. Nature 538, 238–242 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. The HUGO Pan-Asian SNP Consortium. Mapping human genetic diversity in Asia. Science 326, 1541–1545 (2009).

    Article  Google Scholar 

  15. Mondal, M. et al. Genomic analysis of Andamanese provides insights into ancient human migration into Asia and adaptation. Nat. Genet. 48, 1066–1070 (2016).

    Article  CAS  PubMed  Google Scholar 

  16. Qin, P. et al. Quantitating and dating recent gene flow between European and East Asian populations. Sci. Rep. 5, 9500 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Wong, E. H. et al. Reconstructing genetic history of Siberian and Northeastern European populations. Genome Res. 27, 1–14 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Kong, Q. P. et al. Phylogeny of east Asian mitochondrial DNA lineages inferred from complete sequences. Am. J. Hum. Genet. 73, 671–676 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Derenko, M. et al. Phylogeographic analysis of mitochondrial DNA in northern Asian populations. Am. J. Hum. Genet. 81, 1025–1041 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Su, B. et al. Y-chromosome evidence for a northward migration of modern humans into Eastern Asia during the last Ice Age. Am. J. Hum. Genet. 65, 1718–1724 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Ke, Y. et al. African origin of modern humans in East Asia: a tale of 12,000 Y chromosomes. Science 292, 1151–1153 (2001).

    Article  CAS  PubMed  Google Scholar 

  22. Shi, H. et al. Y chromosome evidence of earliest modern human settlement in East Asia and multiple origins of Tibetan and Japanese populations. BMC Biol. 6, 45 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Zhong, H. et al. Global distribution of Y-chromosome haplogroup C reveals the prehistoric migration routes of African exodus and early settlement in East Asia. J. Hum. Genet. 55, 428–435 (2010).

    Article  PubMed  Google Scholar 

  24. Xing, J. et al. Genomic analysis of natural selection and phenotypic variation in high-altitude mongolians. PLoS Genet. 9, e1003634 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. The Genome of the Netherlands Consortium. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat. Genet. 46, 818–825 (2014).

    Article  Google Scholar 

  27. Huang, J. et al. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel. Nat. Commun. 6, 8111 (2015).

    Article  CAS  PubMed  Google Scholar 

  28. Reich, D. et al. Reconstructing Native American population history. Nature 488, 370–374 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Kong, Q. P. et al. Mitochondrial DNA sequence polymorphisms of five ethnic populations from northern China. Hum. Genet. 113, 391–405 (2003).

    Article  CAS  PubMed  Google Scholar 

  31. Stewart, J. B. & Chinnery, P. F. The dynamics of mitochondrial DNA heteroplasmy: implications for human health and disease. Nat. Rev. Genet. 16, 530–542 (2015).

    Article  CAS  PubMed  Google Scholar 

  32. Katoh, T. et al. Genetic features of Mongolian ethnic groups revealed by Y-chromosomal analysis. Gene 346, 63–70 (2005).

    Article  CAS  PubMed  Google Scholar 

  33. Poznik, G. D. et al. Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences. Nat. Genet. 48, 593–599 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Botigue, L. R. et al. Gene flow from North Africa contributes to differential human genetic diversity in southern Europe. Proc. Natl Acad. Sci. USA 110, 11791–11796 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Gravel, S. et al. Reconstructing Native American migrations from whole-genome and whole-exome data. PLOS Genet. 9, e1004023 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Henikoff, S. & Henikoff, J. G. Position-based sequence weights. J. Mol. Biol. 243, 574–578 (1994).

    Article  CAS  PubMed  Google Scholar 

  40. Pickrell, J. K. & Pritchard, J. K. Inference of population splits and mixtures from genome-wide allele frequency data. PLOS Genet. 8, e1002967 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  43. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  45. The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

    Article  PubMed Central  Google Scholar 

  46. The 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

    Article  PubMed Central  Google Scholar 

  47. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  48. Delaneau, O., Marchini, J. & Zagury, J. F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2012).

    Article  CAS  Google Scholar 

  49. Browning, B. L. & Browning, S. R. Genotype imputation with millions of reference samples. Am. J. Hum. Genet. 98, 116–126 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370 (1984).

    CAS  PubMed  Google Scholar 

  52. Liu, K. & Muse, S. V. PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics 21, 2128–2129 (2005).

    Article  CAS  PubMed  Google Scholar 

  53. Saitou, N. & Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987).

    CAS  PubMed  Google Scholar 

  54. Van Geystelen, A., Decorte, R. & Larmuseau, M. H. AMY-tree: an algorithm to use whole genome SNP calling for Y chromosomal phylogenetic applications. BMC Genomics 14, 101 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  55. Zhang, F. et al. YHap: a population model for probabilistic assignment of Y haplogroups from re-sequencing data. BMC Bioinformatics 14, 331 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  56. Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006).

    Article  CAS  PubMed  Google Scholar 

  57. Lewis, P. O. A likelihood approach to estimating phylogeny from discrete morphological character data. Syst. Biol. 50, 913–925 (2001).

    Article  CAS  PubMed  Google Scholar 

  58. van Oven, M. & Kayser, M. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Hum. Mutat. 30, E386–E394 (2009).

    Article  PubMed  Google Scholar 

  59. Fan, L. & Yao, Y. G. An update to MitoTool: using a new scoring system for faster mtDNA haplogroup determination. Mitochondrion 13, 360–363 (2013).

    Article  CAS  PubMed  Google Scholar 

  60. Kloss-Brandstatter, A. et al. HaploGrep: a fast and reliable algorithm for automatic classification of mitochondrial DNA haplogroups. Hum. Mutat. 32, 25–32 (2011).

    Article  PubMed  Google Scholar 

  61. Bergström, A. et al. A Neolithic expansion, but strong genetic structure, in the independent history of New Guinea. Science 357, 1160–1163 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  62. de Manuel, M. et al. Chimpanzee genomic diversity reveals ancient admixture with bonobos. Science 354, 477–481 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  63. Browning, B. L. & Browning, S. R. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics 194, 459–471 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Browning, B. L. & Browning, S. R. A fast, powerful method for detecting identity by descent. Am. J. Hum. Genet. 88, 173–182 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Atzmon, G. et al. Abraham’s children in the genome era: major Jewish diaspora populations comprise distinct genetic clusters with shared Middle Eastern ancestry. Am. J. Hum. Genet. 86, 850–859 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Reich, D., Thangaraj, K., Patterson, N., Price, A. L. & Singh, L. Reconstructing Indian population history. Nature 461, 489–494 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Yi, X. et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329, 75–78 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Alexandros, S. et al. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

    Article  Google Scholar 

Download references

Acknowledgements

We sincerely thank the Mongolian volunteers who agreed to contribute blood samples and participate in this study. We thank D. Reich for sharing genotype data on populations from Siberia and South Asia, and J. Fekecs for graphical assistance. We acknowledge F.S. Collins and C.D. Bustamante for their helpful discussions and comments on the manuscript, as well as Shuangshan Shuangshan, Y. Bao, and S. Ba for contributing to the sample collection process. This study was supported by Shenzhen Municipal Government of China (CXB201108250094A), Inner Mongolia University for Nationalities Scientific Research Project (MD2012038), the National Science Foundation of China (81560176, 81511130050), China National Genebank, Foundation of the Inner Mongolia Department of Science and Technology (2015MS0875, 201502103), Science and Technology Planning Project of Inner Mongolia, China (20120409), and the Guangdong Provincial Key Laboratory of Genome Read and Write (2017B030301010). C.R.G. is supported by the US National Institutes of Health (4U01HG007419-04) and National Science Foundation (1201234). N.N., S.R.B., and L.C.B are supported by the Intramural Research Program of the National Human Genome Research Institute, National Institutes of Health.

Author information

Authors and Affiliations

Authors

Contributions

Y.Y., H.Z., B.B., and H.B. initiated and supervised the project. H.B., Q.W., Y.X., Z.P., J.J., X.Y., M.M., B.G., D.W., Y.G., H.H., S.S., Y.C., YanruZ., L.Z., YiyiL., C.L., F.M., K.W., L.L., and YingchunL. surveyed and collected the samples. Y.X., YanruZ., DongZ., J.C., S.W., X.Li, and T.Li performed extraction of the genomic DNA. H.B., X.G., Q.W., M.J., and B.W. did the genome sequencing. YongZ., L.F., H.W., and T.Lan did the mapping and variation calling. T.Lan, X.G., H.L., W.L., Z.W., and B.W. performed experimental validation. X.G., T.Lan, and B.D. did the construction of the haplotype reference panel. X.G., T.Lan, DandanZ., H.X., N.D., X.Luo, W.X., and L.Y. performed the analysis of population diversity and genetic structure. T.Lan, X.G., N.N., B.D., and X.N. did the inferences of population demographic history. N.N., S.R.B., K.L., and C.R.G. did the analysis of phylogeny of East Asians. X.G., N.N., T.Lan, and S.R.B. wrote the manuscripts. X.G., C.Y., X.Luo, and T.Li were in charge of data submission. N.N., X.G., T.Lan, S.R.B., N.D., C.R.G., X.X., X.Liu, H.Y., L.C.B., J.W., and K.K. revised the manuscript.

Corresponding authors

Correspondence to Burenbatu Burenbatu, Huanmin Zhou or Ye Yin.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–17 and Supplementary Tables 1–8

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bai, H., Guo, X., Narisu, N. et al. Whole-genome sequencing of 175 Mongolians uncovers population-specific genetic architecture and gene flow throughout North and East Asia. Nat Genet 50, 1696–1704 (2018). https://doi.org/10.1038/s41588-018-0250-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-018-0250-5

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research