A new study by Jody Hey,1 published in PLoS Biology, sets new standards in the analysis of human genetic data. Using new statistical methods and a combined analysis of nine genes, Hey provides a detailed picture of the events associated with the first migration of Asians into the Americas.
Explorations into the use of DNA sequence data for human demographic inferences began in the late 1980s and early 1990s.2, 3 The research was focused on testing the out-of-Africa hypothesis and the main inferential tool was the estimation of gene trees. However, it soon became apparent that demographic inferences cannot easily be made on the basis of an estimated gene tree, mainly because the relationship between particular demographic models and gene trees is very complex. The same gene tree may arise from multiple different demographic models. A method for connecting gene trees with demographic models was needed. Coalescent theory4 turned out to provide this link. Using coalescent theory it is possible to calculate how likely a particular gene tree is under a particular demographic model. The coalescent framework was used to estimate population growth rates, and methods for inferring migration rates and other parameters were developed.5, 6, 7, 8 Unfortunately, most of the models were so demographically naïve that they hardly were applicable to real human data. The fundamental problem has been that the effects of various factors, such as changes in population sizes, gene flow between populations (migration), and divergence of populations from a shared ancestral population, are intertwined, making it impossible to determine the effect of one factor without taking the other into account. The only solution to this problem is to construct complex models that take all (or as many as possible) of the relevant factors into account.
The study by Hey, Rutgers University, sets the bar for such studies. His model incorporates changes in population size, gene flow, and divergence – allowing new explorations into human genetic demography. Inferences are made in a coalescent-based statistical framework that takes into account the uncertainty in the data regarding the gene tree (no gene tree can be estimated with 100% accuracy) and can combine the information from many different loci. He applied this method to data from nine loci from East-Asians and Amerind-speaking Native American populations. The major objective was to determine the timing of the earliest migrations into the Americas from Asia, and determine the effective population sizes of past and present populations. The results suggest that the first wave of migration occurred relative recently but that the effective number of migrants was about 90.
Although much emphasis has been put on the exact number of migrants populating the Americas, it is should be noted that the estimates obtained in genetic studies are of the effective population size at the time of migration. The actual number of people could be substantially higher than the effective number. For example, Hey found the effective population size of the number of people in the ancestral Asian population to be approximately 9000, implying that the number of individuals peopling the Americas in the first wave corresponds to as much as 1% of the entire East Asian population. The results also show that there could have been substantial levels of migration between Asians and Amerindians in the years after the first wave of migration. Nonetheless, the study clearly describes a picture of demographic events that include strong growth in the population size after the first wave of migration and a very recent migration event.
Some of the parameters of interest could not be estimated with great certainty. For example, the date of the first migration event was associated with much statistical uncertainty, and the relative importance of migration after the first migration event could not be determined. Although this could be seen as a weakness of the study, it really points to the strength of the methodology. The method is based on a statistical method that takes all the relevant information from the genetic data into account. So when some of the parameters are difficult to estimate, it implies that the data does not contain enough information about these parameters. In this way, the methodology significantly helps to quantify the uncertainty in the data. It also raises serious concerns about previous studies which, based on much less data, and without the use of rigorous statistical methods, have made strong claims about human demography using genetic data.
What sets Hey's study apart from other similar studies is the use of complex and more realistic models. While no model can be exactly true, the approach by Hey can help distinguish good models from bad ones. Genetic data in human demographic studies have often been analyzed by interpreting an estimated gene tree or network. As Hey points out, the verbal interpretations are themselves models that often are very simplistic. The method presented by Hey is an important step forward in the field of human genetic demographics, replacing Ad hoc story telling with rigorous model testing and statistical inferenceâ–ª
References
Hey J : On the number of new world founders: a population genetic portrait of the peopling of the Americas. PLoS Biol 2005; 3: e193.
Vigilant L, Pennington R, Harpending H, Kocher TD, Wilson AC : Mitochondrial DNA sequences in single hairs from a southern African population. Proc Natl Acad Sci USA 1989; 86: 9350–9354.
Vigilant L, Stoneking M, Harpending H, Hawkes K, Wilson AC : African populations and the evolution of human mitochondrial DNA. Science 1991; 253: 1503–1507.
Hudson RR : Gene genealogies and the coalescent process; in Futuyma D, Antonovics J (eds): Oxford Surveys in Evolutionary Biology. New York: Oxford University Press, 1990, pp 1–44.
Hudson RR, Slatkin M, Maddison WP : Estimation of levels of gene flow from DNA sequence data. Genetics 1992; 132: 583–589.
Tajima F : The effect of change in population size on DNA polymorphism. Genetics 1989; 123: 597–601.
Griffiths RC, Tavare S : Sampling theory for neutral alleles in a varying environment. Philos Trans Roy Soc Lond B Biol Sci 1994; 344: 403–410.
Beerli P, Felsenstein J : Maximum-likelihood estimation of migration rates and effective population numbers in two populations using a coalescent approach. Genetics 1999; 152: 763–773.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nielsen, R. Demography: Peopling the Americas. Eur J Hum Genet 13, 1100–1101 (2005). https://doi.org/10.1038/sj.ejhg.5201481
Published:
Issue Date:
DOI: https://doi.org/10.1038/sj.ejhg.5201481