Aymara mtDNA variation and demographic history in the Central Andes

Batai, K., V. J. Vitzthum, et al. (2011). “Aymara mtDNA variation and demographic history in the Central Andes.” American Journal of Physical Anthropology 144(S52): 83.

This is my paper that I presented at the American Association of Physical Anthropologists (AAPA) meeting in Minneapolis, MN. The main question that I had was whether increased female gene flow was a stronger force influencing mtDNA genetic variation in Central Andes than large female effective population size due to demographic expansion that they experienced after introduction of intensive agriculture.  My data suggests that both increased female effective population size and gene flow were important factors.  Now, I have to prepare for publications.

In Central Andes, exploitation of marine resources and intensive agriculture led to population increase early in prehistory. The region is characterized by constant population movement as well.  These events undoubtedly affected regional genetic variation, but the exact nature of these effects remains uncertain.  In this study, mtDNA HVRI sequence variation in 61 Aymara individuals from La Paz, Bolivia, was analyzed and compared to that of other Latin American populations to examine how increased female effective population size and gene flow influenced the mtDNA variation among Central Andean and other western South American populations.

The Aymara and Quechua were genetically diverse showing evidence of population expansion and large effective population size when estimated using maximum likelihood methods that account for gene flow between subdivided populations.  Spatial expansion models generally fit the mtDNA variation observed in Latin America well, especially among genetically less diverse populations, but a demographic expansion model fits the mtDNA variation found among Central Andean populations well.

These results suggest that female effective population had a greater impact on mtDNA variation than female gene flow among Central Andeans.  However, migration rates and the results of AMOVA and multidimensional scaling analysis suggest that female gene flow was also an important factor, influencing genetic variation among the Central Andeans as well as lowland populations from western South America.  The interaction sphere may have extended to the transitional zones between the Andes and Amazon making populations from these areas more genetically diverse and similar to Central Andeans.

Sex-biased demography in humans: differences in global and regional scale analyses

Wilkins, J. F. and F. W. Marlowe (2006). “Sex-biased migration in humans: what should we expect from genetic data?” BioEssays 28(3): 290-300.

There has been a debate on whether asymmetrical demographic history between two sexes is a result of small male effective population size due to polygyny (e.g., herehere and here) or increased female gene flow (e.g., here).  Many analytical methods used to address these questions assume equilibrium (e.g. migration rates were constant and the direction of gene flow stayed the same), but in reality migration rates and direction of gene flow always changes and they argue that the model that accounts for the changes is necessary.  Since very recent demographic changes affect a regional level genetic variation, but not global level, sampling of populations analyzed affects the results and interpretation.

First, they (Marlowe) reviewed the anthropological literatures and suggest that female migration rate increased after introduction of agriculture.  Foragers, hunter-gatherers, tend to have bilateral kinship system and flexible post-marital residence pattern, so in forager societies, both males and females moved.  On the other hand, in pastoralist and agriculturalist societies, patrilocal post-marital residence pattern is more dominant, so females tend to move more than males.

Second, they (Wilkins) conducted a series of computer simulations to show that statistics like FST, therefore that estimation of male to female effective population size ratio based on FST, are influenced by timing of migration rate change and whether researchers use local and dense global sampling or geographically sparse sampling.  So, when regional population samples or dens global population samples are analyzed, genetic data should show how recent female migrations after the introduction of agriculture affected genetic variation.  When geographically sparse population samples are analyzed, on the other hand, genetic data should reflect demography of archaic foragers that existed before introduction of agriculture.

As the title of the article says, this is what we should expect based on the theory, and this should be tested using empirical data.

Does global mtDNA and Y chromosome data show similar male and female gene flow pattern?

Wilder, J. A., S. B. Kingan, et al. (2004). “Global patterns of human mitochondrial DNA and Y-chromosome structure are not influenced by higher migration rates of females versus males.” Nature Genetics 36: 1122-1125.

In two publications that I reviewed in the last post, Hammer and his colleagues did not really consider the effects of gene flow on genetic variation.  In 2004, Wilder et al. published another article and in their article, they show that there is no significant difference in male and female gene flow pattern, so the asymmetrical demographic pattern observed is a result of difference in male and female effective population sizes.

Their results are very different from Seielstad et al. (1998).  Their Y chromosome data does not show more genetic differentiation than mtDNA data suggesting that male and female migration rates were not different.  First, within population variance estimated for mtDNA was much higher than that of Seielstad, and mtDNA and Y chromosome show similar within population and among populations variance.  Second, the Mantel test show significant correlation between pairwise mtDNA and Y chromosome genetic distance (ΦST). 

Therefore, they argue that shorter coalescent time and smaller genetic diversity of Y chromosome compared to other genetic markers are most likely due to polygyny.  Polygyny increase variance in male reproductive success, that reduces male effective population size.

However, we have to note that they used unique global data set; Africa (Bakola, Dogon, South African Bantus, Khoisan), Europe (Datch and Italian), Asia (Mongolian Khalks and Sri Lankans), and Oceania (Papua New Guineans and Baining from New Britain).  Some of them are reproductively isolated populations that have experienced genetic drift.

I feel that they got these results because they used only 10 population samples and some of them are reproductively isolated populations.  Does including more population samples change the results of their analyses and interpretation?

Do males have smaller effective population than females?

Wilder, J., Z. Mobasher, et al. (2004). “Genetic evidence for unequal effective population sizes of human females and males.” Molecular Biology and Evolution 21: 2047-2057.

Pilkington, M., J. Wilder, et al. (2007). “Contrasting signature of population growth for mitochondrial DNA and Y chromosomes among human populations in Africa.” Molecular Biology and Evolution 25(3): 517-525.

While recognizing importance of female gene flow (Cox et al., 2008), Hammer and his colleagues (e.g., here) have been major critiques of Seielstad et al. (1998) and others who have shown that increased female gene flow is more important factor causing sex-biased demographic pattern.  Instead, they have argued that reduced male effective population size due to polygyny and other factors is the more important factor than increased female gene flow.

The reasons why they think reduced male effective population size, not increased female gene flow, is the major cause of asymmetrical demography are 1) shorter time to the most recent common ancestor (TMRCA) for Y chromosome than mtDNA, 2) clear evidence of demographic expansion for mtDNA, but not for Y chromosome, and 3) not significant difference in the population structure and differentiation between mtDNA and Y chromosome data.

To demonstrate that, Hammer and colleagues published a series of articles based on their analyses of mtDNA cytochrome c oxidase 3 and Y chromosome sequences, instead of mtDNA hypervariable region sequence and Y chromosome SNPs or STR.  Comparing different genetic markers with different mutation is problematic, so they chose these markers that have similar mutation rates.

First, Wilder et al. (2004) demonstrated that the TMRCA for Y chromosome is much shorter than for mtDNA.  They used a coalescent based program, GENETREE, to estimate the TMRCA.  This program allow users to estimate the TMRCA accounting for population growth, so the users can compare the difference in the TMRCA between mtDNA and Y chromosome accounting for differences in genetic diversity and demographic history

Second, Pilkington et al. (2007) focused on African populations and used several different methods to examine if mtDNA and Y chromosome variation show evidence of population expansion.  mtDNA variation of agriculturalists clearly show evidence of population expansion, while Y chromosome variation does not.

While these publications clearly show differences in mtDNA and Y chromosome variation resulting from differences in male and female demographic history, they did not really consider the effects of gene flow on genetic variation in these publications.

Genetic evidence for a higher female migration rate in humans: Does patrilocal post-marital residence pattern explain higher female migration rate?

Seielstad, M., E. Minch, et al. (1998). “Genetic evidence for a higher female migration rate in humans.” Nat Genet 20: 278-280.

Seielstad and colleagues compared Y chromosome variation to autosomal and mtDNA variation and they found that skewed Nν (≈ Nm, where N is effective population size and m is migration rate) ratios between different markers, and Y chromosome data show more population differentiation than other markers.  They believe that restricted male gene flow compared to female gene flow due to patrilocal residence is the cause of the skewed Nν and the population differentiation. 

They obtained Nν ratios of 7.96 (mtDNA:Y chromosome), 2.94 (mtDNA:autosomes), and 2.71 (autosomes:Y chromosome) using published global data.  They also obtained Nν ratios of 2.20 (autosomes:Ychromosome) based on their analysis of Y and autosomal microsatellite data of African populations.  Either or combination of two factors, increased female gene flow or/and reduced male effective population size, can cause this skew, but there are three reasons why they think increased female gene flow is the cause.

  1. For many genetic markers, the 80-90% of genetic variation is found within population (e.g., here), but Y chromosome variation does not follow this pattern.  Between group (or between continent) and within group (or within continent) variance explain Y chromosome variation more than mtDNA and autosomal variation, while within populations variance of Y chromosome explain less than mtDNA and autosomal variation.
  2. When the correlation between genetic distance (FST) and geographic distance are compared, as geographic distances between populations increases, genetic distance for Y chromosome increases much faster than other markers.
  3. They believe that even with very high level of polygyny, polygyny should not cause the big difference in Nν that they observed.

This is one of very first articles that discussed the asymmetrical demographic history of male and female among humans, and whether increased female gene flow or reduced male effective population size is the major cause of this asymmetry is still debated (e.g., here and here).  I agree that there was a great deal of female gene flow in the past, but if post-marital residence pattern was the major cause of the differences in migration rate between two sexes have to be examined more.  I also wonder if more recent studies of global Y chromosome variation show the similar pattern of population differentiation with small within population variance and how increasing number of markers, population samples, and sample size from each population affect the results and interpretation.

MIGRATE: Maximum-likelihood estimation of migration rates and effective population numbers

Beerli, P. and J. Felsenstein (1999). “Maximum-likelihood estimation of migration rates and effective population numbers in two populations using a coalescent approach.” Genetics 152: 763-773. 

Beerli, P. and J. Felsenstein (2001). “Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach.” Proceedings of the National Academy of Sciences of the United States of America 98(8): 4563-4568.

Beerli, P. (2004). “Effect of unsampled populations on the estimation of population sizes and migration rates between sampled populations.” Molecular Ecology 13(4): 827-836.

Beerli and Felsenstein (Beerli and Felsenstein, 1999) developed maximum-likelihood and coalescent theory based method to estimate migration rate and effective population size.  The likelihood parameter Θ= [Θ1, Θ2, М1, М2], where Θ is 4Neμ and Μ is m/μ, are calculated by exploring genealogical trees, including topology of branch length and migration scenarios.  Rather than exploring all possible genealogical trees, parameters are calculated focusing the trees with high likelihood using Markov chain Monte Carlo approach.  After each chain, parameters are recalculated and the likelihood is reevaluated.  To obtain more accurate estimate, this process is repeated for many times.

In 2001, Beerli and Felsenstein (Beerli and Felsenstein, 2001) developed a program called MIGRATE.  MIGRATE use n-Island model (n is number of subpopulations) and estimates asymmetrical migration rates (Μ) between demes and effective population size of demes (Θ) for more than three sampled populations at same time.  Multiplying Μ and Θ gives Nem.  Because gene flow influence within-population genetic diversity and MIGRATE estimates Θ accounting for gene flow (Ray et al., 2003), MIGRATE should provide more accurate estimates of effective population size than traditional methods (See also here).

Unfortunately, human populations usually violate the important assumptions to estimate parameters, so the interpretation of the data has to be treated with cautions.  First, MIGRATE assumes that there is no unsampled population exchanging genes with sampled populations.  Beerli (Beerli, 2004) examined the effects of unsampled populations on these methods.  He found that migration rate, Μ=m/μ, is not seriously affected, but effective population size is upwardly biased and Nem tends to be overestimated.  Beerli suggests that more accurate estimation can be obtained by running the program with many populations at the same time.  However, human populations tend to interact with many different populations and many of them are not sampled. 

Second, MIGRATE assumes that population size and migration rate did not change over time.  However, many human populations experienced demographic expansion or bottleneck in the past.  Development of better transportation technologies, recent state expansion, European contact, globalization, and industrialization increased movement of people over time.

mtDNA variation in South America and evolutionary history of native South Americans

In Central Andes, exploitation of marine resources and introduction of intensive agriculture caused population size to increase early in its prehistory.  Also, since its prehistory, there were constant movements of people in the Andes due to vertical use of the ecosystem, series of state expansions, forced migration by Inca, and migration to larger cities after European contact.  These events probably had great effects on the genetic variation, but what extent these event influenced genetic variation is still uncertain.  In this post, I reviewed recent mtDNA data from Central Andes to address these questions.

Following Tarazona-Santos et al. (2001), Fuselli et al. (2003) argues that mtDNA data supports Y chromosome data and Western part of South America, mainly Andes, and Eastern part of South America, mainly Amazon, had separate evolutionary histories.  The Andean populations have high within-population genetic diversity and they are genetically similar to each other.   They argue that both large effective population size and gene flow contributed for large within-population genetic diversity.  The Amazonian populations, on the other hand, have low within-population genetic diversity and they are genetically differentiated, because without gene flow, genetic drift had a great effect on geographically isolated small populations.

Lewis et al. (2005, 2007) analyzed mtDNA sequence variation of five highland Peruvian populations and compared to two of three highland Peruvian populations that Fuselli et al. (2003) analyzed and lowland populations.  Supporting the argument that Fusellli et al. put forth, Lewis et al. found high within-population genetic diversity among highland Peruvian populations and Andeans populations were genetically homogeneous compared to Amazonian populations.  However, their AMOVA results suggest that there is no significant genetic difference between Andean group and Amazonian group.  Lewis and Long (2008) found more mtDNA variation in Eastern South American and less genetic variation in the Andean region than previously reported, when regional variation was accounted for. 

Analyses of mtDNA variation among the populations that occupy in the transitional zone between Andes and Amazon provide more complicated perspectives on the South American evolutionary history.  Bert et al. (2004) and Corella et al. (2007) analyzed mtDNA variation of lowland Bolivians from the Department of Beni and Cabana et al. (2006) analyzed that of Gran Chaco.  In general, these populations have genetic diversity values intermediate of Andean and Amazonian populations and they are genetically differentiated from each other.  These patterns are expected from smaller relatively isolated populations. 

However, they found evidence suggesting that these populations were not reproductively isolated and there were gene flows among these populations as well as between Andeans and populations in the transitional zone responding to state expansion from Andean highland or reorganization of indigenous societies during the colonial era.  First, some of these small populations from lowland Bolivia have unexpectedly high within-population genetic diversity.  Second, when the Ayoreo is excluded from analyses, Gran Chaco populations were very homogeneous.  Finally, some these populations are genetically very similar to Andeans.

The population size in Central Andes may have been large enough for them to be genetically more diverse than other populations in South America, but the constant movements, or interactions, of people made populations in Central Andes genetically homogeneous and potentially genetically more diverse.  This interaction sphere may have extended into the transitional zones making populations from the transitional zone genetically diverse and similar to Central Andeans.  My review of articles on Central Andeans mtDNA variation, however, shows that no one has examined whether female effective population size or gene flow contribute more on mtDNA variation of these populations.

Bert, F., A. Corella, et al. (2004). “Mitochondrial DNA diversity in the Llanos de Moxos: Moxo, Movima and Yuracare Amerindian populations from Bolivia lowland.” Ann Hum Biol 31: 9-28.

Cabana, G. S., D. A. Merriwether, et al. (2006). “Is the genetic structure of Gran Chaco populations unique? Interregional perspectives on native South American mitochondrial DNA variation.” American Journal of Physical Anthropology 131(1): 108-119.

Corella, A., F. Bert, et al. (2007). “Mitochondrial DNA diversity of the Amerindian populations living in the Andean Piedmont of Bolivia: Chimane, Moseten, Aymara and Quechua.” Annals of Human Biology 34(1): 34-55.

Fuselli, S., E. Tarazona-Santos, et al. (2003). “Mitochondrial DNA diversity in South America and the genetic history of Andean highlanders.” Molecular Biology and Evolution 20(10): 1682-1691.

Lewis, C. M. J., B. Lizárraga, et al. (2007). “Mitochondrial DNA and the peopling of South America.” Human Biology 79: 159-178.

Lewis, C. M., Jr. and J. C. Long (2008). “Native South American genetic structure and prehistory inferred from hierarchical modeling of mtDNA.” Molecular Biology and Evolution 25(3): 478-486.

Lewis, C. M. J., R. Y. Tito, et al. (2005). “Land, language, and loci: mtDNA in Native Americans and the genetic history of Peru.” American Journal of Physical Anthropology 127: 351-360.

Tarazona-Santos, E., D. R. Carvalho-Silva, et al. (2001). “Genetic Differentiation in south amerindians is related to environmental and cultural diversity: Evidence from the Y chromosome.” American Journal of Human Genetics 68(6): 1485-1496.

Intra-deme molecular diversity in spatially expanding populations

Ray, N., M. Currat, et al. (2003). “Intra-deme molecular diversity in spatially expanding populations.” Molecular Biology and Evolution 20: 76-86

Using computer simulation, Ray and his colleagues analyzed effects of spatial, or range, expansion and gene flow on within-population genetic diversity and demonstrated that when migration rate (Nm) is large, demes have genetic signature of expansion similar to demographic expansion (N is deme size or effective population size of deme and m is rate of out-migrating individuals in a population that are replaced by incoming immigrants in each generation). 

In the spatial expansion model, the center of the expansion sends migrants to previously unoccupied areas and as deme size increases, migrants are sent to new areas.  When the migrants are sent to the previously occupied demes, gene flow takes place.  They show that spatially expanding populations with large Nm have large genetic diversity and large negative values of two neutrality tests.  They argues that spatial expansion model explains observed genetic pattern better than pure demographic model which assumes that populations are not subdivided. 

Up until 1990s, many population genetics methods to reconstruct demographic history used an unrealistic model.  The model assumes that populations are unsubdivided and people in the populations are randomly mating.  In reality, human populations are subdivided in a very complex way and there are many cultural factors that regulate mating patterns. 

More recently, population geneticists and anthropological geneticists try to understand how migration/gene flow between demes in subdivided populations affects population subdivision and demographic history.  Today, we start understanding that migration/gene flow between different ethnolinguistic and geographic groups was common and gene flow can affect genetic variation of populations in various ways.

Based on Ray and his colleagues’ finding and new perspective on gene flow and genetic variation, we need to reevaluate the model of human colonization and expansion, such as Neolithic expansions among European farmers, Bantus, and others.  One extreme version of the Neolithic expansion model suggests that the Neolithic farmers spread after pure demographic expansion at the core area without contribution of preexisting foragers or without much of gene flow after expansion.  The results of their simulation suggest that populations that experienced Neolithic expansion have genetic evidence of expansion either due to demographic expansion or increased gene flow.

Targeted Retrieval and Analysis of Five Neandertal mtDNA Genomes

Briggs, A. W., J. M. Good, et al. (2009). “Targeted Retrieval and Analysis of Five Neandertal mtDNA Genomes.” Science 325(5938): 318-321.

Briggs and his colleagues analyzed whole mtDNA genome sequence variation of five Neanderthal samples that they sequenced and one Neandethal sample that previously sequenced.  They found that Neanderthal sequence variation was much smaller than modern humans, even smaller than modern Europeans.  Of 6 total Neanderthal individuals analyzed, two had the exactly same sequence.  It is interesting because only two out of 30 modern Europeans had same mtDNA genome sequence.  Effective population was very small and did not exceed 3,500 females (mean Ne = 1476 and 95% confidence interval between 268 and 3510).  If we take the mean, the total Neanderthal population size is about 8,856 (1476 is multiplied by 2 assuming male-to-female effective population size was same and then multiplied by 3 because effective population size is about 1/3 of actual population size.  Note that effective population size is a long term average of number of people who contributed gene to next generation). 

The authors say that Neanderthal had small mtDNA diversity because they had a small effective population size for a long time during their existence, but their population size could have been reduced significantly, when anatomically modern human expanded into Europe.

This small mtDNA diversity and effective population size among Neanderthal is very interesting, first because Neanderthal population size estimated is smaller than many modern forager populations that are becoming extinct.  Second, lack of clear evidence of phylogeographic structure suggests that sparsely populated Neanderthals were highly mobile and high mobility of robust Neanderthals required high energy consumption.  Third, I wonder if skeletal morphology shows a similar pattern.  Do Neanderthals have less skeletal morphological variation than modern humans?

More information on Neanderthal genome can be found here and here.

Female-to-male breeding ratio in modern humans – an analyziz based on historical recombination

Labuda, D., J.-F. Lefebvre, et al. (2010). “Female-to-male breeding ratio in modern humans – an analyziz based on historical recombination.” American Journal of Human Genetics 86: 1-11.

Following Keinan et al., Labuda et al. also challenge Hammer’s argument and argue that polygyny was not a major factor influencing asymmetrical demographic pattern between males and females.  Hammer et al. and Keinan et al. obtained X chromosome to autosome variation or effective population size ratios from mutational diversity, but Labuda et al. based their analysis on population recombination rate (ρ) to estimate female-to-male breeding ratio (β).

If males have large reproductive variance, so if some males are reproductively more successful and have more children than others through polygyny or other polygynous sexual practices, male effective population size should be smaller than female effective population size, so the female-to-male breeding ration is larger than 1.  If polygyny was not major factor, on the other hand, male effective population size should be similar to female effective population size, so the female-to-male breeding ratio is expected to be close to 1.

They obtained the breeding ratio of 1.4 among the Yoruba from West Africa, 1.3 among the Europeans, and 1.1 among the East Asians.  They believe that there is slight excess of breeding females per male (10-40%) in the samples analyzed and they concluded that polygyny was not a major factor.

Our estimates of the breeding ratio are close to but greater than 1, suggesting some polygyny in the history of human populations…Excessive manifestation of polygyny are documented in the recent history of Asian populations, but this may be the exception rather than the rule.  Human beings are usually characterized as monogamous with polygamous tendencies.

Instead of polygyny, they suggest that serial monogamy and longer generation time of males can also increase the breeding ratio.

I agree with Labuda et al.  I doubt polygyny had that big of impacts on Europeans and Asian (Chinese and Japanese) genetic variation, but polygyny is more common in Africa and recent shift from flexible to more male-centric social structure may have had some impacts on the Yoruban genetic variation.

So, why Hammer’s argument is challenged by Keinan et al. and Labuda et al.?  Both Keinan et al. and Labuda et al. used HapMap data and findings from mutational diversity based analysis conducted by Keinan et al. were somewhat supported by recombination based analysis conducted by Labuda et al.  It might be safe to say that polygyny was not major factor affecting the genetic variation of the European and Asian populations analyzed by these two groups of re searchers.  I believe that Hammer’s data does not agree with these findings, not only because Hammer and his colleagues have smaller genome coverage, but also because they used different sample populations.  Probably, the same level of analyses that Keinan et al. and Labuda et al. conducted using more population samples is necessary to address how polygyny played a role affecting genetic variation.