Accelerated genetic drift on chromosome X during the human dispersal out of Africa

Keinan, A., J. C. Mullikin, et al. (2009). “Accelerated genetic drift on chromosome X during the human dispersal out of Africa.” Nat Genet 41(1): 66-70.

Keinan and his colleagues provide data that challenges Hammer’s argument.  While Hammer and his colleagues have argued that female effective population size is larger than male effective population size largely due to polygynous practices, comparing X chromosome variation to autosomal variation, Keinan and his colleagues show that female effective population size was reduced outside of Africa (note that females carry two X chromosomes and males carry one, so X chromosome variation reflect female demographic history more than male).   

Compared to Hammer et al., Keinan et al. used bigger genome data.  They analyzed 130,000 SNPs using subset of the HapMap data, 1,087 additional SNPs that they discovered in two West African copies of X chromosomes, and sequence data consist of over a billion base pairs of DNA from five North Europeans, four East Asians, and five Africans. 

First, using SNP data, they obtained the ratio of X chromosome and autosomes allele frequency differentiation between two populations (FST) to estimate the amount of genetic drift.  The ratios obtained between North European and East Asian were not significantly different from expected ratio (3/4 = 0.75), but the ratios between African and non-African were reduced.

Second, they compared the X chromosome and autosomes SNP allele frequency distribution within each population.  The shape of allele frequency distribution for X chromosome and autosomes was significantly different for non-Africans.  Non-Africans have more high-frequency derived allele on X chromosome than expected and the X chromosome allele frequency distribution of non-Africans does not fit the expected distribution.

Third, they obtained the X-to-autosome sequence divergence ratios for each population.  West African has ratio close to expected, but non-Africans have significantly smaller ratio than expected (0.635 for North European and 0.690 for East Asian).

They think that X chromosome experienced accelerated genetic drift and sex-biased demographic processes rather than natural selection is likely explanation.  However, the data do not support that polygyny is one of the process, because polygyny increases the ratio, but they observed decreased ratios.  Alternatively, they suggest that non-Africans received long-range male migration from Africa or females have longer generation time than males.  Also, some females were reproductively more successful than the others during out-of-Africa dispersal.

Two different conclusions were obtained from different groups of researchers, maybe because of several factors.  First, the samples used by two groups were different.  Hammer et al. have more sample populations that Keinan et al did not use.   Second, although Hammer et al have sequence data from more individuals than Keinan et al., they used much smaller genomic data.  Third, two groups used very different analytical methods.

Race: a social destruction of a biological concept

Sesardic, N. (2010). “Race: a social destruction of a biological concept.” Biology and Philosophy 25(2): 143-162.

Anthropologists, other social scientists, philosophers, and human population geneticists have argued that there is no genetic basis for racial classification, but in this article, Sesardic (2010) argues that non-genetic basis of human race arguments are not supported by the recent multilocus genetic data.  The point that he is making is not existence of human biological race, but questioning the scientific basis for the non-existence of biological race arguments.

Like Pigliucci and Kaplan, Sesardic starts out with a problem of defining race, but he mainly focus on examining how philosophers and others, who argues no genetic differences among human groups, define race to illustrate the way they define race are not supported by recent genetic data showing genetic differences among human groups. 

Sesardic argues that if frequencies of alleles on one locus are used for racial classification, individuals cannot be classified correctly into right racial categories, but if multilocus genetic data are used as demonstrated by Risch and his colleagues and Rosenberg et al (2002), many individuals can be classified into racial or geographical categories correctly.  Similarly, if forensic anthropologists look at many skeletal traits, they can accurately infer the racial identity of individuals. 

As Sesardic suggested, no genetic difference argument is not supported by many genetic and osteological studies.  However, we should avoid a naïve conclusion.  The multilocus genetic data showing genetic differences among human groups should not be used to argue the existence of human biological race (note that Sesardic is not arguing this).  We have to consider evolutionary and historical process as well as sampling and statistical effects that cause the clustering of human groups illustrating genetic differences.

Human Genome Diversity Cell Line Panel samples

Cann, H. M., C. d. Toma, et al. (2002). “A Human Genome Diversity Cell Line Panel.” Science 296(5566): 261-262.

Cavalli-Sforza, L. L. (2005). “The Human Genome Diversity Project: past, present and future.” Nature Reviews: Genetics 6: 333-340.

Human Polymorphism Study Center (CEPH) website   

Cultured cell lines of 1050 individuals from 51 populations are stored at the Center for the Study of Human Polymorphism (CEPH), the Foundation Jean Dausset in Paris to facilitate anthropological and medical genetic research.  Samples were collected with full consent.  The sampled populations are of anthropological interests and are from five continents.  Cavalli-Sforza says that these sampled populations are potentially non-admixed with Europeans.

However, because the sampled populations are not randomly collected samples from the world, the sampled population set could be inadequate for understanding of human evolution and population structure.  For example, despite the great genetic variation exist in sub-Saharan Africa, only six sub-Saharan Africans were included and of six, three of them are forager populations.  One of the sub-Saharan African population sampled, Bantus, are collections of samples from six different Bantu ethnic groups and 12 Bantu individuals from Kenya, but the Bantus are linguistically, culturally, and genetically diverse.  Similarly, the Yoruba is also a culturally and potentially genetically diverse group. 

Compared to sparse collection of samples in Africa, samples from Asia are concentrated around Pakistan (8 ethnic groups) and China (Han Chinese and 14 ethnic minorities).  The gaps in the geographical distribution of sampled populations are in the area where more admixed populations occupy, such as North Africa, Middle East, India, and Central Asia.  In these areas, there are series of prehistoric and historic migrations, expansion of states/empires, and long-distance trade.  Also, there are many areas of the world beside Africa that requires additional sampling. 

Cavalli-Sforza addresses potential inadequacy of samples collected for intended research purposes.  Although there are great geographical gaps in the sampled population set and more samples need to be collected in the future, he believes that this initial collection is essential to determine how samples need to be collected later.  

Research projects that used CEPH samples are reviewed here.

Mobile elements reveal small population size in the ancient ancestors of Homo sapiens

Huff, C. D., J. Xing, et al. (2010). “Mobile elements reveal small population size in the ancient ancestors of Homo sapiens.” Proceedings of the National Academy of Sciences 107(5): 2147-2152

Huff et al. (2010) analyzed genome variation of two samples, focusing on the SNPs around the mobile element insertion areas.  The theory behind this project is that mobile element insertions (Alu and LINE1) are much rarer, so they have deep genealogies (ancient coalescent time). 

Their research basically supports this theoretical point.  First, TMRCA estimated based on 9,609 SNPs in the 10 kb around insertion was 462 k years old, which is older than the TMRCA estimated from other genomic regions.  Second, more interestingly, they estimated significantly larger ancient effective population size than modern effective population size.  They used a coalescent-Maximum likelihood based method to estimate three demographic parameters.

Modern effective population = 8,500

Ancient effective population = 18,500 (C.I. 14,500-26,000)

Time of population size change = 1.2 M years

This means that effective population size before 1.2 M years ago was 18,500.  The small effective population size of modern human support many previous genetic studies, but it is interesting to see that modern human have genetic evidence that suggests that ancestors of modern human, such as Homo erectus, had much larger effective population size and they were much more genetically diverse than anatomically modern human.  Since effective population size of modern humans is much smaller than Chimpanzee, it has been suggested that our ancestors experienced series of bottleneck, but this research data show the significant reduction in the population size occurred after 1.2 M years ago.  Jorde actually said in the NIH Genome Center Lecture series that our ancestors almost became extinct.

The Khoisan and Bantu genomes: Evidence of recent gene flow between two groups?

Schuster, S. C., W. Miller, et al. (2010). “Complete Khoisan and Bantu genomes from southern Africa.” Nature 463(7283): 943-947.

This research project by Schuster et al. (2010) is the first genetic study that analyzed genome sequence variation of the Khoisan hunter-gatherers.  Although, as the authors of this article as well as Rasmussen et al. (2010) noted, association between genetic variants and phenotype is not simple, they found several genetic variants that might be beneficial for hunter-gather lifestyle.  They suggest that, in the future, the comparison of genome variation of hunter-gatherer populations with genome variation of agricultural populations will help understand genetic changed occurred among agricultural populations as agriculturalists adopted new lifestyle. 

More interestingly, in their principle component analysis (PCA), they found distinctive population clusters.  First, when the Europeans were included, the Africans scatter widely because a greater genetic variation exists in Africa.  Also, the Bantus and Yorubans form a distinctive from the Khoisan, showing uniqueness of Khoisan genetic variation.  Second, when only Africans were analyzed, the Africans form three different clusters (Khoian, Bantus, and Yorubans).  The genetic variation among the Khoisan was large, possibly because of recent admixture with the Bantu neighbors. 

Interestingly, they did not explain why they observed different clusters in their PCA and they focused on possible gene flow between the Bantus and Khoisan.  Do they think if they include more samples from different populations, there will not be clear clusters?  Is the genetic distinctiveness between the Bantus and Yorubans observed in the second PCA significant?

Can we celebrate human genetic diversity?

Lahn, B. T. and L. Ebenstein (2009). “Let’s celebrate human genetic diversity.” Nature 461(7265): 726-728.

Some may argue that there is no significant genetic difference between groups of humans and fear that genetic research can provide evidence to support racists’ arguments.  On the other hand, Lahn and Ebenstein (2009) argue that this kind of view is dangerous, if genetic research projects show that there are significant genetic differences among human groups.  Instead of arguing biological egalitarianism, they suggest to approach human genetic diversity with a positive attitude and celebrate human genetic diversity. 

We have to note that authors did not use the term, race, probably because they did not want to mislead the readers to believe that genetic studies will provide evidence to support racists’ arguments.  Genetic differences among human groups should not equate with existence of biological race.  Clearly, the trend is shift from ‘race’ to human genome variation focused genetic research (Royal and Dunston, 2004).  But I wonder how big the genetic differences have to be to say human groups are genetically different.  Are genetic differences biologically and socially meaningful?  How will the public interpret the genetic data?

Relationship between genetic variation and racial/ethnic identity (instrumentalist paradigm)

Eriksen, T. H. (2001). Ethnic identity, national identity, and intergroup conflict. Social identity, intergroup conflict, and conflict reduction. R. D. Ashmore, L. Jussim and D. Wilder. Oxford, Oxford University Press: 42-68.

Eriksen is an instrumental theorist, and he believes that culture and ethnic identity is fluid.  According to Eriksen, in the instrumental theoretical framework, ethnicity and identity can be explained as follow (direct quote from Erikson, 2001);

1)      Although ethnicity is widely believed to express cultural differences, there is variable and complex relationship between ethnicity and culture; and there is certainly no one-to-one relationship between ethnic differences and cultural ones.

2)      Ethnicity is a property of a relationship between two or several groups, not a property of a group; it exists between and not within groups.

3)      Ethnicity is the enduring and systematic communication of cultural differences between groups considering themselves to be distinct.  It appears whenever cultural differences are made relevant in social interaction, and it should thus be studied at level of social life, not at the level of symbolic culture.

4)      Ethnicity is thus relational and also situational: ethnic character of a social encounter is contingent on the situation.  It is not, in other words, inherent.

These explanations should apply to racial identity as well. 

Then, how can we apply this concept of race/ethnicity and identity to human genetic and anthropological genetic research?  I replaced the word ‘ethnicity’ by ‘identity’ and ‘culture’ by ‘gene’ to explain nature of relationship between identity and genetic variation.

1)      Although many people believe that there is some kind of relationship between gene/biology and racial/ethnic identity, there is variable and complex relationship between genetic variation and identity, and there is no one-to-one relationship between gene and identity.

2)      Although applying a concept of deme to human subgroups is difficult, deme is more similar to actual human groups or societies than Mendelian population.  By thinking human groups as demes, we can study a property of a relationship (eg. gene flow) between different groups.

3)      Identity is developed through the enduring and systematic communication of genetic and phenotypic differences between groups considering themselves to be distinct.  Aside from the genetic variants for disease causing genes, genetic differences are made relevant in social interaction, and the relationship between genetic variation and race/ethnicity should be studied at level of social life.

4)      Identity is relational and situational: the relationship between genetic variation and racial/ethnic identity is contingent on the situation.  Identity is not inherited, socially constructed.

I think this is pretty good.  Does anyone have comments?

Paradigm shift: from ‘race’ to human genome variation

Royal, C. D. M. and G. M. Dunston (2004). “Changing the paradigm from ‘race’ to human genome variation.” Nature Genetics 36(11): S5-7.

Royal and Dunston call for paradigm shift from the use of ‘race’ in biomedical and genetic research to human genome variation focused research.  Human genome data does not support the traditional concept of ‘race’ as biological construct of people’s identity.  The discordance between human genome variation and race is necessary for understanding the relationship between human genome variation and ethnic health disparity, because in addition to genetic variation, socio-cultural factors are important risk factors for the common diseases.  The authors also argue that the human genome knowledge can be used to eliminate ethnic health disparity, if the benefits are shared and the messages are understood by the public.

The study of race and ethnic identity is a big area in social science, and in social science fields there was a paradigm shift, maybe during the 80s and 90s.  In the old paradigm, but still exist today, people view that there is a natural relationship between gene/biology and ethnic/racial identity.  Today, social scientists think that culture and ethnic/racial identity is very fluid and there is no simple correlation between gene/biology and ethnic/racial identity.  Basically, ethnic and racial identity is socially constructed. 

This article shows that there is a big gap between biomedical researchers and social scientists and there should be more collaboration between biomedical researchers and social scientists for the genome knowledge and the message to be understood by the public.