Human genetic variation and population structure: statistical construct or real?

Rosenberg, N. A., J. K. Pritchard, et al. (2002). “Genetic Structure of Human Populations.” Science 298(5602): 2381-2385.

Rosenberg, N. A., S. Mahajan, et al. (2005). “Clines, Clusters, and the Effect of Study Design on the Inference of Human Population Structure.” PLoS Genet 1(6): e70.

Li, J. Z., D. M. Absher, et al. (2008). “Worldwide Human Relationships Inferred from Genome-Wide Patterns of Variation.” Science 319(5866): 1100-1104. (The article is also covered by Anthropology.net and Mathilda’s anthropology Blog)

To examine the how human world-wide population structure are identified with Structure-like programs using HGDP-CEPH samples, I compared the results of three studies. 

Number of individuals and genetic markers used

Rosenberg et al., 2002 (1056 individuals; 377 autosomal microsatellite loci)

Rosenberg et al., 2005 (1048 individuals; 783 autosmal microsatellite and 210 insertion/deletion)

Li et al., 2008 (938 individuals; 650,000 single nucleotide polymorphisms)

Types of Structure-like Programs

Rosenberg et al., 2002 (STRUCTURE version 1)

Rosenberg et al., 2005 (STRUCTURE version 2)

Li et al., 2008 (frappe)

Results

K=2 to K=5 show similar patterns in all three studies.  K=2 shows gradual change from Africa to the New World.  K=3 has three clusters (Africa, Europe/Middle East/South Asia, and Asia/Oceania, Americas).  K=4 separates Amerindians from the Asian cluster and K=5 separates Oceania from the Asian cluster.

Figure 1 from Rosenberg et al. 2002

Three different studies shows different pattern for K=6.  Rosenberg et al. (2002) separate Kalash (Pakistan) from European/Middle Eastern/South Asian cluster, but Li et al. (2008) have a separate cluster for South Asians.  Contrary, Rosenberg et al. (2005) have two clusters for Amerindians (North American and Latin Americans).  Moreover, Li et al. (2008) have the 7th cluster, the Middle Eastern separated from Europeans, though there are great levels of admixture among the Middle Eastern.

Figure 2 from Li et al. 2008

Interpretations

Rosenberg et al. (2002) and Li et al. (2005) say that five clusters identified correspond to major geographic groups.  They also recognize the importance of the recent gene flow and admixture.  However, all of them argue that there are clear patterns of human population clustering or population structure, because geographic boundaries and socio-cultural practices isolated populations, and the understanding of human population structure is important for disease causing gene mapping.

My comments

I do believe that there are great levels of genetic differences among the major geographic groups and the understanding of that is very important for disease causing gene mapping studies.  However, we should not use any population clusters identified in these studies as an evidence of biological basis for racial categories for many reasons (three are discussed here).  First, these authors probably have racial/typological thinking and over-emphasize the differences among human groups, without carefully considering how admixture and gene flow have played important roles.  It seems that Rosenberg et al. (2002) and (2005) arranged the populations on the STRUCTURE output to illustrate the genetic differences.  As human population geneticists, their major concerns seem to be technical aspects, not social, cultural, and humanistic aspects to understand the role of gene flow. Second, the problems of the HGDP-CEPH collections should have been mentioned, but only Li et al. (2006) very briefly mentions the limitation of the HGDP-CEPH collections.  We have to think how inclusions of more population sample, especially potentially more admixed populations, affect the results of analyses and interpretations.  Also, numbers and types of genetic markers, and types of Structure-like program used may affect the results of analyses.  The three studies discussed here have slightly different results and results are interpreted slightly differently as well.

Advertisement

One Response to Human genetic variation and population structure: statistical construct or real?

  1. [...] – enn dei har for å likna på nokon på andre sida av jorda2. Kva slike clustrar ein får er avhengig av kor mange ein ber om, og av testresultata ein puttar inn. Bloggen Antropogenics oppsummerer: [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.