Tang et al. (2005) analyzed 326 microsatellite markers of 3,636 individuals from varying racial/ethnic groups in 15 different locations to examine the correlation between racial/ethnic identity and genetic ancestry. The racial/ethnic groups included for analysis are white, African American, Hispanic, and Asians (Chinese, including Taiwanese from Taiwan, and Japanese).
Their analyses show very high correspondence between racial/ethnic identity and genetic ancestry. Using genetic ancestry, more than 99% of individuals were correctly categorized into self-identified racial categories. They found the clustering of racial/ethnic groups on the multidimensional plots based genetic distance calculated as well as using STRUCTURE.
The main concern for them is application to association studies to find disease causing genes, not to examine if there is biological basis for racial classification, though Neil Risch, one of the lead investigators of this project, expressed his thoughts on problem defining race in the interview with Jane Gitschier, human geneticists and PLOS editor.
However, the clusters identified in this study could be statistical constructs because of their poor sampling strategies (Weiss and Long, 2009). Why didn’t they include Native Americans, Asian Indians, Central Asians, and Middle Eastern? But why did they include Taiwanese from Taiwan? Also, I am not sure if they chose right model for cluster analysis. They used STRUCTURE (Pritchard et al., 2000) for cluster analysis and they used the NOADMIX option, “so that the entire genome of each individual was assumed to have been derived from a single homogeneous population.” There should be some level of admixture between each racial/ethnic group, except for Taiwanese, and is each individual from all the racial/ethnic groups derived from a single homogeneous population? Later they explain
“We note that this analysis was not based on determination of individuals’ “racial” ancestry (e.g., estimating individual European, African, and Native American ancestry for the African American and Hispanic subjects). To do so would require inclusion of the nonadmixed ancestral groups (such as Africans and Native Americans) and the use of the “ADMIX” option of structure. What our results do show is that the (admixed) groups included have approximated within-group random mating sufficiently long enough to give rise to distinct genetic clusters.” Are they saying individuals within subgroups (African Americans and Hispanic) are randomly mating? I wish that they explained how they used STRUCTURE and how they interpret the data little further.
Moreover, they did not consider the social processes why genetic differences are maintained between different ethnic groups, showing lack of collaboration between biomedical geneticists and social scientists.