Dear Editor.

We read the recent article of Magalhães da Silva et al1 reporting the correlation between biogeographic ancestries, estimated using 30 ancestry informative markers (AIMs), and self-reported skin color in two different Brazilian Northeastern populations (Fortaleza and Salvador, capitals of the states of Ceará and Bahia, respectively). The authors observed that African ancestry is more correlated in the sample from Salvador than in the one from Fortaleza and that the use of different African populations as proxies of the Brazilian’s African ancestors may influence the results.

One unusual point of this study was taking Han Chinese from Beijing (CHB) as pseudo-ancestors for Amerindians as there is no Native American population included in the HapMap Project (http://hapmap.ncbi.nlm.nih.gov/) and, in addition, because CHB population has been shown to have allele frequencies similar to those of Native Amerindians.1

It is well known that most Latin American countries are inhabited by tri-hybrid populations derived from African, Amerindian and European roots, in which their proportions show considerable variability.2 As the vast majority of published studies aimed at inferring the admixture proportions of Latin American populations have not used CHB population as a proxy for Amerindian ancestors, we compared the admixture inference of some Latin American populations using Chinese or Native American ancestors as proxies.

In order to estimate admixture proportions, we used the following Latin American populations from 1000 Genomes Project Phase III3: Colombians from Medellin, Colombia (CLM); Mexican Ancestry from Los Angeles, USA (MXL); Peruvians from Lima, Peru (PEL); and Puerto Ricans from Puerto Rico (PUR). Besides the CHB population, we also used as proxies for African and European ancestors the Utah Residents with Northern and Western European ancestry (CEU) and Yoruba in Ibadan, Nigeria (YRI), respectively, also obtained from 1000 Genomes Project Phase III database. For Amerindian ancestors, a combination of Mayans, Quechuans and Nahua Natives (we call this group AMI) were used as pseudo-ancestors for Indigenous Americans as described by Kosoy et al.4

One hundred and twenty-seven SNPs were used as AIMs from a set of 128 SNPs validated by Kosoy et al4 (the rs10954737 SNP is not present in the 1000 Genomes database). After merging 1000 Genomes with AMI data using PLINK version 1.90,5 we employed STRUCTURE software v. 2.3.4 (Pritchard et al6) running into ParallelStructure R package.7, 8 The parameters applied were 10 independent runs with 100 000 burn-in steps and 100 000 Markov chain Monte Carlo replicates assuming three ancestral populations (K=3) in admixture model, allele frequencies correlated and the parameter USEPOPINFO=1. CLUMPAK Server allowed us to generate bar plots, referring to individual and population ancestry proportions.9 We made two separated analyses: one using CEU, YRI and AMI as pseudoancestors and another using CEU, YRI and CHB.

The bar plot with the admixture estimate is shown in Figure 1 and the ancestry proportions for the CLM, MXL, PEL and PUR populations are displayed in Table 1. It is possible to observe a tendency towards increasing of Amerindian ancestry and a decreasing of European ancestry when the CHB population was used as proxy for Native American ancestors.

Figure 1
figure 1

Bar plot with the admixture estimate of Latin American populations using AMI (top) and CHB (bottom) populations as proxies for the Amerindian ancestors. CEU=Utah Residents with Northern and Western European ancestry; YRI=Yoruba in Ibadan, Nigeria; AMI=combination of Mayans, Quechuans Amerindians and Nahua Native Americans; CHB=Han Chinese from Beijing; CLM=Colombians from Medellin, Colombia; MXL=Mexican Ancestry from Los Angeles, USA; PEL=Peruvians from Lima, Peru; PUR=Puerto Ricans from Puerto Rico.

Table 1 Proportions of European (EUR), African (AFR) and Native American (AMR) ancestors in Latin American populations

In general, the results obtained using AMI pseudo-ancestors are more similar to those found in the literature, also when using AMI pseudo-ancestors in studies concerning populations from the same city such as CLM, MXL, PEL and PUR (Table 1). The results published by Magalhães da Silva et al1 predicted 54.7, 12.3 and 33% for EUR, AFR and CHB contributions, respectively, in a self-declared ‘white’ individuals from Fortaleza. Whereas in a study of Pena et al,10 using 40 validated AIMs, the authors found 75.8, 13.3 and 10.9% in a group of ‘white’ individuals from the same city. The same tendency for increased AMR contribution, using CHB as pseudo-ancestors, occurred when analyzing ‘brown’ individuals. Curiously, this tendency was not confirmed when the results of the population from Salvador are compared with the findings from the study of Pena et al.10

We also evaluated whether the allele frequency of ancestry markers were different between AMI and CHB populations: in our set of 127 AIMs, 100 showed significant differences between allele frequencies (P-value from χ2 test <0.05). Moreover, we observed a weak correlation between the allele frequencies of AMI and CHB populations (r=0.33). Finally, average pairwise FST values were higher when comparing AMI with CEU or YRI (0.30 and 0.35, respectively) than when comparing CHB with CEU and YRI (0.16 and 0.25, in that order). As higher divergence corresponds to better Informativeness,11 we could hypothesize that the use of AMI individuals as pseudo-ancestors of Native Americans should be preferred to that of CHB individuals.

In light of this, we believe that the results of Magalhães da Silva et al1 may be biased owing to the use of CHB population instead AMI as reference samples of Amerindian ancestors. Some genotyping data from AMI populations such as those from Kosoy et al4 are publically available as well as those from Human Genetic Diversity Project (http://www.hagsc.org/hgdp/files.html) and may serve as source of AIMs where a subset of the panel found by Galanter et al12 can be downloaded.

In future studies dealing with Latin American tri-hybrid admixtures, our suggestion would be to consider genotyping data from AMI populations as first choice, eventually comparing the results with those obtained by using CHB as proxy as carried out by Magalhães da Silva et al.1 Their comparison will allow a definitive choice based on the better representatives of Amerindian ancestry.