Importance of pigmentation

The prevalence of several common diseases differs among populations. Although such health disparities may largely be the result of differential access to health care and differences in socioeconomic factors and environmental exposures, there may also be risk alleles that differ in frequency among populations. Therefore, tests of correlations between genetic ancestry and phenotype can be a good starting point for research on the causes of health disparities1. Even if the ultimate cause for the difference is primarily environmental, individual genetic ancestry analysis can provide a biological reference point against which to study cultural and environmental factors. This line of investigation is in the very early stages of development and has benefited tremendously from studies of skin pigmentation.

Skin pigmentation is one of the most variable phenotypes in humans, but very little is known about the genetic basis and evolutionary history of this polygenic trait. Understanding the genetics of human pigmentation and the distribution of pigmentation alleles within and among populations is important. To a large extent in the US, skin pigmentation is a proxy for 'race', on the basis of which racism has been manifested. Skin color can affect the level of discrimination a person experiences or the quality of medical care he or she receives. Differences in skin color among populations are commonly (and incorrectly) understood as an indication of deeper biological differences among populations. Pigmentation genes are unlike the bulk of the genetic variation among populations and can, and should, be used to help educate the general public about the distribution of genetic variation across the populations of the world and the lack of meaning of 'race' as a system of biological classification. Skin pigmentation is also of interest from the biological point of view. Melanin has a key physiological role, mediating the amount of ultraviolet (UV) radiation reaching the dermal capillaries and therefore influencing the rate of UV-induced photolysis and photoactivation of compounds that function in many physiological pathways. Finally, skin pigmentation is useful as a model phenotype for ancestry and admixture mapping studies and can and should be used to evaluate these methods for assessing the biology and sociology of health disparities.

Constitutive pigmentation, the melanin content of unexposed areas of the skin, is a polygenic trait that is relatively unaffected by environmental factors. Pigmentation varies markedly both within and between geographic regions (Fig. 1). Both dark- and light-skinned populations are found on most continents, and a strong correlation between latitude and incident UV is evident2,3. The differences in pigmentation observed among human populations are probably due to the action of natural selection, but drift and sexual selection may have been important as well. UV radiation levels, and their effects on vitamin D synthesis, photolysis of folate, sunburn and skin cancer, are believed to be important factors in driving the evolution of skin pigmentation genes3,4. Despite advances in our knowledge of the genes involved in the pigmentation pathway5,6,7, little is known about the genetic basis of normal variation in pigmentation within and among human populations. Because of the key role that selection (natural and sexual) has had in the evolution of constitutive pigmentation and other superficial traits, the variation of these phenotypes among human populations is typically higher than the average variation across the genome. In humans, most genetic markers and traits show relatively small differences between populations. The percentage of the genetic variance explained by differences between continental groups for an 'average' marker, measured as FST, is typically only 5–15% of the total variance. In contrast, Relethford8 estimated that 88% of the total variance in skin pigmentation is explained by differences between geographic groups.

Figure 1: Global map of skin pigmentation levels.
figure 1

This map, based on the work of the geographer R. Biasutti, depicts average pigmentation levels across the world. Higher numbers represent darker skin color. Source; D. O'Neil (Behavioral Sciences Department, Palomar College, San Marcos, California, USA;

The distribution of FST values in humans must be examined carefully to avoid two common misinterpretations. First, we should not extrapolate findings based on superficial traits, strongly subject to selection, to the rest of the genome. Phenotypes such as skin pigmentation do not follow the pattern observed for most traits in humans. Second, we should not disregard the genetic differences between human populations as negligible based on the 'average' picture of the genome. FST values at individual loci show a wide dispersion around the mean. For most traits the differences between geographic regions are small, but there is a subset of traits for which diversity between regions is high. This subset may include traits that have had a role in the adaptation of humans to different environments and traits related to differences in drug metabolism or disease risk between populations. The genetic basis underlying these phenotypic differences can be studied using admixture mapping1,9,10,11,12,13,14.

As part of our research on pigmentation variation in human populations and the application of admixture mapping to elucidate the genetic basis of common pigmentation variation in human populations, we studied the relationship between pigmentation and ancestry in five geographically and culturally defined populations of mixed ancestry with a wide range of pigmentation and ancestral proportions. The main goal of this study was to evaluate whether constitutive pigmentation and ancestry are correlated in admixed populations and, if so, which are the reasons for the correlation and what is its potential impact for biomedical research. The study included 232 African Americans living in Washington, DC, USA; 173 African Caribbeans living in the UK; 64 Puerto Rican women living in New York, USA; 156 individuals from the city of Tlapa in the state of Guerrero, Mexico; and 444 Hispanics from San Luis Valley, Colorado, USA. Details about the African American12, African Caribbean12, Puerto Rican15, Mexican16 and Hispanic17 individuals were previously published.

Materials and methods

We measured constitutive skin pigmentation in the African American, African Caribbean, Puerto Rican and Mexican individuals on the upper inner side of both arms of each person using a DermaSpectrometer (Cortex Technology), and we report the melanin content as the melanin index. We measured constitutive skin pigmentation in the Hispanic individuals on the upper inner arm using a Photovolt model 575 spectrophotometer (Photovolt Instruments), and we report the melanin content as the lightness index (see ref. 18 for more information about melanin and lightness indices).

To estimate the relative West African, European and Indigenous American ancestry in each individual and in the population samples, we genotyped genetic markers showing large differences in frequency between parental populations (West African, European and Indigenous Americans) and small frequency differences within regions. These markers are called ancestry-informative markers (AIMs)19. We characterized 34 AIMs in the African American and African Caribbean individuals, 36 AIMs in the Puerto Rican individuals, 24 AIMs in the Mexican individuals and 21 AIMs in the Hispanic individuals. We genotyped AIMs by either McSNP20,21 or conventional agarose gel electrophoresis. More information about the AIMs used in this study was previously published12,15,22. We also deposited information about primer sequences, polymorphic sites and population data for a set of AIMs in the National Center for Biotechnology dbSNP database, under the submitter handle PSU-ANTH.

We calculated group admixture levels with the program ADMIX23, which implements a weighted least-squares method. We obtained estimates of individual ancestry using a maximum-likelihood method24 or using the program STRUCTURE 2.0 (ref. 25). We evaluated the correlation between constitutive skin pigmentation and individual ancestry using the Spearman's rho nonparametric test, because in most of the samples the individual ancestry values do not follow a normal distribution. The tests were carried out using the program SPSS (version 10).

Skin color and genetic ancestry in five admixed samples

The five populations sampled in this study show a wide range of both skin pigmentation and admixture levels. The African Caribbean individuals had the highest average melanin index (57.8 ± 0.74), followed by the African American individuals (53.4 ± 0.63) and the Mexican individuals (46.1 ± 0.37), whereas the Puerto Rican individuals had the lowest average constitutive pigmentation (36.8±0.75; Fig. 2). The African Caribbean and African American individuals had a much wider range of skin pigmentation than did the Mexican and Puerto Rican individuals. The average lightness index in the Hispanic individuals was 30.4 ± 0.18. Because we used different technology to measure skin reflectance in the Hispanic individuals, it is not possible to compare directly the distribution of constitutive skin pigmentation in these individuals with that of the other four populations.

Figure 2: Distribution of melanin index values.
figure 2

(a) African Americans. (b) African Caribbeans. (c) Mexicans. (d) Puerto Ricans.

Figure 3 shows a triangular representation of the average West African, European and Indigenous American ancestry levels estimated using the WLS method23. The African Caribbean and African American individuals were of primarily West African ancestry (87.9 ± 1.1% and 78.7 ± 1.2%, respectively). The second main component of ancestry in these populations was European, which was relatively smaller in African Caribbean than in African American individuals (10.2 ± 1.4% versus 18.6 ± 1.5%). The average Indigenous American contribution was substantially smaller in these two populations (1.9 ± 1.3% and 2.7 ± 1.4%, respectively). The Mexican population was characterized by a large proportion of Indigenous American ancestry (94.5 ± 1.0%), with small proportions of European (4.2 ± 0.9%) and West African (1.3 ± 0.4%) ancestry. The Puerto Rico individuals were of primarily European ancestry (53.3 ± 2.8%) but also had relatively large proportions of West African (29.1 ± 2.3%) and Indigenous American (17.6 ± 2.4%) ancestry. The Hispanic individuals had both European (62.7 ± 2.1%) and Indigenous American ancestry (34.1 ± 1.5%), with a relatively small proportion of West African ancestry (3.2±1.5%).

Figure 3
figure 3

Triangular plot showing average admixture proportions in African Americans (inverted filled triangle), African Caribbeans (filled circle), Mexicans (filled square), Puerto Ricans (asterisk) and Hispanics (open diamond).

We also estimated the ancestry of each individual using a maximum-likelihood estimation method24. A triangular plot of individual ancestry values is shown in Figure 4. There was a wide range of individual ancestry values in each of the populations, reflecting admixture stratification. The individual admixture values obtained using the program STRUCTURE 2.0 were highly correlated with the maximum-likelihood estimation values (Spearman's rho values ranged between 0.837 and 0.997; P < 0.001).

Figure 4
figure 4

Triangular plot showing individual ancestry proportions for African Americans inverted filled triangles), African Caribbeans (filled circles), Mexicans (filled squares), Puerto Ricans (asterisks) and Hispanics (open diamonds).

Figure 5 presents a three-dimensional representation of the relationship of individual ancestry and constitutive pigmentation values (melanin index) for the African American, Puerto Rican and Mexican individuals. Individual ancestry is plotted in a triangular format, and constitutive pigmentation values are shown in the vertical axis. This plot shows a trend for individuals with larger proportions of European ancestry to have lighter skin pigmentation.

Figure 5
figure 5

Triangular, three-dimensional representation of individual ancestry proportions and constitutive pigmentation values (melanin index; on the vertical axis) in African Americans (open circles), Puerto Ricans (closed circles) and Mexicans (triangles).

We further explored the relationship between constitutive skin pigmentation and individual ancestry using bivariate correlation analysis (Table 1). In the samples that were measured with the DermaSpectrometer, we observed a significant positive correlation between melanin index and Indigenous American or West African ancestry, but the strength of the relationship was quite variable: Puerto Rico, ρ = 0.633; African Americans, ρ = 0.440; African Caribbeans, ρ = 0.375; Mexico, ρ = 0.212).

Table 1 Relationship of melanin content and individual ancestry

We observed a significant correlation between constitutive pigmentation and individual ancestry in each of the five admixed samples. This correlation can be explained as a result of population structure due to admixture stratification in the samples. Here, we broadly define 'population structure' as the presence of associations between unlinked markers. The process of admixture between two populations will create linkage disequilibrium (LD) or nonrandom associations between both linked and unlinked markers, but LD between unlinked markers will disappear in just a few generations. LD decays as a function of the recombination rate (θ) between the two markers and the number of generations (n) since the admixture event. This decay in LD can be represented as Dn = (1 − θ)n D0, where Dn is the LD n generations after admixture and D0 is the initial LD26. For example, after ten generations of random mating, the LD at unlinked loci will be reduced to 0.1% of the initial level. Thus, we would not expect to find a significant correlation between pigmentation and individual ancestry if the admixture event happened some generations ago and there is random mating in the admixed population.

We explored this issue by simulation, comparing the relationship between skin pigmentation and ancestry in an admixed sample with no variation in admixture proportions among the individuals of the sample (no 'admixture stratification') with that relationship in an admixed sample with admixture stratification. We estimated ancestry using 35 or 100 AIMs. We simulated constitutive pigmentation using a simple polygenic additive model in which one allele at each pigmentation locus specifies twice the amount of melanin as the alternative allele. We considered four alternative scenarios in which pigmentation was determined by the action of 2, 5, 10 or 15 loci, respectively. We also evaluated the effect of pigmentation differences between the parental populations. In one model, the pigmentation differences between the parental populations were large, owing to differences in allele frequencies in pigmentation genes (differences between 30% and 63%). In an alternative model, the pigmentation differences between the parental populations were small (differences in frequency not exceeding 10%). As expected, the correlation between pigmentation and ancestry was not significant in the sample without admixture stratification, irrespective of the number of markers used to estimate ancestry, the number of genes determining melanin content or the pigmentation differences between the parental populations. Correlations between pigmentation and ancestry were significant in the admixed sample showing admixture stratification only when there were large pigmentation differences between the parental populations. No correlations were observed when the frequencies of the alleles determining melanin content and, consequently, pigmentation levels were similar in the parental populations. Further details on the simulations are given in Supplementary Note online.

In an admixed population, two factors can cause population structure: continuous gene flow and assortative mating. Continuous gene flow refers to an admixture model in which there has been an ongoing contribution from one or more parental populations to the admixed population over a period of time extending into the recent past. This creates hidden population structure due to variation in admixture proportions among individuals. Assortative mating can be defined as nonrandom mating according to phenotypic characteristics. If there is assortative mating based on a phenotype (e.g., skin color) or any other factor (e.g., socioeconomic status or education) that is correlated with ancestry, any population structure originally present in the admixed population will be maintained through the generations. Although continuous gene flow and assortative mating are very different processes, the end result is similar, in that there will be variation of individual ancestry in the admixed population. This will be reflected in significant associations between unlinked markers and, potentially, correlations between specific phenotypes and individual ancestry. The results of our simulations show that variation in admixture proportions between individuals have a profound effect on the measured correlations between pigmentation and individual ancestry. A significant correlation is observed in spite of the fact that all the AIMs used to estimate individual ancestry and all the markers determining pigmentation are unlinked.

Given that constitutive pigmentation is relatively independent of environmental factors, significant correlations with individual ancestry are easily explained as the result of genetic factors. For many complex traits and diseases, environmental factors have an important role, and controlling for these factors is a critical aspect to consider when carrying out this type of analysis27. It is possible to envision certain situations in which a positive association between a phenotype and individual ancestry could be misinterpreted as evidence of the presence of genetic factors influencing the phenotype, when in fact the phenotypic differences could be due to environmental factors that covary with ancestry1,28.

Implications of the pigmentation-ancestry correlations

It is important to consider the implications of the wide variation in the strength of the correlation between constitutive pigmentation and individual ancestry in these admixed samples. This variability is presumably a reflection of differences in the degree of population structure present in each population or of the levels of pigmentation differences between the parental populations and the number of genes involved. For example, the strong correlation observed in Puerto Ricans seems to indicate that continuous gene flow, assortative mating or both factors are important in this population. On the contrary, the correlation between melanin content and ancestry in Hispanics, though significant, is weak. This is consistent with historical data indicating that this population appeared as a result of a relatively old admixture event and that independent assortment has greatly decreased the association between unlinked markers created by the admixture process (for more information about the history and admixture dynamics in this population, see ref. 17). Alternatively, the differences in the extent of the correlation between constitutive pigmentation and ancestry may be due in part to admixture histories involving populations with widely different pigmentation levels. The Puerto Rican individuals have substantial contributions from three parental groups (Europeans, West Africans and Indigenous Americans). The African American and African Caribbean individuals have contributions mainly from West Africans and Europeans, and the Mexican and Hispanic individuals have primarily European and Indigenous American ancestry.

The results of this study have implications for the use of pigmentation as a 'marker' of ancestry in admixed populations. Depending on the degree of population structure present in the admixed population, the correlation between constitutive pigmentation and individual ancestry may be strong, weak or even absent. Parra et al.29 found no correlation between 'color' (described as a multivariate evaluation based on skin pigmentation, hair color and texture, and the shape of nose and lips) and African ancestry (based on 10 AIMs) in an admixed sample from Brazil. There are important differences in research design and phenotype measurement between our study and that of Parra et al., precluding direct comparison. For example, we objectively measured pigmentation on a continuous scale using reflectometry, whereas in the paper by Parra et al.29, 'color' was subjectively assessed by one of several observers and then compressed into three categories (white, intermediate and black). Owing to differences in the study design and the populations analyzed, the results of the two studies are not necessarily contradictory. In admixed populations where there is substantial admixture stratification, constitutive pigmentation will be correlated with individual ancestry, whereas in other admixed populations where truly random mating has occurred for a number of generations since the last significant admixture event, both variables may show complete independence. These results emphasize the need to be cautious when using pigmentation as a marker of ancestry or when extrapolating the results observed in one admixed sample to samples from other admixed populations.

The variation in the degree of population structure in these admixed populations from the Americas also has implications for the application of association approaches to mapping genes associated with complex diseases. In admixed populations with strong admixture stratification, such as the group from Puerto Rico, admixture LD is expected to extend over long genomic regions, but there is an inflated risk of false positive results. We previously showed how marked this effect can be in our study of skin pigmentation variation in two of the samples included in this study, the African Americans and the African Caribbeans12. Approximately one-half of the AIMs that we analyzed in those samples had a significant effect on pigmentation, even though most of them are located in genomic regions with no pigmentation gene candidates. In these samples, many markers give a positive result not because they have a functional effect on pigmentation, but because they are informative for ancestry, and pigmentation and ancestry show a strong correlation due to the presence of admixture stratification. When individual ancestry was used in the ANOVA test as a conditioning variable, most of the significant signals disappeared, except the effect of one AIM located in the candidate pigmentation gene TYR12.

The gene CYP3A4 provides another interesting example. This gene has been associated in several studies with prostate cancer risk in African Americans. Kittles and collaborators30 recently described that a marker located in CYP3A4 showed a significant effect on prostate cancer, but the effect disappeared after controlling for admixture. The CYP3A4 marker has a large frequency difference between European and African populations, and so the positive association observed in previous studies in African Americans could be due to the presence of population structure. Thus, when carrying out association studies in admixed populations, it is necessary to consider that a substantial number of markers unrelated functionally to the phenotype being analyzed could show up as false positive results, particularly those markers showing high frequency differences between the populations involved in the admixture process. Fortunately, there are several statistical methods available to control for the confounding effects of population structure in association studies31,32, including a method specifically designed for admixed samples10,11.

In populations with a reduced level of population structure, such the Hispanic group in our study, admixture LD will extend over shorter genomic regions, but these samples will be less prone to false positive results. In general, for admixed populations, admixture mapping is an ideal approach to map genes underlying population differences in complex traits and diseases11,14,33,34. Admixture mapping relies on testing for association of the trait with locus ancestry inferred from marker data. This approach requires a genome-wide panel of AIMs to infer ancestry. Until recently, the applicability of admixture mapping was restricted by the limited availability of informative markers. A high-density admixture map for gene discovery in African Americans is now available13, and admixture maps for other populations, such as Hispanics, will be at hand in the near future.

Skin pigmentation–ancestry correlations provide the most compelling evidence to date that admixture mapping will be successful. Skin pigmentation is simple in terms of gene-environment interactions, especially in contrast to hypertension, obesity and type 2 diabetes, diseases that differ across populations for both genetic and environmental reasons. This relative simplicity may be key in the utility of pigmentation as a model phenotype in these types of studies. Additionally, important questions remain regarding the potential direct effects of skin pigmentation levels on disease risk that are mediated either through the UV filtering properties of melanin or its effect on color-based discrimination. Indeed, because pigmentation is genetically simple, it may be possible to identify the genes underlying constitutive pigmentation, predict the pigmentation level of subjects in a study, and measure and make adjustments for some of the effects of pigmentation in samples of individuals where skin reflectance data is not available. But there will be sources of variation in personal reactions and experiences that would limit this approach from measuring many of the social effects of constitutive pigmentation. As such, concerted research programs directed specifically at understanding pigmentation and its interactions with other physiological, sociological and psychological systems are needed to understand the manifold effects of this phenotype.

Note: Supplementary information is available on the Nature Genetics website.