Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Integrative functional genomic analyses identify genetic variants influencing skin pigmentation in Africans

Abstract

Skin color is highly variable in Africans, yet little is known about the underlying molecular mechanism. Here we applied massively parallel reporter assays to screen 1,157 candidate variants influencing skin pigmentation in Africans and identified 165 single-nucleotide polymorphisms showing differential regulatory activities between alleles. We combine Hi-C, genome editing and melanin assays to identify regulatory elements for MFSD12, HMG20B, OCA2, MITF, LEF1, TRPS1, BLOC1S6 and CYB561A3 that impact melanin levels in vitro and modulate human skin color. We found that independent mutations in an OCA2 enhancer contribute to the evolution of human skin color diversity and detect signals of local adaptation at enhancers of MITF, LEF1 and TRPS1, which may contribute to the light skin color of Khoesan-speaking populations from Southern Africa. Additionally, we identified CYB561A3 as a novel pigmentation regulator that impacts genes involved in oxidative phosphorylation and melanogenesis. These results provide insights into the mechanisms underlying human skin color diversity and adaptive evolution.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Massively parallel screening of genetic variants associated with African skin pigmentation.
Fig. 2: Regulatory variant rs6497271 impacts OCA2 expression and contributes to human skin color variations.
Fig. 3: Continuing evolution of OCA2 enhancer E2 contributes to African skin pigmentation diversity.
Fig. 4: Regulatory variant rs111969762 near MITF contributes to the light skin color of the San.
Fig. 5: Regulatory SNPs of LEF1 and TRPS1 contribute to the light skin color of the San.
Fig. 6: CYB561A3 and TMEM138 are the primary target genes of GWAS-SNP rs7948623.
Fig. 7: CYB561A3 affects melanin levels in MNT-1.

Similar content being viewed by others

Data availability

The epigenomic data, Hi-C and HiChIP data were uploaded to UCSC browser and are available at https://genome.ucsc.edu/s/fengyq/Tishkoff_Lab%2Dhg38%2DMPRA%2DHiC_Pigmentation. All RNA-seq and epigenomic data generated in this study are available at GEO GSE240717. Genotype data for GWAS are in dbGaP phs001396.vl.pl. Source data are provided with this paper.

Code availability

Public software and packages were used following the developer’s manuals. The custom code used for data analysis has been deposited at GitHub (https://github.com/fengyq/nature_gentics_codes) and Zenodo101 (https://zenodo.org/records/10198223).

References

  1. Jablonski, N. G. & Chaplin, G. Colloquium paper: human skin pigmentation as an adaptation to UV radiation. Proc. Natl Acad. Sci. USA 107, 8962–8968 (2010).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  2. Barsh, G. S. What controls variation in human skin color? PLoS Biol. 1, E27 (2003).

    Article  PubMed  PubMed Central  Google Scholar 

  3. Beleza, S. et al. Genetic architecture of skin and eye color in an African–European admixed population. PLoS Genet. 9, e1003372 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Liu, F. et al. Genetics of skin color variation in Europeans: genome-wide association studies with functional follow-up. Hum. Genet. 134, 823–835 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Martin, A. R. et al. An unexpectedly complex architecture for skin pigmentation in Africans. Cell 171, 1340–1353 (2017).

  6. Galván-Femenía, I. et al. Multitrait genome association analysis identifies new susceptibility genes for human anthropometric variation in the GCAT cohort. J. Med. Genet. 55, 765–778 (2018).

    Article  PubMed  Google Scholar 

  7. Neale lab UK-Biobank GWAS result. Neale Lab http://www.nealelab.is/uk-biobank/ (2018).

  8. Adhikari, K. et al. A GWAS in Latin Americans highlights the convergent evolution of lighter skin pigmentation in Eurasia. Nat. Commun. 10, 358 (2019).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  9. Lona-Durazo, F. et al. Meta-analysis of GWA studies provides new insights on the genetic architecture of skin pigmentation in recently admixed populations. BMC Genet. 20, 59 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Jiang, L., Zheng, Z., Fang, H. & Yang, J. A generalized linear mixed model association tool for biobank-scale data. Nat. Genet. 53, 1616–1621 (2021).

    Article  CAS  PubMed  Google Scholar 

  11. Batai, K. et al. Genetic loci associated with skin pigmentation in African Americans and their effects on vitamin D deficiency. PLoS Genet. 17, e1009319 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Pairo-Castineira, E. et al. Expanded analysis of pigmentation genetics in UK Biobank. Preprint at bioRxiv https://doi.org/10.1101/2022.01.30.478418 (2022).

  13. Crawford, N. G. et al. Loci associated with skin pigmentation identified in African populations. Science 358, eaan8433 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Miller, C. T. et al. cis-Regulatory changes in KIT ligand expression and parallel evolution of pigmentation in sticklebacks and humans. Cell 131, 1179–1189 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Tsetskhladze, Z. R. et al. Functional assessment of human coding mutations affecting skin pigmentation using zebrafish. PLoS ONE 7, e47398 (2012).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  16. Visser, M., Kayser, M. & Palstra, R.-J. HERC2 rs12913832 modulates human pigmentation by attenuating chromatin-loop formation between a long-range enhancer and the OCA2 promoter. Genome Res. 22, 446–455 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Praetorius, C. et al. A polymorphism in IRF4 affects human pigmentation through a tyrosinase-dependent MITF/TFAP2A pathway. Cell 155, 1022–1033 (2013).

    Article  CAS  PubMed  Google Scholar 

  18. Fan, S. et al. Whole-genome sequencing reveals a complex African population demographic history and signatures of local adaptation. Cell 186, 923–939.e14 (2023).

    Article  CAS  PubMed  Google Scholar 

  19. Gordon, M. G. et al. lentiMPRA and MPRAflow for high-throughput functional characterization of gene regulatory elements. Nat. Protoc. 15, 2387–2412 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Akey, J. M. et al. Tracking footprints of artificial selection in the dog genome. Proc. Natl Acad. Sci. USA 107, 1160–1165 (2010).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  21. Myint, L., Avramopoulos, D. G., Goff, L. A. & Hansen, K. D. Linear models enable powerful differential activity analysis in massively parallel reporter assays. BMC Genomics 20, 209 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Adelmann, C. H. et al. MFSD12 mediates the import of cysteine into melanosomes and lysosomes. Nature 588, 699–704 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  23. Luecke, S. et al. The aryl hydrocarbon receptor (AHR), a novel regulator of human melanogenesis. Pigment Cell Melanoma Res. 23, 828–833 (2010).

    Article  CAS  PubMed  Google Scholar 

  24. Kayser, M. et al. Three genome-wide association studies and a linkage analysis identify HERC2 as a human iris color gene. Am. J. Hum. Genet. 82, 411–423 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Lona-Durazo, F. et al. A large Canadian cohort provides insights into the genetic architecture of human hair colour. Commun. Biol. 4, 1253 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Simcoe, M. et al. Genome-wide association study in almost 195,000 individuals identifies 50 previously unidentified genetic loci for eye color. Sci. Adv. 7, eabd1239 (2021).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  27. Liang, Z. et al. BL-Hi-C is an efficient and sensitive approach for capturing structural and regulatory chromatin interactions. Nat. Commun. 8, 1622 (2017).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  28. Mumbach, M. R. et al. HiChIP: efficient and sensitive analysis of protein-directed genome architecture. Nat. Methods 13, 919–922 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Bhattacharyya, S., Chandra, V., Vijayanand, P. & Ay, F. Identification of significant chromatin contacts from HiChIP data by FitHiChIP. Nat. Commun. 10, 4221 (2019).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  30. Ochoa, D. et al. Open Targets Platform: supporting systematic drug-target identification and prioritisation. Nucleic Acids Res. 49, D1302–D1310 (2021).

    Article  CAS  PubMed  Google Scholar 

  31. The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

  32. Albers, P. K. & McVean, G. Dating genomic variants and shared ancestry in population-scale sequencing data. PLoS Biol. 18, e3000586 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Levy, C., Khaled, M. & Fisher, D. E. MITF: master regulator of melanocyte development and melanoma oncogene. Trends Mol. Med. 12, 406–414 (2006).

    Article  CAS  PubMed  Google Scholar 

  34. Klein, J. C. et al. A systematic evaluation of the design and context dependencies of massively parallel reporter assays. Nat. Methods 17, 1083–1091 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Tan, B. et al. FOXP3 over-expression inhibits melanoma tumorigenesis via effects on proliferation and apoptosis. Oncotarget 5, 264–276 (2014).

    Article  PubMed  Google Scholar 

  36. Cao, Y. et al. Accurate loop calling for 3D genomic data with cLoops. Bioinformatics 36, 666–675 (2020).

    Article  CAS  PubMed  Google Scholar 

  37. Takeda, K. et al. Induction of melanocyte-specific microphthalmia-associated transcription factor by Wnt-3a. J. Biol. Chem. 275, 14013–14016 (2000).

    Article  CAS  PubMed  Google Scholar 

  38. Bondurand, N. et al. Interaction among SOX10, PAX3 and MITF, three genes altered in Waardenburg syndrome. Hum. Mol. Genet. 9, 1907–1917 (2000).

    Article  CAS  PubMed  Google Scholar 

  39. Kichaev, G. et al. Leveraging polygenic functional enrichment to improve GWAS power. Am. J. Hum. Genet. 104, 65–75 (2019).

    Article  CAS  PubMed  Google Scholar 

  40. Morgan, M. D. et al. Genome-wide study of hair colour in UK Biobank explains most of the SNP heritability. Nat. Commun. 9, 5271 (2018).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  41. Visconti, A. et al. Genome-wide association study in 176,678 Europeans reveals genetic loci for tanning response to sun exposure. Nat. Commun. 9, 1684 (2018).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  42. Larimore, J. et al. Mutations in the BLOC-1 subunits dysbindin and muted generate divergent and dosage-dependent phenotypes. J. Biol. Chem. 289, 14291–14300 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Saito, H. et al. Melanocyte-specific microphthalmia-associated transcription factor isoform activates its own gene promoter through physical interaction with lymphoid-enhancing factor 1. J. Biol. Chem. 277, 28787–28794 (2002).

    Article  CAS  PubMed  Google Scholar 

  44. Wang, X. et al. LEF-1 regulates tyrosinase gene transcription in vitro. PLoS ONE 10, e0143142 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Ishitani, T. et al. The TAK1–NLK–MAPK-related pathway antagonizes signalling between β-catenin and transcription factor TCF. Nature 399, 798–802 (1999).

    Article  CAS  PubMed  ADS  Google Scholar 

  46. Ishitani, T., Ninomiya-Tsuji, J. & Matsumoto, K. Regulation of lymphoid enhancer factor 1/T-cell factor by mitogen-activated protein kinase-related Nemo-like kinase-dependent phosphorylation in Wnt/β-catenin signaling. Mol. Cell. Biol. 23, 1379–1389 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Gai, Z., Gui, T. & Muragaki, Y. The function of TRPS1 in the development and differentiation of bone, kidney, and hair follicles. Histol. Histopathol. 26, 915–921 (2011).

    CAS  PubMed  Google Scholar 

  48. Swoboda, A. et al. STAT3 promotes melanoma metastasis by CEBP-induced repression of the MITF pathway. Oncogene 40, 1091–1105 (2021).

    Article  CAS  PubMed  Google Scholar 

  49. Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  50. Sitaram, A. & Marks, M. S. Mechanisms of protein delivery to melanosomes in pigment cells. Physiology 27, 85–99 (2012).

    Article  CAS  PubMed  Google Scholar 

  51. Wang, Z. et al. CYB561A3 is the key lysosomal iron reductase required for Burkitt B-cell growth and survival. Blood 138, 2216–2230 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Karlsson, M. et al. A single-cell type transcriptomics map of human tissues. Sci. Adv. 7, eabh2169 (2021).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  53. Lee, J. H. et al. Evolutionarily assembled cis-regulatory module at a human ciliopathy locus. Science 335, 966–969 (2012).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  54. Lamason, R. L. et al. SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. Science 310, 1782–1786 (2005).

    Article  CAS  PubMed  ADS  Google Scholar 

  55. Lavado, A., Olivares, C., García-Borrón, J. C. & Montoliu, L. Molecular basis of the extreme dilution mottled mouse mutation: a combination of coding and noncoding genomic alterations. J. Biol. Chem. 280, 4817–4824 (2005).

    Article  CAS  PubMed  Google Scholar 

  56. Seruggia, D., Fernández, A., Cantero, M., Pelczar, P. & Montoliu, L. Functional validation of mouse tyrosinase non-coding regulatory DNA elements by CRISPR–Cas9-mediated mutagenesis. Nucleic Acids Res. 43, 4855–4867 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Ambrosio, A. L., Boyle, J. A., Aradi, A. E., Christian, K. A. & Di Pietro, S. M. TPC2 controls pigmentation by regulating melanosome pH and size. Proc. Natl Acad. Sci. USA 113, 5622–5627 (2016).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  58. Ploper, D. et al. MITF drives endolysosomal biogenesis and potentiates Wnt signaling in melanoma cells. Proc. Natl Acad. Sci. USA 112, E420–E429 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Zhang, Y. et al. Lef1 contributes to the differentiation of bulge stem cells by nuclear translocation and cross-talk with the Notch signaling pathway. Int. J. Med. Sci. 10, 738–746 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Fantauzzo, K. A., Kurban, M., Levy, B. & Christiano, A. M. Trps1 and its target gene Sox9 regulate epithelial proliferation in the developing hair follicle and are associated with hypertrichosis. PLoS Genet. 8, e1003002 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Fantauzzo, K. A. & Christiano, A. M. Trps1 activates a network of secreted Wnt inhibitors and transcription factors crucial to vibrissa follicle morphogenesis. Development 139, 203–214 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Yamada, T. et al. Wnt/β-catenin and kit signaling sequentially regulate melanocyte stem cell differentiation in UVB-induced epidermal pigmentation. J. Invest. Dermatol. 133, 2753–2762 (2013).

    Article  CAS  PubMed  Google Scholar 

  63. Andl, T., Reddy, S. T., Gaddapara, T. & Millar, S. E. WNT signals are required for the initiation of hair follicle development. Dev. Cell 2, 643–653 (2002).

    Article  CAS  PubMed  Google Scholar 

  64. Tobias, P. V. & Biesele, M. The Bushmen: San Hunters and Herders of Southern Africa (Human & Rousseau, 1978).

  65. Feng, Y., McQuillan, M. A. & Tishkoff, S. A. Evolutionary genetics of skin pigmentation in African populations. Hum. Mol. Genet. 30, R88–R97 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Rawofi, L. et al. Genome-wide association study of pigmentary traits (skin and iris color) in individuals of East Asian ancestry. PeerJ 5, e3951 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  67. Stokowski, R. P. et al. A genomewide association study of skin pigmentation in a South Asian population. Am. J. Hum. Genet. 81, 1119–1132 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Article  ADS  Google Scholar 

  69. Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Kang, H. M. et al. Efficient and parallelizable association container toolbox, EPACTS v3.3.0. EPACTS http://genome.sph.umich.edu/wiki/EPACTS (2013).

  71. Bhatia, G., Patterson, N., Sankararaman, S. & Price, A. L. Estimating and interpreting FST: the impact of rare variants. Genome Res. 23, 1514–1521 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  74. Barrett, T. et al. NCBI GEO: mining millions of expression profiles—database and tools. Nucleic Acids Res. 33, D562–D566 (2005).

    Article  CAS  PubMed  Google Scholar 

  75. Phenotype: pigmentation phenotype. International Mouse Phenotyping Consortium https://www.mousephenotype.org/data/phenotypes/MP:0001186 (2023)

  76. Dickinson, M. E. et al. High-throughput discovery of novel developmental phenotypes. Nature 537, 508–514 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Baxter, L. L., Watkins-Chow, D. E., Pavan, W. J. & Loftus, S. K. A curated gene list for expanding the horizons of pigmentation biology. Pigment Cell Melanoma Res. 32, 348–358 (2019).

    Article  PubMed  Google Scholar 

  78. Uhlen, M. et al. A pathology atlas of the human cancer transcriptome. Science 357, eaan2507 (2017).

    Article  PubMed  Google Scholar 

  79. Custom Alt-R™ CRISPR–Cas9 guide RNA. Integrated DNA Technologies https://www.idtdna.com/site/order/designtool/index/CRISPR_CUSTOM (2023).

  80. RNA sequencing frequently asked questions. GENEWIZ https://web.genewiz.com/rna-seq-faq (2023).

  81. Magoč, T. & Salzberg, S. L. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957–2963 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  82. Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  83. FastQC. GitHub https://github.com/s-andrews/FastQC (2020)

  84. Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).

    Article  CAS  PubMed  Google Scholar 

  85. Soneson, C., Love, M. I. & Robinson, M. D. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res. 4, 1521 (2015).

    Article  PubMed  Google Scholar 

  86. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  87. Kucukural, A., Yukselen, O., Ozata, D. M., Moore, M. J. & Garber, M. DEBrowser: interactive differential expression analysis and visualization tool for count data. BMC Genomics 20, 6 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  88. Blighe, K., Rana, S., Lewis, M. EnhancedVolcano: publication-ready volcano plots with enhanced colouring and labeling. R package version 1.14.0. EnhancedVolcano https://github.com/kevinblighe/EnhancedVolcano (2023).

  89. Luo, W. & Brouwer, C. Pathview: an R/Bioconductor package for pathway-based data integration and visualization. Bioinformatics 29, 1830–1831 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Corces, M. R. et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. McKenna, A. et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Li, H. et al. The sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  94. Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  95. Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  96. Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).

    Article  CAS  PubMed  Google Scholar 

  97. Meers, M. P., Tenenbaum, D. & Henikoff, S. Peak calling by Sparse Enrichment Analysis for CUT&RUN chromatin profiling. Epigenet. Chromatin 12, 42 (2019).

    Article  Google Scholar 

  98. Coetzee, S. G., Coetzee, G. A. & Hazelett, D. J. motifbreakR: an R/Bioconductor package for predicting variant effects at transcription factor binding sites. Bioinformatics 31, 3847–3849 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Kulakovskiy, I. V. et al. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP–seq analysis. Nucleic Acids Res. 46, D252–D259 (2018).

    Article  CAS  PubMed  Google Scholar 

  100. An, L. et al. OnTAD: hierarchical domain structure reveals the divergence of activity among TADs and boundaries. Genome Biol. 20, 282 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  101. Feng, Y. Codes for skin pigmentation paper. Zenodo https://doi.org/10.5281/zenodo.10198223 (2023).

  102. Shin, J. H., Blay, S., Graham, J. & McNeney, B. LDheatmap: an R function for graphical display of pairwise linkage disequilibria between single nucleotide polymorphisms. J. Stat. Softw. 16, 1–9 (2006).

  103. Liu, T. et al. Cistrome: an integrative platform for transcriptional regulation studies. Genome Biol. 12, R83 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This research was supported by the following grants: NIH grants R35 GM134957-01, 3UM1HG009408-02S1, 1R01GM113657-01 and 5R01AR076241-02. We thank the Skin Biology and Disease Resource-based Center (SBDRC, NIH P30-AR069589) at the University of Pennsylvania for funding and providing human primary melanocytes. The sequencing of MPRA was carried out by the DNA Technologies and Expression Analysis Core at the University of California Davis Genome Center, supported by the NIH Shared Instrumentation Grant 1S10OD010786-01. We thank E. Burton for assistance on part of the plasmid cloning. We thank Z. (J.) Zhou from the Department of Genetics at the University of Pennsylvania for sharing their tissue culture room. We thank J. Phillips-Cremins from the Department of Genetics at the University of Pennsylvania for constructive suggestions on Hi-C. We thank H. Wong and H. Wu at the University of Pennsylvania for sharing their experimental equipment. We thank the African participants for their contributions to this study.

Author information

Authors and Affiliations

Authors

Contributions

Y.F. and S.A.T. designed the study and wrote the original draft. Y.F. performed the Hi-C, H3K27ac HiChIP, CRISPR, RNA-seq, ATAC-seq and CUT&RUN experiments and related data analysis. Y.F. and F.I. conducted the MPRA under supervision of N.A. Y.F. and C.Z. analyzed the MPRA data. S.F. and M.E.B.H played a role in quality control and analysis of WGS and SNP array data. Y.F. and S.F. conducted the GWAS and Di analysis. Y.F. and N.X. performed CRISPR editing and related assays in MNT-1. S.A.T., T.N., S.W.M., G.G.M., A.K.N., C.F. and G.B. played a role in collecting data from Africa. J.S. and E.O. performed the CYB561A3 immunofluorescence imaging and analyses. E.O. and M.S.M. provided resources and additional experimental insights. All authors assisted with manuscript review and editing. S.A.T. supervised the project.

Corresponding author

Correspondence to Sarah A. Tishkoff.

Ethics declarations

Competing interests

N.A. is an equity holder of Encoded Therapeutics, a gene regulation therapeutics company and is a cofounder and scientific advisor of Regel Therapeutics and Neomer Diagnostics. The remaining authors declare no competing financial interests.

Peer review

Peer review information

Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Quality statistics of the MPRA experiments.

(a) Statistics for FLASH-merged reads in the association library. The plot shows that 46.1% are 200 bp fragments as designed. (b) Statistics of BWA-mapped reads in the association library. The plot shows that 44.1% are 200 bp fragments as designed. (c) Statistics of barcode types per oligo in the association library. On average, each oligo is linked with 126 different barcodes. (d) Statistics of barcode types per oligo in reference (n = 1102), alternative (n = 1103), negative control (n = 153), and positive control (n = 30) oligos. Data is from the association library. (e) Statistics of barcode counts per oligo in reference (n = 1102), alternative (n = 1103), negative control (n = 153), and positive control (n = 30) oligos. Data is from the association library. (f) Barcode types for reference and alternative alleles are comparable. Pearson’s r = 0.91, p < 2 × 10−16. (g) Principal component analysis of DNA and RNA libraries from MNT-1 and WM88 cells. Three replicates. (h) Summary of enhancer activities estimated by MPRA. Enhancer activities were defined as the barcode counts per million in the RNA library divided by the barcode counts per million in the DNA library. Alt: oligos containing alternative alleles (n = 1103). Ref: oligos containing reference alleles (n = 1102). Negative, negative control oligos (n = 148). Positive, positive control oligos (n = 30). For boxplots, central lines are median, with boxes extending from the 25th to the 75th percentiles. Whiskers further extend by ±1.5 times the interquartile range from the limits of each box.

Source data

Extended Data Fig. 2 MPRA identifies six allelic skewed variants near MFSD12.

(a) Plot showing allelic skewed variants in regulatory regions near MFSD12. Blue tracks indicate DNase-Seq, ATAC-Seq, and ChIP-Seq from melanocytes; orange tracks indicate ChIP-Seq from melanoma (501-mel) cells; green tracks indicate DNase-Seq from ENCODE cell lines. E1-E4, enhancers. P, promoter. (b-g) Relative enhancer activities of the two alleles at rs142317543, rs6510759, rs734454, rs10416746, rs6510760, rs7246261 estimated by MPRA (n = 3). For b, c, d, f, g, p-values were estimated with a random effects model for mpralm and paired t-tests with multiple testing adjustments; e was without multiple testing adjustments. (h-k) Relative enhancer activities estimated by LRA. Two-tailed paired t-tests (For LRA in MNT1, n = 6. For LRA in WM88, 2 h n = 8; others n = 9). Data were presented as mean ± SEM. ns p > 0.05. (l) rs6510760 and rs7246261 disrupt the binding motifs of AHR and TFAP2, respectively. Predicted by ‘MotifBreakR’98. (m) The LD pattern of candidate functional variants near MFSD12. LD was calculated using the 180 G18 data by the LDheatmap102 package. For boxplots, central lines are median, with boxes extending from the 25th to the 75th percentiles. Whiskers further extend by ±1.5 times the interquartile range from the limits of each box.

Source data

Extended Data Fig. 3 The enhancer E4 interacts with the promoter of MFSD12 and affects the expression of MFSD12.

(a, b) Chromatin interactions near MFSD12 identified by Hi-C and H3K27ac HiChIP with Hae3 digestion. The upper matrix is from MNT-1 Hi-C data, and the lower matrix is from MNT-1 H3K27ac HiChIP data. TADs were called by onTAD100 and colored by nested TAD levels. The solid arch was a loop defined using FitHiChIP29 software, the dashed arch was a potential loop based on the observed interaction matrix. The interaction matrix between MFSD12 and HMG20B was highlighted with orange angles. The DNase track of melanocytes was downloaded from ENCODE68. rs6510760 and rs657246261 in E4 were colored in red. The plotted region is chr19:3519998-3589998 (hg19). (c) Schematic showing the location of the two sgRNAs targeting the enhancer E4 of MFSD12. (d) PCR results showing efficient knockout of the enhancer by the two sgRNAs. Three independent experiments. (e) qPCR showed that CRISPRi of E4 reduces the gene expression of MFSD12 and HMG20B in MNT-1 cells. Two-sided Dunnett’s test with adjustments for multiple comparisons (n = 3). (f) CRISPRi of E4 slightly increases melanin levels in MNT-1 cells. Two-tailed unpaired t-tests (n = 19). (g) qPCR showed that CRISPR knockout of E4 decreases the gene expression of MFSD12 and HMG20B in MNT-1 cells. Two-tailed unpaired t-tests without multiple testing adjustments (n = 6). Data are presented as mean ± SEM.

Source data

Extended Data Fig. 4 Identification of functional variants associated with skin pigmentation near OCA2.

(a) SNP rs6497271 is in a melanocyte-specific enhancer. Blue tracks indicate DNase-Seq, ATAC-Seq, and ChIP-Seq data from melanocytes; orange tracks indicate ChIP-Seq data from melanoma (501-mel) cells; green tracks indicate DNase-Seq data from ENCODE cell lines. E1-E4, enhancers. The plotted region is chr15: 28,335,146-28,385,146 (hg19). (b) MPRA and LRA reveals that rs4778242 significantly affects the enhancer activity of E1 in MNT-1 and WM88 cells. MPRA (n = 3), LRA (n = 9). (c) MPRA showed that rs6497271 affects the enhancer activity of E2 in MNT-1 and WM88 cells (n = 3). (d) MPRA shows that rs7495989 affects the enhancer activity of E3 in MNT-1 and WM88 cells (n = 3). (e) MPRA and LRA reveals that rs4778141 affects the enhancer activity of E4 in MNT-1 and WM88 cells. MPRA (n = 3), LRA (MNT-1, n = 9; WM88, n = 6). (f) rs6497271 overlaps transcription factor binding sites. Left panel shows rs6497271 disrupts the binding motif of LEF1 and SOX10. Right panel shows that rs6497271 overlaps ChIP-seq peaks from Cistrome database103. LRA data are presented as mean ± SEM, tested with two-tailed paired t-tests. MPRA p-values are estimated with a random effects model for mpralm and paired t-tests with multiple testing adjustments. For MPRA boxplots, central lines are median, with boxes extending from the 25th to the 75th percentiles. Whiskers further extend by ±1.5 times the interquartile range from the limits of each box.

Source data

Extended Data Fig. 5 Identification of functional variants near MITF related to skin pigmentation in the San.

(a) A Plot showing functional Di-SNP rs111969762 is in a melanocyte-specific regulatory region. Blue tracks indicate DNase-Seq, ATAC-Seq, and ChIP-Seq from melanocytes; orange tracks indicate ChIP-Seq from melanoma (501-mel) cells; green tracks indicate DNase-Seq from ENCODE cell lines. E1-E2, enhancers. (b) MPRA showed that rs111969762 affects enhancer activity in WM88 cells (n = 3). (c) MPRA shows that rs7430957 impacts enhancer activity in WM88 cells (n = 3). (d) LRA shows that rs7430957 does not significantly alter the activity of the E2 enhancer near MITF. P values were estimated by two-tailed paired t-tests, MNT-1(n = 6), WM88 (n = 11). Data were presented as mean ± SEM. ns p > 0.05. (e) rs111969762 overlaps transcription factor binding sites. Left panel showed rs6497271 disrupts the binding motif of FOXP3. Right panel showed that rs111969762 overlaps ChIP-seq peaks from the Cistrome database103. MPRA p-values were estimated with a random effects model for mpralm and paired t-tests with multiple testing adjustments. For MPRA boxplots, central lines are median, with boxes extending from the 25th to the 75th percentiles. Whiskers further extend by ±1.5 times the interquartile range from the limits of each box.

Source data

Extended Data Fig. 6 Functional testing of Di-SNPs near LEF1.

(a) MFVs and regulatory elements near LEF1. rs17038630 and rs11939273 are Di-SNPs from the San population. (b) Plot showing allelic skews at rs17038630 in MNT-1 and WM88 cells estimated by MPRA (n = 3). (c) rs17038630 overlaps SOX10 and LEF1 binding sites. Left panel shows that rs17038630 disrupts the binding motif of SOX10 and LEF1. Right panel shows that rs11939273 overlaps ChIP-seq peaks from the Cistrome database103. (d) MPRA and LRA results showing allelic skews at Di-SNP rs11939273 in MNT-1 and WM88 cells, the allele frequency data was from the 180 G18 and 1000 G31 dataset. MPRA (n = 3), LRA (MNT-1, n = 6; WM88, n = 9). LRA data are presented as mean ± SEM, tested with two-tailed paired t-tests. (e) CRISPR-KO of the enhancer E1 of LEF1 does not affect LEF1 expression and melanin levels in MNT-1 cells. Left panel shows genotyping results of CRISPR-KO of the enhancer E1 of LEF1, three independent experiments. Middle panel shows the RT-qPCR results of CRISPR-KO of the enhancer E1 of LEF1 (n = 9). Right panel shows the melanin levels of CRISPR-KO of the enhancer E1 of LEF1 (n = 9). Two-tailed unpaired t-tests. For MPRA boxplots in b and d, central lines are median, with boxes extending from the 25th to the 75th percentiles. Whiskers further extend by ±1.5 times the interquartile range from the limits of each box. MPRA p-values were estimated with a random effects model for mpralm and paired t-tests with multiple testing adjustments.

Source data

Extended Data Fig. 7 MPRA and LRA identified three functional Di-SNPs near NLK.

(a) MFVs and regulatory elements near NLK. rs75827647, rs10468581 and rs113940275 are Di-SNPs from the San population. (b) LRA and MPRA results showing allelic skews at rs75827647 in MNT-1 and WM88 cells. MPRA (n = 3), LRA (MNT-1, n = 6; WM88, n = 9). (c) LRA and MPRA results showing allelic skews at rs10468581 in MNT-1 and WM88 cells. MPRA (n = 3), LRA (MNT-1, n = 6; WM88, n = 9). (d) LRA and MPRA results showing allelic skews at rs113940275 in MNT-1 and WM88 cells. MPRA (n = 3), LRA (MNT-1, n = 6; WM88, n = 11). From b to d, the barplots are results of LRA, two-tailed paired t-tests without adjustments for multiple comparisons; data were presented as mean ± SEM. ns p > 0.05. The boxplots are results from MPRA, p-values were estimated with a random effects model for mpralm and paired t-tests with multiple testing adjustments. The right panels are allele frequency maps constructed using the 180G18 and 1000 G31 dataset. For boxplots, central lines are median, with boxes extending from the 25th to the 75th percentiles. Whiskers further extend by ±1.5 times the interquartile range from the limits of each box.

Source data

Extended Data Fig. 8 Functional testing of Di-SNPs near TRPS1.

(a) SNP rs11985280 overlaps a regulatory element of TPRS1. Blue tracks show ATAC-Seq, and ChIP-Seq data from melanocytes; orange tracks indicate ChIP-Seq data from melanoma (501-mel) cells, green tracks indicate ATAC-Seq and DNase-Seq data from ENCODE cell lines. (b) MPRA results showing allelic skews at rs11985280 in MNT-1 and WM88 cells (n = 3). p-values were estimated with a random effects model for mpralm and paired t-tests with multiple testing adjustments. For boxplots, central lines are median, with boxes extending from the 25th to the 75th percentiles. Whiskers further extend by ±1.5 times the interquartile range from the limits of each box. (c) rs11985280 disrupts the binding motif of CEBPA and CEBPB. Right panel shows that rs11985280 overlaps ChIP-seq peaks from the Cistrome database103. (d) CRISPR-KO of the enhancer E1 of TRPS1 affects TRPS1 expression but not melanin levels in MNT-1 cells. Left panel shows genotyping results of CRISPR-KO of the enhancer E1 of TRPS1, three independent experiments. Middle panel shows the RT-qPCR results of CRISPR-KO of the enhancer E1 of TRPS1 (n = 9). Right panel shows the melanin levels of CRISPR-KO of the enhancer E1 of TRPS1 in MNT-1 cells (n = 9). Two-tailed unpaired t-tests. Data are presented as mean ± SEM and p values are listed above the bars.

Source data

Extended Data Fig. 9 Identification of functional regulatory variants near the BLOC1S6 locus.

(a). rs72713175 overlaps a regulatory region in melanocytes. Green tracks indicate ATAC-seq for MNT-1 and WM88 cells; blue tracks indicate ATAC-seq and ChIP-Seq from NHM; orange tracks indicate CUT&RUN from MNT-1 cells. (b) MPRA results showing allelic skews at rs11985280 in WM88 cells but not in MNT-1 cells (n = 3). P values were estimated with a random effects model for mpralm and paired t-tests without multiple testing adjustments. For boxplots, central lines are median, with boxes extending from the 25th to the 75th percentiles. Whiskers further extend by ±1.5 times the interquartile range from the limits of each box. (c) Allele frequencies at rs72713175 in global populations, data were from the 180 G18 and 1000 G31 datasets. (d) LRA results showing that rs72713175 did not affect enhancer activity in WM88 and MNT-1 cells. Two-tailed paired t-tests (n = 6). (e) CRISPRi of the enhancer containing rs72713175 significantly reduced the expression of BLOC1S6 (control, n = 8; others, n = 6; Two-sided Dunnett’s test with adjustments for multiple comparisons). (f) CRISPRi of the enhancer containing rs72713175 significantly reduced melanin levels in MNT-1 cells (control, n = 18; others, n = 9, Two-sided Dunnett’s test with adjustments for multiple comparisons). Data are presented as mean ± SEM and p values are listed above the bars.

Source data

Extended Data Fig. 10 Identification of functional regulatory variants near the DDB1 locus.

(a) Plots showing allelic skewed variants in regulatory elements near the DDB1 locus. rs7948623 overlaps an open chromatin region in melanocytes and many other cell types. rs2277285 and rs2943806 are located within CTCF binding sites and TAD boundaries. Blue tracks indicate DNase-seq, ATAC-seq, and ChIP-Seq data from melanocytes; orange tracks indicate ChIP-Seq data from melanoma (501-mel) cells; gray tracks indicate CTCF ChIP-Seq data from three cell lines; and green tracks indicate DNase-Seq data from ENCODE68. (b-d) Allelic skews at rs7948623, rs2277285 and rs2943806 as estimated by MPRA (n = 3). P values were estimated with a random effects model for mpralm and paired t-tests without multiple testing adjustments. For boxplots, central lines are median, with boxes extending from the 25th to the 75th percentiles. Whiskers further extend by ±1.5 times the interquartile range from the limits of each box. (e) rs7948623 disrupts a MITF binding motif and overlaps ChIP-seq peaks from the Cistrome database103. (f, g) LD pattern between the MFVs at the DDB1 locus. LD was calculated using the 180G18 dataset.

Source data

Supplementary information

Supplementary Information

Supplementary Notes 1–5 and Figs. 1–30.

Reporting Summary

Supplementary Tables

Supplementary Table 1. GWAS statistics of significant GWAS-All SNPs. Supplementary Table 2. GWAS statistics of significant GWAS-Bots SNPs. Supplementary Table 3. MPRA oligo sequences. Supplementary Table 4. MPRA results. Supplementary Table 5. LRA results. Supplementary Table 6. SNP–gene pairs determined by Hi-C and H3K27ac HiChIP. Supplementary Table 7. Allele frequencies of MPRA significant SNPs in global populations. Supplementary Table 8. Primer sequences.

Source data

Source Data Fig. 1

Statistical source data.

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Fig. 6

Statistical source data.

Source Data Fig. 7

Statistical source data.

Source Data Extended Data Fig. 1

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 6

Statistical source data.

Source Data Extended Data Fig. 7

Statistical source data.

Source Data Extended Data Fig. 8

Statistical source data.

Source Data Extended Data Fig. 9

Statistical source data.

Source Data Extended Data Fig. 10

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Feng, Y., Xie, N., Inoue, F. et al. Integrative functional genomic analyses identify genetic variants influencing skin pigmentation in Africans. Nat Genet 56, 258–272 (2024). https://doi.org/10.1038/s41588-023-01626-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-023-01626-1

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research