Abstract
Facial morphology—a conspicuous feature of human appearance—is highly heritable. Previous studies on the genetic basis of facial morphology were performed mainly in European-ancestry cohorts (EUR). Applying a data-driven phenotyping and multivariate genome-wide scanning protocol to a large collection of three-dimensional facial images of individuals with East Asian ancestry (EAS), we identified 244 variants in 166 loci (62 new) associated with typical-range facial variation. A newly proposed polygenic shape analysis indicates that the effects of the variants on facial shape in EAS can be generalized to EUR. Based on this, we further identified 13 variants related to differences between facial shape in EUR and EAS populations. Evolutionary analyses suggest that the difference in nose shape between EUR and EAS populations is caused by a directional selection, due mainly to a local adaptation in Europeans. Our results illustrate the underlying genetic basis for facial differences across populations.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Automatic landmarking identifies new loci associated with face morphology and implicates Neanderthal introgression in human nasal shape
Communications Biology Open Access 08 May 2023
-
Precise modulation of transcription factor levels identifies features underlying dosage sensitivity
Nature Genetics Open Access 06 April 2023
-
Phenomic Studies on Diseases: Potential and Challenges
Phenomics Open Access 05 January 2023
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout




Data availability
The Meta-analysis GWAS summary statistics are available on the National Omics Data Encyclopedia. NODE: OEP002283. The participants making up the NSPT, NHC and TZL datasets were not collected with broad data sharing consent. Given the highly identifiable nature of both facial and genomic information and unresolved issues regarding risk to participants, we opted for a more conservative approach to participant recruitment. Broad data sharing of the raw data from these collections would thus be in legal and ethical violation of the informed consent obtained from the participants. This restriction is not because of any personal or commercial interests. Additional details can be requested from L.J. for the NSPT dataset, and S. Wang for the NHC and TZL datasets. Data usage shall be in full compliance with the Regulations on Management of Human Genetic Resources in China. Publicly available data used were: the 1000GP Phase 3 data (https://www.internationalgenome.org/category/phase-3/)23, The Roadmap Epigenomics Project (http://www.roadmapepigenomics.org)26, NCBI dbSNP database (http://www.ncbi.nlm.nih.gov/SNP)62, UCSC genome browser (http://genome.ucsc.edu)63, HaploReg v.4.1 (https://pubs.broadinstitute.org/mammals/haploreg/haploreg.php)64, Ensemble genome browser (http://asia.ensembl.org/Homo_sapiens/Info/Index)65, GTEx v.8 (https://gtexportal.org/home/)66,67,68, Human genome dating (https://human.genome.dating/)49 and the transcriptome resource from separated ectoderm and mesenchyme of the developing mouse face (GSE62214).
Code availability
The statistical analyses in this work were based on functions of the statistical toolbox in MeshMonk (https://github.com/TheWebMonks/meshmonk, v.0.0.6)56, MATLAB 2018a, R (v.3.6.1), ggplot2 (v.3.1.0), Python (v.3.5.0), PLINK v.1.9, SHAPEIT2 (v.2.17), IMPUTE2 (v.2.3.2), SNPLIB (https://github.com/jiarui-li/SNPLIB), GCTA-GREML, FUMA (v.1.3.6), GREAT (v.4.0.4), GREGOR (v.1.4.0), Metascape (https://metascape.org), LocusZoom (https://genome.sph.umich.edu/wiki/LocusZoom) and REHH2 (v.3.2.0) as mentioned throughout the Methods.
References
Liu, F. et al. A genome-wide association study identifies five loci influencing facial morphology in Europeans. PLoS Genet. 8, e1002932 (2012).
Paternoster, L. et al. Genome-wide association study of three-dimensional facial morphology identifies a variant in PAX3 associated with nasion position. Am. J. Hum. Genet. 90, 478–485 (2012).
Adhikari, K. et al. A genome-wide association study identifies multiple loci for variation in human ear morphology. Nat. Commun. 6, 7500 (2015).
Adhikari, K. et al. A genome-wide association scan implicates DCHS2, RUNX2, GLI3, PAX1 and EDAR in human facial variation. Nat. Commun. 7, 11616 (2016).
Cole, J. B. et al. Genomewide association study of African children identifies association of SCHIP1 and PDE8A with facial size and shape. PLoS Genet. 12, e1006174 (2016).
Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).
Shaffer, J. R. et al. Genome-wide association study reveals multiple loci influencing normal human facial morphology. PLoS Genet. 12, e1006149 (2016).
Lee, M. K. et al. Genome-wide association study of facial morphology reveals novel associations with FREM1 and PARK2. PLoS One 12, e0176566 (2017).
Cha, S. et al. Identification of five novel genetic loci related to facial morphology by genome-wide association studies. BMC Genomics 19, 481 (2018).
Claes, P. et al. Genome-wide mapping of global-to-local genetic effects on human facial shape. Nat. Genet. 50, 414–423 (2018).
Crouch, D. J. M. et al. Genetics of the human face: identification of large-effect single gene variants. Proc. Natl Acad. Sci. USA 115, E676–E685 (2018).
Qiao, L. et al. Genome-wide variants of Eurasian facial shape differentiation and a prospective model of DNA based face prediction. J. Genet. Genomics 45, 419–432 (2018).
Li, Y. et al. EDAR, LYPLAL1, PRDM16, PAX3, DKK1, TNFSF12, CACNA2D3, and SUPT3H gene variants influence facial morphology in a Eurasian population. Hum. Genet. 138, 681–689 (2019).
Wu, W. et al. Whole-exome sequencing identified four loci influencing craniofacial morphology in northern Han Chinese. Hum. Genet. 138, 601–611 (2019).
Xiong, Z. et al. Novel genetic loci affecting facial shape variation in humans. eLife 8, e49898 (2019).
Huang, Y. et al. A genome-wide association study of facial morphology identifies novel genetic loci in Han Chinese. J. Genet. Genomics 48, 198–207 (2021).
Bonfante, B. et al. A GWAS in Latin Americans identifies novel face shape loci, implicating VPS13B and a Denisovan introgressed region in facial variation. Sci. Adv. 7, eabc6160 (2021).
White, J. D. et al. Insights into the genetic architecture of the human face. Nat. Genet. 53, 45–53 (2021).
Guo, J. et al. Variation and signatures of selection on the human face. J. Hum. Evol. 75, 143–152 (2014).
Li, J. & Ji, L. Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity (Edinb.) 95, 221–227 (2005).
Kanai, M., Tanaka, T. & Okada, Y. Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set. J. Hum. Genet. 61, 861–866 (2016).
Stouffer, S. A., Suchman, E. A., De Vinney. L. C., Star, S. A. & Williams, R. M. Jr. The American Soldier: Adjustment During Army Life (Wiley, 1965).
Genomes Project, C. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).
Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
Bernstein, B. E. et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28, 1045–1048 (2010).
Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Schmidt, E. M. et al. GREGOR: evaluating global enrichment of trait-associated variants in epigenomic features using a systematic, data-driven approach. Bioinformatics 31, 2601–2606 (2015).
Wilderman, A., VanOudenhove, J., Kron, J., Noonan, J. P. & Cotney, J. High-resolution epigenomic atlas of human embryonic craniofacial development. Cell Rep. 23, 1581–1597 (2018).
Som, P. M. & Naidich, T. P. Illustrated review of the embryology and development of the facial region, part 2: late development of the fetal face and changes in the face from the newborn to adulthood. AJNR Am. J. Neuroradiol. 35, 10–18 (2014).
Staley, J. R. et al. PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics 32, 3207–3209 (2016).
Zhou, Y. et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat. Commun. 10, 1523 (2019).
Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370 (1984).
Robinson, M. R. et al. Population genetic differentiation of height and body mass index across Europe. Nat. Genet. 47, 1357–1362 (2015).
Sabeti, P. C. et al. Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913–918 (2007).
Gautier, M., Klassmann, A. & Vitalis, R. rehh 2.0: a reimplementation of the R package rehh to detect positive selection from haplotype structure. Mol. Ecol. Resour. 17, 78–90 (2017).
Choi, S. W., Mak, T. S. & O’Reilly, P. F. Tutorial: a guide to performing polygenic risk score analyses. Nat. Protoc. 15, 2759–2772 (2020).
Yi, X. et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329, 75–78 (2010).
Singh, M. K. et al. The T-box transcription factor Tbx15 is required for skeletal development. Mech. Dev. 122, 131–144 (2005).
Lausch, E. et al. TBX15 mutations cause craniofacial dysmorphism, hypoplasia of scapula and pelvis, and short stature in Cousin syndrome. Am. J. Hum. Genet. 83, 649–655 (2008).
Ding, H. L., Clouthier, D. E. & Artinger, K. B. Redundant roles of PRDM family members in zebrafish craniofacial development. Dev. Dyn. 242, 67–79 (2013).
Kamberov, Y. G. et al. Modeling recent human evolution in mice by expression of a selected EDAR variant. Cell 152, 691–702 (2013).
Tan, J. et al. The adaptive variant EDARV370A is associated with straight hair in East Asians. Hum. Genet. 132, 1187–1191 (2013).
Shaffer, J. R. et al. Multiethnic GWAS reveals polygenic architecture of earlobe attachment. Am. J. Hum. Genet. 101, 913–924 (2017).
Gaudet, P., Livstone, M. S., Lewis, S. E. & Thomas, P. D. Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium. Brief. Bioinform. 12, 449–462 (2011).
Zhao, Y. et al. Isolated cleft palate in mice with a targeted mutation of the LIM homeobox gene lhx8. Proc. Natl Acad. Sci. USA 96, 15002–15006 (1999).
Haenig, C. et al. Interactome mapping provides a network of neurodegenerative disease proteins and uncovers widespread protein aggregation in affected brains. Cell Rep. 32, 108050 (2020).
Astuti, D. et al. Germline mutations in DIS3L2 cause the Perlman syndrome of overgrowth and Wilms tumor susceptibility. Nat. Genet. 44, 277–284 (2012).
Albers, P. K. & McVean, G. Dating genomic variants and shared ancestry in population-scale sequencing data. PLoS Biol. 18, e3000586 (2020).
Voight, B. F., Kudaravalli, S., Wen, X. & Pritchard, J. K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).
Guo, J. et al. Global genetic differentiation of complex traits shaped by natural selection in humans. Nat. Commun. 9, 1865 (2018).
He, Y. et al. A probabilistic method for testing and estimating selection differences between populations. Genome Res 25, 1903–1909 (2015).
Zaidi, A. A. et al. Investigating the case of human nose shape and climate adaptation. PLoS Genet. 13, e1006616 (2017).
Delaneau, O., Marchini, J. & Zagury, J. F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–81 (2011).
Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
White, J. D. et al. MeshMonk: Open-source large-scale intensive 3D phenotyping. Sci. Rep. 9, 6085 (2019).
Li, J. et al. Robust genome-wide ancestry inference for heterogeneous datasets: illustrated using the 1,000 genome project with 3D facial images. Sci. Rep. 10, 11850 (2020).
Rencher, A. C. & Christensen, W. F. Methods of Multivariate Analysis. (John Wiley & Sons, Inc., 2012). https://doi.org/10.1002/9781118391686
Olson, C. L. On choosing a test statistic in multivariate analysis of variance. Psychol. Bull. 83, 579–586 (1976).
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).
Navarro Gonzalez, J. et al. The UCSC Genome Browser database: 2021 update. Nucleic Acids Res. 49, D1046–D1057 (2021).
Ward, L. D. & Kellis, M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 40, D930–4 (2012).
Aken, B. L. The Ensembl gene annotation system. Database (Oxford) 2016, baw093 (2016).
Lonsdale, J. et al. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Carithers, L. J. & Moore, H. M. The Genotype-Tissue Expression (GTEx) Project. Biopreserv. Biobank. 13, 307–308 (2015).
Keen, J. C. & Moore, H. M. The Genotype-Tissue Expression (GTEx) project: linking clinical data with molecular analysis to advance personalized medicine. J. Pers. Med. 5, 22–29 (2015).
Acknowledgements
We thank the participants of the NSPT, NHC and TZL cohorts who consented to participate in research, and the related teams, including interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists and nurses. This project was funded by the following grants and contracts: Strategic Priority Research Program of the Chinese Academy of Sciences (XDB38020400 to S. Wang); Shanghai Municipal Science and Technology Major Project (2017SHZDZX01 to L.J. and S. Wang); National Key Research and Development Project (2018YFC0910403 to S. Wang); CAS Interdisciplinary Innovation Team Project (to S. Wang); Max Planck-CAS Paul Gerson Unna Independent Research Group Leadership Award (to S. Wang); National Natural Science Foundation of China (31521003 to L.J., 31900408 to M.Z.); National Science and Technology Basic Research Project (2015FY111700 to L.J.); CAMS Innovation Fund for Medical Sciences (2019-I2M-5-066 to L.J.); The 111 Project (B13016 to L.J.); China Postdoctoral Science Foundation (2019M651352 to M.Z., 2020M670984 to W.Q.). We are grateful for all suggestions collected during the poster exhibition of ASHG 2019 Annual Meeting and thank all the participants in these studies.
Author information
Authors and Affiliations
Contributions
S. Wang, L.J., J.L. and M.Z. conceptualized the study (ideas; formulation or evolution of overarching research goals and aims). M.Z., S. Wu, S.D., W.Q. and J.L. carried out data curation (management activities to annotate, scrub data and maintain research data for initial use and later re-use). M.Z., S. Wu, S.D., W.Q., J.L. and J.C. carried out the formal analysis (application of statistical, mathematical, computational or other formal techniques to analyze or synthesize study data). M.Z., S. Wu, S.D., W.Q. and J.L. did the visualization (preparation, creation and/or presentation of the published work, specifically visualization/data presentation). K.T., L.Q., Y.Y. and J.T. collected the 3D facial scans of the TZL cohort. J.W., Z.Y., J.T, K.T. and L.Q. collected the 3D facial scans of the NSPT cohort. S. Wu, Y.L. and Q.P. contributed to generating the SNP array data. S. Wu, S.D. and J.L. registered the 3D facial scans of the Northern Han Chinese cohort and conducted the PCA of discovery cohort. N.N. and A.R.-L. performed the analysis of MPRS22/Mprs22 and human/mouse craniofacial shape. M.Z., S. Wu, S.D., W.Q. and J.L. wrote the original draft. S. Wang, L.J., P.C., J.L., M.Z., S. Wu, S.D. and W.Q. reviewed and edited the final manuscript. All authors participated in preparing the manuscript by reading and commenting on drafts before submission.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Hongtu Zhu and Fan Liu for their contribution to the peer review of this work. Peer reviewer reports are available with this article.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Study design.
We first start with a face segmentation procedure to get 63 face segments from which we defined 10 anatomical face regions. Then by using a CCA based GWAS, we identified 244 variants with a P value lower than 5×10-8, in which 151 are also lower than 9.8×10-10. To investigate what affects the similarity of an EAS face, we used polygenic population shape (PPS) analyses to fit EUR and EAS faces and identified 13 variants mainly contributing to EUR-EAS facial differences. To investigate selection on facial variation, we used FST and XP-EHH to find which parts of the face are under selection. These results, we further compared with random drift and random PPS to find out, which from the two populations, EUR or EAS, experienced selection.
Extended Data Fig. 2 Enrichment analysis of leading variants.
(a) Geno Ontology enrichment for genes annotated from leading variants by GREAT24. (b) Heatmap indicating the global enrichment of trait-associated variants in different chromatin state (y axis) and in different tissue (x axis). The fold change was calculated by GREGOR28. The embryonic craniofacial tissue was previously published by epigenomic atlas, while the other was previously published by Roadmap Epigenome27. The description of the 25-state chromatin model can be found at: https://egg2.wustl.edu/roadmap/web_portal/imputed.html#chr_imp. (c) Expression levels of the candidate genes in craniofacial tissues. Each point (n=3 biologically independent replicates for each condition) represents an estimated fold change compared to control genes at different times (E10.5, E11.5, E12.5), in different prominences (Frontonasal, FNP: circle; Maxillary, MxP: square; Mandibular, MnP: triangle), and tissue layer (Ectoderm, Ect: red; Mesenchyme, Mes: blue). Data are presented as mean values +/- 95% confidence intervals (1.96×SEM).
Extended Data Fig. 3 XP-EHH and FST enrichment analysis for shared and differentiated variants.
XP-EHH and FST enrichment analysis for (a, d) EUR differentiated variants, (b, e) EAS differentiated variants, and (c, f) shared variants in EAS study. The blue color is the null distribution. The red line is the mean XP-EHH or FST score of shared or differentiated variants. The black line is the 95% quantile of the null distribution.
Extended Data Fig. 4 Validation of PPS in 10 anatomical segments.
(a) The null distribution (blue) of Euclidean distance, cosine similarity with EUR average face and EAS average face using 1,000 simulations derived from random variants on the 10 anatomical regions, red line infers the statistics of the leading variants associated with corresponding regions; black line infers 95% quantile of distribution from the random variants with corresponding regions; (b) The genetic effects of rs12632544 and (c) rs12473319 weighted by their effect allele number difference of EUR and EAS (visualized using the local surface normal displacement).
Extended Data Fig. 5 The EAS-FA of polygenic shapes in 10 anatomical regions for EAS and EUR individuals in 1000GP.
The EAS-FA of polygenic shapes in a) mandible, b) forehead, c) lower mouth, d) upper mouth, e) nose, f) maxillary, g) glabella, h) eye, i) tempora, and j) zygoma for EAS and EUR individuals in 1000GP. The squares represent the mean EAS-FA score in 10 anatomical regions and the horizontal lines represent 1st and 3rd quantile.
Extended Data Fig. 6 EAS-FA of the 244 leading variants on the EUR-EAS difference.
The distributions (blue) of EAS-FA derived from 244 leading variants associated with a) whole face and b) - k) 10 anatomical segments. The black dotted line is the EAS-FA threshold of each region (mean + 3×SD). The red arrow is the variant over threshold.
Extended Data Fig. 7 Multi peak in 17q24.3 region.
(a) Association variants in the SOX9 locus and genomic environment surrounding SOX9 across a 2-Mb window. Four independent variants, represented by (1) rs34476511 (blue), (2) rs9900242 (green), (3) rs8068343 (red), and (4) rs2193052 (purple) are observed; (b) Allele frequency in AMR, SAS, AFR, EUR and EAS population of the four variants from 1000GP; (c) The effects of the four variants in the nose region.
Supplementary information
Supplementary Information
Supplementary Notes 1 and 2, Methods, Figs. 1–6 and Tables 1–9.
Supplementary Tables
Supplementary Tables 1–9.
Rights and permissions
About this article
Cite this article
Zhang, M., Wu, S., Du, S. et al. Genetic variants underlying differences in facial morphology in East Asian and European populations. Nat Genet 54, 403–411 (2022). https://doi.org/10.1038/s41588-022-01038-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-022-01038-7
This article is cited by
-
Precise modulation of transcription factor levels identifies features underlying dosage sensitivity
Nature Genetics (2023)
-
AEP-GAN: Aesthetic Enhanced Perception Generative Adversarial Network for Asian facial beauty synthesis
Applied Intelligence (2023)
-
Phenomic Studies on Diseases: Potential and Challenges
Phenomics (2023)
-
Automatic landmarking identifies new loci associated with face morphology and implicates Neanderthal introgression in human nasal shape
Communications Biology (2023)
-
Combining genome-wide association studies highlight novel loci involved in human facial variation
Nature Communications (2022)