Genome-wide association scans of complex multipartite traits like the human face typically use preselected phenotypic measures. Here we report a data-driven approach to phenotyping facial shape at multiple levels of organization, allowing for an open-ended description of facial variation while preserving statistical power. In a sample of 2,329 persons of European ancestry, we identified 38 loci, 15 of which replicated in an independent European sample (n = 1,719). Four loci were completely new. For the others, additional support (n = 9) or pleiotropic effects (n = 2) were found in the literature, but the results reported here were further refined. All 15 replicated loci highlighted distinctive patterns of global-to-local genetic effects on facial shape and showed enrichment for active chromatin elements in human cranial neural crest cells, suggesting an early developmental origin of the facial variation captured. These results have implications for studies of facial genetics and other complex morphological traits.
Access optionsAccess options
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Subscribe to Journal
Get full journal access for 1 year
only $18.75 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This investigation was supported by KU Leuven, BOF funds GOA, CREA and C1. The collaborators at the University of Pittsburgh were supported by the National Institute for Dental and Craniofacial Research (see URLs) through the following grants: U01-DE020078, U01-DE020057, R01-DE016148, K99-DE02560 and 1-R01-DE027023. Funding for genotyping was provided by the National Human Genome Research Institute (see URLs): X01-HG007821 and X01-HG007485. Funding for initial genomic data cleaning by the University of Washington was provided by contract HHSN268201200008I from the National Institute for Dental and Craniofacial Research (see URLs) awarded to the Center for Inherited Disease Research (CIDR). The collaborators at Penn State University were supported in part by grants from the Center for Human Evolution and Development at Penn State, the Science Foundation of Ireland Walton Fellowship (04.W4/B643), the US National Institute of Justice (see URLs; 2008-DN-BX-K125) and the US Department of Defense (see URLs). The collaborators at the Stanford University School of Medicine were supported by the Howard Hughes Medical Institute, NIH U01 DE024430 and the March of Dimes Foundation 1-FY15-312 (J.W.). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Integrated supplementary information
A global-to-local facial segmentation of the PSU cohort obtained using hierarchical spectral clustering. Note that the order of quadrants, or facial segments at each level, is not necessarily the same as for Fig. 1, on the PITT cohort. The reason is the randomness in the clustering that does not preserve such order and hence the use of the normalized mutual information as a measure of overlap.
Left, the number of principal components retained after parallel analysis for each facial segment. Right, the amount of variation explained by the principal components expressed as percentage for each facial segment.
Supplementary Figure 3 GREAT analysis GREAT GO gene ontology analysis results for the 15 top replicated SNPs in
Table 1. Plotted is the binomial test FDR (cyan) and binomial enrichment (magenta) for indicated top associated biological processes, phenotypes and expression pattern categories.
a, –log10 (P value) of the canonical correlation per facial segment ranging from 0 to –log10 (8.01 × 10–5), i.e., the Bonferroni-corrected P value for literature replication. Black-encircled facial segments have reached nominal replication (P = 0.05). b, The canonical correlation [0 1]. c, The normal displacement (displacement in the direction locally normal to the facial surface) in each quasi-landmark of facial segment 45 going from the major to the minor allele SNP variant. Blue, inward depression; red, outward protrusion.
a, –log10 (P value) of the canonical correlation per facial segment ranging from 0 to –log10 (1.28 × 10–9), i.e., the Bonferroni-corrected P value for discovery. Black-encircled facial segments have reached nominal genome-wide significance (P ≤ 5 × 10–8). b, The canonical correlation [0 1]. c, The normal displacement (displacement in the direction locally normal to the facial surface) in each quasi-landmark of facial segment 11, going from the major to the minor allele SNP variant. Blue, inward depression; red, outward protrusion.
a, 6p21.1 locus with peak SNP rs227833 and candidate gene SUPT3H. b, 19q13.11 locus with peak SNP rs287104 and candidate gene KCTD15. The locus in a is primarily affecting the nasal bridge and ridge, leaving the nose tip unaffected. The locus in b is focused on the nose tip only, which could indicate potentially different underlying soft tissue regulations. Top, –log10 (P value) of the canonical correlation per facial segment ranging from 0 to –log10 (1.28 × 10–9), i.e., the Bonferroni-corrected P value for discovery. Bottom, the normal displacement (displacement in the direction locally normal to the facial surface) in each quasi-landmark of a representative facial segment per locus, going from the major to the minor allele SNP variant. Blue, inward depression; red, outward protrusion.