Introduction

The human face is an important interface of social interaction; communication, sensory input and expression in humans are to a large extent based on facial characteristics and traits1. Normal facial variation is associated with emotional expression2, attractiveness3 and even lifetime reproductive success4. Recent evidence suggest that evolution has contributed to increased diversity and complexity in human facial morphology, presumably due to the role of the face as a primary medium of individual identification and recognition5. The influence that facial features have in our life has spurred a long and ongoing interest in unraveling the roles that genes and environment play in the morphological characteristics of the human face.

Since the mid-twentieth century, anthropometric scientific research on parent-offspring resemblance and twin concordance has confirmed that variation in human face morphology is driven by genetics6,7,8,9,10,11,12,13,14. Heritability studies were carried out to quantify the extent of phenotypic variation that can be explained by genetic variability using, for instance, facial features extracted from cranial measurements. Moderate heritability, varying approximately between 0.35 and 0.65, was found for traits such as nasion-basion and nasion-sella distances, as well as the position of the lower jaw and the nasal height15,16,17,18,19,20,21,22,23. More recent studies used facial photographs instead, due to the simplicity in which the images can be obtained. However, common traits such as the upper lip height, as well as nasal breadth and vertical eye distance, extracted from standard photographs, were only found to be moderately heritable, with estimates between 0.4 and 0.5324,25,26,27,28,29. Given the almost perfect resemblance of identical twins, such heritability values appear surprising low. Attempts to replicate these findings across independent studies generated inconsistent evidence. A comparison of eight heritability studies reported low correlation (<0.4) between heritability estimates for commonly examined traits such as head circumference, facial height and nose width8. Further examples include the heritability of facial width, which was reported to range from as low as 0.257 to 0.62914,21,30, nasal breadth heritability, varying between 0.352 and 0.63914,31,32, and cheek length heritability, ranging from 0.154 to 0.47521,33.

Possible explanations for such unexpectedly low estimates and inconsistency across findings can be found in a number of experimental factors. First of all, in the way in which the traits have historically been measured. Radiographs and photographs are both flat, two-dimensional images. Their use to measure inherently three-dimensional (3D) objects, such as facial surfaces, limits the extent of shape variability that can be captured and constrains the range of facial morphological descriptors that can be extracted. Other issues can be identified in the process of constructing and measuring facial traits. Quantifying face variability is heavily dependent on establishing landmarks across faces. The number of points that can be manually annotated on a face is affected by the type of imaging modality used, and by the ability of a person to establish landmark locations in an unambiguous manner across samples. Consequently, it is common for studies to annotate as landmarks only a few prominent facial markers, such as eye and mouth corners, nose tip and zygomatic bones15,16,17,18,19,20,21,22,23,24,25,26,27,28,29, ranging in number between 10 up to, in exceptional cases, 40 landmarks. It becomes clear thus that manual landmarking poses an important constraint limiting the extend of facial variability that can be captured. Furthermore, most classical studies adopted facial traits derived from two-dimensional distances between landmark pairs16,17,18,19,20,21,22,23,24,25,28,29. More rarely, angles between connected landmark pairs have also been considered15,19,26,27. The widespread adoption of such facial phenotypes could be justified by the small number of annotated landmarks, the relative simplicity in which these measurements can be acquired, as well as their ease of interpretation. On the other hand, they offer a perhaps oversimplified characterization of face morphology and fail to take into full account the geometric variability that can be observed across faces at a more granular level.

A separate limiting factor that has affected twin heritability studies is related to sample sizes employed and associated statistical modeling implications. Twin studies predominantly employ statistical methods that estimate heritability as the percentage of phenotypic variation that is due to variation of genetic factors34. The statistical power of such studies is defined as the probability of correctly rejecting the null hypothesis of zero heritability35. A multitude of factors affect power, including the combination of variance components used in the model, heritability effect size (ranging between 0 and 1), sample size and the proportion of monozygotic (MZ) to dizygotic (DZ) twins in the dataset. How to best optimize the experimental design in terms of sample size is still largely debatable35. By assuming the existence of only additive genetic effects in the twin model, a previous simulation study reported minimum required sample sizes ranging between 75 and 2,000 twin pairs in order to achieve 95% statistical power, depending on heritability effect sizes ranging from 0.8 to 0.2 respectively, and keeping the MZ to DZ ratio close to one36. In practice, the large majority of twin studies to date have relied on sample sizes of 20 to 100 twin pairs6,10,11,13,19,20,24,25,26,27,28, possibly due to difficulties in recruiting large cohorts of twins. The use of small to medium sized cohorts may have thus resulted in under-powered studies, especially for traits with low to moderate heritability.

In this work we present a large-scale heritability study of face geometry that departs from previous related investigations in various aspects. First, we acquired 3D facial models. A system for high-resolution 3D photographic scanning, the 3dMD face imaging system, was used to generate anatomically precise three-dimensional polyhedral surfaces of the faces. To capitalize on these representations, we developed a novel automated landmarking procedure, GESSA (Geodesic Ensemble Surface Sampling Algorithm). The algorithm automatically places landmark points throughout the facial surfaces and establishes point-wise correspondence across subjects. GESSA enables the annotation of thousands of landmarks, resulting in the ability to capture morphological variation across subjects at a much finer level of granularity, whilst removing human measurement errors and enabling scalability to large cohorts. The position of each landmark is automatically determined by the algorithm, which attempts to distribute landmark locations uniformly on individual surfaces whilst establishing a precise correspondence across all faces. GESSA was validated on a publicly available dataset of 3D facial surfaces, Morphface37. The software and validation data are available to download from https://github.com/dimostsag/gessa. The availability of densely sampled landmark positions on each face enabled a wider range of facial traits to be defined, each capturing a specific aspect of face-shape variability. In this study, we demonstrate that local curvature traits, computed at each three-dimensional position across the facial surface, provide highly informative quantitative measurements of facial geometry, and explore for the first time their heritability.

A face heritability study was performed on 952 British twins recruited from the TwinsUK adult twin registry38. All subjects were females and unselected for anolyhedral mesh comprising of approximately 4,500 points. Using GESSA, we identified 4,096y disease, of which 197 were MZ and 279 DZ pairs. To the best of our knowledge, this is the largest twin heritability study of the human face. Each face was represented in the dataset as a 3D p landmarks on each face, each one contributing a local curvature value whose heritability was independently assessed. Curvature-based heritability estimates at the individual landmark level were combined into face heritability maps highlighting in great detail, for the very first time, which facial parts are under high and low genetic control. A multivariate analysis involving thousands of closely sampled landmarks further identified extended and well-defined facial regions sharing similar patterns of variability, with heritability estimates reaching or exceeding 0.7, including the chin, nasal regions, nasolabial folds, upper lips and zygomatic bones. In addition, using a smaller set of these landmarks, we explored the heritability of more traditional distance-based facial traits. A number of facial lengths, including bizygomatic and nose width, had heritability estimates close to or greater than 0.7, values that are significantly higher than the ones previously encountered in the respective literature. This is the first time that such a detailed and comprehensive evaluation of facial shape heritability has been investigated using a large cohort and 3D data capture technology. Our heritability findings are likely to support future genome-wide studies on facial geometry, while dense representations of facial surfaces through curvature indices may find further use in face recognition and reconstruction techniques.

Results

We present here the phenotyping and heritability results from our sample of 952 TwinsUK twins. Point correspondences for 4,096 landmarks were automatically established using GESSA. Shape-related phenotypes were constructed using four different curvature indices on the landmark sets. Visualization of heritability estimates associated to landmark-wise curvature traits produced high-definition heritability maps. A multivariate analysis of these landmark-wise measurements, based on sparse PCA (sPCA), indicated the presence of spatially coherent traits extending over larger areas of the face, whose heritability was also estimated. Finally, a subset of 17 landmarks was selected and the heritability of 20 traits based on Euclidean and Geodesic distances was calculated.

Curvature-based Morphological Traits in the TwinksUK dataset

The set of 952 three-dimensional facial surfaces from the TwinsUK cohort was processed with GESSA and 4,096 landmarks were identified. Each landmark contributed four shape-related phenotypes, associated with how bent the surface is around that point. These traits were computed using local curvature indices, namely Mean Curvature (MC), Gaussian Curvature (GC), Curvedness (CU) and Shape Index (SI), resulting in 16,384 quantitative traits per face. A detailed description of the curvature indices, and the rationale for using different types, can be found in the Methods section. For each one of these four measures, curvature maps of the average TwinsUK face - constructed by averaging landmark positions of all faces - were obtained by color-coding all facial landmarks according to their curvature values. Figure 1 shows the resulting maps, which provide easily interpretable representations of facial morphology and underline the different attributes of each curvature index. It must be noted that, due to ethical reasons, individual faces cannot be shown. We only visualize the average face, computed by averaging landmark coordinates of the 952 faces. Due to the large sample size of our dataset, the average face appears somewhat smoothed, in particular around the mouth and eye regions. Supplementary Fig. 10 shows average faces obtained using only 10 and 200 individuals. It can be clearly noticed that the increasing sample size has a smoothing effect on the average face.

Figure 1: Face curvature maps.
figure 1

The figure shows the curvature maps of the average face from the TwinksUK dataset. The maps were created by computing values of curvature indices along all landmarks on the face. Most facial parts have a convex topology, either cylindrical or spherical, with few saddle-like or flat areas that transition to concave regions localized in the eye, nose and mouth corners. Each map further highlighted the unique characteristics of its corresponding index. Mean Curvature Map: Positive values corresponded to convex areas (nose, eyebrows, lips, chin), while negative values to concave ones (inner eye corners, subnasal region). Gaussian Curvature Map: Many facial regions had very small values, due to being either flat or curved along only one direction. Exceptions were the spherical (nose tip, inner eye corners) and saddle-like (nasion, subnasal region) areas. Curvedness Map: Consisting only of non-negative values, the map highlighted how curved the surface is without distinguishing between different shape morphologies. Large values were concentrated on the central part of the face. Shape Index Map: Most facial areas had positive values, indicating the overall cylindrical structure of the human face.

The MC index provides a balanced measure between shape morphology, i.e. flat vs. cylindrical vs saddle structure, and curvature magnitude, i.e. how bent the surface is irrespective of shape. In the MC curvature map, points belonging to protruding concave regions like the nose, chin and eyebrows had positive MC values, with higher measurements observed in the nasal surface. Flatter areas, such as cheeks and forehead, had MC values close to zero, while inner eye corners, ala of the nose, and to a lesser extent, the corner areas between lips and chin were comprised of negative-valued curvatures. GC describes well variation between and within cylindrical and saddle-like structures, while being less sensitive to other shape characteristics. In the GC map, facial features associated with positive values were the cylindrically structured inner eye corners and nose tip, while saddle-like regions such as nasion and base of the nose showed negative GC values. CU is affected by changes in the magnitude of the curvature but not by shape morphology. The CU curvature map highlighted facial parts with large overall curvature, for example nose and eyebrows, which confirms the intuitive observation that highly curved facial parts are mainly centrally located in the face. Contrary to CU, the SI index primarily distinguishes between different shape morphologies, but is less sensitive to curvature magnitude. The SI curvature map showed that that most facial areas have positive values corresponding to generally cylindrical structures.

The sample variability of each curvature trait was also investigated. Curvature variance maps can be found in the Supplementary Fig. 4. Depending on the curvature index, various facial areas showed increased variability. High variance of the MC measurements was observed in areas such as eye sockets, ala of the nose and mouth. GC and CU variance was located mainly in the nose, eye and philtrum regions, while the SI captured increased variation in the zygomatic and mouth areas. It is important to notice that particular facial areas, namely the mouth and eye regions, showed consistently high variability, which could relate to the increased motional ability of the specific structures. Irrespective of the curvature index used, average phenotypic variability was always higher in the subset of DZ, compared to the MZ pairs. As heritability is based on differences in similarity between MZ and DZ pairs39, we also computed the means and variances of absolute trait differences between pairs. The results were combined into facial maps (Supplementary Fig. 5) and showed clearly higher values in the DZ subset, compared to the MZ one.

Univariate Heritability Analyses

We performed univariate analyses of all 16,384 curvature traits - 4 traits per landmark - with the aim to combine local heritability results and produce global maps of heritability for the human face. The heritability of each trait was independently estimated using Structural Equation Modeling (SEM)40. The method evaluates which combination of additive (A) genetic, common (C) environmental and unique (E) environmental variance components can best explain the observed phenotypic variance and covariance of MZ and DZ twin data; see Methods for a detailed description of the model approach. Different combinations of A, C and E component models were considered. The Akaike Information Criterion (AIC)41 was used to guide model selection. AE models were the best-fitting ones according to AIC. Summary statistics for all fitted models can be found in Supplementary Table 1. SEM also assessed the ability of the model to fit the observed data. Model Goodness-of-fit was examined using a log-likelihood ratio test between the structured model and a fitted saturated model where no structure was imposed on the covariances. Test p-values above 0.05 translated to the structured model providing a better fit than the saturated one. Details regarding goodness-of-fit can be found in the Supplementary Text. From the sets of 4,096 traits per index, 86.02%, 67.9%, 86.8% and 87.06% AE models for MC, GC, CU and SI respectively had goodness-of-fit p-values above 0.05.

For each curvature type, the 4,096 heritability estimates were visualized in a single heritability map plot. Each map provides a graphical representation of the extent by which the geometry of facial regions is controlled by genetic variability. Frontal and side facial views of these maps are shown in Fig. 2. For interactive viewing of the heritability maps, a website was created at http://heritabilitymaps.info/.

Figure 2: Heritability maps of the human face.
figure 2

Each heritability map consists of 4,096 landmark heritability estimates. Facial areas with high heritability across all four maps were the mental region, philtrum, nasal tip, nasion, inner eye corners, nasolabial folds and frontal process of maxilla. Mean Curvature Heritability Map: In addition to aforementioned areas, we observed genetic association along the lower parts of the eyes (zygomatic bones), the sides of mouth/chin complex (mental foramen), the complete upper lip region and the frontal eminences. Gaussian Curvature Heritability Map: Compared to the previous map, zygomatic bones were not clearly distinguished and only the philtrum had high heritability in the upper lip region. A moderate genetic effect was observed around the whole nasion area. Curvedness Heritability Map: The map included clear heritable lines in the upper and lower circumference of the facial surface. A further unique feature observed here was a flat heritable area around the ramus of the mandible. Shape Index Heritability Map: Zygomatic lines were clearly visible in this map, along with the upper lip and the frontal eminences. The map highlighted primarily the mostly concave regions of transition between the lip and ala of the nose, the frontal process of maxilla and the lower more protruding parts of the nasal bones.

Several landmarks with high (>0.65) heritability estimates were localized on well defined facial areas. Irrespective of the curvature index employed, landmarks belonging to the mental region, philtrum, nasal tip, nasion, inner eye corners, nasolabial folds and frontal process of maxilla gave consistently high estimates. Amongst all curvature types, MC yielded the highest heritability estimates over extended facial regions. The MC heritability map highlighted further highly heritable areas, including the zygomatic lines around the eye sockets, side areas of the mental foramen, the upper lip and frontal eminences. The results of the GC traits showed high heritability for a number of saddle-like facial structures, namely the whole nasion region, the philtral ridge, as well as the lower nasal bone. In the CU heritability map, a strong genetic influence is observed in the angular transition from the nasal bone and glabella towards the frontal eminences, as well as in the overall roundness of the facial circumference, highlighted by heritable lines across the upper part of the forehead and the lower part of the ramus of the mandible. The SI heritability map showed strong genetic control of the softly spherical flat regions of frontal eminences and upper lip, the cylindrical structure of the upper zygomatic bones and sides of the nose and finally the saddle-like areas of nasion, nasal bone and ala of the nose. The results indicated that local curvature is strongly determined by genes for large parts of the face.

Multivariate Heritability Analyses

We aimed to identify larger facial areas showing common patterns of shape variation and produce single-valued heritability estimates for them. For each face and curvature index, we decomposed the aggregated landmark phenotypes using sparse Principal Component Analysis (sPCA)42,43 (see Supplementary Text for further details on sPCA). sPCA automatically identified traits comprised of linear combinations of landmark phenotypes with similar curvature variability. We refer to these as regional traits. Sparsity affected the number of landmark traits comprising each regional phenotype, and was imposed in order to acquire measurements corresponding to extended but spatially consistent facial regions. The amount of sparsity was controlled through a single parameter. Different values were tested and results showed that the parameter had little effect in the heritability estimation process (Supplementary Fig. 3). The parameter was set to 7.5 for GC, 12.5 for SI and 15 for MC and CU. We estimated the heritability of 100 regional traits for each curvature index. Each set of regional traits captured approximately 90% of its corresponding curvature’s sample variability. The cumulative amounts of explained variances for all regional traits are reported in the Supplementary Files 1–4, for MC, GC, CU and SI indices respectively.

Average heritability statistics for the regional trait analyses are included in Supplementary Table 1. We found again that the best models were the AE as assessed by AIC. Goodness-of-fit p-values above 0.05 were acquired for 81, 75, 87 and 73 regional traits of MC, GC, CU and SI respectively.

The facial areas associated to regional traits were visualized by color-mapping the coefficients of the linear combinations - weights by which landmark traits contribute to the regional phenotypes - on the facial surface. We refer to these maps as Eigenface maps, since they correspond to the eigenvectors of the multivariate decomposition. Eigenface maps of the 10 highest variance explaining regional traits for each curvature index are shown in Supplementary Fig. 2. Figure 3 shows the Eigenface maps of the top 5 heritable regional traits for each curvature index, along with their heritability values. Corresponding phenotypic correlations for MZ and DZ subsets are reported in Supplementary Table 2. The maps of the most heritable traits were mostly comprised of landmarks closely located to each other, indicating patterns of common shape variation in the respective areas.

Figure 3: Eigenface maps of the top heritable regional traits.
figure 3

The maps depict the weights by which landmark phenotypes contribute to the regional traits. sPCA was utilized in order to achieve good spatial consistency of the Eigenface maps. Mean Curvature Eigenface Maps: Distinctive facial characteristics that emerged as heritable were the nasolabial folds, transitions to the ala of the nose, frontal eminences, zygomatic areas between and below the eye sockets and the condyloid process of the mandible, with the latter been more clearly portrayed in the multivariate rather than the univariate heritability analysis. Gaussian Curvature Eigenface Maps: The GC Eigenface maps were highly localized, compared to the rest of the curvature indices. Heritable regions included for the philtrum, ala transitions, inner eye corners and chin facial regions. Curvedness Eigenface Maps: The top heritable traits highlighted mainly the zygomatic areas around the eye sockets, as well as the nasion, chin and upper forehead areas. Shape Index Eigenface Maps: Highly heritable regional traits were located in the nasolabial folds, zygomatic areas,chin, condyloid process of the mandible and mental foramen regions.

Heritability results for the regional and univariate traits were in good agreement. From Fig. 3 it is obvious that the morphologies of regions such as nasolabial folds, zygomatic bones, inner eye corners, mental region, frontal eminences and ala of the nose were highly heritable (>0.65). Due to the good segmentation of the facial surface into clearly identifiable regional traits, we also identified heritable areas that were not as easily noticeable in the heritability maps. In particular, the mental foramen showed up as the second top heritable phenotype in the SI results, while the condyloid process of the mandible was highlighted in the fifth and third top heritable trait in MC and SI analyses, respectively. In the discussion, we identify regional traits by their curvature index and their variance-explaining order, as shown in Fig. 3.

Heritability of Distance-based Traits

An analysis of traditional distance-based facial traits was performed to gain insights about the relative merits of our phenotypes and also enable direct comparisons to previous heritability studies. Out of the 4,096 landmarks, we located 17 corresponding to prominent fiducial points, by visual inspection of the average TwinsUK face. Figure 4 shows the selected landmarks. Using the computed correspondence, we were able to automatically locate the 17 landmarks on all 952 faces. Ten facial traits derived from Euclidean distances (EDTs) between selected landmark pairs were subsequently considered. The phenotypes are summarized in Table 1. In addition, we constructed ten equivalent distance traits measured as lengths of connecting paths between the same landmark pairs, where the paths were restricted to lie only on the facial surfaces. Such distances, defined on non-flat surfaces, are called Geodesic distances. An illustration of the difference between the two types of distances can be seen in Supplementary Fig. 6. We examined whether the use of facial traits derived from Geodesic distances (GDTs), only possible on 3D data, yielded any advantages compared to Euclidean traits.

Figure 4: Distance-based analysis - Selected landmarks depicted on the average TwinsUK facial surface.
figure 4

17 facial points corresponding to prominent fiducial markers were selected from the automatically computed set of 4,096 landmark points. Landmark pairs were subsequently used to construct distance-based phenotypes. See Table 1 for naming conventions.

Table 1 Abbreviations and related descriptions of selected landmarks and distance-based traits.

Heritability estimation for the 10 EDTs and their equivalent GDTs was carried out using ACE, AE and E structural equation models, as before. AE models provided on average the best fits according to AIC. Supplementary Table 3 provides detailed statistics for the fitted models. Table 2 shows the resulting heritability estimates as well as SEM Goodness-Of-Fit test p-values. In the remainder, we concentrate on traits whose models’ Goodness-Of-Fit test p-values were greater than 0.05, thus providing good fits of the observed data. Four EDTs corresponding to horizontal facial measurements, namely nose, zygomatic, mandible and mouth widths were found to be highly heritable. Of these, estimates greater than 0.7 were acquired for the two upper/middle face EDTs, zygomatic and nose widths, while the heritabilities of the mandible and mouth widths were found to be slightly lower at approximately 0.62 and 0.67, respectively. The EDT corresponding to nasal protrusion was also found to be moderately heritable at approximately 0.55. Heritable GDTs included the mandible and mouth widths, nasal protrusion, lower face height and biocular width (i.e. the distance between inner eye corners). The first three GDTs had slightly lower heritability estimates compared to the corresponding EDTs. Mandible and mouth GDTs traverse the lip region, which had low heritability estimates in our previous analyses. On the other hand, the biocular GDT, which traverses a surface area with consistently high heritability, i.e the nasion, provided the highest estimate (0.789) among all distance phenotypes.

Table 2 Heritability estimates h2 for 10 traits based on Euclidean Distances (EDTs) and 10 traits based on Geodesic Distances (GDTs).

Discussion

We presented a novel landmarking and phenotyping methodology for 3D surfaces and performed a large-scale twin heritability study of the human face. A salient aspect of our analysis is the automated dense landmarking procedure. Dense landmarking approaches have been recently adopted in face modeling and candidate association analyses in order to study genetic syndromes involving facial dysmorphisms and asymmetries44,45 and recognize genetic variants that explain normal face variation46,47. Existing methods, though, are either still heavily dependent on some form of manual landmarking, which can be a tedious and error-prone process, or not suited for the analysis of polyhedral surfaces. Further details on related work can be found in the Supplementary Text. Here, we propose a new methodology, GESSA, which makes use of appropriate mathematical structures, such as distances and paths, directly defined on the surfaces, in order to provide uniform and dense landmarking of 3D polyhedral models in an accurate and efficient manner. The GESSA software and validation data can be freely downloaded from https://github.com/dimostsag/gessa. Non sensitive data, such as curvature data matrices, are available upon request to the authors. Furthermore, this is the first time that dense facial landmarking has been used in a twin heritability study.

A second novel aspect of our methodology, facilitated by the three-dimensional face models, is the use of curvature-based phenotypes. For each landmark point in the surface, four different types of univariate measurements describing curvature for local patches centered around the landmarks were considered. Each curvature type is able to highlight varying morphological structures. The use of these traits enabled us to characterize local variability in facial shape and identify its genetic content. Heritability estimates for individual landmark traits were combined to provide detailed global maps of heritability for the human face. Furthermore, regional traits, defined as linear combinations of the single landmark traits, were computed by a multivariate decomposition of the previous traits. Heritability analyses of the latter phenotypes allowed us to report on accurate heritability values for well-defined facial regions. In the literature, similar curvature traits have been successfully used for other applications such as face detection48, recognition49,50, segmentation51,52 and affinity estimation52,53.

Our landmarking and phenotyping pipeline was employed for the analysis of 456 female twin pairs from the TwinsUK cohort38. Previous work has suggested that the TwinsUK sample is representative of the general British population, where the sample were ascertained from54. For facial traits in particular, it may be expected that gender also plays a role in shaping up the facial characteristics we study, such as curvature measures. For this reason, caution is warranted before any of these results are generalized and extended to male subjects.

To our knowledge, this is the largest face heritability study ever done. Compared to most commonly encountered sample sizes of between 20 and 100 pairs6,10,11,13,19,20,24,25,26,27,28, the increased number of subjects improves the statistical power of our study to identify effects of heritability. Based on a previously published simulation study on the power of twin studies36, our sample size surpasses the minimum requirements for having 95% power of rejection of the false - zero heritability - hypothesis at the 5% significance level even when the true heritability effect is as low as 0.3.

We were able to identify curvature-based facial traits that were highly heritable (>0.65) in both of our curvature-based analyses. A direct comparison with previously reported heritable facial lengths and angles is not straightforward, due to the different nature of the measurements. Certain connections though were made between our heritability and Eigenface maps and related published findings. A heritability estimate of 0.53 was previously reported for nose width in a pedigree analysis of 229 Korean individuals29. The MC heritability map (Fig. 2) showed high heritability for the line between the left and right corners of the ala through the base of the nose. In the same study, inner eye corner distance had a heritability estimate of 0.61. The GC, CU and SI heritability maps all indicated that the shape of the nasion region, including the inner eye corners, was highly heritable. This inference was further supported by our multivariate analysis. The 3rd and 5th most heritable GC regional traits, as well as the 3rd top heritable CU trait (Fig. 3) had Eigenface maps concentrated on the nasion area (Fig. 3) and their respective heritability estimates ranged between 0.699 and 0.737. Moderate to high heritability, ranging between 0.4 and 0.7, was identified in twin and family studies for the facial width22,23,26,27. Our MC and SI heritability maps (Fig. 2) highlighted highly heritable lines following the curve of the zygomatic bones. Another phenotype that was reported with heritability estimates of 0.59 and 0.66 is head circumference22,55. This result can be connected to the elevated heritability estimates regarding curvature magnitude in the periphery of the face, that we observed in the CU heritability map (Fig. 2). Finally, a number of twin and family studies identified moderate to high heritability estimates for various phenotypic traits relating to the position, length and angular structure of the jaw bone. Mandible ramus and body length heritabilities were reported to be 0.72 and 0.77 respectively in a study of 363 children and their parents15, while the angle between the two lines was found to have moderate heritability 0.47 and 0.453 in the same analysis and in a different study of 77 twins25. Chin width was estimated to have heritability of 0.42 in a family study29. In our multivariate results, we observed high estimates in the chin area – 2nd SI, 8th CU and 15th GC regional traits, with respective trait heritabilities ranging from 0.685 to 0.711 (Fig. 3). All above comparisons are summarized in Supplementary Table 4.

We also explored the heritability of facial traits based on Euclidean distances (EDTs), as well as equivalent phenotypes measured using Geodesic distances on the facial surfaces (GDTs). 10 EDTs and GDTs yielded reliable heritability estimates, ranging from 0.505 to 0.789. Evidence supporting our results were found in a number of family-based heritability analyses. A previously mentioned study reported its highest heritability estimates of 0.42, 0.44, 0.53 and 0.61 for mandible width, nasal protrusion, nose width and inner eye corner distance respectively29. Here, the nose width EDT and biocular width GDT yielded two of the three highest estimates −0.718 and 0.789 respectively-, while nasal protrusion and eye distance EDTs and GDTs were moderately to highly heritable, with heritability estimates ranging from 0.505 to 0.677. Nose width was also reported to be moderately to highly heritable in two studies of 125 Belgian and 342 Indian families with corresponding estimates of 0.639 and 0.49812,32. The same studies reported heritabilities of 0.606 and 0.605 for the bizygomatic width, while three further analyses, including large population samples from Russia and India and Europe, gave estimates of 0.52, 0.71 and 0.629 for the same trait22,26,30. Our most heritable EDT, with h2 = 0.734 corresponded to that width. In comparison to previously reported values, our heritability estimates were either similar or significantly higher.

Important inferences were made by exploring results from the various types of phenotypes used in this study. In the distance-based study, EDTs corresponding to mandible and mouth widths showed slightly higher heritability estimates in comparison to the respective GDTs. The latter traits had geodesic paths passing through the lip region, which, in the curvature analyses (Fig. 2), had consistently low heritability estimates. On the other hand, the GDT corresponding to inner eye corner distance gave the highest heritability estimate of 0.789. The geodesic path of that trait traverses the nasion region, which is a well known highly heritable area of the human face24,29,56. The results could indicate that GDTs are more sensitive than EDTs to the facial morphology between the considered landmarks. A comparison of summary statistics from the curvature and distance heritability analyses revealed that Structural Equation Models provided good data fits for approximately 80% of the curvature traits, but only 50% of the distance phenotypes, as assessed by Goodness-Of-Fit log-likelihood ratio tests. This observation could be a strong indicator that curvature phenotypes are more suitable for the study of facial morphology.

Conclusion

In this work we proposed a new method for the automated and dense landmarking of 3D surfaces, GESSA, and applied it in a novel large-scale twin heritability study of the human face morphology. Heritability estimates were computed for local curvature phenotypes corresponding to single landmarks and the results were combined to generate face heritability maps. Furthermore, regional curvature traits, corresponding to larger facial areas, were extracted through a multivariate analysis and their heritability was also estimated, yielding a number of highly heritable facial features. Heritability estimation was also performed for a number of traditional facial length traits, with estimates being equivalent or higher than the ones found in the existing literature. In conclusion, we provided a fresh perceptive in facial phenotyping and heritability analysis that could potentially inform future genome-wide studies and be useful in a variety of applications, ranging from population genetics and gene-mapping studies, to face modeling and reconstruction applications.

Methods

Sample Description

Imaging data were retrieved for 1,547 participants of British origin from the TwinsUK Cohort38, for which 3D facial models and age information was available. All subjects provided written and informed consent for academic use of the data. Experiments were approved by the Guy’s and St. Thomas’ (GSTT) Ethics Committee. The research was done in accordance with the tenets of the Declaration of Helsinki. The sample consists predominately of female twins unselected for any disease. Detailed sample characteristics of the TwinsUK Cohort have been previously described38. 314 subjects were dropped due to artifacts in the images or lack of zygosity information. 228 unrelated subjects were excluded from further analysis. An additional 56 subjects were removed due to mesh reconstruction errors during our phenotyping process. After preprocessing and quality control, we were left with 952 female subjects (59.31 mean age, standard deviation 9.85). Of these, 197 pairs were monozygotic and 279 were dizygotic.

Image Acquisition and Preprocessing

The raw 3D facial images were acquired using a 3D photographic scanning system manufactured by 3dMD. Participants were asked to keep their mouths closed and a neutral expression during the acquisition of the 3D scans. Each image was comprised of a 3D triangular mesh, with approximately 4,500 points representing the frontal facial surface, and the corresponding texture map. Texture maps were not used in our analysis. Due to large variation in the original pose and position of the original meshes, we manually located outer eye corners and nasion in all faces. The three landmarks were used to impose a common orientation of the faces under the same coordinate space. High landmark accuracy was not important for this purpose and these landmarks were discarded from any subsequent analysis step. Following that, the surfaces were cropped and trimmed to remove non-facial areas, such as neck and chest regions, hair and ears. Finally, the Iterative Closest Point (ICP) algorithm57 was applied to align the cropped images. The preprocessing pipeline was performed using the Meshlab software suite.

The Geodesic Ensemble Surface Sampling Algorithm (GESSA)

For the streamlined identification and alignment of landmark points across 3D faces, we propose and use our novel landmark sampling procedure, Geodesic Ensemble Surface Sampling Algorithm (GESSA). GESSA automatically samples large sets of corresponding landmark points from sets of similar polyhedral surfaces. We formulate the problem of finding corresponding landmarks as a minimization of an objective function comprised of two entropy-based terms. The first term is the entropy of the data probability distribution. Minimization of this term is performed by moving landmark positions and leads to improvement of landmark correspondences across surfaces. The second term is the sum of entropies for landmark distributions on individual surfaces. By maximizing this term the algorithm achieves a uniform distribution of points on all related surfaces.

The above formulation is adopted from the work by Cates et al.58, which was, in turn, based upon previous work on statistical shape modeling with information-based optimization functions. The first such work that we are aware of was presented by Kotcheff and Taylor59, and subsequent articles by Davies et al.60,61. Further details on these methods can be found in the Supplementary Text. Here we built upon that framework and use geodesic distances between landmarks, directly defined on the polyhedral facial surfaces, in order to increase precision during uniform landmark sampling. Furthermore, a suitable gradient descent optimization technique was developed to optimize landmark locations by operating only on the surface structures. By incorporating these two key features, we were able to to deal with highly curved surfaces, improve upon computational space requirements and enhance the correspondence results.

General Methodology

Let us consider an ensemble of N polyhedral surfaces. Each surface , j = 1 … N, is represented, without loss of generality, as a a triangular mesh in R3. Our objective is to sample M points , with , uniformly from each surface, and establish one-to-one correspondence among points on all surfaces. As such, the overall correspondence problem can be broken down to two major components; correspondence optimization of landmarks across the ensemble and uniform sampling on individual surfaces.

The coordinates of M points on a surface can be aggregated into a vector zj, with

zjR3M. The vectors zj can be thought of as point representations of surfaces distributed in R3M. As such, R3M is taken to be the space of all surfaces, when each one is sampled in M locations, referred hereafter as shape space. Consider ZR3M to be a random variable in shape space with probability density function p(Z), and zj, j = 1 … N, realizations of that random variable. The differential entropy of Z is given by . As such, the sample differential entropy is given by

One-to-one correspondence of landmarks can be optimized by minimizing the above sample entropy. This minimization increases the compactness of the surfaces’ distribution in shape space, which equates to bringing surface landmarks closer to each other. A potential risk though is that landmarks can be collapsed to the same surface locations. The solution is to balance the correspondence accuracy with uniform distributions of points on individual surfaces.

Assuming that , have been sampled in some way from the surface , their positions can be manipulated in order to make them uniformly distributed on the surface. This is done by maximizing the sample entropy for the distribution of landmarks in , since, in bounded domains, as are our surfaces, the uniform distribution has maximum entropy. Let be a random variable on surface with probability density function , and , realizations of . The differential entropy of is , and the sample differential entropy is given by

The combined optimization cost used in the correspondence algorithm balances the sample entropy in shape space with the sum of point distribution entropies and is given by

Q must be minimized under a set of constraints imposing that each point lies in its corresponding surface. We propose the use of a geodesic gradient descent algorithm which directly follows straightest paths on the surfaces in order to update landmark locations without violating these constraints. We first introduce some notions related to manifold geometry that are required to develop our methodology. We then describe the details regarding uniform distribution of points on individual surfaces, followed by the correspondence optimization.

Geometric Structures on Manifolds

The structure of a 3D object is commonly described by its boundary surface in R3. Such surfaces can be mathematically described as 2D manifolds, curved topological spaces that locally, around each point, can be considered similar to a 2D Euclidean space62. We restrict our interest to continuous and differentiable manifolds where distances and shortest paths can be defined.

Between any two manifold points, there exists a unique shortest curve in that connects these two points. This curve is called a geodesic curve and is equivalent to a straight line on the Euclidean space62. The length of the geodesic curve defines the distance between the two points in . Furthermore, for each point α on a manifold , we can define a plane, passing from that point, which can be understood as a local linearization of the manifold around α. This space is called the tangent space of at point α and is denoted . has equal dimensionality to the manifold .

A tangent vector can be uniquely associated to the geodesic curve from point α to point , using the exponential map function , with . The inverse of the exponential map is termed logarithmic map. It accepts two points on the manifold and returns the tangent vector that corresponds to the geodesic curve connecting the two points63, i.e. .

An important inference is that is the smallest tangent vector in norm such that 64. As such, the norm of the logarithmic map provides the length of the geodesic and is used as the distance metric on the manifold:

The gradient of the squared distance function is directly related to the logarithmic map65:

Uniform Distributions of Landmarks in Individual Surfaces

Here we assume that M points have already been positioned on a surface and describe the methodology for distributing these points uniformly on that surface. By maximizing the sample differential entropy w.r.t. , we in essence manipulate point positions to achieve the required uniformity. The optimization problem can be written as

In order to minimize the cost, which is equal to the negative sample entropy, we will employ a gradient descent technique, where points are iteratively moved proportionally to the negative gradient of the cost, until no significant improvement in cost can be achieved. The gradient of w.r.t the landmark point is

The last approximation is based on the assumption that within one iteration of the gradient descent optimization cycle, the probability density at one point is not affected by the rest. As such, , when . The assumption is adopted in order to reduce significantly the computational burden.

To proceed, estimates of are needed. Kernel Density Estimation is suitable for this purpose, but relies on the calculation of distances between surface points. Polyhedral surfaces can be considered piecewise planar approximations of 2D manifolds. An appropriate formulation of geodesic distances for polyhedral surfaces has been given in the seminal work of Mitchell et al.66. They are computed as overall lengths of piece-wise linear segments on the surface triangles that form straight lines when two adjacent faces are unfolded across their common edge.

Based on the ability to compute geodesic distances on the surface , we propose the following geodesic kernel density estimator for :

with the isotropic covariance kernel function given by

where is a standard deviation parameter.

Having formulated a suitable density estimator, we proceed to solve the optimization problem (7) using gradient descent. Reminding that points are constrained to lie on , we need to provide suitable updates for the gradient descent algorithm. Utilizing equation (6), we can write the gradient of the objective function w.r.t. as an average vector on the tangent space of the landmark. In particular:

Employing the definition of the exponential map on manifolds, our point updates are given by

where γ is the gradient descent’s time step parameter. Exponential maps can be computed on polyhedral surfaces using straightest geodesic lines67.

Correspondence Optimization

Having laid out the optimization procedure required to guarantee a uniform distribution of points on the individual surfaces, we now need to solve the correspondence optimization problem, which is written as follows:

Let again be a random variable in shape space with probability density function . If the density function is assumed to be a multivariate normal distribution , the differential entropy can then be written as 58. Using the sample covariance estimator , with the centered observations, , and the matrices of sample vectors and respectively as their rows, the sample entropy is written as

Gradient descent can also be employed here. To simplify computations, the mean estimate is considered fixed during each iteration. With this assumption, the matrix of partial derivatives of can be written as58:

A regularization term α is added above since in practice, will not have full rank.

In order to accommodate these correspondence updates into our geodesic gradient descent method, we project each update on the plane of the point’s corresponding mesh triangle. Let be the perpendicular unit normal vector of the ’s current triangle. The tangent vector that maximizes the gradient update on the triangle plane is given by

By adding up the gradients from equations (12) and (16), we finally acquire the gradient descent updates for the optimization of the overall correspondence cost 4:

Initialization and Iterative Landmark Splitting

The number of landmarks, M, to be sampled from each surface is provided as a parameter by the user. The algorithm initially samples randomly one point from each surface and performs correspondence optimization. Following that, the landmark is split into two and the overall optimization process runs again on the new point sets. This procedure is repeated iteratively until the required number of sampled landmarks is reached and their positions are optimized.

Validation of GESSA using the MorphFace dataset

For this validation, we used the publicly and freely available Morphface dataset of 3D facial surfaces37. The dataset is distributed freely from the University of Basel for internal, non-commercial research, evaluation or testing purposes, and can be found at http://faces.cs.unibas.ch/bfm/main.php?nav=1-1-1&id=scans. It comprises of 11 individual subjects’ faces with neutral pose. All the faces were registered to the Basel Face Model (BFM) facial template. The template mesh consisted of 53,490 landmarks. The registration process identified the locations of these landmarks on the 11 validation faces. For the purposes of this validation study, 19 of these landmarks, corresponding to prominent facial points, were manually selected to constitute the groundtruth landmarks (GTLs). Supplementary Fig. 8 depicts these 19 landmarks on the facial surface. The Morphface surfaces were subjected to cropping in order to remove non relevant areas, such as neck and ears. Subsequently, the facial meshes were subsampled, yielding approximately 4,700 points per face, a number that was similar to that of the facial meshes in our TwinsUK cohort.

Using GESSA, we computed 4,096 GESSA Generated Landmarks (GGLs) in the validation surfaces. Supplementary Fig. 7(A) depicts these landmarks on three example faces, while Supplementary Fig. 9 depicts the average face of the validation dataset.

The goals of our validation study were threefold. First, we aimed to evaluate how uniform the distribution of the sampled landmark points on the surfaces would be. Second, we aimed to demonstrate that sampling approximately 4000 facial landmarks produces a dense landmark coverage of facial surfaces. Third, we aimed to show that GESSA is consistently and accurately aligning landmarks to achieve one-to-one correspondence across all surfaces.

First, we focus on the evaluation of the landmarks’ surface distributions. Supplementary Fig. 11 shows the estimated surface density function of an example validation face. It can be observed that the density is almost uniform everywhere. In addition, examination of Supplementary Fig. 7(A) also shows clearly that the landmarks are uniformly distributed across the complete facial surfaces.

Second, we needed to examine if the chosen number of sampled GGLs is sufficient to densely cover the facial surfaces. A low number of sampled landmarks would lead to extended facial areas having no or very few landmarks. As a consequence, extracted traits would not be able to capture well the morphological information contained in these areas. On the other hand, very dense sampling would incur unnecessary computational costs.

To address this objective, we worked under the premise that, by sampling surfaces densely, we should be able to locate GGLs sufficiently close to the preselected GTLs. We measured the average distance, over all faces in the validation set, from each GTL to its closest GGL. Supplementary Fig. 7(B) shows the GTLs and respective closest GGLs for three example validation faces. Table 3 reports the 19 average distances between GTLs and GGLs over the set of 11 faces. Due to the fact that dimensional units on the Morphface dataset were unknown, we further divided each of the above values by the mean facial width - distance between the two zygomatic landmarks- in order to compute normalized distances, which are also included in Table 3.

Table 3 Average distances and standard deviations (Std) between the 19 groundtruth landmarks and their respective nearest GESSA landmark points.

The average distances per landmark were always less than 3% of the mean facial width. By further averaging over the 19 landmarks considered, the overall average distance was 1.77% of the mean facial width. The results indicated that the number of landmarks was sufficient to cover densely the surfaces and that landmarks extended over the complete surfaces.

Third, we concentrated on evaluating the ability of GESSA to place landmarks in equivalent positions consistently across the complete set of faces. The standard deviations of the above GTL-GGL distances can be used as measures of consistent placement, since small values indicate that GTL-GGL distances are similar across all faces and, consequently, landmarks are positioned in equivalent positions across the dataset. The details are reported in Table 3. The standard deviations of the GTL-GGL distances for all landmarks were always less than 1% of the mean facial width. By further averaging again over the 19 landmarks considered, the average standard deviation was 0.42%. The results indicated that the automatic landmarking produced by the algorithm was sufficiently accurate to be deployed in our heritability study.

Curvature-based Facial Traits

The shape around a point on a surface can be characterized using curvature descriptors68. Curvature is a directional property and describes how bent the surface is around each point69. The curvature magnitude of a point in some direction is given by the reciprocal of the radius of the circle that best approximates the slice of surface in that direction68. Normal curvatures are defined on orthogonal planes to a surface point and for each such point, there exists a single normal curvature that has the largest absolute curvature magnitude. This is called the maximum curvature Kmax. The curvature perpendicular to Kmax is the minimum curvature Kmin. These two surface attributes are collectively called principal curvatures and any normal curvature at a point on a surface can be derived as a combination of Kmax and Kmin69. Supplementary Fig. 1 shows a general characterization of shapes based on the signs of Kmax and Kmin. Univariate curvature indices derived from these measures have been proposed. Four such measures were used in this study to compute phenotypic traits:

Mean Curvature:

Gaussian Curvature:

Curvedness:

Shape Index:

In order to explain the differences between the various indices, we first present two intuitive features that can be used so as to describe the shape of a surface patch. The first feature is the general shape morphology and is governed by the various sign combinations of Kmin and Kmax, i.e. flat (), vs. convex cylindrical (), vs. concave cylindrical (), vs saddle structure (), vs. convex cylindrical (), vs. concave cylindrical (). The second feature is curvature magnitude, i.e. how bent the surface is irrespective of shape morphology. Any univariate descriptor of curvature needs to be a compromise, since it cannot include all information provided by these two features68,69. Individually, each principal curvature does not provide a useful interpretation of local surface shape68, as can be seen in Supplementary Fig. 1. In contrast, the four curvature indices yield more meaningful quantitative shape measures by grouping together or differentiating particular classes of basic shape structures. An illustration of the various indices’ characteristics, including their domains and how shapes are differentiated are included in Fig. 5. The MC index provides a balanced measure between shape morphology and curvature magnitude. It is strongly affected by directional shape differences (convex vs. concave shapes) but is also sensitive to curvature magnitude. GC distinguishes primarily between shape morphologies of the same or opposite principal curvature signs. Finally, CU and SI indices are the most accurate quantitative measures of curvature magnitude and shape morphology respectively.

Figure 5: Topological characteristics of curvature indices.
figure 5

Each descriptor highlights different attributes of the surface’s underlying topology. MC differentiates significantly areas of high and low curvature, as well as convex and concave shapes. GC discriminates well between spherical and saddle-like areas. CU is less representative of a particular morphology and reflects the absolute curvature magnitude in each point, irrespective of its specific shape. Finally, SI is scale-independent and able to differentiate between pure shape characteristics, e.g domes, ridges and saddles, regardless of their high or low CU.

The calculation of local curvature on the 3D meshes was performed using a finite-differences algorithm70. Normal vectors perpendicular to the surface were computed in each landmark. Gradients across the surface were then approximated using finite normal differences of neighbor points. Principal curvatures were found through an eigenvalue decomposition of the normal gradients and curvature index values calculated from the respective formulas.

Distance-based Facial Traits

Two different types of distance-based phenotypes were considered in this study. Traits derived from Euclidean distances between landmark pairs (EDTs), which represent the main type of examined phenotypes in the literature, and traits derived from Geodesic distances between landmark pairs (GDTs). An illustration of the difference between the two types of distances can be seen in Supplementary Fig. 6. Geodesic computations were performed using an implementation of the exact discrete geodesic algorithm66,71.

Heritability Estimation

The heritability analyses were performed using Structural Equation Modeling (SEM)40. SEM evaluates which combination of additive (A) genetic, common (C) environmental and unique (E) environmental variance components can best explain the observed phenotypic variance and covariance of MZ and DZ twin data. The importance of individual variance components is assessed by dropping parameters sequentially from the set of nested models ACE→AE→E. In choosing between models, variance components are excluded from the selection process if there is no significant deterioration in model fit after the component is dropped, as assessed by the Akaike Information Criterion (AIC)41. The E component represents random error and must be retained in all models40. Heritability estimates for the AE models are calculated as , where a and e are the path coefficients of the A and E variance components in the SEM model. A detailed description of SEM can be found in the Supplementary Text. In this work, ACE, AE and E structural equation models were fitted using the OpenMx software72. Regarding the univariate heritability studies, the log files from OpenMx for MC, GC, CU and SI traits respectively are provided in: Supplementary Files 5–8 for the ACE models, Supplementary Files 9–12 for the AE models and Supplementary Files 13–16 for the E models. Regarding the multivariate heritability studies, the log files (ACE, AE and E models) from OpenMx for MC, GC, CU and SI traits respectively are provided in Supplementary Files 17–20. The log files include detailed fit statistics, as well as the estimated path coefficients for the latent factors, along with their standard errors. Furthermore, age was included in the SEM models as a covariate. In contrast, for the multivariate study, the age was regressed from the data before the application of SEM. While the outcome is identical for both approaches, in the second case, no age variable is shown in the models, and ACE, AE and E logs are all included in single files.

Additional Information

How to cite this article: Tsagkrasoulis, D. et al. Heritability maps of human face morphology through large-scale automated three-dimensional phenotyping. Sci. Rep. 7, 45885; doi: 10.1038/srep45885 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.