Introduction

“Human facial diversity is substantial, complex, and largely scientifically unexplained”1. The human face is an important source of information for social interactions and for scientists alike. A face advertises, among other things, a person’s sex, age, hormonal status, previous environmental exposure, health, interpersonal attitudes, and emotions. The study of faces and what they communicate in this way integrates genomics, human behavioral biology and life history, evolutionary psychology, and biological anthropology. Ultimately the theory of these relationships is an evolutionary one: that the human body and face have been shaped by selective forces throughout our evolutionary history in response to natural and social environments. Facial morphology thereby occupies the middle of a causal chain whereby biological factors such as age, sex, and body composition are reflected in facial and bodily characteristics that then serve as cues in person perception and the consequent behaviors.

Correlational studies have identified some links between physical characteristics and social inference, but usually fail to identify the specific morphological pathways underlying the inferences. Morphometric face analysis, however, has demonstrated that quantification of the morphological cues is crucial2,3,4. For studying correlates of facial shape variation, researchers are now turning to geometric morphometric (GMM) methods, which can combine biological factors, shape information, and trait inference in the same data space. In 2005, Schaefer and colleagues were the first to make use of this possibility in the analysis of faces5,6,7. In Schaefer et al. 2009, the approach was made explicit in a review article and given the name “psychomorphospace”8. Since then GMM has been applied in face research by several research groups, e.g., refs 9, 10, 11, 12.

Instead of using distances, angles, or ratios, GMM is based on a complete multivariate analysis of the locations (that is, the Cartesian coordinates) of a designed set of landmark and semilandmark points taken all together. The most important advantage of GMM is that it preserves the relative spatial relationships of the landmarks and semilandmarks throughout the analysis. The first step is standardizing for position, size and orientation of the faces using a least squares criterion (Procrustes distance). Thereafter, linear regressions of the shape coordinates on the variable of interest quantify and depict the association of this variable with facial shape. This paper exploits a GMM technique only a couple of years old that decomposes the result of such a shape regression into variation at both large and small scales, in order to localize and visualize the relative predominance of its large-scale versus small-scale features. The publication introducing this method13 focused on providing modern paleobiology with a tool to differentiate among integration, dis-integration, and self-similarity. But the concepts and equations entailed directly transfer to the fundamental questions in face research. Integration implies a large contribution from large-scale variation (global patterns), whereas dis-integration can be interpreted as a higher amplitude of small-scale variation (local patterns) in facial signals. Thus, this morphometric notion of integration is based as much in the geometry of landmark placement as in their correlations13. Even though small-scale features may be correlated, they cannot be “integrated” in our morphometric sense unless the deformations of the spaces around them are correlated as well, meaning the integration must be at large scale.

Our article exemplifies the new analysis using regressions of male faces on three traits typically associated with facial shape: a physical trait (body mass index, BMI), an endocrinological measure (salivary cortisol), and a rating (perceived health). This short list is not intended as an exhaustive directory of traits, only as a preliminary survey of the range of scaling dimensions that might be exploited in today’s range of studies of facial shape and trait attribution.

We are learning steadily more about the facial correlates of body composition and endocrinological status, on the one hand, and ascribed personal characteristics in social perception, on the other. Studies of facial masculinity or femininity and facial attractiveness have a longer history, while the topics of facial cues to body mass index or health are receiving increasing attention. Current studies in these areas exploit a variety of methods, including linear distances and angles, techniques of computer vision, and geometric morphometrics. Yet one cannot say whether it is the individual features (eyebrows, eyes, nose, mouth – all aspects of local variation) or instead general aspects of shape such as the overall shape of the facial outline or facial width-to-height ratio (global variation) that carry most of the signal. Early approaches to this puzzle included the dissection of the face into single features and their isolation and systematic variation via line drawings or identi-kits (e.g., ref. 14), along with single- and multiple-feature variation15. Although these approaches proved productive, information about natural variation and covariation of the features could not be included and, in spite of modern software and feature manipulations (e.g., refs 16,17) cannot be retrieved. By distinguishing between local and global variation, the GMM approach presented in this paper supersedes such testing of isolated single features.

It is difficult to derive hypotheses about the relative contributions of large- and small-scale variation from the existing literature. Since faces are biological systems operating under functional constraints, a certain degree of integration is to be expected. This expectation is consistent with the observation that so far no biological data set was analyzed in which the slope of the regressions we will be highlighting was closer to zero (indicating no integration) than −0.5613. One also might expect that facial shape changes consequent to biological processes (body fat storage, water retention) would be more global than psychological signals would be. That is, even given the compartmentalization of fat or extracellular fluids (e.g., ref. 18), these are likely to be more uniformly distributed around the face than are social signals. For example, as Keating15, p. 68, concludes, “variations in eye size or lip thickness alone [are the] reliable dominance cues.” Another hint that local effects dominate social perception is the finding that even neutral facial expressions convey emotional meaning because certain purely histological traits, such as downturned corners of the mouth due to fatty pads or water retention, mimic emotional expressions19, while certain ambiguous emotional displays, such as lowering the eyebrows and upturning the corners of the mouth, bear emotional valence (e.g., refs 20,21). Highly transient states like these that owe to single muscle units likely represent the most dis-integrated facial features. Other forms of social inference might be intermediate, relying on both local and global features. Taken as a whole, our new approach may not rewrite these intuitive understandings, but it will quantify them for the first time.

Material and Methods

Participants

Frontal photographs (procedure below), body height via anthropometer, body composition (Tanita TBF 300), and saliva samples (see below) were collected from 34 ethnically Central European men from the Viennese student population. They were recruited at the Centre for Organismal Systems Biology of the University of Vienna. Each participant was informed about the measurement procedure, subsequent data use, and the right to withdraw from the study at any time; all gave their informed written consent. All protocols were in accordance with the Declaration of Helsinki.

Subjects’ age ranged from 19 to 27 years; body mass index [body weight (kg)/body height (m)2; BMI], from 17.7 to 34.9 (22.9 ± 4.1). BMI and body fat proportion were highly correlated (rs = 0.881); we chose to present BMI because it is the more common choice of previous researchers into facial adiposity.

Hormone sampling

Each participant provided six saliva samples, three per session at intervals of about 20 minutes. Each session started between 08:00 and 09:30 a.m. Participants were advised not to eat or drink for at least one hour before the data collection, not to brush their teeth that morning (in order to avoid the risk of small bleeds), and not to be involved in sports or sexual activities, to drink alcohol or caffeine, or to take drugs over the 12 hours preceding the measurement session. Salivary samples were frozen at −20 °C and analyzed jointly in the endocrine lab at the Department of Behavioural Biology of the University of Vienna. Cortisol concentration was quantified by a microtiter plate enzyme immunoassay (EIA) using procedures developed by Palme and Möstl22. Repeated measurements of duplicate pool samples revealed a mean inter-assay coefficient of variation of 11.6%; the mean intra-assay coefficient of variation was 14.5%, which is the usual variation for analyses using group-specific enzyme immuno-assays23,24. Individual samples with high discrepancies between duplicate samples were excluded before averaging. Mean cortisol values per subject ranged between 14.8 and 52.3 ng/ml (29.9 ± 8.9 ng/ml).

Rating study

Each rater (39 male, 62 female; ethnically Central European; 20–45 years, 33 ± 6.7 years) rated each of the 34 male faces (grey-scaled and masked by a blurred ellipse, Fig. 1) in pseudo-random order on a computer screen using sliders with a hidden range from 0 (unwell) to 100 (healthy appearance). Participation in the rating study was wholly voluntary; all participants completed the procedure. To account for individual variation in rating ranges, data were rank-ordered within each rater; then the median of the 101 scores for each photograph was taken as the measure for perceived health for the corresponding photograph. Median ranks ranged from 5 to 26.75 (17.2 ± 5.9) out of 34, a gratifyingly wide span that sustained the further analysis we are about to report.

Figure 1
figure 1

(a) Landmark scheme. Thirty-seven point landmarks and thirty-four semilandmarks (–) were digitized on each facial portrait. Subsequently, their x- and y-coordinates were subjected to a generalized Procrustes superimposition with additional steps for sliding and symmetrization26. (b) Greyscaled version of the same portrait on standardized background as vignetted by a blurred ellipse. The face in this figure is the actual average of all the sample faces after each was unwarped to the sample average configuration.

Facial photographs and landmark data

Frontal photographs were taken at 350 cm with the head adjusted according to the Frankfort horizontal and a neutral facial expression. We used a digital reflex camera (Canon EOS 40D) with a 200 mm lens positioned at eye height.

A total of 71 landmarks and semilandmarks were digitized to capture facial shape (Fig. 1). Landmark definitions basically match the earlier operationalization of Windhager and colleagues25.

Shape analysis

The initial steps in the present research dataflow were those that have become standard in Procrustes studies of facial form8,27. Landmark and semilandmark locations from Fig. 1, after Procrustes superimposition, were regressed on correlates of facial form of three different types: physiological measurements (here, BMI), hormonal transients (here, salivary cortisol), and perceptions by others (here, a health rating). The regression vectors that prove conventionally significant by the usual permutation tests may be visualized as thin-plate spline grid deformations from the mean form to the predicted forms that lay three standard deviations from the mean in either direction (Fig. 2).

Figure 2: Visualization of symmetrized shape regressions upon BMI, cortisol and health rating by thin-plate spline (TPS) deformation grids.
figure 2

The average landmark configuration corresponds to the undeformed grid. The complete symmetrized scatter of all shape coordinates that generate the grand mean is presented to its left. The deformations correspond to a decrease (left) or an increase (right) of 3 standard deviations of the predictor variable: BMI, top right pair; cortisol, bottom left pair; health rating, bottom right pair.

At this point we invoked the novel procedure just introduced to the community of disciplines concerned with evolution: the formal construction of a dimension of spatial scaling corresponding to any shape phenomenon of interest (here, any of these regressions). For a detailed mathematical explanation of this procedure, see ref. 13. The approach is an extension to our shape morphometrics of a formalism already somewhat familiar from studies of Brownian motion. Mandelbrot’s notion of fractal dimension28 is based on early work by Perrin29 and others confirming Einstein’s self-similar model of diffusion. In Brownian motion, as observed in the laboratory, the statistical properties of any segment of the process are independent of the duration of that segment except for one single parameter, the diffusion coefficient (or, for a random walk, the step variance). A diffusion four times as long as another looks exactly the same except for a scaling of amplitude by a factor of 2 (the square root of 4).

Bookstein13 shows how this same notion of scaling can be converted from time comparisons to space comparisons by use of the machinery of principal and partial warps that is already part of the standard thin-plate-spline morphometric toolkit27. This machinery has been part of GMM since the beginning (cf. refs 26, 27 and 30), but these tools are not applied as often as the other parts of this useful technological praxis for shape analysis. Briefly, any individual shape of some landmark configuration (here, the shape of a face) can be represented as the deformation of the sample average shape. The thin-plate spline diagram that GMM typically uses to convey one of these deformations has a specific bending energy, a net quantity of what would be actual physical energy if the situation were that of a metal plate bending perpendicular to the picture plane. Bending energy turns out to be a quadratic form (in effect, a sum of squares) in the coordinates of the landmark points themselves. And, just as sines and cosines are a conveniently simple representation of the way a musical sound can be expressed in terms of pure tones, so the principal warps are a conveniently simple representation of the ways that any single shape change can be re-expressed as a superposition of these rhetorically useful forms of “pure bending at some particular scale.” Principal warps are geometrically orthogonal components corresponding to deformations at different geometric scales (analogous to different powers in polynomial curve fitting31). A partial warp is just the combination of two copies of the same principal warp, once for the horizontal coordinate of a facial shape, once for the vertical coordinate. Finally, the uniform component is the part of the change that comes from patterns that are free of bending – the so-called affine transformations that leave parallel straight lines parallel.

Once each observed shape is represented in terms of this new set of descriptors, the uniform component together with all the partial warps, the analysis of integration just introduced into the evolutionary literature13 is launched, as follows. One begins by removing all the shape variance that corresponds to the uniform shape changes (here, changes in height/width ratio of these faces). In a context of two-dimensional data (such as our facial photos), if the variance of every partial warp is exactly proportional to the reciprocal of its bending energy, then the nonuniform shape variance of every small square of landmarks and semilandmarks, regardless of size, position, or orientation, will be the same. If the variance of partial warps drops faster than their bending energy rises, the transformation can be said to be more integrated, with greater variability at the larger scales of shape features. Conversely, if partial warp variance drops more slowly than bending energy rises, the transformation is more dis-integrated, with more variability of the smaller-scale structures than would be predicted by the large-scale variation. (Our standard Procrustes null distribution lies in an extreme position on this scale, with the variance of every partial warp exactly the same a priori. This is one reason it is an inadvisable choice for applications in biological morphometrics32).

One gets from a landmark-based data representation to an estimate of this scaling dimension for any particular shape phenomenon by carrying out one additional regression (see examples in the penultimate figure below). The new regression fits a line to a plot of log bending energy against log partial warp variance for all the partial warps representing the transformation under study. In plots like this one, the first partial warp is the pattern of nonuniform shape change with the least bending per unit Procrustes length–this is usually a bending of the long axis of the form under study, and can be in the x-direction, the y-direction, or any combination. The second partial warp typically complements the first one by some version of a cubic (S-shaped) bend, likewise in any combination of x- and y-directions, and so on until the last partial warp, which is usually the relative displacement of the pair of landmarks at closest spacing to one another. Whatever the reference form, the partial warps provide an ordination of all its possible shape changes along the single dimension of steadily greater and greater bending per unit Procrustes length (of the deformation). The fitted regression slope is a summary measure of the steepness of fall of this ordination. Slopes steeper than −1 correspond to integrated processes (such as growth) that affect all regions of the form by a small number of quite powerful 1-factors. Slopes shallower than −1 represent patterns of the opposite connotation, patterns that are much less correlated from locus to locus across the form. In-between are the strictly self-similar processes, of slope exactly −1. These are the analogues of Brownian motion for this domain of shape features—transformations that have the same nonuniform variance (transformation of squares into trapezoids or kites), as a proportion of starting scale, regardless of that scale.

Of the regressions of form on its correlates that are considered in this paper, one is an integrated pattern, one is a dis-integrated pattern, and one is a self-similar pattern. We show the regressions of variance on bending energy responsible for this classification and, back on the picture of the face, the evident variations of predicted landmark shifts that correspond to this taxonomy of scaling regimes. We also interpret the difference, which is substantial, in terms of the different origins of these shape regressions in development versus perceptual processes.

We used F. James Rohlf’s computer programs tpsUtil and tpsDig2 for landmark digitization, tpsRelw for sliding of the semilandmarks, tpsRegr for the shape regressions, and tpsSuper for the image unwarping and averaging33. The analysis of spatial scaling was carried out in S-Plus.

Results

The principal scalar measurements of this study were nearly uncorrelated among themselves. Rank-correlations were as follows: BMI and cortisol, 0.089; BMI and health rating, 0.157; cortisol and health rating, 0.279 (n.s. for our N = 34). Facial shape variation was strongly predicted by each of BMI, cortisol, and health rating separately. Each shape regression was significant (all p’s ≤ 0.05 over 1000 permutations). With symmetrized faces, the fraction of variance explained by BMI was 18.5%, by cortisol, 10.7%, and by health rating, 6.0%.

Before proceeding to the detailed spatial analysis, we guide the reader through the typical verbal interpretation of opposite pairs of grids (Fig. 2) and averaged unwarped images (GM morphs, Fig. 3). The male facial shape associated with low BMI in our data is mainly characterized by an elongated facial outline with the sensory organs comparatively larger and more widely spread out over the area of the face. This general pattern is emphasized by higher eyebrows, a relatively larger visible part of the sclera and iris, a longer nose, fuller lips, upturned corners of the mouth, and a more pointed chin. In contrast, men with a high BMI tend to have a rounder face with more centrally situated and comparatively smaller sensory organs. Likewise, the sclera and the iris are less visible, the mouth has more downturned corners, and the chin appears to be wider and rounder. The general facial correlates of low salivary cortisol somewhat resemble those for low BMI except for the shape of the eye region (eyes that are more almond-shaped). In contrast, high salivary cortisol is related to eyes that are more slit-like, with upper lid regions that look almost swollen. Generally, the facial outline widens with increasing salivary cortisol as with increasing BMI. Morphs visualizing the different health ratings hardly differ in overall size and location of the sensory organs in relation to the whole face (as they did for comparisons over the range of BMI or salivary cortisol). Still, the shapes of sensory organs are not the same along the attributed health gradient. For lower health ratings, the eyes are relatively rounder, the nose thinner and longer, and the mouth narrower but framed by thicker lips. In contrast, higher health ratings are characterized by relatively more elliptical eyes with lower and straighter eyebrows, as well as a shorter and wider nose. The lips are relatively thinner and the mouth is wider. The overall face is less oval, but rather more square than one with lower attributed health.

Figure 3
figure 3

Computed morphs of the averaged unwarped image (GM morphs) depicting the same shape regressions and configurations as the thin-plate splines (Fig. 2): the sample average as well as the facial shapes corresponding to low (minus three standard deviations) and high (plus three standard deviations) of BMI, cortisol, and health rating.

In order to quantify spatial scaling for each of these regressions, shape variance is first split into uniform and non-uniform shape changes. The relative contribution of the uniform component varied by predictor variable: 31.5% for BMI, 26.8% for cortisol, 25.2% for health rating (Table 1). The uniform component is depicted as black lines in the penultimate figure. Thereafter, the scaling dimension is estimated via the regression of log partial warp variance on log bending energy for all the partial warps of the transformation (Fig. 4; Fig. 5, third column). By definition, the first partial warps have the least bending per unit Procrustes length. Table 1 shows that the sum of variances over the uniform component, partial warp 1 and partial warp 2 (the largest scale contributions) is highest for BMI (86%), followed by cortisol (73%) and health rating (54%). The variance at these highest three scales is 0.95 × 10−4 for BMI, 0.49 × 10−4 for cortisol, and 0.24 × 10−4 for health rating.

Table 1 Contribution of large-scale variation as a function of the predictor variable.
Figure 4: Color key for the shape changes per partial warp depicted in Fig. 5.
figure 4

The first two partial warps (in red) correspond to the non-uniform components with the least bending energy per unit Procrustes length. The subsequent three partial warp contributions are coded in cyan, and the others – representing small-scale, local variations – in purple. Note the log-scale along the vertical axis.

Figure 5: Graphical representation of spatial scaling.
figure 5

The left column depicts the contribution of the uniform component (in black) as well as the vectors for partial warp 1 and partial warp 2 (both in red). This stands for large-scale variation. In the middle column, the other partial warps, representing small-scale variation, are added (the next larger three in cyan, the others in purple). The uniform component together with all partial warps gives the deformations depicted in Fig. 2. The right column gives the log partial warp amplitude squared together with the log bending energy for each partial warp. The frame at upper right is not missing a red dot; rather, there are two red dots, which overlap nearly perfectly on the scale of this vertical axis. Lines are the regressions whose slopes indicate the level of (dis-)integration pattern by pattern.

For BMI, the partial warp variance drops faster than the bending energy rises (Fig. 5, third column). The corresponding slope of −1.12 stands for an integrated pattern. Cortisol shows an intermediate pattern between integration and dis-integration with a slope of −0.99 (which Bookstein characterized as “self-similarity”13). Although there is a major effect of components at larger scale, there are substantial local effects as well (Fig. 5, second row, middle column). These are found mainly in the eye region, the relative distance between the nose and the mouth, and the chin. In interpreting these displacement diagrams, the reader should emphasize the visual extent of the colors per se – the net lengths of our three selected subdomains of partial warps, irrespective of their direction – and should not be concerned with the appearance of “outliers,” as the partial warps are precomputed patterns correlated over all 71 of the (semi-)landmarks of the design. The visual impact of the red segments decreases down the figure, while those of the cyan and purple segments increase. The most dis-integrated pattern was found for the health rating. This means that the corresponding shape changes are much less correlated from locus to locus across the male face. The slope of −0.76 here is significantly shallower than the regression slope of −1.12 for BMI (p = 0.002). Small-scale effects predominate (Fig. 5, bottom row).

Discussion

It has become customary to analyze correlates of facial form by regressing Procrustes shape representations of that variation on their hypothesized causes or effects. Both the causes of that variation (here, the BMI index) and the effects of that variation (here, perceived “health”) can be detected and described by strong regressions of this sort. Our results have shown how regressions like these can sometimes be differentiated by their apparent geometric scale. We might summarize the findings and their interpretation in the form of the oversimplified diagram in Fig. 6. Physiological effects upon form appear more integrated than hormonal correlates of form, which are, in turn, more integrated than the apparently multifocal perceptual effects of form that our brains invoke implicitly in the course of rating studies. The finding suggests a polarity within psychomorphospace studies: contrasting global versus focal patterns of morphology.

Figure 6: Spatial wavelengths: the “colors” of the face as viewed frontally.
figure 6

This schema visualizes the different structures of correlation across shape regressions by dissecting the scale of variation involved. While biological variables are encoded in rather large scale (global patterns), perceptual outcomes tend to be of smaller scale (local patterns). Brackets indicate hypothetical scenarios.

Figure 6 incorporates a subtle color-coding. You are familiar with color as wavelengths of light: red has the longest wavelength in the visible spectrum, blue the shortest. Figure 6 exploits this color spectrum by “coloring” the wavelengths of bending as if all these regressors, causes and ratings alike, were filters on the same unchanging data set of shapes. The diagram copies the top right and bottom right regression lines from Fig. 5, “BMI” and “Health rating”, and adds three others corresponding to hypothetical processes that go beyond the data of the present paper. All these lines are to the same axes as in the right column of Fig. 5. The line labeled “Growth allometry” has slope -1.5, the estimated slope for allometry from an analysis of growing rodent skulls13. The line labeled “Emotion rating” expresses the conjecture that a rating of an emotional state, such as anger (think of the role of the eyebrows in conventional cartoon renderings of this emotion), will focus even more sharply on local features and less on global gradients at the largest scales. Finally, the line labeled “no integration” is the biologically impossible situation modeled by the Mardia-Dryden distribution26 where all landmarks vary independently by the same circular Gaussian. In terms of the more conventional language of filtering, we color “BMI” in red because in comparison with selfsimilarity it is like a red filter, emphasizing long spatial wavelengths, and the Health rating is drawn in blue because, like a blue filter, it comparatively emphasizes shorter wavelengths – the partial warps of higher energy at the right on the horizontal scale. Growth allometry should be drawn in the “infrared” on this diagram, even stronger at larger scales and weaker at small scales. Conversely, emotional ratings should be drawn in ultraviolet, with even more amplitude than the health rating at the smaller scales. Finally, the parody of “no integration” is shown in black, as it is incompatible with life. In the metaphor of the filter, this distribution lets no meaningful biological signal through at all.

The sort of large-scale variation represented by the red line in Fig. 6 is conceptually analogous to the n-dimensional feature space proposed by Grammer and colleagues34, in which correlated features compose a single ornament no matter the spatial extent over which they are correlated. In contrast, single feature approaches such as identi-kits might be regarded as analogous to the small-scale patterns here indicated by the colors of blue or violet.

Certainly, systemic effects are reflected in much more global spatial patterns than single muscle movements are. In this context fat and water distribution seem to be important issues. In young adults (as in our sample), fat in the face looks fairly evenly distributed because of smooth transitions between subcutaneous fat compartments, while ageing leads to abrupt contour changes between these regions18. This is the straightforward explanation of why our shape regressions on BMI reveal mainly large-scale changes in facial morphological covariation: all facial regions studied are highly integrated in respect of fat deposit processes. In contrast, increased saliva cortisol concentrations not only influence the overall shape but also have an impact on specific features, in particular on the area around the eyes. Physiologically, circadian secretion rates of cortisol in relation to other hormones or life-style factors can dramatically influence the water exchange between cells and the extracellular space by ion movements along the cell membranes35. Also, cortisol administration in healthy men leads to an expansion of extracellular plasma volume36. So it would be reasonable to assume that the inter-individual differences in cortisol in our sample might be associated with differences in water retention. And these effects seem quantifiable not only in a more rounded facial outline with somewhat centrally located sensory organs but also locally around the eyes. “Swollen eyes” can safely be added to this phenomenon, since periorbital puffiness is typically caused by fluid buildup around the eyes.

Further differentiation comes from the nature of ratings. Someone who is asked to attribute a certain trait to a face will search for the cues of that trait. In our example, the raters likely pick shape features that in their experience systematically vary with health status. For instance, faces with “apple cheeks” would consistently be assigned a more highly ranked health status than those hollower in the cheek and eye areas (a pattern coherent with our shape estimates for perceived health status, Figs 2 and 3). In our sample of young male faces, health raters might also pay attention to features such as testosterone markers (pointing to a good immune system37), body size markers, and physical strength markers, adding up to a “patchwork face” with a masculine and robust appearance (Fig. 3, bottom right morph). For all of these reasons, the spatial scale for a rating is generally less integrated than for a physiological condition. Dis-integration probably peaks for ratings of emotional expressions. We now have a metric to quantify the different spatial scalings of shape changes associated with the various predictor variables.

In our sample, BMI, cortisol, and health rating are hardly correlated at all. In light of this near-independence of causes and effects combined with the visual similarity of some features in the graphs of Fig. 5, we briefly look into the shape regressions themselves and relate our results to other studies on facial correlates of BMI, cortisol, and perceived health.

The facial shape changes associated with increasing BMI parallel those that others have found, e.g., ref. 9. The overall pattern, which seems robust against choice of morphometric method, is predominantly a global effect, an enlarged lower face. In their meta-analysis of two “Caucasian” and two “African” male samples, Coetzee et al. note a low but significant positive correlation of BMI with facial width-to-height ratio and a low negative correlation with perimeter-to-area and with cheek-to-jaw-width ratios38. Cheek depth and relative lower face width seem to be most affected by nutritional condition (see reference 39 for a review). The pattern in Fig. 5 also closely resembles the deformation induced by rising percentage of body fat in men25 and in female adolescents40. In a sample of children and adolescents, three normalized distances representing the lower face area were enough to train a machine to predict body weight from facial portraits41. Our approach does not require preselection of subsamples or preselection of specific local features as in Henderson and colleagues42. When analyzed by shape regressions, such features and contrasts are implicitly embraced by a single pooled analysis of all the landmarks and semilandmarks on all the faces.

This study is one of the first to quantify the facial shape changes that covary with cortisol in young adult males. Moore and colleagues produced composite faces, each an average over five to eight men, to represent the four combinations of high/low cortisol with high/low testosterone43. It appears that their results parallel ours: Both of their high cortisol conditions are characterized by a rounded facial outline with eyes, nose, and mouth relatively close together. Due to variation in their head positioning we could not compare aspects of the eyes. Our results also resemble to some extent the ones obtained by Gonzalez-Santoyo and colleagues for young adult women44. They averaged ten faces of women with low salivary cortisol concentrations and ten faces of women with high cortisol levels. As for men, more salivary cortisol was associated with a higher facial width-to-height ratio. In contrast, their composite for high cortisol had rounder eyes than the one for low cortisol levels, which is the opposite of the trend that we observed. Our pattern of high cortisol effects is also consistent with the facial appearance of Cushing’s disease (which involves, among other symptoms, chronic overproduction of cortisol). Common signs and symptoms of Cushing’s are a round “moon face” along with weight gain/central obesity, hypertension, thin skin and stretch marks, and muscle weakness45,46. Cortisol administration in healthy men has been related to an expansion of extracellular plasma volume36. Such water retention might also explain the “swollen eyes” or drooping eyelids that we observed as local effect with increasing cortisol concentration.

The percentage of large-scale variation (Table 1) dropped to just over half when the regression was on a rating instead of a physiological measurement. Raters overweighed small-scale variation in face shape when judging the health status of another individual in comparison to global patterns like BMI. One explanation could be that people attend to small-scale variations because of the variety of facial expressions and their importance in interpersonal encounters. Perrett and colleagues (2001, as cited in ref. 47) presented composite images combining the 20% healthiest-looking male students and separately the 20% least healthy-looking. A separate version of their study amplified shape, color and texture differences (all images reprinted in ref. 47). The shape characteristics associated with perceived health paralleled our results. In women, perceived health is associated with upward mouth curvature, but not with eyelid openness42.

According to Vernon and colleagues48, p. E3353, “despite enormous variation in ambient images of faces, a substantial proportion of the variance in first impressions can be accounted for through linear changes in objectively defined features.’’ We have shown that those “linear changes” arise at a range of geometric scales. It is a logical next step to examine which ratings use which features and how their weights might vary in social perception of other qualities or over the type of person being rated (a child, a woman, a person of a different ethnicity [in which respect see Blais et al.49 or Tan et al.50]). In line with the analogy of filtering wavelengths, this approach might ramify into models of neural processing patterns that can then be systematically tested by functional brain imaging or other neurometric laboratory methods.

In conclusion, the methods we presented here augment the information obtained from a shape regression by the patterns of diverse spatial scaling profiles. Not only can we calculate the percentage of variance that is explained by large-scale features, to be compared across predictors and contexts (e.g., biological processes vs. zero-acquaintance guesses), but also the slope of the additional regression serves as a continuous metric, the “color” of the spatial filter. Future studies could extend this paradigm by incorporating ratings of emotions that are presumed to correspond to the least integrated shape patterns. The new method renders the framework for classification of the profiles of psychomorphospace considerably more robust. For example, the dominance of small-scale features in the production of ratings could well help to explain the overgeneralization biases that notoriously afflict rating behaviors in most studies of the particular socially salient ratings that lead to prejudice and ethnic conflicts.

Additional Information

How to cite this article: Windhager, S. et al. Patterns of correlation of facial shape with physiological measurements are more integrated than patterns of correlation with ratings. Sci. Rep. 7, 45340; doi: 10.1038/srep45340 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.