Introduction

The estimation of human stature from long bone length measurements is a common task in forensics or biological anthropology, and it can also be used to assess body mass index for hospitalized and bedridden patients. Because of variation in body form between human populations, it is essential to base the inference of stature on formulae derived for the population of interest. Indeed, several authors have demonstrated that estimation is affected by differences between human populations1,2,3. Different regression formulae are available in the literature for stature estimation for a number of human populations, based on data measured either from autopsied corpses4,5 or from skeletal remains1,6. However, existing studies are mostly based on rather small sample sizes and mostly, data is collected from deceased bodies; very few studies used data from living humans7.

Adoption of algorithms based on Artificial Intelligence (AI) has become increasingly widespread in various fields of medical and biological research. The benefit of these new methods lies in handling large amounts of data in a fully automated way. Accurate measurement of bone length is traditionally done manually from standing long leg radiographs (LLRs) for living humans or from skeletal material as well as from cadavers for deceased persons. This manual measurement process is both time-consuming and sometimes poorly reproducible because of the use of different software applications and different measurement techniques8,9. Manual landmark placement may also lead to high inter-observer variability. A recently published AI-based algorithm automatizes length and angle measurements on LLRs, which enables using much larger datasets and produces standardized outputs10,11,12. In general, the application of AI technology in medicine facilitates the analysis of large imaging datasets such as radiographs, computer-tomography (CTs) or Magnetic Resonance Images (MRIs) in medicine. However, to date, no study has been published on the use of AI technology for height estimation based on radiographic measurements in a large human sample.

The frequently used regression formulae derived by Trotter and Gleser in 1952 and 1958 are considered to be suitable for persons of European ancestry13,14,15 and are applied to estimate stature from skeletal remains. Trotter and Gleser suggested not to estimate stature by determining the average of estimates obtained from several equations, each of which is based on a different bone or on a combination of bones1. Although the formulae derived by Trotter and Gleser1,15 are considered to be quite reliable, these regressions were derived in the more than half a century ago, and average human stature has increased markedly since then, especially in high-income countries16,17. It is therefore a reasonable suggestion that these formulae may need to be adapted due to the secular change in stature18.

In this study, we aimed to derive updated regression formulae to infer stature for humans of European ancestry based on long bone measurements from living patients. For this purpose, we used measurements from LLRs taken between 2015 and 2020. We based the regressions on a large sample of more than 4000 adults and applied an AI-based algorithm to acquire tibial length, femoral length and total leg length for this patient sample. We then compared these newly derived regression formulae to existing ones in the literature.

Results

Patient demographics

Of the 4200 LLRs included in the final analysis, 2526 (60.1%) were from female patients and 1674 (39.9%) were from male patients. All included patients were between 18 and 95 years old and born between 1923 and 2002 with a median age of 66 years (Fig. 1). The mean BMI was 29.44 kg/m2 (± 5.8 kg/m2 SD) and the mean height was 168.9 cm (± 9.6 cm SD). Summary statistics of patient demographic variables and total numbers of left, right and bilateral radiographs, which were used in this study, are presented in Table 1. Mean BMI was similar across age groups (Fig. 2). Scatterplots for femur length and stature, separately for males and females, are shown in Fig. 3.

Figure 1
figure 1

Distributions of (a) age at the time of image acquisition and (b) year of birth for the female and male patient samples.

Table 1 Summary statistics for the demographic variables.
Figure 2
figure 2

Distribution of mean BMI per age group, separately for the male and female patient samples.

Figure 3
figure 3

Scatterplot of femur and stature measurements for the male (a) and female (b) sample and the estimated regression lines based on these data. For comparison, the regression lines of the current study (Femur Simon) are shown together with regression lines from the literature [Trotter and Gleser Femur for females (1952) and males (1958), Trotter and Gleser Femur only for males (1952)].

Regression results

The linear regression equations for the estimation of stature in our sample based on either one or two bone lengths are presented in Table 2.

Table 2 Stature estimation equations based on femur, tibia, leg length and femur + tibia length, respectively.

Correlations between stature and long bone lengths were consistently larger than 0.7 for all considered long bones in males and females. For the male sample, this correlation was r = 0.82 (95% CI 0.80–0.83) for the femur, r = 0.80 (0.79–0.82) for the tibia, r = 0.84 (0.82–0.86) for total leg length and r = 0.84 (0.83–0.86) for tibia + femur. The slopes and intercepts are the averages of left and right. The intercept for the femoral regression was 77.49 in our equation for males, compared to an intercept of 65.53 in Trotter and Gleser (1958), and an intercept of 61.41 in Trotter and Gleser (1952). The slope was 2.08 in our equation for males, compared to a slope of 2.32 in Trotter and Gleser (1958) and a slope of 2.38 in Trotter and Gleser (1952), see Fig. 3a.

The correlations between long bone length and stature for the female sample were r = 0.77 (95% CI 0.76–0.79) for the femur, r = 0.76 (0.75–0.78) for the tibia, r = 0.80 (0.78–0.81) for leg length and r = 0.80 (0.78–0.81) for tibia + femur. The intercept for the femoral regression in females was 79.81 in our sample, compared to an intercept of 54.13 in Trotter and Gleser (1952). The slope of the femoral regression for females was 1.90 in our equation, compared to a slope of 2.47 in Trotter and Gleser (1952), see Fig. 3b.

Average stature of the male subsample was 177 cm (± 7.4 cm SD), with a range from 148 to 202 cm. For females, average stature was 163 cm (± 6.5 cm SD), ranging from 140 to 189 cm. Detailed tables summarizing the stature distributions as well as the corresponding averages of the femur, tibia, leg and tibia + femur measurements for the male (Table 3) and female (Table 4) samples are included below.

Table 3 Distribution of stature and means of long bone measurements for male (femur, tibia + femur, leg length, tibia length) for each stature level (each cm).
Table 4 Distribution of stature and means of long bone measurements for female (femur, tibia + femur, leg length, tibia length) for each stature level (each cm).

We calculated differences between predicted stature and the mean of the clinically measured stature values for each stature category (each cm) independently, to assess the goodness of fit of the linear regression equations for the different stature categories. We found that the linear equations tended to slightly overestimate stature in short persons, and underestimate stature in tall persons, on average. For example, for male individuals who were two standard deviations (14.8 cm) shorter than the male mean (177 cm), stature was overestimated, on average, by 2.5–3.3% depending on the regression equation (4.7 cm based on the femur equation, 4.4 cm based on the leg length and the tibia + femur equations and 5.9 cm based on the tibia equation). Male individuals who were two standard deviations taller than the male mean were underestimated, on average, by 2.3–2.8% (4.9 cm based on the femur equation, 4.5 cm based on the tibia equation, 4.2 cm based on the leg length equation and 4.0 cm based on the tibia + femur equation, Fig. 4a).

Figure 4
figure 4

Distribution of stature for males (a) and females (b) (black Gaussian curve) in our sample. The right vertical axes in (a) and (b) describe the number of patients with a specific stature value (rounded to cm). The left vertical axes depict the difference between predicted and mean stature value for the four regression formulae derived here (sigmoidal four parameter logistic curve): femur (grey), leg length (black), tibia + femur (black dotted) and tibia (grey dotted). For very short persons (left tail of the distributions), predicted stature was larger than the mean stature and for tall persons (right tail of the distribution) predicted stature was smaller than the mean stature for all four regression formulae. Dotted lines depict the mean as well as ± 2 standard deviations of the stature distributions.

Female individuals who were two standard deviations (13 cm) shorter than the female mean (163 cm) were overestimated, on average, by 3.1–3.7% (5.6 cm based on the femur equation, 5.2 cm based on the leg length equation, 5.1 cm using the tibia + femur equation and 6.1 cm using the tibia equation). Females who were two standard deviations shorter than the female mean were underestimated, on average, by 2.6–3.0% (4.9 cm using the femur equation, 4.8 cm using the tibia equation, 4.2 cm using the leg length equation and 4.6 cm using the tibia + femur equation, Fig. 4b). The mean distribution of stature for males and females is visualized in the Supplementary Material (see Supplementary Fig. S1 online).

Discussion

In this study, we derive new stature estimation regression formulae based on long bone measurements, which were collected from long leg radiographs of 4200 living Austrians. Measurement was automatized by the software LAMA™19, which is an algorithm able to automatically place landmarks utilizing artificial intelligence.

As expected, our findings confirm that different long bone lengths show a high correlation (r ≥ 0.76) with stature. Using tibia + femur or leg length resulted in a higher correlation with stature (r > 0.84) and hence also in a better predictive capacity of the regression formula compared to formulae using femoral or tibia length alone (r > 0.8).

Different stature estimation formulae have been described in the literature for different human populations and geographical areas, such as for Japanese20, Thai21, Portuguese5, Mexicans22, White US-Americans15 and Native North Americans23. The formulae by Trotter and Gleser (1952, 1958) are considered to be most suitable for persons of European ancestry13,24, but these formulae were established more than half a century ago. As the secular increase in stature has since led to an absolute increase in average stature in most human populations25,26,27, a review is warranted to assess whether these formulae require adjustment.

Our results show that the regression lines of the present study, which we derived based on a sample of more than 4000 living Austrians, possess a shallower slope and a larger intercept, compared to the formulae derived by Trotter and Gleser (1952, 1958). We suggest that the differences in slopes and intercepts are a consequence of the ongoing secular increase in stature in Europe, where maturation occurs at increasingly younger ages, and absolutely larger adult height is reached. The exploitation of the full growth potential during childhood and adolescence is likely a consequence of reduced poverty, better nutrition and better general health27. This phenomenon shifts the population distribution of stature towards higher mean values. At the same time, human bodies, and especially most of our long bones, do not generally grow isometrically18,28,29,30, which implies that the secular increase in stature likely affects the association between stature and the long bones18,29,31. In particular, the femur shows positive allometric growth18. Consequently, the secular increase in body size could be the reason for the larger intercept and shallower slope in the femoral regression formula derived in this study compared to the estimates by Trotter and Gleser (1952, 1958). An alternative explanation could be that the observed differences in intercepts and slopes are a consequence of genetic differences between samples, or they could be due to non-random sampling in earlier work. Trotter and Gleser (1952, 1958) used samples of military personnel, which might have been truncated, as those too short would not have been accepted into the military. Their female sample (Trotter, Gleser 1952) from the Terry Collection had uncommonly low stature by today ‘s standards.

This study aimed at updating the existing linear regression formulae for stature estimation. Our results indicate that a linear formula is limited in predicting stature accurately for very small and very tall persons. A further limitation of our study is that the exact measurement method and the used anatomical landmarks differ between radiographic measurements as collected here, which is the standard in radiology, and dry bone measurements, as collected in the studies by Trotter and Gleser (1952, 1958) and as usually done in forensics. In the present study, length measurement methods described by Waldt et al.32 were used as this is the standard in radiological long bone measurements33,34. We believe that despite the different measurement methods for long bone length in clinical medicine vs. forensics, these formulae have the potential to be applicable in anthropology and forensics. Dry bone length will likely deviate marginally from bone length measured on radiographs because bones shrink slightly when drying (ca. 2 mm difference in long bone length between fresh and dry bone15). In addition, the position of the long bone on an osteometric board will differ marginally from the position of the femur of a person undergoing a radiograph. However, we expect the resulting measurement differences to be small. Future work could estimate the measurement error when assessing long bone length based on dry vs. wet bone vs. radiographs according to the clinical vs. forensic standard for the same person.

To conclude, we found that the regressions derived here have shallower slopes and an increased intercepts compared to formulae from the literature (Trotter and Gleser 1952, 1958). We interpret these differences as a possible consequence of the secular increase in stature. Our study illustrates that AI algorithms are a promising new tool enabling large-scale measurements of bones based on radiographs.

Methods

The study was approved by the institutional ethics review board (Ethics-Committee of the Vinzenz Group EK: 46/2020) and individual informed consent was waived. All data analysed were collected as part of routine diagnosis and treatment. All experiments were performed in accordance with relevant named guidelines and regulations.

Study population

Between 2015 and 2020, we performed 17,099 standing antero-posterior LLRs in the Michael Ogon Laboratory for Orthopaedic Research, Orthopaedic Hospital Speising in Vienna, Austria. LLRs and demographic patient information were collected from the institutional arthroplasty registry.

We excluded patients with artificial joints, implants, other kinds of metalwork, posttraumatic or pathologic deformities, metabolic bone diseases, LLRs with no presence or visibility of the calibration ball, patients under 18 years of age, LLRs where the algorithm was unable to identify necessary landmarks and patients where stature was not recorded. In total, 4200 LLRs were measured and included in the final analyses.

Image acquisition

LLRs were taken as part of the clinical routine, as they are a standard procedure for preoperative planning and for diagnostic purposes. All LLRs were taken on the same device (DigitalDiagnost X-Ray-System 2.1.3, Philips Healthcare Inc., Andover, MA, USA) and each included a 25 mm calibration ball marker, which was placed medially or laterally of the knee joint.

Automated measurements

Leg-Angle-Measurement-Assistant (LAMA™) software (ImageBiopsy Lab, Vienna, Austria), which automates angle and length measurements on LLRs and annotates the original DICOM images, was used in this study. This program generates numerical outputs for the three linear distance measurements tibial length, femoral length and total leg length. LAMA™ automatically localizes anatomical features of the femur and tibia as well as the calibration ball to assess the landmarks needed to estimate the measurements. The software was designed to suppress the output if landmarks cannot be placed appropriately. Length calibration was performed by segmenting the calibration ball and calculating a magnification factor based on the size of the calibration ball and the diameter of the segmentation (pixel units).

For all LLRs the following linear distance measurements were computed (Fig. 5)32. Leg length (measured as linear distance between top of the femoral head and midpoint of the tibial plafond), maximum femoral length (top of the femoral head–bottom of the femoral medial condyle), and tibial length (midpoint of proximal tibial joint line–midpoint of the tibial plafond).

Figure 5
figure 5

Biometric linear distance measurements taken on the long leg radiographs. Leg length (orange line): distance between the top of the femoral head and the midpoint of the tibial plafond; maximum femoral length (green line): distance between the top of the femoral head and the distal portion of the medial femoral condyle; tibial length (blue line): distance between the distal portion of the medial femoral condyle and the midpoint of the tibial plafond.

Validation

The AI algorithm applied in this study was validated in a previous study on a smaller dataset of 289 LLRs and showed excellent intra-class-correlation between manually measured and automated measured lengths19.

Comparison to existing formulae

The formulae derived in the present study were subsequently compared to existing formulae published by Trotter and Gleser in the 1950s (Trotter and Gleser 1952, 1958). Trotter and Gleser measured samples of US military personnel from the Korean War and from World War II. Stature measurements were recorded at the time of induction into military service. Long bone measurements were conducted before final burial. Formulae for females were derived by Trotter and Gleser (1952)15 based on corrected equations from the Terry Collection samples (Smithsonian Institution, Washington D.C.).

Statistical analysis

Four ordinary least squares linear regression equations were estimated for stature as dependent variable and for femur, tibia, femur + tibia and total leg length, respectively, as predictor variable. Regressions were estimated separately for males and females. Correlation coefficients between stature and the three variables, leg length, femur length, and tibia length, respectively, were calculated, separately for males and females.

To assess the goodness of fit of the linear regression equations, we computed differences between the predicted value and the mean of the clinically estimated stature values for each stature category (for each cm). The resulting differences capture how well the linear model approximates the mean of the clinical measurements for each stature category. To plot the resulting differences, they were approximated by a logistic sigmoidal function (4 parameter logistic regression).

P values < 0.05 were considered statistically significant throughout the study. All analyses were performed using IBM-SPSS® version 25 and GraphPad Prism® version 8.