Introduction

Anthropometric measures are routinely assessed in sports and rehabilitative medicine because of their usefulness for performance optimization and (re-)injury prevention1. In fact, measurements such as the limb circumferences and skinfold thicknesses are widely used in young and adult athletes to assess the body composition (e.g., fat mass percentage, lean mass, muscle mass)2,3,4,5,6 and its training-related changes7,8,9,10,11. Because tape-based measurements of body circumferences and caliper-based measurements of skinfold thicknesses may not be culturally or socially acceptable and also exhibit poor reliability, especially in persons with overweight and obesity, there was the need for the development and validation of reproducible, valid, and cost-effective technologies suitable to perform non-invasive anthropometric and body composition assessments12,13. Digital anthropometry by three-dimensional optical imaging systems has recently been shown to provide non-invasive, precise, and accurate measurements12,13,14,15,16. However, three-dimensional imaging systems are still unavailable in most clinical and sports settings, whereas digital consumer cameras and smartphones (with their mobile applications using high-resolution imaging) became pervasive and offer new tools that can be used by physicians to evaluate patients and by exercise scientists to evaluate athletes. Tian et al.17 were the first to propose and validate an algorithm for predicting three-dimensional body shape and composition from a single frontal bi-dimensional image acquired with a digital consumer camera. Very recent studies performed in healthy adults showed that smartphone-based assessment of body circumferences18,19,20 and different body composition variables (i.e., body fat percentage, visceral adipose tissue, fat free mass, appendicular lean mass)18,21,22,23,24,25 was feasible (through two or four photographs) and produced reliable and valid (with respect to dual-energy X-ray absorptiometry—DXA18,21,22,25 or rapid four-compartment model23,24) measurements. These measurements can be particularly useful in young athletes to optimize the nutritional programs and to help identify underlying medical problems (e.g., eating disorders)26. However, no previous study, to our knowledge, performed smartphone-based digital anthropometric assessments in young athletes. Therefore, the aim of this study was to investigate the reproducibly and validity of smartphone-based estimation of clinically relevant anthropometric and body composition parameters in a large group of youth soccer players.

Methods

Participants and protocol

The study setting was a sports medicine and rehabilitation center where a convenience sample of 124 male soccer players [median age (1st–3rd quartile): 16.2 (15.0–18.4) years; body mass index: 21.4 (20.1–22.8) kg/m2] and 69 female soccer players [age: 15.5 (14.4–16.9) years; body mass index: 20.4 (19.6–21.4) kg/m2] were recruited to participate. Almost all (N = 181) players were Caucasian, with the exception of some (N = 12) African players. This single study visit was performed as part the preseason investigations and included measurements of body weight and height, whole-body DXA scan, and acquisition of optical images.

All subjects (or their parents in case of underage subjects) gave their written consent for study participation and publication of identifying information/images after receiving a detailed explanation of the protocol.

Measurements

Body weight and height were measured while the subject was dressed in undergarments and with bare feet. Body weight and height were measured (to the nearest 0.1 kg and 0.5 cm, respectively) using a standard scale with stadiometer (model Seca 799, Seca GmbH & Co. Kg, Hamburg, Germany).

One whole-body DXA scan was performed on a Lunar iDXA system (GE Healthcare, Chicago, IL, USA) according to a standardized protocol. Duplicate DXA scans were not acquired to be radiation dose conserving. The output from the DXA scan included the following body composition measurements: (i) total fat mass and its percentage, (ii) total lean mass (i.e., whole-body soft lean mass plus bone mineral content), (iii) arms lean mass (i.e., the soft lean mass of the upper limbs), (iv) legs lean mass (i.e., the soft lean mass of the lower limbs), (v) appendicular lean mass (i.e., the sum of the soft lean mass of the upper and lower limbs).

Optical images were taken with Mobile Fit app (version 3.0, Size Stream LLC, Cary, NC, USA) using a standardized positioning protocol. Voice commands from the app guided each player into position for the self-scan: as shown in Fig. 1A–D, the player was asked to assume a “front A-pose” (and to maintain the pose without movements of the trunk or limbs) to capture the frontal image. Next, the player was asked to assume a “side pose” (Fig. 1B–E) to capture the lateral image. After the image capture, the app software generated a de-identified 3D humanoid avatar (Fig. 1C–F) with associated anthropometric measurements and body composition estimates. The acquisition of the frontal and lateral images was performed in duplicate to obtain two avatars for each player. The image acquisition was further repeated if the experimenters noticed either movements of the trunk or limbs during the frontal image capture or the improper placement of the upper limbs and hands (as shown in a representative example in Fig. 1E) during the lateral image acquisition. In fact, body movements or an improper placement of the upper limbs can produce changes in the shape of the avatar (for example, the arm-blocking-back artefact and the hand-on-thigh artefact) that ultimately result in biased estimation of different body circumferences.

Figure 1
figure 1

(AC) representative example of acquisition of the frontal (A) and lateral (B) images in a male soccer player and the relative avatar (C) with the following measurements: average of the right and left arm circumference 32.3 cm, waist circumference: 87.9 cm, hip circumference: 100.8 cm, average of the right and left thigh circumference: 54.8 cm. (DF) the improper placement of the upper limbs during the acquisition of the lateral image (E) in the same player of the (A,B) changed the shape of the avatar and biased the estimation of different body circumferences, as follows: average of the right and left arm circumference 30.7 cm, waist circumference: 81.5 cm (i.e., the waist circumference difference between the two avatars was ~ 6 cm), hip circumference: 99.3 cm, average of the right and left thigh circumference: 53.9 cm.

The Mobile Fit app report includes 243 whole-body and segmental circumferences, lengths, surface areas, and volumes. Body composition estimates can be derived from some of these measures using the previously published equations for estimation of the body fat percentage27 and appendicular lean mass25 (Table 1).

Table 1 Prediction equations for the body fat percentage and appendicular lean mass.

The following avatar-derived anthropometric and body composition parameters were considered for reproducibility and validity assessments (see below): body surface area, waist circumference, hip circumference, averages of the two sides for the arm, thigh, and calf circumferences, fat mass and its percentage, total lean mass, and appendicular lean mass.

Statistical analysis

Prior to statistical analyses, outliers of DXA-derived and Mobile Fit app-derived measurements were removed using the Grubbs’ outlier test (alpha = 0.05)28 as a part of preprocessing quality control: 2 male players were identified as outliers (1 for DXA-derived measurements, 1 for app-derived measurements) and their measurements removed.

Normality of the data distributions was assessed with the Shapiro–Wilk test and parametric statistical tests (paired sample T test, Pearson correlation analysis) were used.

Reproducibility assessment

Changes in anthropometric parameters and body composition measurements between the two avatars were analyzed with the paired sample T test to assess the presence of systematic bias. Measurement reproducibility (i.e., the extent to which scores on repeated measurements—avatar 1 vs avatar 2—are close to each other) was evaluated using: (i) root mean square error (RMSE); (ii) root mean square coefficient of variation (RMS-%CV); (iii) standard error of measurement (SEM) that was calculated as follows: √mean square error term from the Analysis of Variance (ANOVA)/median value between the two avatars): a value < 10% was considered as low (i.e., high agreement between different measurements)29; (iv) smallest detectable change (SDC: the smallest individual change in a score that can be interpreted as a real change) that was calculated as follows: 1.96 × √2 × SEM29.

Validity assessment

The following comparisons between the DXA-derived and Mobile Fit app-derived measurements (the average of the two avatars was considered) were performed: paired sample T test, absolute average differences obtained from the Bland–Altman plots, Pearson correlation analysis, Passing-Bablok regression analysis. The latter analysis was specifically designed to compare two data sets, given by two different methods, both affected by experimental errors30,31. If zero is not part of the 95% confidence interval of the intercept of the regression line, it may be concluded that systematic differences exist between the data acquired by the two methods. If 1 is not part of the 95% confidence interval of the slope of the regression line, it may be concluded that proportional differences exist between the compared data sets. For these conclusions to be valid, Passing and Bablok recommended a sample size of at least 50 subjects30,31. The Passing-Bablok regression can only be applied if the data are well-fitted by a linear model: therefore, the Cusum test for linearity was also performed.

Equation reparameterization and predictors reduction

The fit linear regression model (“fitlm” command in Matlab) was used to develop population-specific equations for the body fat percentage estimation (we adopted the same predictors previously identified by Harthy et al.27) and the appendicular lean mass estimation (we adopted the same predictors recently identified by McCarthy et al.25). Moreover, the stepwise linear regression was also adopted to identify the minimum number of statistically significant appendicular lean mass predictors (among those selected by McCarthy et al.25) in order to avoid model overfitting32,33.

Data were expressed as median and 1st–3rd quartile and were represented with violin plots showing the probability density functions of the data sets. The threshold for statistical significance was set to P = 0.05. Statistical tests were performed with Matlab (MathWorks, Inc., Natick, MA, USA), MedCalc v. 20.218 (MedCalc Software Ltd, Ostend, Belgium), SPSS v. 20.0 (SPSS Inc., Chicago, IL, USA) software packages.

Ethical approval

The study conformed to the guidelines of the Declaration of Helsinki and was approved by the ethics committee of the University of Turin (protocol n. 0574321).

Results

The reproducibility analysis was performed in a total sample of 186 players (image processing errors implied the rejection of the first acquisition for one female player and of the second acquisition for three male players and one female player).

No significant differences were obtained between the two avatars for all anthropometric and body composition measurements (paired T test: P > 0.05 for all comparisons—Fig. 2).

Figure 2
figure 2

Violin plots of the anthropometric and body composition measurements obtained for the two avatars in the whole group of male and female soccer players. Error bars indicate the median values and the interquartile ranges.

As shown in Table 2, the RMSE values were in the range 0.7–3.4 cm for the circumference measurements (the value obtained for the body surface area was 0.03 m2) and in the range 0.3–1.0 kg for the body composition estimates (the value obtained for the body fat percentage was 1.7%). The RMS-%CV values were in the range 1.2–2.6% for the anthropometric measurements and in the range1.1–5.2% for the body composition estimates. The SEM values were < 10% for all anthropometric and body composition measurements. The SDC values were in the range 1.2–6.6 cm for the circumference measurements (the value obtained for the body surface area was 0.06 m2) and in the range 0.6–1.9 kg for the body composition estimates (the value obtained for the body fat percentage was 3.3%).

Table 2 Reproducibility analysis results.

The agreement between DXA and Mobile Fit app was analyzed in a total sample of 122 male players and 69 female players (after exclusion of 2 outliers).

As shown in Fig. 3, the paired T test showed significant differences (P < 0.0001 for all comparisons) between DXA and Mobile Fit app (data obtained for the two avatars were averaged) for all body composition measurements. The Bland–Altman plots (Fig. 4) showed that Mobile Fit app overestimated the body fat percentage with respect to DXA (the average overestimation was + 3.7% in males and + 4.6% in females), while it underestimated the total lean mass (− 2.6 kg in males and − 2.5 kg in females) and the appendicular lean mass (− 10.5 kg in males and − 5.5 kg in females) with respect to DXA.

Figure 3
figure 3

Violin plots of the DXA-derived and Mobile Fit app-derived body composition measurements (data for the two avatars were averaged) obtained in the whole group of male and female soccer players. Error bars indicate the median values and the interquartile ranges.

Figure 4
figure 4

Bland–Altman plots of differences vs. means of the body fat percentage, total lean mass, and appendicular lean mass in male players (left column) and in female players (right column). In each plot, the solid horizontal line depicts the mean of the differences, whereas dashed horizontal lines represent the upper and lower limit of agreement (SD: standard deviation of the differences). The error bar displayed on each horizontal line represents the 95% confidence interval of the corresponding quantity. The dashed-dotted linear regression line (sandwiched between its 95% confidence interval curves) is indicative of proportional bias (whenever its slope is different from zero).

As shown in Fig. 5, we found significant positive correlations between DXA and Mobile Fit app for all body composition measurements: the R2 values ranged between 0.16 and 0.89 in male players and between 0.25 and 0.74 in female players. As shown in Table 3, the P values obtained by the Cusum test for linearity (range of P values between 0.18 and 0.65) indicated that the Passing-Bablok regression was applicable. The regression analyses in male players showed no systematic differences between DXA and Mobile Fit app for all body composition measurements (Table 3: the 95% confidence intervals of the regression intercepts included the 0 value for all measurements). However, proportional differences between DXA and Mobile Fit app were observed for the body fat percentage and appendicular lean mass (Fig. 5A–E and Table 3: the 95% confidence intervals of the regression slopes did not include the 1 value). The regression analyses in female players showed systematic differences between DXA and Mobile Fit app for the body fat percentage and the appendicular lean mass (Fig. 5B–F and Table 3: the 95% confidence intervals of the regression intercepts did not include the 0 value). Moreover, proportional differences between DXA and Mobile Fit app were also observed for the appendicular lean mass (Fig. 5F and Table 3: the 95% confidence interval of the regression slope did not include the 1 value). The Bland–Altman plots (Fig. 4) confirmed the presence of proportional biases between DXA and Mobile Fit app in both male and female players: in fact, significant (P < 0.0001) positive correlations were obtained between the differences and means for the body fat percentage in male players (i.e., the higher the body fat percentage, the lower the Mobile Fit app overestimation with respect to DXA) and for the appendicular lean mass in both groups of players (i.e., the higher the appendicular lean mass, the higher the Mobile Fit app underestimation with respect to DXA).

Figure 5
figure 5

Relations between DXA-derived and Mobile Fit app-derived measurements of body fat percentage, total lean mass, and appendicular lean mass investigated through the Passing-Bablok regression analysis in male players (left column) and in female players (right column). The regression plots include the line of identity and the generated regression line along with the R2 and P value (obtained through the Pearson correlation).

Table 3 Results of the Passing-Bablok regression analyses.

Using data of the soccer players, we reparameterized the equation previously proposed by Harthy et al.27 and we obtained the population-specific equation reported in Table 1 (equation #1). The reparameterization minimized the differences between the DXA-derived and Mobile Fit app-derived estimations of body fat percentage (male players: average difference of 0.0% with min–max differences of − 6.5% and 10.6%; female players: average difference of 0.0% with min–max differences of − 6.2% and 6.7%). However, the Passing-Bablok regression analysis (Cusum tests for linearity: P value of 0.18 in male players and 0.29 in female players) showed systematic and proportional differences in both male players (Fig. 6A: 95% confidence interval of the regression intercept: − 66.2 to − 25.8; 95% confidence interval of the regression slope: 3.0 to 6.1) and female players (Fig. 6B: 95% CI of the regression intercept: − 90.0 to − 42.4; 95% confidence interval of the regression slope: 2.7 to 4.7).

Figure 6
figure 6

Relations between DXA-derived and Mobile Fit app-derived (after equation reparameterization in top and middle panels and after equation reparameterization and predictors reduction in bottom panels) measurements of the body fat percentage and appendicular lean mass investigated through the Passing-Bablok regression analysis in male players (left column) and female players (right column). The regression plots include the line of identity and the generated regression line along with the R2 and P value (obtained through the Pearson correlation).

Using data of the soccer players, we reparameterized the equations recently proposed by McCarthy et al.25 and we obtained the population-specific equations reported in Table 3 (equation #2 for males and equation #4 for females). The reparameterization minimized the differences between the DXA-derived and Mobile Fit app-derived estimations of the appendicular lean mass (male players: average difference of 0.0 kg with min–max differences of − 4.4 kg and 3.1 kg; female players: average difference of 0.0 kg with min–max differences of − 3.3 kg and 2.1 kg). As shown in Fig. 6C,D, significant positive correlations were obtained between DXA-derived and Mobile Fit app-derived estimations: the R2 value was 0.91 for male players (higher than R2 of 0.86 of Fig. 5E) and 0.78 for female players (higher than R2 of 0.62 of Fig. 5F). The Passing-Bablok regression analysis (Cusum tests for linearity: P value of 0.66 in male players and 0.45 in female players) showed no systematic and no proportional differences in male players (Fig. 6C: 95% confidence interval of the regression intercept: − 3.7 to 0.1; 95% confidence interval of the regression slope: 1.0 to 1.1), while systematic and proportional differences were still observed in female players (Fig. 6D: 95% confidence interval of the regression intercept: − 7.4 to − 1.0; 95% confidence interval of the regression slope: 1.1 to 1.3).

Stepwise linear regression enabled to identify a limited number of appendicular lean mass predictors (shown in Table 1: equation #3 with 6 predictors for males and equation #5 with 4 predictors for females) among those selected by McCarthy et al. (13 predictors for males and 12 predictors for females)25. The differences between the DXA-derived and Mobile Fit app-derived estimations of the appendicular lean mass were minimized (male players: average difference of 0.0 kg with min–max differences of − 4.4 kg and 3.1 kg; female players: average difference of 0.0 kg with min–max differences of − 3.7 kg and 2.0 kg). As shown in Fig. 6E,F significant positive correlations were obtained between DXA-derived and Mobile Fit app-derived estimations: the R2 values were 0.91 for male players (equal to R2 value of panel C) and 0.77 for female players (comparable to R2 value of panel D). The Passing-Bablok regression analysis (Cusum tests for linearity: P value of 0.66 in male players and 0.65 in female players) showed no systematic and no proportional differences in male players (Fig. 6E: 95% confidence interval of the regression intercept: − 3.7 to 0.1; 95% confidence interval of the regression slope: 1.0 to 1.1), while systematic and proportional differences were still observed in female players (Fig. 6F: 95% confidence interval of the regression intercept: − 6.9 to − 1.2; 95% confidence interval of the regression slope: 1.1 to 1.3).

Discussion

In the present study we investigated the reproducibly and validity of smartphone-based estimation of clinically relevant anthropometric and body composition parameters in a large group of youth soccer players. The main results of this study can be summarized as follows: (i) Mobile Fit app provided precise measurements of body size and composition; (ii) Mobile Fit app overestimated the body fat percentage with respect to DXA, while it underestimated the total and the appendicular lean masses in both male and female players; (iii) reparameterization of the equations previously proposed to estimate the body fat percentage27 and the appendicular lean mass25 minimized the differences between the DXA-derived and Mobile Fit app-derived estimations.

The demonstration of high agreement between the measurements obtained for consecutive avatars confirms previous studies on the clinical application of smartphone-based digital imaging analysis that were performed in healthy adults through different commercially-available tools19,21,22,23. To our knowledge, this study is the first investigating youth soccer players and it is also the first performing smartphone-based analysis in a clinical setting, outside well-controlled laboratory settings. The smartphone app evaluated in this study is similar to the MeThreeSixthy app evaluated in previous studies19,20,24: both apps capture a series of bidimensional photographic silhouettes that are then extracted and linked to a tridimensional template mesh using artificial intelligence and machine learning algorithms. Although these apps are easy to use and consecutive scans can easily be acquired, a practical implication of the observed findings is that duplicate (or triplicate) acquisition of front and side images is not mandatory in a clinical setting: if the measurement conditions (i.e., lighting) and the participant attire and pose are appropriate, a single acquisition can provide robust body size measurements. Another implication of the observed high reproducibility of measurements is that the app we evaluated can be used to monitor the body size changes in response to interventions (e.g., training, diet) and the SDC values reported in Table 2 will help clinicians and trainers interpret the clinical meaning of the body circumference and composition changes over time at individual level.

The smartphone app evaluated in this study uses the prediction equation for body fat percentage previously developed by Harthy et al.27 through a gold standard four-compartment body composition model. Previous studies demonstrated its accuracy in healthy adults and its underestimation of the body fat percentage in individuals with higher degrees of adiposity (i.e., percentages above 30%)24,27. Conversely, we found that the app overestimated the body fat percentage with respect to DXA in both male and female players.

The smartphone app evaluated in this study uses the prediction equation for appendicular lean mass recently proposed by McCarthy et al.25 who demonstrated its accuracy with respect to DXA in small groups of adult men and women. Conversely, we found that the app underestimated the appendicular lean mass in both male and female players.

Differences in age and variability in body size and composition between the previously investigated populations and our group of athletes are possible explanations for the discrepancies between the previous24,25,27 and the present results. A methodological implication of these findings is that the anthropometrics-based prediction models obtained in healthy adults or in persons with obesity cannot be applied to assess the body composition in young athletes. Consistently, other body composition assessment approaches (e.g., bioimpedance analysis) require the selection of “normal” or “athletic” settings as different prediction models were developed and validated for “normal” and “athletic” subjects.

The reparameterization of the two equations and the predictors reduction for the McCarthy’s equation25 were performed as a preliminary effort to develop population-specific equations for the body fat percentage and the appendicular lean mass estimations. Although the reparameterization minimized the differences between the DXA-derived and Mobile Fit app-derived estimations, systematic and proportional differences between the DXA-derived and the app-derived estimations were still observed in male and female players for the body fat percentage and in female players for the appendicular lean mass. Further studies in large groups of soccer players are required to validate in independent data sets the population-specific models presented in this manuscript.

Further studies are also required to develop new models for body composition prediction in young athletes. It is well-known they undergo not only body size but also body shape changes that are underlain by both skeletal growth and sport-specific muscle size adaptations. For instance, soccer players tend to have an hourglass body shape with a larger proportion of their appendicular lean mass in the proximal region of the lower extremities (i.e., anterior and posterior thigh). It can therefore by hypothesized that body shape-based models could be required in young athletes to capture information about body composition beyond conventional anthropometric measurements, as it has already been observed in healthy adults34,35,36.

This study has several limitations. First, only one smartphone app was adopted for digital anthropometric assessment of soccer players: therefore, our results (high reproducibility of the app-derived measurements, app-derived overestimation of the body fat percentage and underestimation of the appendicular lean mass with respect to DXA) and the anthropometrics-based prediction models presented in Table 1 are device-specific and are not generalizable beyond the MeThreeSixthy and Mobile Fit apps. Second, we assessed the agreement between the measurements obtained for consecutive avatars, but the short-term and long-term reproducibility of the measurements was not investigated. Third, we used DXA instead of gold-standard approaches (i.e., four-compartment model for the body fat percentage estimation and whole-body magnetic resonance imaging for appendicular lean mass estimation) to derive body composition estimations used to perform the validity assessments. Fourth, we did not control for factors that might influence body composition (such as menstrual cycle phase in female players, hydration status in both male and female players) and might thus add variability to measurements and predictions.

In conclusion, this study showed that the digital anthropometric assessment with a smartphone app in youth soccer players provided precise measurements of body size and composition. Previously proposed anthropometrics-based prediction models obtained in adults cannot be applied to assess body composition in young athletes: population-specific models are therefore required given their body composition and shape features.