Introduction

High body fat percentage (adipose tissue mass relative to total body weight) is associated with mortality1,2. Recently, a large cohort study in adult individuals with a follow-up of 14 years reported that low baseline body mass index (BMI, weight in kilograms divided by the square of the height in meters) and high body fat percentage are independently associated with increased mortality3.Thus, accurate estimation of body fat percentage is highly relevant from a clinical and public health perspective, an aspect that has been endorsed by the American Heart Association Obesity Committee4.

Obesity, a state of excessive accumulation of body fat, is an important risk factor for multiple chronic pathologies including diabetes, coronary artery disease, hypertension and certain types of cancer5,6,7. Interestingly, the definition of obesity has changed over the last century. For example, early reports have defined obesity as the 20% to 40% excess of weight over the normal of 300 grams per centimeter of height8. Others have arbitrarily proposed body fat-defined obesity as a body fat percentage >35% for women and >25% for men9. To date, there is no consensus for the definition of obesity based on body fat percentage10,11. A BMI ≥30 is currently used to define obesity12. In fact, BMI is widely used to assess body fatness12,13, despite its limited accuracy to estimate body fat percentage9,14,15. An inherent problem of BMI due to its limited accuracy to estimate body fat percentage is misclassification of body fat-defined obesity. For example, a BMI ≥30 would overlook nearly 50% of women who had a body fat percentage higher than 35%9. Among the participants of the Third National Health and Nutrition Examination Survey, the diagnostic accuracy of BMI for body fat-defined obesity was estimated at 94% among women compared with 82% among men9. Thus, simple and low-cost alternatives to BMI with better diagnostic accuracy for obesity in both sexes would be of considerable importance.

Although several sophisticated techniques are available to obtain accurate estimates of whole-body fat percentage16; these methods are unsuitable for routine clinical purposes and large population studies. Consequently, numerous equations based on anthropometrics have been proposed as alternatives to BMI to better estimate whole-body fat percentage17,18,19,20,21,22,23,24,25. Some published equations require more than 10 different anthropometric measurements19, others require up to four different skinfold measurements21; some are relatively complex equations with numerous terms20,25. Thus, one common problem among existing equations is the lack of simplicity, showing limited potential for their use in routine clinical practice or public health.

In the present study, we systematically explored more than 350 anthropometric indices aiming to identify a simple anthropometric linear equation that is more accurate than the BMI as a potential alternative tool for clinical and epidemiological purposes to estimate whole-body fat percentage among female and male adult individuals. The second aim of the study was to evaluate its clinical utility.

Results

Study population

We included for analysis data from adult individuals 20 years of age and older who participated in the National Health and Nutrition Examination Survey (NHANES) 1999–2006. NHANES 1999–2004 data (n = 12,581) were used for model development and NHANES 2005–2006 data (n = 3,456) were used for model validation. Participants selection for the development and validation datasets is shown in Fig. 1. Characteristics of the participants studied are described in Table 1. Mean values of whole-body fat percentage measured by dual energy X-ray absorptiometry (DXA) in the development and validation datasets were 39.9% and 39.4% in women, and 28.0% and 27.8% in men, respectively. The frequencies of DXA multiply imputed data in the development and validation datasets are described in Supplementary Tables 1 and 2, respectively.

Figure 1
figure 1

Flow diagram of participant selection for the development and validation datasets. DXA, dual energy X-ray absorptiometry.

Table 1 Characteristics of adult individuals (≥20 years old) included in the study*.

Model development, performance, and selection

Supplementary Table 3 shows correlation matrix among the commonly used anthropometrics including body weight, height, BMI, triceps and subscapular skinfolds, arm and leg lengths, and waist, calf, arm and thigh circumferences. Since arm and leg lengths showed poor correlation with body fat percentage, they were excluded from further analysis. In total, 365 anthropometric indices were empirically generated and tested for correlation with body fat percentage (see Supplementary Table 4 for a full list of all indices generated).

Equations were derived using linear regression. Our selected regression models included those based on the simplest indices with the highest correlation with body fat percentage among women and among men. Among the 365 generated indices, height3/(waist × weight) showed the highest correlation with whole-body fat percentage among women (r = −0.81; P < 0.001). (√Height)/waist equation showed the highest correlation with whole-body fat percentage among men (r = −0.85; P < 0.001). Height3/(waist × weight) showed slightly stronger correlation than the simple 1/BMI (r = −0.79; P < 0.001) among women. Among men, (√height)/waist showed slightly stronger correlation than the simpler index height/waist (r = −0.84; P < 0.001). Height2/(waist × √weight) showed high correlations both among women and men. Thus, we finally selected the five aforementioned indices to evaluate model performance.

Given height/waist is the reciprocal of the widely used waist-to-height ratio, we also examined the predicting ability of waist/height index. Height/waist better predicted whole-body fat percentage and showed lower root mean squared error (RMSE) than waist/height among men and women, and across ethnic groups (Supplementary Table 5) and age categories (Supplementary Table 6). Thus, we dropped waist/height from further analysis. Supplementary Fig. 1 shows improved linear relationship between whole-body fat percentage and height/waist by sex and ethnicity. All selected models showed lower prediction of body fat percentage in older individuals (Supplementary Table 6). We found a progressive decline in body weight, height and fat-free mass after 50 years of age, and a steeper decline in fat mass and waist circumference after 70 years of age among women and men (Supplementary Fig. 5), which coincided with the lower predicting ability of all models in older individuals.

For practical reasons, performance analyses of all selected models presented here were tested using their rounded and simplest expression (details are provided in the Supplementary material). Raw equations are shown in Supplementary Table 7. Concordance coefficients between DXA-measured whole-body fat percentage and final selected models are shown in Supplementary Table 8.

All selected models showed higher accuracy than BMI among women, whereas precision was improved only in models based on three anthropometrics and 1/BMI (Supplementary Table 9). Among men, height/waist equation showed the highest accuracy, and was also superior to BMI. Models based on three anthropometrics but not 1/BMI were also more accurate than BMI. All models but not 1/BMI were more precise than BMI among men (Supplementary Table 9).

Height/waist equation, named as the relative fat mass (RFM), was the final model selected because of its simplicity (it requires only two common anthropometrics), it was superior to BMI in predicting body fat percentage among men, had similar predicting ability relative to BMI among women and had overall better performance than BMI among women and men, independently.

Final equations are as follows:

$${\rm{Equation}}\,{\rm{for}}\,{\rm{women}}:\,76-(20\times ({\rm{height}}/{\rm{waist}}))$$
(1)
$${\rm{Equation}}\,{\rm{for}}\,{\rm{men}}:\,64-(20\times ({\rm{height}}/{\rm{waist}}))$$
(2)

or

$${\rm{RFM}}:\,64-(20\times ({\rm{height}}/{\rm{waist}}))+(12\times {\rm{sex}})$$
(3)

In equations (13), height and waist (circumference) are expressed in meters. In (3), sex = 0 for male and 1 for female. The coefficients for equations (1) and (2) were rounded for practical purposes.

Supplementary Fig. 3 shows good agreement between RFM and DXA.

Although we found a significant interaction between age and RFM among women (P < 0.001), that was not case among men (P = 0.088). However, inclusion of age in the final model did not improve R2 among women (RFM model: R2 = 0.66; RFM and AGE model: R2 = 0.66) or among men (RFM model: R2 = 0.75; RFM and AGE model: R2 = 0.75). Likewise, inclusion of ethnicity in the final model did not substantially increased R2 among men (RFM and ethnicity model: R2 = 0.76). Among women, inclusion of ethnicity in the model did not improve body fat prediction (R2 = 0.66). Thus, age and ethnicity were not included in our final model selected.

Model validation and performance

In the validation dataset, compared with BMI, RFM had a more linear relationship with DXA whole-body fat percentage among women (adjusted coefficient of determination, R2: 0.69; 95% CI, 0.67–0.72; vs. 0.65; 95% CI, 0.63–0.67) and men (R2: 0.75; 95% CI, 0.72–0.77 vs. 0.61; 95% CI, 0.59–0.63) (Fig. 2 and Supplementary Table 10). RFM had less bias than BMI among women (0.9%; 95% CI, 0.6% to 1.1% vs. −10.9%; 95% CI, −11.2% to −10.5%) and a similar low bias among men (RFM: 0.5%; BMI: 0.7%) (Table 2). Among women, RFM showed higher accuracy than BMI (91.5% vs. 21.6%; P < 0.001). RFM was also more precise than BMI (4.9%; 95% CI, 4.6–5.2% vs. 5.8%; 95% CI, 5.5–6.2%). Among men, RFM showed higher accuracy than BMI (88.9% vs. 81.9%; P < 0.001) and better precision (RFM: 4.2%; 95% CI, 3.9–4.6% vs. BMI: 5.1%; 95% CI, 4.9–5.4%) (Table 2 and Supplementary Fig. 4). Among women, RFM was more accurate than BMI across ethnic groups (P < 0.001 for all comparisons). Among men, RFM was also more accurate among European-Americans (P < 0.001) and African-Americans (P < 0.001) (Table 2). RFM also showed better performance than BMI across age categories (Supplementary Fig. 5) and across body fat quintiles (Supplementary Fig. 6). Among men, RFM also showed better performance than CUN-BAE (Clinica Universidad de Navarra-body adiposity estimator), Gallagher, Deurenberg and Kagawa equations, including across ethnic groups. Among women, RFM was superior to Deurenberg and Kagawa equations (Table 2).

Figure 2
figure 2

Prediction of whole-body fat percentage by RFM using linear regression in NHANES 2005–2006 (validation dataset). RFM, relative fat mass, which is based on height/waist. R2, coefficient of determination; RMSE, root mean squared error. Data plots correspond to DXA imputation 1.

Table 2 Comparison of performance between RFM and published equations based on BMI or waist-to-height ratio for prediction of body fat percentage among adult participants (n = 3,456) in the validation dataset (NHANES 2005–2006)*.

Internal validation with bootstrapping confirmed RFM was a better predictor of body fat percentage than BMI among women and men (Supplementary Table 11). RFM predicting ability decreased with age (Supplementary Table 12). RFM was more accurate and more precise than BMI (Supplementary Table 13) and had superior accuracy than BMI across age categories (Supplementary Fig. 7 and Supplementary Table 14) and body fat ranges; however, accuracy was lower in leaner individuals (Supplementary Fig. 8).

RFM was a better predictor of trunk fat percentage than of whole-body fat percentage or whole-body fat mass (Supplementary Table 15).

Obesity misclassification

To compare the rates of obesity misclassification between BMI and our final model, we arbitrarily defined obesity as DXA-measured body fat percentage ≥33.9% for women and ≥22.8% for men, based on the corresponding cut-points between the first and second quintiles for each sex. These cut-points were calculated using combined datasets (NHANES 1999–2006). In the validation dataset, when using same DXA cut-points for obesity diagnosis (≥33.9% for women and ≥22.8% for men), RFM had higher sensitivity than BMI. Table 3 shows total positive and negative cases of obesity identified using either BMI or RFM. RFM resulted in fewer false negatives among women (5.0%; 95% CI, 3.1–6.8% vs. 72.0%; 95% CI, 67.3–76.6%; P < 0.001) and men (3.8%; 95% CI, 1.8–5.8% vs. 4.1%; 95% CI, 2.1–6.1%; P < 0.001). There were fewer false positives with RFM among men (32.3%; 95% CI, 25.8–38.8% vs. 49.7%; 95% CI, 44.2–55.3%; P < 0.001) but more false positives among women (41.0%; 95% CI, 32.2–49.9% vs. 0%; P < 0.001).

Table 3 Positive and negative cases of DXA-diagnosed obesity* identified using either BMI or RFM among adult participants (n = 3,456) in the validation dataset (NHANES 2005–2006).

Obesity total misclassification was also lower with RFM than with BMI among all women (12.7% vs. 56.5%; P < 0.001) and all men (9.4% vs. 13.0%; P < 0.001) (Fig. 3), and among all Mexican-Americans (8.2% vs. 35.4%; P < 0.001), all European-Americans (11.3% vs. 35.2%; P < 0.001) and all African-Americans (9.9% vs. 37.2%; P < 0.001).

Figure 3
figure 3

Obesity total misclassification error in NHANES 2005–2006. Bars show comparison of total misclassification of obesity diagnosed by DXA-whole-body fat percentage (≥33.9% for women and ≥22.8% for men, based on the corresponding cut-points between the first and second quintiles for each sex) when using RFM and BMI at same DXA cut-points and a BMI of 30. Error bars are standard error.

In the internal validation dataset, compared with BMI, obesity total misclassification was lower with RFM among women (P < 0.001) and men (P < 0.001), among all Mexican-Americans, all European-Americans and all African-Americans (P < 0.001 for all three ethnic groups), and across age categories (P < 0.001 for all comparisons). Although we found a lower total misclassification rate with RFM among other ethnicities (Non-Hispanic Asians, Native Americans, and those who self-reported multiple ethnicity) (RFM: 12.9%, BMI: 41.9%; P < 0.001), these findings should be interpreted with caution as NHANES 1999–2006 did not oversample to get reliable estimates on these minority American ethnic groups.

Diagnostic accuracy for obesity and diabetes

In the validation dataset, compared with BMI, RFM showed better diagnostic accuracy for body fat-defined obesity among men (area under curve [AUC]: 0.94 vs. 0.91; P < 0.001) and similar diagnostic accuracy among women (AUC: 0.929 vs. 0.933; P = 0.52). RFM was also better than BMI in identifying diabetes cases among women (AUC: 0.79 vs 0.73; P = 0.002) and men (AUC: 0.80 vs. 0.76; P = 0.001).

Sensitivity analysis of the combined datasets showed RFM had a better diagnostic accuracy than BMI for high body-fat percentage among men (P < 0.001) regardless the DXA cut-point used to define obesity (Supplementary Fig. 9). RFM also showed a significant improvement over BMI and Gallagher, CUN-BAE and Deurenberg equations among men (Supplementary Table 16).

RFM was superior to DXA-measured trunk fat percentage in discriminating diabetes among women (P < 0.001) but not among men (P = 0.548) (Supplementary Fig. 10).

Discussion

In the present study, we identified the relative fat mass (RFM), which is a simple linear equation based on height-to-waist ratio, as a potential alternative tool to estimate whole-body fat percentage in women and men 20 years of age and older. Our analyses were performed using nationally representative samples of the US adult population which allowed us to evaluate the performance of RFM among Mexican-Americans, European Americans, and Africans-Americans.

In the validation dataset, the performance of RFM to estimate DXA-measured body fat percentage was overall more consistent than that of BMI among women and men, across ethnic groups, young, middle-age and older adults, and across quintiles of body fat percentage, although the accuracy of RFM was lower among individuals with lower body fatness. RFM also showed overall better performance (accuracy and precision) than the CUN-BAE, Gallagher, Deurenberg and Kagawa equations to estimate whole-body fat percentage among women and men.

The selection of our final model deserves some comment. The main aim of the present study was to identify a simple anthropometric equation, that could potentially be used for clinical and epidemiological purposes, as an alternative to BMI to better assess body fatness among adult individuals. No attempt was made to generate non-linear equations or complex linear equations based on a high number of anthropometrics. Previous studies have addressed this point19,22. Although our selected models based on three anthropometrics showed the highest adjusted R-squared than those based on two anthropometrics among women, we believe they would unlikely represent a practical alternative to BMI. Although the equations based on 1/BMI and height/waist showed a good predicting value among women and men, respectively, a different index for each sex would also result in low practicality when compared with BMI. Thus, we finally selected the height/waist equation (RFM) because it was the simplest equation among all selected models that better estimated whole-body fat percentage than the BMI among women and men, independently. Although waist-to-height ratio is widely used in epidemiology as a predictor of cardiovascular risk factors26,27, our results from the development dataset showed better linear relationship between whole-body fat percentage and height-to-waist ratio (the foundation of RFM) versus waist-to-height ratio among women, men, across ethnic groups, and age categories (Supplementary Tables 5 and 6). It should be noted that for body fat estimation purposes, the useful waist-to-height ratio is not an intuitive surrogate of whole-body fat percentage.

In our validation dataset, we found a high rate of false negative cases (low sensitivity) of body fat-defined obesity when using BMI at the cut-points arbitrarily chosen, both among women and men. These findings are consistent with those from previous studies, regardless the DXA body fat cut-points used to define obesity9,28,29. An RFM ≥33.9 for women and ≥22.8 for men showed a high sensitivity to identify individuals with obesity, 95.0% and 96.2%, respectively. Likewise, using same cut-points, RFM had lower rates of obesity total misclassification than BMI among all women and all men and among Mexican-Americans (8.2%), European-Americans (11.3%) and African-Americans (~9.9%), indicating a consistent and relatively low rate of obesity misclassification with RFM across these ethnic groups.

The lower rates of obesity misclassification with RFM compared with BMI (among women: ~13% vs. ~57%, respectively; among men: ~9% vs. ~13%), supports the clinical utility of RFM to identify individuals with high body fat percentage, a condition that has been associated with increased mortality1,2,3. Overall, our data show that the lower rates of obesity total misclassification with RFM are largely due to the higher sensitivity (lower false negatives) of RFM for body fat-defined obesity among women and men, supporting the potential of RFM as a screening tool for obesity. Compelling evidence indicates lifestyle intervention in adult individuals with overweight or obesity may reduce morbidity and all-cause mortality29,30. Thus, one important aspect will be to evaluate whether early lifestyle intervention in individuals with high body fat percentage assessed by RFM could offer clinical benefits to reduce morbidity and mortality in the short and long term.

One limitation of previous studies proposing predicting equations of body fat percentage is the lack of information on the diagnostic accuracy for high body fatness19,20,25. In the present study, RFM showed better diagnostic accuracy for body fat-defined obesity among men compared with BMI and the CUN-BAE, Gallagher and Deurenberg equations. Among women, RFM had similar diagnostic accuracy for obesity than BMI and CUN-BAE, Gallagher, Deurenberg and Kagawa equations. Thus, one benefit of using RFM over BMI is its relatively high diagnostic accuracy for obesity in both sexes (AUC ≥ 0.93). An additional advantage of RFM over BMI was its superior diagnostic accuracy for diabetes, a well-established cardiovascular risk factor31. RFM also showed superior diagnostic accuracy for diabetes relative to CUN-BAE and Gallagher equations among women. Our findings are consistent with meta-analyses of numerous cross-sectional studies, concluding waist-to-height ratio is superior to BMI to identify cardiovascular risk factors, including diabetes26,27.

Measurement of waist circumference is unstandardized and subject to variability. However, measurement error due to the anatomical placement of measuring tape appears to have little effect on the association between waist circumference and cardiovascular risk factors, including diabetes32. Moreover, the reproducibility between measurements is very high33. Nevertheless, if waist circumference measurements become part of routine clinical evaluation, it should be implemented with adequate tools and professional training.

The present study has some limitations. (1) We used DXA as the reference method. Compared with the four-compartment method, DXA underestimates fat percentage in the lower ranges and in men, and overestimates fat percentage in the higher ranges and in women34,35. Thus, the performance of RFM could well be slightly superior or inferior to the actual estimates depending on the relative fat mass and sex. (2) NHANES data analysis by ethnicity was limited to Mexican-American, European-American, and African-American adult individuals. Therefore, our results cannot be extrapolated to other ethnic groups. Future studies will be required to evaluate the performance of RFM in other ethnicities (e.g. Asians and Native-American populations) as well as in children, athletes, and in individuals with specific diseases. (3) Our study was cross-sectional and used a single-point measurement of each anthropometric. Thus, our study was not designed to propose RFM cut-points for the diagnosis of obesity. We defined obesity using arbitrary cut-points of DXA-measured body fat percentage to compare obesity misclassification by RFM and BMI. Sensitivity analysis showed RFM had better diagnostic accuracy for obesity than BMI among men regardless the cut-point used to define obesity. (4) RFM validation was limited to a nationally representative sample of the US population. External validation of the RFM performance and obesity misclassification with RFM in populations from other countries are warranted.

Our findings showed RFM equation, which is based on height/waist, had superior performance (accuracy and precision) to BMI and the CUN-BAE, Gallagher, Deurenberg and Kagawa equations to estimate whole-body fat percentage in women and men. Overall, total misclassification of body fat-defined obesity with RFM was lower than with BMI among women and men, across ethnic groups, including Mexican-Americans, European-Americans and African-Americans. We conclude that, in the population studied, RFM was more accurate than BMI to estimate whole-body fat percentage among women and men and improved body fat-defined obesity misclassification among American adult individuals of Mexican, European or African ethnicity.

Methods

Study population

NHANES is a program designed to study the health and nutritional status of the non-institutionalized population of the United States. NHANES is conducted annually and released in two-year cycles using a nationally representative sample across the country, selected using a multistage, probability sampling design. NHANES 1999–2004 and NHANES 2005–2006 oversampled Mexican-American and African-American populations to obtain representative samples of these ethnic groups for reliable estimates36. Thus, analysis by ethnic groups were limited to Mexican-American, European-American (White) and African-American individuals.

The present study did not require approval or exemption from the Cedars-Sinai Medical Center Institutional Review Board as it involved the analysis of publicly available de-identified data only.

Data

An advantage of using NHANES for the present study is that it constitutes the largest database containing information on whole-body composition for the US population, which was collected between 1999 and 2006 using the well accepted method DXA37,38. Thus, DXA was used as the reference method to measure whole-body fat percentage.

NHANES 1999–2004 was used as the development dataset. Multiple imputation was applied to replace missing DXA data39. Details are provided in the Supplementary Material. Model development included individuals 20 to 85 years of age. In total, 12,581 observations were included for model development (Fig. 1).

NHANES 2005–2006 was used as the validation dataset. Multiple imputation was also used to account for missing data (see Supplementary Material). Model validation included individuals 20 to 69 years of age, as DXA was performed only on individuals 69 years old and younger in this sample. In total, 3,456 observations were included for model validation (Fig. 1).

Anthropometric measurements

Waist circumference was measured placing the measuring tape around the trunk (unclothed waist) in a horizontal plane at the level of the uppermost lateral border of the right ilium during standing position at the end of the expiration. The measurement was recorded to the nearest 0.1 cm. Body weight was measured with an electronic scale (examinee wearing underwear only). Height was measured with an electronic stadiometer40. Other anthropometrics were measured using standard procedures40.

DXA scans

DXA scans were acquired using a Hologic QDR 4500A fan-beam densitometer (Hologic, Inc., Bedford, Massachusetts) and Hologic DOS software version 8.26:a3*. Scans were reviewed and analyzed by the University of California, San Francisco, using Hologic Discovery software, version 12.1 for NHANES 1999–2004 and version 12.4 for NHANES 2005–200639. Body fat percentage was calculated as the ratio of DXA whole-body fat mass (g) to DXA whole-body total mass (g), multiplied by 100.

Model development and selection

Common anthropometrics including body weight, height, triceps and subscapular skinfolds, arm and leg lengths, and waist, calf, arm and thigh circumferences were tested for correlation with DXA-measured whole-body fat percentage in men and women, independently. Simple and combined anthropometrics that had the highest correlation with body fat percentage among women and men, independently, were the foundation for our model development using linear regression for survey data. We also explored the effect of adding age and ethnicity in the regression models. Two- and three-way anthropometric indices were generated, including combination of integer powers, square root, and reciprocal transformations. Model selection was based on the ability to predict whole-body fat percentage (R2) in both women and men and sex-ethnicity subgroups, the lowest RMSE, the lowest Akaike information criterion41, the overall performance in terms of accuracy and precision, and the simplicity to estimate body fat percentage in both women and men. Further details are provided in the Supplementary Material.

Model validation

Validation of the final model was performed in NHANES 2005–2006. RFM performance was validated in the participants of the NHANES 2005–2006, a large nationally representative sample of the US adult population but different sample from the development dataset. Development and validation datasets were combined into one dataset (NHANES 1999–2006, n = 16,037 adult individuals) to perform internal validation using the most accepted technique, the bootstrapping, to obtain bootstrapped standard errors and verify the statistical differences between selected models and BMI42.

Model performance

We used concordance correlation coefficient and Bland-Altman plots to examine the agreement between estimated and DXA-measured body fat percentage43. Bias was calculated as the median difference between estimated and measured body fat percentage. For the purpose of the present study, accuracy (how closely an individual estimate agrees with the “true” or reference value) was calculated as the proportion of cases with <20% difference between estimated and DXA-measured whole-body fat percentage44. Precision was calculated as the interquartile range of the difference between estimated and measured body fat percentage44. The performance of our final model was compared with four published equations that are based on age and BMI or waist-to-height ratio reported to have a high prediction for body fat percentage: Gallagher25, CUN-BAE20, Deurenberg45 and Kagawa equations46.

Obesity misclassification

To date, there is no consensus on the diagnosis of obesity based on body fat percentage. Thus, to define obesity based on body fat percentage we used arbitrary cut-points of DXA-measured body fat percentage: ≥33.9% for women and ≥22.8% for men (corresponding cut-points between the first and second quintiles for each sex). Misclassification of body fat-defined obesity was expressed as false negative rate (1–sensitivity), false positive rate (1–specificity), and total misclassification error (the proportion of false positives and false negatives together among all women, all men, and among both sexes combined).

Diagnostic accuracy for obesity and diabetes

Diabetes was defined if an individual had a measured glycated hemoglobin ≥6.5% or a fasting plasma glucose ≥126 mg/dL or self-reported diagnosed diabetes47. Diagnostic accuracy for obesity and diabetes were estimated using the receiver-operating-characteristic curve analysis, expressed as the AUC48.

Statistical analysis

We used clusters and strata information and probability weights for all analyses to account for the complex design of the NHANES49. Estimates of the Akaike information criterion and concordance correlation coefficient were adjusted for probability weights only. Initial examination of the association between body fat percentage and anthropometrics, including those generated in the present study, were performed using unweighted data. Listwise deletion was used to handle missing data for correlation analyses. Pooled data estimates (and their 95% confidence intervals) were obtained using Rubin’s equations50 implemented in STATA for analysis of multiple imputation in complex survey data. Variance estimates for development and validation datasets were obtained using Taylor series linearization. Bootstrapping with 1000 replicates was used to obtain confidence intervals for adjusted R-squared and RMSE in the development and validation datasets and to perform internal validation. Wald test was used to test for interaction of ethnicity and age category with selected indices on the prediction of body fat and to calculate P values to evaluate the accuracy and diagnostic accuracy (AUC) between models51. Bonferroni correction was applied for multiple comparisons. All analyses were performed using Stata 14 for Windows (StataCorp LP, College Station, TX). P values were set to a two-tailed alpha level of 0.05.