General adiposity, usually defined by a high body mass index (BMI), is an established risk factor for several malignancies, including cancers of the oesophagus (adenocarcinoma), pancreas, colon, rectum, breast (postmenopausal), corpus uteri, kidney, gallbladder, stomach (cardia), liver, ovary, prostate (advanced stage), mouth, pharynx, multiple myeloma, and meningioma [1, 2]. In regions with high obesity prevalence, 4–9% of the cancer burden is attributable to a BMI ≥ 25 kg/m2 [3].

However, the obesity-attributable cancer burden is likely underestimated [4], because BMI neither differentiates between muscle and fat mass nor does it capture body fat distribution [3]. Waist or hip circumferences (WC or HC) and waist-to-hip ratio (WHR) represent surrogate markers of body fat distribution and have shown associations with cancer similar to those seen with BMI [5, 6]. Likely explanations are that these indicators track with general adiposity and are also highly inter-correlated [3], rendering them relatively non-specific regarding cancer risk. Adult attained height is an established risk factor for at least eight different types of cancer including pancreas, colorectum, endometrium, ovary, prostate, kidney, skin, and breast (pre- and postmenopausal) [7,8,9,10,11,12]. Although increased cancer risk due to height is largely independent of obesity and could be explained by an increased cell number in taller individuals [13], some overlap with obesity due to shared mechanistic pathways (e.g., elevated insulin-like growth factor 1 [IGF-1] levels) [14] cannot entirely be excluded. Taken together, classical anthropometric traits may fail to fully represent the complex relations of relative weight, adiposity, and height to cancer risk [15].

In a meta-analysis of 65 studies, Ried et al. [16] combined six anthropometric traits (i.e. weight, height, BMI, WC, HC, and WHR) using principal component analysis (PCA) and derived four principal components (PCs) for body shape phenotypes, which together explained over 99% of the total variation in these anthropometric traits [16]. The PCs showed large agreement across studies and between men and women [16]. The findings of Ried et al. suggest that the body shape phenotypes represent information that is not fully captured by individual anthropometric traits [16]. The body shape phenotypes showed differential associations with various indicators of metabolic health, such as elevated blood lipids, blood glucose, and insulin sensitivity [16], which are candidate mediators underlying the association between obesity and carcinogenesis [17]. Whether different body shapes are associated with cancer risk is unknown.

We used data of the European Prospective Investigation into Cancer and Nutrition (EPIC) study, applied the approach of Ried et al. [16] to derive four distinct body shape phenotypes and investigated associations of these body shape phenotypes with overall and site-specific cancer risk.

Materials and methods

Study population

The EPIC study is a prospective multicenter cohort investigating the association between lifestyle factors and cancer and other chronic diseases [18]. Between 1992 and 2000, approximately 520,000 men and women mostly aged between 35 and 65 years from 22 study centres in 9 different European countries (Denmark, France, Germany, Italy, the Netherlands, Norway, Spain, Sweden, and the United Kingdom) were recruited. Participants were selected from the general population, with few exceptions: in France, female employees in state schools were recruited; in Utrecht (Netherlands) and Florence (Italy), women who had participated in breast cancer screening were included; and in some centres in Spain and Italy, registered members from blood donor registries were selected. The Oxford (UK) cohort recruited half of the participants from groups of vegetarians and vegans [18,19,20]. Data from Greece were unavailable for this analysis.

All participants provided written informed consent, and approval for the study was obtained from the International Agency for Research on Cancer (IARC) ethics review panel (No. 20-34) and from all recruiting institutions. At recruitment, information on socioeconomic and lifestyle factors and medical history were obtained using questionnaires.

After exclusions, the current analysis comprised 340,152 participants (118,218 men and 221,934 women: Supplementary Fig. 1).

Assessment of anthropometric measures

Anthropometric measurements followed standard protocols, except in France and Oxford (UK), where data on body weight were based on self-report [21]. The accuracy of self-reported anthropometric measures was improved by using prediction equations derived from participants with both measured and self-reported data at baseline. These recalibrated self-reported anthropometric measures are valid for identifying associations in epidemiologic studies [22].

Body weight was measured without shoes to the nearest 0.1 kg and height to the nearest 0.1 cm or 0.5 cm. BMI was calculated as body weight (kilograms, kg) divided by height in metres squared (m2). WC was measured at the narrowest circumference of the torso or midway between the lowest ribcage and the highest point of the iliac crest. HC was determined horizontally at the level of the greatest lateral extent of the hips or above the buttocks. Body circumferences were rounded to the nearest centimetre. WHR was calculated as WC (cm) divided by HC (cm). To reduce heterogeneity due to protocol differences between centres, body weight, WC and HC of each participant were corrected for clothing worn during measurement [22]. Furthermore, centre-, age-, and sex-specific mean values for weight, height, WC, and HC were imputed for individuals with neither self-reported nor measured anthropometric data.

Ascertainment of cancer cases

Cancer cases were mainly identified through population-based cancer registries. In Germany and France, cancers were identified using health insurance records, cancer and pathology registries, and active follow-up of participants and next of kin [18]. Complete follow-up occurred between December 2009 and December 2013, depending on the centre.

Incident cancer cases were coded using the International Classification of Diseases and the third revision of the International Classification of Diseases for Oncology (malignant primary site) [23]. Detailed information on tumour topography is provided in Supplementary Table 1. The present analyses focused on the first primary cancer diagnosis. Participants who later developed a subsequent cancer were considered a case at the time of their first cancer. Endpoints were defined as all cancers combined and individual cancer types (bladder, brain and central nervous system (CNS), breast (postmenopausal and premenopausal), cervix, colon, corpus uteri, oesophagus (adeno and squamous cell carcinomas (SCC)), gallbladder, kidney, larynx, lips, oral cavity and pharynx, liver, lung, malignant melanoma, myeloma, ovary, pancreas, prostate, rectum, stomach (cardia and non-cardia), and thyroid). Cancer types with fewer than 100 cases were not considered for analyses.

Statistical analysis

We performed PCA on the standardised residuals of height, weight, BMI, WC, HC, and WHR. Residuals were computed in separate linear regression models of the six anthropometric traits on age, sex, and study centre. The age, sex, and centre adjustment should facilitate comparability across study populations by removing the extraneous variability introduced by these variables. This resulted in a set of six PCs representing orthogonal linear combinations of the six traits [24]; i.e., each component represented a weighted sum of the six transformed traits and was independent of the other components. For better characterisation of each body shape, the mean values of each trait among participants in the top and bottom 5% proportions are presented alongside their visualisations using Pearson’s correlation coefficients between all anthropometric measures and the PCs were also calculated. To minimise possible influence of outliers, PC data were winsorized at 1 and 99% [25].

We used Cox proportional hazards models with age as the underlying time metric to estimate hazard ratios (HRs) and 95% confidence intervals (CIs). Age at recruitment was the entry time, and age at the first primary cancer diagnosis, age at end of follow-up, age at loss-to-follow-up, or age at time of death, whichever came first, were the exit time. HRs and 95% CIs per 1 standard deviation (SD) increment of each PC were calculated to allow body shapes to be compared. Models were stratified by age at recruitment (in 5-year groups), sex, and study centre. To avoid a duplication of tests for this novel exposure across 24 types of cancers and because the loadings for the anthropometric traits across the four PCs were very similar among men and women, we provided results for men and women combined.

We fitted a crude model that included the four body shape PCs. Potential confounding variables for the multivariable models were identified a priori using Directed Acyclic Graphs (Supplementary Fig. 2) [26, 27]. Model selection was based on Akaike’s Information Criterion. Proportional hazards assumptions were tested using scaled Schoenfeld residuals [28]. Departure from linearity for all continuous exposure variables was assessed by log-likelihood ratio tests and if necessary, restricted cubic splines were used with three knots placed at the 10th, 50th, and 90th percentiles [24].

Analyses were repeated among never and current smokers to assess potential residual confounding by smoking and were stratified by median age (52.3 years) to address potential changes of anthropometric measures. In addition, to control for potential reverse causation, sensitivity analyses were performed excluding the first 2 years of follow-up.

All statistical tests were two-sided and Bonferroni-corrected P values ≤0.001 (~0.05/96 tests) were considered statistically significant. Analyses were performed using R version 4.0.3 (R Foundation for Statistical Computing, Vienna, Austria, 2020).


The first four PCs together explained 99.8% of the total variation of the six anthropometric variables. Thus, all analyses were restricted to these PCs (Table 1). Each PC described a distinct body morphology (Fig. 1). The loadings for each anthropometric trait are presented in Table 1 (men and women combined), which were very similar among men and women (Supplementary Tables 2 and 3). For better comparability with Ried et al. [16], the directionality of PC4 was reversed.

Table 1 Loadings and explained variance of the principal components (PCs) for the analytic study population in EPIC (n = 340,152) and average loadings and average explained variance derived by Ried et al. [16] labelled “avPC”.
Fig. 1: Loadings for the four different body shape phenotypes.
figure 1

PC1: blue; PC2: magenta; PC3: green; PC4: orange.

PC1 explained 63.0% of the total variation, with high loadings for all anthropometric measures except height, describing individuals characterised by general obesity (Supplementary Fig. 3). PC2 (19.6% of total variation) was characterised by opposite loadings for height and WHR (Supplementary Fig. 4), mainly discriminating between tall individuals with low WHR and short individuals with high WHR. PC3 (14.4% of the total variation) was characterised by loadings for height and WHR in the same direction, HC loadings in the opposite direction, and low loadings for BMI (Supplementary Fig. 5), distinguishing between tall individuals with high WHR but low HC and short individuals with low WHR and high HC. PC4 represents a rare phenotype explaining only 2.8% of the total variation and was characterised by high loadings for body weight and BMI and low loadings for WC and HC (Supplementary Fig. 6). Pearson’s correlation coefficients for the six anthropometric variables were consistent with the loadings for the individual PCs (Supplementary Fig. 7).

Baseline characteristics

Baseline characteristics are presented for sex-specific quintiles of loadings for PC1 (Table 2). Anthropometric measures were notably larger in quintile 5 than in quintile 1, except for WHR and height. Participants in the lowest two PC1 quintiles had a healthier diet, a higher educational attainment, were more physically active, and smoked more frequently compared with the top 20% of the study population. Additional baseline characteristics are provided in Supplementary Table 4.

Table 2 Characteristics of participants according to sex-specific quintiles of loadings of principal component 1 (overall adiposity) in EPICa.

Body shapes and cancer risk

After a median follow-up of 15.3 years (interquartile range = 12.8–16.8 years) and 4,841,860 person-years, 47,110 incident cancer cases were diagnosed. Among participants, 65% were women; the mean age at recruitment was 50.9 years (SD = ±10.5 years) for women and 52.7 years (SD = ±9.6 years) for men.

Results for PC1 (overall adiposity)

The HR for overall cancer risk per 1 SD increment in PC1 was 1.07 (95% CI = 1.05–1.08) (Fig. 2). In cancer type-specific analyses, a 1 SD increment in PC1 was associated with increased risks for malignant tumours of the corpus uteri, oesophagus (adeno), liver, kidney, gallbladder, colon, pancreas, myeloma, breast (postmenopausal), and rectum. An inverse relationship was observed between PC1 and cancers of the prostate and oesophagus (SCC). All these associations passed the Bonferroni-corrected P ≤ 0.001. Among never smokers, these estimates remained largely unchanged, except for smoking-related cancers, where the point estimate increased (lips, oral cavity, pharynx: 1.13; 0.98–1.30) or showed a tendency towards the null (lung, larynx, oesophagus (SCC)).

Fig. 2: Hazard ratios (HRs) for total cancer and 24 cancer subtypes per 1 SD increment in the first principal component (PC1; overall adiposity).
figure 2

HRs with corresponding 95% confidence intervals (95% CIs) from Cox proportional hazards regressions in the total population (n = 340,152) and in never smokers (n = 160,111); n number of cancer incidence cases, CNS central nervous system, SCC squamous cell carcinomas.

Results for PC2 (tall stature; low WHR)

The association between PC2 and overall cancer showed a slightly increased risk per 1 SD increment (HR = 1.03; 95% CI = 1.02–1.04) (Fig. 3). Positive associations were observed for cancers of the thyroid, breast (post- and premenopausal) and malignant melanoma. All these associations passed the Bonferroni-corrected P ≤ 0.001, except thyroid cancer (P = 0.003). An inverse relationship was observed for tumours of the rectum and lips, oral cavity, pharynx. Inverse associations were also observed for cancers of the stomach (non-cardia), liver, and oesophagus (adeno), but these associations did not pass the Bonferroni-corrected P ≤ 0.001. When these analyses were repeated among never smokers, the point estimates remained largely unchanged, except for a stronger positive association for cancers of the brain and CNS, no association for cancers of the lips, oral cavity, pharynx, and a stronger inverse association for non-cardia stomach cancer.

Fig. 3: Hazard ratios (HRs) for total cancer and 24 cancer subtypes per 1 SD increment in the second principal component (PC2; tall stature, low waist-to-hip ratio).
figure 3

HRs with corresponding 95% confidence intervals (95% CIs) from Cox proportional hazards regressions in the total population (n = 340,152) and in never smokers (n = 160,111); n number of cancer incidence cases, CNS central nervous system, SCC squamous cell carcinomas.

Results for PC3 (tall stature; high WHR)

PC3 was positively associated with overall cancer risk, with an HR of 1.04 (95% CI = 1.03–1.05) per 1 SD increment (Fig. 4). Positive associations were observed for 12 of 24 different cancers, of which 8 also passed the Bonferroni-corrected P ≤ 0.001 (Supplementary Table 7). However, among never smokers, associations with five of these cancer types were substantially attenuated (larynx, oesophageal SCC, stomach cardia, lips, oral cavity, pharynx, and lung). An inverse association was found for cancer of the corpus uteri (P < 0.001), with a more pronounced inverse association among never smokers.

Fig. 4: Hazard ratios (HRs) for total cancer and 24 cancer subtypes per 1 SD increment in the third principal component (PC3; tall stature, high waist-to-hip ratio).
figure 4

HRs with corresponding 95% confidence intervals (95% CIs) from Cox proportional hazards regressions in the total population (n = 340,152) and in never smokers (n = 160,111); n number of cancer incidence cases, CNS central nervous system, SCC squamous cell carcinomas.

Results for PC4 (high BMI and weight; low WC and HC)

There was no association between PC4 and overall cancer risk (HR = 1.00; 95% CI = 0.99–1.01) (Supplementary Fig. 8). A relatively robust positive association was observed with thyroid cancer risk (HR = 1.10, 95% CI = 1.00–1.21), which however did not pass the Bonferroni-corrected P ≤ 0.001.

Sensitivity analyses

After excluding the first 2 years of follow-up, the point estimates for PC1 remained largely unchanged, except for cervical cancer, for which the HR decreased (Supplementary Table 5). There was also little change for PC2, except for an even lower risk for laryngeal cancer (Supplementary Table 6). For PC3 and PC4, no sizeable changes in the associations were observed (Supplementary Tables 7 and 8).

In analyses stratified by age, most HRs were largely consistent across the two age groups (<52.3 vs. ≥52.3 years). Exceptions were as follows. For PC1, HRs for cancers of the pancreas and thyroid were stronger in the younger as compared to the older age group (Supplementary Table 5). For PC2, the HR for gallbladder cancer was stronger in the younger as compared to older age group, whereas HRs were less strong for cancers of the brain and CNS, breast (premenopausal), and thyroid (Supplementary Table 6). For PC3, HRs for oesophageal (adeno and SCC), laryngeal, stomach (cardia), and thyroid cancers were stronger in the younger as compared to the older age group (Supplementary Table 7). For PC4, a positive association was observed with liver cancer in the younger age group, while this association was inverse in the older age group (Supplementary Table 8).

In a further analysis, BMI (per 5 kg/m2 increment = 1 SD) was positively associated with risk of cancers of the corpus uteri, oesophagus (adeno), kidney, thyroid, gallbladder, breast (postmenopausal), colon, pancreas, rectum, and multiple myeloma (Supplementary Fig. 9). Positive associations were also seen for tumours of the stomach (cardia) and ovary, but confidence intervals included the null. An inverse relationship was found for seven cancer types, including cancers of the oesophagus (SCC), cervix, lung, lips, oral cavity, pharynx, stomach (non-cardia), larynx, and malignant melanoma. After restricting the analyses to never smokers, BMI remained inversely associated only with malignant melanoma.


PCs of six anthropometric traits that capture four distinct body shape phenotypes were differentially associated with the risk of overall cancer and 17 site-specific cancers. Some novel associations were identified, including a positive relation of PC3 with lung cancer, oesophageal SCC, and malignant melanoma. Furthermore, PC1 and/or PC3 were positively associated with hepatobiliary cancers, malignant melanoma, and total prostate cancer, while BMI was unrelated to those cancers (Supplementary Fig. 9). These findings suggest that the current cancer burden associated with adiposity and body size based on classical anthropometric traits is likely underestimated. Leveraging information from multiple anthropometric traits may better capture the heterogeneous expression of adiposity and its health consequences than BMI.

We showed that these body shape phenotypes are congruent with Ried et al. [16] and thus stable across study populations (Fig. 1) and between men and women (Supplementary Tables 2 and 3). The PCs may represent body shape phenotypes more holistically as compared to single anthropometric traits due to the way they combine. However, their interpretation is less straightforward. To circumvent this difficulty, we provided the arithmetic means of each anthropometric trait among participants in the top and bottom 5% percentile across the four body shape phenotypes together with the population variation (1 SD) of these traits and integrated this information in Supplementary Figs. 36 (example in men). For PC1, the difference in height between the top and bottom 5% percentile was 7 cm, which corresponded to 1 SD for height in the study population. In contrast, the difference in BMI was 13.1 kg/m2, which was equal to 3.6 SD (13.1 kg/m2 over population SD for BMI of 3.6 kg/m2); similarly pronounced differences were observed for weight, WC, and HC, but not WHR. This means that with increments in loadings of PC1, BMI, weight, WC, and HC increased by much more than height and WHR. With increments in PC2, we observed a 3.3 SD increment in height and a ~1 SD increment in weight and HC, while WHR, WC, and BMI decreased by ≤1 SD. With increments in PC3, we observed a 2 SD increment in height and ~1 SD increments in WC, WHR, and weight, while BMI and HC remained similar. PC3 could thus indicate a tall and centrally obese phenotype. With increments in PC4, BMI, weight, and WHR increased by ≥1 SD while height, WC, and HC remained similar.

In a post hoc analysis, we calculated median loadings for the body shape phenotypes by smoking status. A pronounced difference was observed for PC3, where current smokers, as compared to never smokers, had higher loadings on PC3 indicating a propensity towards central adiposity for the same level of BMI (Supplementary Table 9). Tobacco smoking, and lifestyle behaviours in general, may play an important role in shaping these phenotypes. Differences in body composition, especially different proportions of muscle mass and visceral adipose tissue across body shape phenotypes, and how these are influenced by lifestyle factors should be investigated in future studies.

PC1 and cancer

The results of PC1, a body shape characterised by general obesity, confirm previous findings on the association between excess body fat and cancer risk and are also in line with previous studies that have considered BMI as a risk factor [2, 29]. We observed positive associations for all established obesity-related cancers. Inverse relationships for cancers of the prostate, larynx, and oesophagus (SCC) are also consistent with findings from a large Spanish cohort [29]. In contrast to previous studies, we found a strong positive association with liver cancer but no association with BMI. The lack of association with BMI suggests that PC1 captures phenotype information beyond that provided by BMI.

PC2 and cancer

Our results for PC2 (tall with low WHR vs. short with high WHR) are not directly comparable with previous studies given the specificity of this body shape. There is nevertheless some overlap between our findings and the literature. There is strong evidence that adult attained height increases the risk of cancers of the premenopausal and postmenopausal breast, skin, colorectum, endometrium, prostate, ovaries, pancreas, kidney, and possibly liver [7,8,9,10,11,12]. This is congruent with our findings except for liver cancer, where we found a relatively robust inverse association, although this association (P = 0.007) did not pass the more stringent multiple-testing corrected P-value of 0.001. Height may represent a surrogate measure for cancer risk factors early in life [30]. Potential aetiologic mechanisms linking taller height to an increased cancer risk include more stem cells with an increased number of mutations during cell division, and insulin-like growth factor 1, which is a major determinant of height and organ size and of cancer risk [13, 14].

In addition to liver cancer, we also found robust inverse associations between PC2 and cancers of the rectum, stomach (non-cardia), and oesophagus (adeno). Non-cardia gastric cancer is caused mainly by Helicobacter pylori infections, which are associated with growth retardation in children [31]. Our finding of an inverse relation with non-cardia gastric cancer is supported by the observation of that cancer being likely to occur in individuals of short stature. A Mendelian randomisation analysis of adult height found an inverse association with oesophageal cancer but a weak positive association with liver cancer [32].

PC3 and cancer

PC3 (tall height; high WHR) represents abdominal obesity in combination with height. Abdominal obesity poses risk for oesophageal adenocarcinoma [33], consistent with our findings. Mechanistically, this could be explained by gastroesophageal reflux disease (GERD) predisposing to Barrett’s oesophagus [34]. Robust positive associations were also observed for cancers of the thyroid, kidney, pancreas, colon, and prostate cancers for which abdominal adiposity and attained height have been implicated as risk factors [7]. A positive association was observed between PC3 and malignant melanoma, indicating attained height as a key anthropometric risk factor. This is supported by a similarly strong positive association between PC2 (tall height, low WHR) and malignant melanoma.

PC3 also showed strong positive associations with smoking-related cancers including larynx, oesophageal SCC, oral cancers and pharynx, and lung. Notably, among never smokers these associations were completely attenuated for cancers of the larynx, oral cancers and pharynx suggesting residual confounding by smoking. Associations with the risk of oesophageal SCC and lung cancer were also attenuated among never smokers but remained imprecisely positively associated. There is some evidence that height is positively associated with the risk of oesophageal SCC and lung cancer [9]. However, further studies are required to corroborate these findings. The observed inverse association between PC3 and corpus uteri cancer is striking. One hypothesis is that HC tracks with gluteofemoral fat accumulation, which is associated with a more favourable adipokine profile and increased lipoprotein lipase activity [35], profiles that have been linked to lower endometrial cancer risk [36].

PC4 and cancer

PC4 only explained a small proportion (3%) of the overall variation in anthropometric traits. Nevertheless, it may represent a rare phenotype of potential relevance in the aetiology of certain cancers. Ried et al. identified two genetic loci for PC4 that have not previously been captured by single-trait anthropometric GWAS [16]. One of these two single-nucleotide polymorphisms has previously been associated with increased levels of circulating adiponectin [16], which has been implicated in cancer development [37]. In our analysis, we found a relatively robust positive association with thyroid cancer (Supplementary Fig. 8 and Supplementary Table 8). Whether altered levels of adiponectin plays a functional role in thyroid cancer in addition to or independently of excess weight is currently unclear [38].

Strengths and limitations

The primary strength of our study is the novel results on body shape phenotypes in relation to cancer incidence. Further assets are the large number of cases, extensive follow-up time, and inclusion of participants from different European countries. In addition, we conducted various informative sub-analyses to rule out the influence of residual confounding and reverse causality. Limitations are potential selection bias with health-conscious individuals being over-represented, and the study being restricted to Caucasian ethnicities. Furthermore, a one-time measure of body shape assumes that participants do not change their exposure profile during follow-up, which is a strong assumption. However, at least for BMI, we show in yet unpublished work that baseline BMI compared to cumulative BMI yielded comparable risk estimates across 26 cancer types (Recalde et al., Nat Commun, provisionally accepted). We also note that there was no interaction between the body shape phenotypes and the time scale associated with cancer risk in our analysis. Taken together, baseline body shape phenotypes very likely provide a good approximation of long-term exposure.


In this multi-national study, distinct body shape phenotypes were positively associated with risks of 17 different cancers. Several entirely novel relationships were identified that have thus far remained undetected in previous studies using classical anthropometric traits. Derived body shapes may reveal underlying biological pathways, thereby providing new insights into cancer development. Such knowledge could help inform cancer prevention strategies.


Where authors are identified as personnel of the International Agency for Research on Cancer/World Health Organisation, the authors alone are responsible for the views expressed in this article and they do not necessarily represent the decisions, policy or views of the International Agency for Research on Cancer/World Health Organisation.