Introduction

The prevalence of cardiometabolic risk factors (CMRFs) varies geographically1,2. Previous research has reported higher prevalence of CMRFs in certain localities: typically in areas of higher socioeconomic disadvantage3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23. However, frequently these studies have been based on measures of association or geographical variance rather than reporting clustering or the share of the total variance that is at the area-level3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23. Quantifying the clustering and the proportion of geographic variation in CMRFs contributed by area-level socioeconomic disadvantage can aid in designing appropriate area-level approaches to help prevent CMRFs. Chronic and uncontrolled CMRFs predispose individuals to the development of cardiovascular disease (CVD), which continues to be the leading cause of health care expenditure and premature mortality worldwide24.

In Australia, a social gradient is observed in the prevalence of many chronic conditions including various CMRFs (e.g.diabetes and chronic kidney disease)25. Generally, Australians enjoy better health than people in many other countries in the world. However, within Australia this better health is not equally distributed26. It is often-reported that socioeconomically disadvantaged individuals in Australia, on average, experience a greater disease burden than their less disadvantaged counterparts25,26,27,28. This tendency is also evident at a contextual level when studies have investigated association of CMRFs with area-level socioeconomic disadvantage in Australia 5,7 and globally 4,9,10,12,14,16,17,18,20,23,29,30,31,32,33.

Consistent with this, men from highly urbanised environments have been reported to have higher incidence of coronary heart disease with increasing residential area socioeconomic disadvantage, after adjusting for individual characteristics18. Also, lower area-level disadvantage has been reported as being associated with lower prevalence of some behavioural cardiac risk factors such as smoking, physical inactivity and obesity etc. in some studies9,10,34. Most of the reported associations of CMRFs with area-level socioeconomic disadvantage were independent of individual-level characteristics such as age and educational attainment. Even though the area-level associations of CMRFs were significant in these studies, the results were often dependent on the CMRF analysed, the measures of area-level socioeconomic disadvantage and the geographic scale at which associations were examined35.

Multilevel analyses of CMRFs based on the average measures of association or variation alone are insufficient to report the geographical variance as similar associations were possible with very different scenarios of area variance36. Multilevel findings extending on the general contextual effects and reporting the proportion of the total area-level variance along with the measures of clustering and the average measures of association or variation are appropriate and informative in reporting area-level influences, but less common23,36,37,38. To differentiate the relative importance of individual versus area-level interventions for the prevention and control of CMRFs, the geographical component of the total individual risk variance has to be identified in a multilevel approach.

Therefore, the aims of this study are to (1) quantify the general contextual or geographic effect of areas on CMRFs, over and above their individual-level compositions; and to (2) quantify the geographic variation across multiple CMRFs specifically explained by area-level socioeconomic disadvantage, within the Illawarra-Shoalhaven region of NSW Australia. Quantification of the general contextual effect and the variation specifically explained by area-level socioeconomic disadvantage will assist our understanding of the socioeconomic context of CMRFs in the study region and provide guidance for health service commissioning more generally nationally.

Methods

A cross-sectional multilevel design was adopted to account for the hierarchical nature of the data and analyses. No informed consent were obtained for the individual-level data used in this study, as the study used existing data which were already de-identified. The study was approved by the University of Wollongong and Illawarra and Shoalhaven Local Health District Health and Medical Human Research Ethics Committee (HREC protocol No: 2017/124). All the methods and analyses were performed meeting the relevent ethical guidelines and regulations of the committee.

Study area and data

The study was conducted in the Illawarra-Shoalhaven region of the New South Wales (NSW) state in Australia. The Illawarra-Shoalhaven region is a coastal plain along the south-east border of NSW; situates at the immediate south of the metropolitan boundaries of Sydney; and encompasses multiple regional cities, towns and rural areas. This region covers a land area of 5,615 km2, and had an estimated residential population of 369,469 at the time of the 2011 Australian Census of Population and Housing conducted by the Australian Bureau of Statistics (ABS)39. Statistical Area level 1 (SA1), the smallest geographical unit of the 2011 census data release, was the area-level unit of analysis in this study39. SA1s typically have a population size of 200 to 800 persons (average 400), and the Illawarra-Shoalhaven region covers a total of 980 conterminous SA1s39.

The CMRF test data in this study were extracted from the Southern IML Research (SIMLR) Study database, which is comprised of de-identified and internally linked pathology results from a major network of pathology services in the study region. The individual-level data in SIMLR database are geocoded to their corresponding SA1 areas, but not to their residential address, for privacy and confidentiality concerns. More details on this data source, procurement and access are published elsewhere7. The CMRF test data were extracted for non-pregnant individuals aged 18 years or older presenting for testing between 01 January 2012 and December 2017. Only the most recent test result was included if an individual had undergone the same test multiple times in this data period. Test data with missing details on the individual and area-level factors analysed in this study were excluded from the analyses.

Variables

Outcome variable

Results of the CMRF tests were the individual-level outcome variables. Data on the seven CMRF tests analysed in this study included: fasting blood sugar level (FBSL); glycated haemoglobin (HbA1c); total cholesterol (TC); high density lipoprotein (HDL); urinary albumin creatinine ratio (ACR); estimated glomerular filtration rate (eGFR); and objectively-measured body mass index (BMI). These CMRF test results were dichotomised into higher risk and lower risk values based on the current national and international guidelines on risk definitions (Table 1).

Table 1 Definitions of higher risk CMRFs test results.

Study variable

The 2011 ABS census based Index of Relative Socioeconomic Disadvantage (IRSD) of the SA1s was the study variable. IRSD summarises a range of measures of relative socioeconomic disadvantage of people and households within SA1s and includes: level of income; education; employment; family structure; disability; housing; transportation; and internet connection45. This study uses IRSD reported as quintiles; the lowest quintile (Q1) indicating the most disadvantaged SA1s and the highest quintile (Q5) the least disadvantaged SA1s45. The IRSD quintiles in the study were derived by ABS from the distribution of IRSD scores for the Illawarra-Shoalhaven region based on the 2011 census. The study region has a diverse IRSD profile with representation across IRSD scores in comparison with Australia as a whole, making the region useful for population-level studies46.

Covariates

Analyses were adjusted for sex (male and female) and age group (18–29 years, 30–39 years, 40–49 years, 50–59 years, 60–69 years, 70–79 years, 80+ years) of each individual at the time of the pathology collection of the CMRFs tests analysed in his study.

Statistical analyses

Initially, descriptive statistics of all individual and area-level variables were performed. Thereafter, single level and multilevel logistic regression models were fitted for the CMRF test data of individuals (Level 1), nested within SA1s (Level 2). For each of the seven CMRFs analysed in this study, a hierarchy of four multilevel models at SA1 level were fit that included fixed effects for age, sex and IRSD and random effect (intercept) for SA1. Model 0 was a single level model adjusted for age and sex; Model 1 (M1) was null model at level 2; Model 2 (M2) adjusted for age and sex at level 2; Model 3 (M3) adjusted for the area-level study variable (IRSD) only at level 2; and the final model Model 4 (M4) included both M2 and M3 covariates (age, sex and IRSD) at level 2. The estimated regression coefficients of the derived models were exponentiated to calculate odds ratios (ORs).The goodness of fit of the models were identified using Likelihood Ratio Tests (LRT) at p < 0.05 level of significance. The general equation of the fully adjusted model is:

$${\text{y}}_{{{\text{ij}} }} \sim {\text{Binomial }}\left( {{1}, \, \pi_{{{\text{ij}}}} } \right)$$
(1)
$${\text{logit}}\left( {\pi_{{{\text{ij}}}} } \right) = \beta_{0} + \beta_{{1}} {\text{Age}}_{{{\text{ij}}}} + \beta_{{2}} {\text{Sex}}_{{{\text{ij}} }} + \beta_{{3}} {\text{IRSD}}_{{{\text{ij}}}} {\text{ } + \text{ u}}_{{\text{j}}}$$
(2)
$${\text{u}}_{{{\text{j}} }} \sim {\text{N}}\left( {0, \, \tau^{{2}}_{{\text{u}}} } \right)$$
(3)

where yij denote the binary response of CMRF test outcome (as ‘higher risk’ or ‘lower risk’, based on the adopted definitions) for individual i in the area (SA1) j; πij denotes the probability that individual i in area (SA1) j has a ‘higher risk’ CMRF test outcome given their individual-level ageij and sexij; and their area-level IRSD index. The β1, β2, β3 are the regression coefficients which measure the associations between the log-odds of the CMRF outcome and each covariate all else equal, and when exponentiated these are translated to ORs36. uj is the random effect for the area (SA1) j and τ2u is the area level variance, which has to be estimated.

Model comparison

The Akaike Information Criterion (AIC) was used to evaluate model fit. The derived multilevel models were compared for: area-level variance (\({\tau }^{2})\) at SA1 (level 2) level; proportional change in variance (PCV); Intra-cluster Correlation Coefficients (ICC); Median Odds Ratios (MORs); area under the receiver operating characteristic (AUC) curve; and the change in AUC.

The \({\tau }^{2}s\) of the multilevel models were initially identified from each models. PCVs were calculated for models M2s to M4s relative to M1s. The ICCs of the fitted models were calculated using the latent variable approach47. This approach assumes that a latent continuous outcome underlies the observed dichotomous outcomes and it is this latent outcome for which the ICC is calculated and interpreted. The ICC measured the expected correlation in CMRF outcomes between two individuals from the same SA1. The higher the ICC, the more relevant area-level context is for understanding individual latent outcome variation36. The MOR is calculated as an alternative way of interpreting the magnitude of area-level variance. The MOR translated the area-level variance which were estimated on the log-odds scale to the commonly used OR scale. The MOR result value is interpreted as the median increased odds of identifying the outcome if an individual move to another SA1 with higher risk. Thus, the higher the MOR the greater the general area-level effect and it will equal to 1 in the absence of area-level variance36. The general contextual effect of the geographic areas over and above their individual-level composition of the higher risk CMRFs, is obtained through the measure of clustering (ICC) in M2s. The geographic variance and ICC in the null models (M1s) of higher risk CMRFs may depend on both the contextual and individual-level variables. Therefore, M2s of the higher risk CMRFs which adjusted for individual-level attributes is better to provide information on the ‘general contextual effect’ of the areas. The unique contribution of the area-level study variable (IRSD) to the area-level variance of higher risk CMRFs were assessed through the PCVs between M2s and M4s.

The receiver operating characteristic (ROC) curves are created by plotting the true positive rates (TPR) i.e. sensitivity, against the false positive rates (FPR), i.e. 1 specificity for different binary classification thresholds of the predicted probabilities in all the models48. Post-estimation, predicted probabilities (πij) are calculated for each individual and are used to calculate the AUC for the model. The AUCs of the models measure the capacity of the models to correctly classify individuals with or without the outcome of a higher risk CMRFs analysed in this study, as a function of their predicted probabilities36. The AUC values range from 1 and 0.5, where 1 is the perfect predictive discrimination and 0.5 have no predictive power49. The AUCs also indicate the general contextual effects and can be compared it to the ICC and the MOR values36. The added value of knowing an individual’s area of residence besides individual-level information (age and sex) can be obtained through the AUC change in Model 2 in reference to Model 0, where a higher AUC change would indicate higher relevance of areas in relation to CMRFs.

Statistical package

All analyses were performed using R version 3.4.4. (R Foundation for Statistical Computing, Vienna, Austria)50. Multi-level models were fit using the glmer function in the lme4 package51,likelihood ratio tests were calculated using the lrtest function in the lmtest package52,and ROC curves using the roc function in the pROC package53.

Results

A total of 1,132,029 CMRFs test data which belong to 256,525 individuals were extracted for the analyses. Figure 1 provides a flow chart of the individual tests in CMRF test data. The mean number of tests per person was 4.4. After removing 1,162 (1.0%) test results data with missing details, a total of 1,130,894 tests were included in the analytic data set.

Figure 1
figure 1

Flow chart of the included/excluded tests in the CMRFs test data.

Table 2 provides details of the missing data and test data distribution of each CMRF tests. Most frequently missing data were the IRSD indices from SA1s in the study area for which an IRSD index was not available from ABS 2011 census either due to low populations or poor data quality54.

Table 2 Table of excluded test data which had missing details.

Tables 3 and 4 shows the frequencies and relative frequencies of CMRF tests results. Overall, the higher risk frequencies of all CMRFs increased with increasing area-level socioeconomic disadvantage, except for TC which demonstrated an inverse trend.

Table 3 Cross-tabulation of individual CMRFs (FBSL, HbA1c, TC and HDL) with the variables in study.
Table 4 Cross-tabulation of individual CMRFs (ACR, eGFR, and Obesity) with the variables in study.
Table 5 Single and multilevel logistic regression model summaries for high FBSL (FBSL ≥ 7.0 mmol/L).
Table 6 Single and multilevel logistic regression model summaries for high HbA1c (HbA1c > 7.5%).
Table 7 Single and multilevel logistic regression model summaries for high TC (TC ≥ 5.5 mmol/L).
Table 8 Single and multilevel logistic regression model summaries for low HDL (HDL < 1 mmol/L).
Table 9 Single and multilevel logistic regression model summaries for high ACR (ACR ≥ 30 mcg/L to mg/L).
Table 10 Single and multilevel logistic regression model summaries for low eGFR (eGFR < 60 mL/min/1.73 m2).
Table 11 Single and multilevel logistic regression model summaries for obesity (BMI ≥ 30 kg/m2).

Single and multilevel models for each of the CMRFs analysed in this study are presented in Tables 5, 6, 7, 8, 9, 10 and 11. After adjusting for the covariates, all seven CMRFs were found to be significantly associated with area-level IRSD in the study region. For all but one variable the associations were positive (i.e. increased with area-level disadvantage). TC was the exception; being inversely associated with area-level disadvantage, with the most disadvantaged quintile (Q1) displaying the lowest odds for higher risk test results. Among the covariates, there was no significant association between gender and higher risk test results of eGFR or BMI. It was also noted that the odds of higher risk eGFR tests results accelerated with increasing age group, and the 80+ age group demonstrated a very high odds of being identified with a higher risk eGFR tests result in the study region.

The overall comparisons of model random effects are presented in Table 12. Reductions in the AIC values were observed among all CMRFs from the null model (M1) to the final model (M4) indicating a better fit for the final models. In the unadjusted null models, higher risk test results of eGFR demonstrated the most area-level variance (0.189) and TC the least (0.026). Adjusting the CMRFs for age and sex initially increased the τ2 of M2 for FBSL (PCV =  + 1.88%), HbA1c (PCV =  + 3.02%), HDL (PCV =  + 15.25%) and BMI (PCV =  + 1.48%). The τ2 was reduced in the final model among all CMRFs compared with the null models.

Table 12 Summary model fit values and comparison of the multilevel models.

The Akaike Information Criterion (AIC) was used to evaluate model fit. The derived multilevel models were compared for: area-level variance (\({\tau }^{2})\) at SA1 (level 2) level; proportional change in variance (PCV); Intra-cluster Correlation Coefficients (ICC); Median Odds Ratios (MORs); Area under the receiver operating characteristic (AUC) curve; and the change in AUC.

The ICCs of the unadjusted models ranged between 0.8% in high TC to 5.4% in low eGFR. Inclusion of IRSD after adjusting for age and sex had reduced the ICCs of all CMRFs in the final models, which ranged between 0.4% in low eGFR to 2.0% in obesity test results. The ICCs of the final models were low and suggest very limited area-level contextual effects. The AUC changes in model 2 and MORs of the final model support these findings.

Figure 2 provides a comparison of the ROC curves of the fitted models. Model 4 s (age + sex + IRSD adjusted models) and models 3 s (IRSD adjusted models) were chosen for the ROC curve plotting for comparative purpose. The predicted outcomes in the CMRFs plots are for the reference individual, i.e., individuals residing in the least disadvantaged areas (model 3) + female + age group 18–29 years (Model 4). A model curve closer to the top left corner of the subfigures indicate a better predictive accuracy of the model. The single measure summary of the ROC curves, AUCs of the final models ranged 0.62–0.88. The highest AUC value was observed for the final model of low eGFR. The AUC changes of model 2 s in relation to M0s ranged 0.01–0.08, which reconfirm the contextual findings of ICCs that the general contextual effects observed in the models were minimal.

Figure 2
figure 2

ROC curves of the fitted models (Model 3s and Model 4s) of CMRFs for comparison: (a) FBSL models; (b) HbA1c models; (c) TC models; (d) HDL models; (e) ACR models; (f) FBSL models; (g) obesity models; Model 3s—CMRFs adjusted only for IRSD quintiles of areas; Model 4s—Final models of CMRFs adjusted for age + sex + IRSD quintiles of area.

The proportions of the geographic variance in CMRFs contributed by IRSD were estimated through the PCV between M2 and M4. Adjusting the models for IRSD and individual-level variables explained a maximum 92.79% of the variance expressed by the null model of eGFR, reducing the ICC from 5.4 to 0.4%. The changes were least among the adjusted models of TC, with a marginal reduction of ICC from 0.8% to 0.5%. Thus, in the final models, the proportional reduction in variance was the largest for eGFR (PCV = 92.79%) and the least for TC (PCV = 33.27%).

The identified specific contribution of IRSD in the geographic variation of CMRF was the highest among the geographic variance of higher risk findings of HDL tests (57.8%), which was closely followed by FBSL (57.14%); HbA1c (53.31%); and ACR (51.17%) test results. The contribution of IRSD was comparatively lower among the geographic variance of the higher risk findings of eGFR (41.75%); BMI (41.06%); and TC (14.71%) test results, though not the least. Even though these specific proportions are large, it should be noted that it actually explained a lot of very little (i.e., variance of 0.01–0.07).

Discussion

The study reports on the influence of areas on higher risk CMRF distribution and quantifies the specific proportion of geographic variance explained by IRSD. The work adds to the very few studies which consider multiple CMRF variables within the same region, or which are based on population derived data over extended years16,17,20,29,31,32,and reports on both single and multilevel analyses38,55. The results present both the measures of association and area-level variance based on multilevel logistic regression analyses36. The findings of the study add to the existing evidence and discussion regarding the relevance of individual versus area-level interventions for the prevention and control of CMRFs.

We found consistent evidence for the association between area-level disadvantage and seven CMRFs among adult health service using residents of the Illawarra-Shoalhaven region in NSW Australia. In adjusted models, the odds of a higher risk finding increased with increase in area-level disadvantage among all CMRFs excepting TC, which showed an inverse pattern of association with increase in area-level disadvantage. Thus, in the final models we observed that, over and above individual age and sex, living in a disadvantaged neighbourhood proportionally increased the individual-level probability of being identified with a higher risk CMRF. The findings highlight the importance of including of area-level variables into health risk analyses.

The ICCs of CMRFs in all the models were comparatively small (Table 12) in all the models. In the fully adjusted models, the ICCs were further reduced and ranged between 0.4% and 2.0% in low eGFR and BMI respectively. As per the interpretation framework proposed by Merlo et al., an ICC value less than 10% is indicative of very little geographic difference56. The AUC change of the model 2 s in relation to the single level models (range 0.01–0.08) reconfirm on these findings. However, this has to be interpreted along with the traditional geographic comparisons such as the proportion of the individuals who are affected with higher risk CMRF outcomes. Therefore, a small geographic difference with uniformly higher, medium, or lower proportion of affected individuals indicates homogeneity of the higher risk CMRF findings within their geographic units56. Such a situation would call for balanced universal approaches to prevent and control the higher risk CMRFs, with a proportional focus to the need and disadvantage level of affected populations57,58. However, it is also worth noting that when the exposure to an agent is homogenic in a community, the traditional epidemiological methods are not very helpful in identifying their markers of susceptibility59.

Our results confirm, and are comparable with, associations between area-level disadvantage and CMRFs reported in previous studies3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23, and extends their findings. The results primarily confirm the geographic variation of CMRFs and associations with area level disadvantage, as reported in previous studies. Further, the study provides means to compare this association which were observed consistently with a range of multiple CMRFs analysed in this study. The study extends on previous reports by differentiating the individual and area-level contributors to the exhibited geographic variance of CMRFs. And most importantly, the general contextual effect and the specific contributions of IRSD on the geographic variance of multiple CMRFs were identified, which is unique in the literature and highly informative for health care service commissioning.

The TC test results often stood apart from the major findings of this study, demonstrating inverse associations with IRSD. However, this was not reflected in the HDL findings, even though both are components of the lipid profile in an individual. This raises the possibility of a medication effect on TC in these areas, where the lipid lowering drugs have a less consistent effect in raising HDL than in lowering TC60. Other factors associated with the higher risk HDL test results may include uncontrolled diabetes61, smoking62, sedentary life style63,64, obesity65, and poor diet quality66,67. However the reason for the inverse association demonstrated by TC test results are not clearly established within the current study results and requires further research to explore possible individual and area-level contributions.

The study has to be considered within its limitations. Primarily, the cross sectional nature of analyses adopted in this study do not yield support for any causal relationships. In addition, the non-linear and time varying effects of covariates analysed in this study restrict generalisability of their findings though very informative for regional health care service commissioning. Secondly, the IRSD quintiles included as the key explanatory variable represent relative disadvantage in an area and have limitations intrinsic to aggregate measures. Thirdly, it should be noted that the data used in this study are extracted from people already utilising the health care service facilities in the area. Fourthly , the readers should be mindful that the variance reported in this study are attributable to (1) individual level factors (age, sex) analysed at the area-level, (2) area-level contextual influences (IRSD), and (3) other individual and area-level characteristics not considered in this study. However, further individual-level data extractions or collections are not possible with this study’s dataset as the de-identification process precludes the inclusion of any further individual level data. Other individual and area-level factors not considered in this study could include: individual-level SES68, type of neighbourhood food outlets6972, poor physical activity resources73,74, residential density and service availability75. Finally, the assumptions of the standard multilevel logistic regression modelling methods adopted in this study would not be able to account for the autocorrelation of the area-level residuals (if any) of the models. Expected shortcomings due to this could be an overestimation of random effects in our models76. However, any such effects were observed to be very marginal in our results as the random effect estimates are already at their lower limits. While acknowledging this limitation, we believe the effects of this are not critical in our results. Hybrid models which provide more precise estimates of random effects are becoming increasingly available with advances in computational technologies77. However, they would not be directly applicable to our data sets, mainly due to the non-availability of location specific data at individual-level in our study data.

Notwithstanding these limitations, the study is unique in that it analysed a range of CMRFs across a widely dispersed population and included both rural and urban residents. In addition, the study used six years (year 2012–2017) of CMRF tests data from the region in the hierarchical multilevel analyses. The findings of the study indicate that those residing in the most disadvantaged areas are more likely to be identified with higher risk CMRFs than those in lower disadvantage areas. However, the low ICC, AUC change and MOR values of the area-level models do not support for contextual approaches. Rather, the findings of the study support a proportionate universalism approach in which health resources are made universally available but proportional to the need and disadvantage level of the affected population57,58.

Conclusion

The study demonstrates that in the Illawarra Shoalhaven region of Australia, people residing in socioeconomically disadvantaged areas have a higher probability of being identified with higher risk CMRFs across a range of factors. The low general contextual effects of the areas suggest for universal intervention for the prevention and control of CMRFs in this study region, but proportional to the need and disadvantage level. The patterns were consistent across the six CMRFs analysed in this study; and comparable with similar studies reported nationally and globally. Based on our findings, we recommend further area-level research to discern the role of other contextual factors not analysed in this study especially the area-level access to health care services to determine its existing role and adequacy78, and evidence based universal interventions for the prevention and control of CMRFs but proportionate to the priority level of the populations based on area-level disadvantage.