Introduction

Visceral fat plays an important role in the development of metabolic complications such as diabetes and cardiovascular disease independently of the effect of overall abdominal fat.1, 2, 3, 4 Accurate measurement of visceral fat is necessary in order to assess its contribution to the pathophysiology of metabolic disease. In epidemiological studies, anthropometric measures such as waist circumference and waist-to-hip ratio are often used to estimate visceral fat, but they do not capture how abdominal fat is distributed as visceral fat and subcutaneous fat, which is also reported to be associated with metabolic risk.5

Magnetic resonance imaging (MRI) and computer tomography (CT) are the reference methods for estimating visceral and subcutaneous fat quantities and distribution. However, the use of these methods in large scale studies is often limited by their costs, accessibility and, in the case of CT, constraints due to radiation exposure. Abdominal ultrasound can be used to obtain an indirect one-dimensional estimate of the visceral and subcutaneous fat components by measurement of the thickness of the subcutaneous fat layer and the ventro-dorsal diameter of the abdominal cavity. This method has been validated against MRI and CT as a way of estimating the visceral and subcutaneous fat distribution in large scale studies, where MRI and CT may not be feasible.6, 7, 8 These ultrasound measurements have been proposed as a component of risk assessment for metabolic disease,7, 8, 9, 10 and the method is now in use.11, 12, 13 Reproducibility of these ultrasound measurements in the assessment of abdominal fat distribution is reported to be good, with measurement errors of less than 3%.6, 7 However, several aspects of the reproducibility of ultrasound have not been described in a real-life setting. Nor has reproducibility been described in a population at high risk of diabetes.

We aimed to extend the validation of abdominal fat distribution measurement with ultrasonography according to the method developed by Stolk et al.,5, 6, 7 by investigating three sources of variability often encountered in epidemiological studies. We aimed (i) to investigate intra- and interobserver variation of visceral and subcutaneous fat assessments obtained using ultrasound in individuals at high risk of diabetes; (ii) to assess short-term variation in the visceral and subcutaneous measurements; and (iii) to establish the reproducibility of the measurements performed in the fasting and non-fasting state, in order to investigate whether the method is suitable for studies under conditions where it is not possible for participants to fast. An understanding of these aspects of variation is important if the assessment of abdominal fat distribution using ultrasound is to be used more widely in epidemiological research and is to be considered as an individual risk stratification tool in clinical practice.

Materials and methods

Study population

Following participation in a stepwise diabetes screening program in Danish general practice (2001–2006),14 a subsample of individuals identified at high risk were invited to attend for a follow-up health assessment as part of the ADDITION-PRO study between 2009 and 2011.15 The ADDITION-PRO health assessment included detailed measurements of anthropometry, biochemistry, central hemodynamics and physical activity, alongside completion of validated questionnaires. On random days during the conduct of the ADDITION-PRO study, all participants were invited to participate in one of three reproducibility substudies, and the first to accept were enrolled. Participants with known previous extensive abdominal surgery were not asked. Eighty-six ADDITION-PRO participants were studied. Intra- and interobserver variation in ultrasound measurement was assessed in a convenience sample of 30 participants (substudy 1), short-term variation in ultrasound measurement in 33 participants (substudy 2) and fasting versus non-fasting variation in 23 participants (substudy 3).

The ADDITION-PRO study was approved by the scientific ethics committee of Central Denmark Region and performed in accordance with the declaration of Helsinki. All participants gave written informed consent.

Ultrasonography

A Logiq 9 machine (GE Healthcare, Waukesha, WI, USA) was used to estimate abdominal fat distribution. Visceral fat was assessed with a 4C (1.5–4.5 MHz) abdominal convex transducer and subcutaneous fat with a 9 l (2.5–8.0 MHz) small parts linear transducer. The measurements were performed according to a strict protocol6 by trained sonographers: two medical doctors and one nurse. No bowel preparation was performed before the measurements. First, the waist was identified as the midpoint between the iliac crest and the lower rib with the participant standing. Next, with the participant lying down, the transducer was placed on the abdomen where the xiphoid line crosses the waist line. Measurements were performed at the end of a quiet expiration using minimal pressure on the transducer. Subcutaneous fat was recorded as the vertical distance from the skin to the linea alba, with the transducer in the transverse position. Visceral fat was recorded at the same location, with the transducer placed longitudinally, as the vertical distance from the peritoneum to the front edge of the vertebra. Both distances were assessed twice with the participant lying in the same position. We defined the average of these two assessments as a single measurement in all three parts of this study. Distances were derived from electronic calipers placed on still ultrasound images by the sonographers without seeing the resulting distances.

Intra- and interobserver variation

Both visceral and subcutaneous fat measurements were performed four times on each participant by two trained sonographers, twice by each sonographer. Between each set of subcutaneous and visceral measurements the participants were asked to stand up (Figure 1a), thus allowing the abdominal content to move between measurements.

Figure 1
figure 1

(a) Examination program for the assessment of intra- and inter-observer variation. (b) Examination program for the assessment of short-term variation. (c) Examination program for the assessment of variation between the fasting and non-fasting state. *VAT, visceral adipose tissue; SAT, subcutaneous adipose tissue.

Short-term variation

A single measurement was performed on the same person on two different days (Figure 1b). Three sonographers performed the measurements. Owing to practical constraints, measurements on a given participant were typically not performed by the same sonographer on the two days.

Fasting versus non-fasting ultrasound measurements

Ultrasound measurements estimating subcutaneous and visceral fat were performed with the participants fasting and were repeated 1 and 2 h after the intake of a standard breakfast. At each of the three time points, a measurement was performed (Figure 1c). All measurements were completed by the same sonographer. After the fasting measurements, the participants were given 15 min to eat. They were asked to eat and drink as much as possible from a meal consisting of three sandwiches, a pear, and tea, coffee, water and juice. The sandwiches consisted of 14% protein, 34% fat and 52% carbohydrate, reflecting an average Danish diet. Amounts consumed were recorded.

Statistics

For all three substudies, Bland–Altman plots were drawn.16 For the calculation of intra- and interobserver variation, the sources of variation were derived from a variance components model with fixed effects of the participant and sonographer and random interaction between the sonographer and participant.17 Formally, the model for a measurement by observer o on participant p, replicate r is:

Here is the ‘true’ value of the measurement, is the bias from observer o, so represents the mean bias between observer 1 and 2; is the s.d. of the between-observer variation and is the within observer s.d. for observer o. The multiplier for the corresponding prediction limits was computed as from the following:

(ref 17)

For the analysis of short-term variation and of variation before and after a meal, we computed limits of agreement, as no replicate measurements were available at each time point. The Bland–Altman plots were drawn16 to examine the underlying assumptions of constant bias and variation across the range of measurements. We also regressed the differences of the ultrasound measurements on the means and the absolute residuals from this regression on the means in order to perform a more formal check of the assumptions.

All measurements were log-transformed, making s.d.’s interpretable as coefficients of variation (CV),18 allowing us to compare the relative precision of measurements estimating subcutaneous and visceral fat. Limits of agreement are therefore reported as a relative measure, that is, the mean ratio of the two measurements (different sonographers, different days and different meal status) multiplied/divided by an error factor to give the prediction limits for the ratio. By the same token, the Bland–Altman plots for the differences between log-transformed values are shown using a log-scale for the ratios.

All calculations including the fitting of the variance components model were performed using the MethComp package19 in the statistical software R.20 A full documentation of the statistical analysis can be obtained from the authors. This material is now available online at http://bendixcarstensen.com/SDC/Ultrason/repro.pdf.

Results

Study samples

Table 1 illustrates selected characteristics of the study samples. The mean age of all participants was 67.7 years (s.d. 6.0) and mean body mass index was 26.9 (s.d. 4.3). One participant was African, the others were Caucasian.

Table 1 Characteristics of the study samples

Intra- and interobserver variation

For visceral fat, the within sonographer CVs were 3.4 and 4.0% and the between sonographers CV was 4.0%. For subcutaneous fat, the CVs within sonographers were 4.2 and 6.1% and the CV between sonographers was 9.5% (Table 2). Figure 2 illustrates the interobserver variation with Bland–Altman plots. The prediction limits are based on a multiplier of 1.36 for the subcutaneous fat and 1.17 for the visceral fat. For the log-transformed data on visceral fat, the key assumptions were met, but for the subcutaneous fat there was a small but statistically significant decrease in the s.d. with increasing subcutaneous distance. This decreasing variation was associated with the very small values of these measurements, which drive down relative precision, as can be seen in the left panel of Figure 2.

Table 2 Intra- and interobserver variation
Figure 2
figure 2

Bland–Altman plots of subcutaneous and visceral adipose tissue measurements, using log-transformed data corresponding to relative differences between sonographers. Points are means over the replicate measurements before and after standing. x axis, measurements in cm.

Short-term variation

Median weight change between the two study dates was –0.2 kg (interquartile range: −0.7 to 0.7 kg, P=0.53). Median waist circumference change was –1.1 cm (interquartile range: −3.0 to 0.5 cm, P=0.06). Time lapse between the two examination days ranged from 14 to 105 days with a median of 35 days. Short-term variation is illustrated with Bland–Altman plots (Figure 3). The assumptions for the plots were met. Limits of agreement are based on a multiplier of 1.34 for both subcutaneous and visceral fat. The coefficient of variation between measurement days was 15%.

Figure 3
figure 3

Bland–Altman plots of short-term variation in the subcutaneous and visceral adipose tissue measurements (log-transformed data). x axis, measurements in cm.

Validation of ultrasound measurements before and after a meal

On an average, participants consumed 297 g of sandwiches (range 135–395 g), 71 g of fruit (range 0–193 g) and 457 ml of drink (range 280–980 ml). The mean waist circumference increase 1 h after the meal was 0.48 cm (s.d. 1.2). Figure 4 illustrates variation in ultrasound measurements with Bland–Altman plots. For subcutaneous fat, the limits of agreement are largely centered around one for both time point comparisons, that is, non-fasting values are nearly the same as fasting values. The widths of the limits of agreement are based on multipliers of 1.20 (2 h post meal) and 1.24 (1 h post meal), that is, slightly larger for the fasting versus 1 h measurement. Two hours post meal values were 1.8% larger than 1 h values. For visceral fat, the widths of the limits of agreement are based on a multiplier of 1.12. Values measured 1 h post meal are 13.7% larger than the fasting values, but already 2 h after a meal the measurements approach the fasting values being only 9.4% larger than these.

Figure 4
figure 4

Bland–Altman plots of variation in the subcutaneous and visceral adipose tissue measurements performed (i) fasting and 1 h after a meal, and (ii) fasting and 2 h after a meal (log-transformed data). x axis, measurements in cm.

Discussion

We validated three important aspects of ultrasound assessment of visceral and subcutaneous fat distribution among a group of individuals at high diabetes risk: intra- and interobserver variability, short-term variability and the effect of a meal. We found the levels of intra- and interobserver variation of around 5%, with the exception of interobserver variation for subcutaneous fat, which was nearly twice as large. The values are similar to or slightly larger than CVs mentioned but not elaborated on by others using the same ultrasound method.6, 7, 8, 12 In this regard, we can consider the method as having acceptable reproducibility for use in a research setting. This adds to the evidence from other researchers, who report good intra- and interobserver reproducibility in the assessment of abdominal fat distribution by radiologists using other ultrasound protocols.21 Short-term variation, as we chose to investigate it, had a CV of 15%. Estimates of visceral fat measured non-fasting were larger than those performed in the fasting state, with 2 h post meal values closer to the fasting values than the values measured 1 h post meal. The estimates of subcutaneous fat were similar in the fasting and non-fasting state.

The key elements of validity that need to be mapped out before a measurement method can be used routinely in research or clinical practice are reproducibility, with a quantification of the main preventable sources of variability, and the longer-term stability of the underlying entity. Insight into the relative importance of the main sources of measurement variability will help researchers in the field to reduce these. The quantification of the degree of longer-term stability of visceral and subcutaneous distances estimated using ultrasound will help in the interpretation of evidence based on a single measurement, by showing to which degree a single value can be regarded as a representative of an individual’s exposure during a longer follow-up period. It will also help to set minimum meaningful differences for future studies aiming to evaluate changes in the amount and distribution of abdominal fat by this method.

Our study has added a systematic investigation of the reproducibility of a measurement method, focusing on the role of the observer, of time and of the fasting state. To our knowledge, this is the first time that temporal variation and variation in fasting states for ultrasound assessment of the abdominal distribution have been quantified. The study was carried out using strict protocols, trained personnel and modern ultrasound equipment with probes appropriate for the desired types of measurements. Our study design has the further strength of evaluating intra- and interobserver variation simultaneously. This allows assessment of how much of the variation between two measurements by different observers is due to interobserver variation and how much is due to intraobserver variation, while other circumstances remain equal.

This study was performed in the context of an ongoing clinical examination of a larger cohort. Some practical constraints imposed by this setting have influenced our study setup. We limited the number of participants for each of the three substudies to a feasible number, which gave us sufficient power to evaluate the main reproducibility questions. This means that we do not have adequate power to analyze further sources of variability within our study samples or to perform stratified analyses. Participants were unselected, but are of an age and have a body mass index that is representative of the presumed target age and risk strata for this measure.

The training of the sonographers was limited. None of the sonographers had prior experience in ultrasonography. The two sonographers who are junior medical doctors attended a 3 days course on ultrasound technique and a day of training in this particular ultrasound technique by the research team who developed it. They furthermore spent a day with a consultant in radiology training on how to perform an abdominal ultrasound. The third sonographer, a nurse, was subsequently trained and supervised by the two doctors over the course of a week. We believe that this level of training is common and sufficient in a population-based research setting, making our results widely applicable in pragmatic settings. Possibly more training might have reduced the variation found. On the other hand, if a measurement method is to become widely applicable, it needs to be easily learnt with limited training.

Despite these limitations, we find that intra- and interobserver variation shows adequate reproducibility. We found that as in most assessments intraobserver variability is lower than interobserver variability. The variation in the visceral estimates is much larger than in the subcutaneous estimates when measured on the absolute scale, clearly because of the different range of the absolute measurements (see Table 1). However, the relative precision in the visceral fat estimates is better than for the subcutaneous ones, as is seen by the CV. Of note, the CVs found for the visceral estimates are similar to the CVs for intraobserver reproducibility for the abdominal visceral fat assessments by gold standard MRI in the range of 5.2–13.4%.22 We also calculated the variation for the ratio of subcutaneous and visceral fat in all the three substudies and found that it followed the pattern of variation for the subcutaneous fat (data not shown). Given the greater reproducibility of visceral estimates, perhaps these are most relevant for risk assessment purposes. However, there are race-related differences not only in the abdominal fat distribution but also in the differential contribution of the visceral and subcutaneous fat compartments to metabolic risk, making the estimation of subcutaneous fat depots important in, for example, African populations.23

Short-term variation is important, as it reflects the degree to which a single measurement can be regarded as being representative of the longer period of time we are typically interested in epidemiological settings. In our study, short-term variation is relatively large but not influenced by the size of the measurements. It is composed of a combination of random variation, intra- and interobserver variability and true changes in the underlying physiology. The average time between the two measurement days is long enough for the fat depots to have changed size. There was an indication of a change in waist circumference, but neither change in median waist circumference nor in median weight was significant (P=0.06 and 0.53, respectively). We used the distance from the peritoneum to the front of the vertebra as a proxy for visceral fat. There is likely to be a natural physiological variation of the measurements performed due to the movement and gas content of the intestines. The relatively large prediction intervals found nevertheless speak against using these measurements to track the changes in the abdominal fat in short-term trials with few participants and against using thresholds for risk evaluation. It would be interesting to know what the short-term variation is in the measurement of abdominal fat distribution with MRI or CT. One study reports the CV of 10% for short-term variation in the assessment of visceral- and subcutaneous-fat areas by MRI.24

In substudy three, all participants ate enough to be considered non-fasting. We showed that the effect of a recent meal is primarily on the visceral measurements, and that the effect does not extend beyond the first hour after the meal. For both visceral and subcutaneous fat, the size of the prediction limits for measurements between fasting and non-fasting states was slightly less than the prediction limits for interobserver variation, as is seen in Figure 2 and Figure 4. Our study shows that in practice ultrasound assessments of visceral fat can be performed in non-fasting participants, making sure that the last meal was more than 1 h ago. If the participants are not fasted, controlling for this in analysis should be considered.

In conclusion, both visceral and subcutaneous fat can be estimated using ultrasound with adequate intra- and interobserver reproducibility by non-radiologists with limited training, making it a feasible method of assessing abdominal fat distribution in epidemiological studies. Short-term variation found in our study was non-negligible, due in part to interobserver variation (4%) and possibly true change in the size of the fat depots. Estimates of visceral fat in the non-fasting state were larger in the first postprandial hour compared with fasting measurements. These estimates of reproducibility should guide use of ultrasound in the assessment of abdominal fat distribution in future studies and clinical practice.