Introduction

An accurate assessment of liver size and volume plays an important role in the clinical routine. The actual size of the liver correlates very well with body size and lifestyle factors like alcohol consumption1. An increase in liver size constitutes a decisive indicator for different liver pathologies like fatty liver disease or steatohepatitis. In addition, monitoring the course of liver size is used for the evaluation of treatment success2,3. This may in particular play a role in the evaluation of the course of disease of infections such as mononucleosis4 or in the evaluation of changes in the context of dietary food conversions5. Estimating liver volume is also crucial for effective surgical planning6,7, necessary for interventional therapy, e.g. selective internal radiation therapy (SIRT), as well as for the evaluation of liver hypertrophy after portal vein embolization or split liver approach. In addition, the liver size and its change can be used for preoperative monitoring to reduce the risk of liver failure and postoperative complications8.

There are several methods available for assessing liver size. Although in clinical routine examinations liver size is approximated by percussion and auscultation, these techniques do not provide reliable information about the actual liver volume9,10 with nearly half of normal sized livers classified as enlarged and vice versa11. Apart from these unreliable clinical investigations, the size of the liver can be estimated by imaging techniques like ultrasound (US), computed tomography (CT) and magnetic resonance imaging (MRI). Their availability, costs and investigation times vary notably.

To date accuracy of volumetry of the liver using cross-sectional imaging has been validated in many studies12,13 and it is seen as gold standard for liver volume assessment. However, aside from the investigation itself, the volumetric post-processing with often manually performed organ segmentation is time-consuming and the necessary technical equipment is frequently not available14.

As an alternative to liver volumetry, volume indices based on simple maximum diameter measurements in all three planes on cross-sectional imaging can estimate the liver size15. Previous investigations demonstrated an excellent agreement between VI determined by MRI and true liver volume15. Likewise, ultrasound, as a fast and inexpensive method with real-time imaging and high availability in clinical settings, may also be suited for evaluation of liver diameters, especially if frequent controls at short intervals are necessary to monitor the development of liver changes.

Therefore, the purpose of this study was to investigate the suitability of volume indices based on either US or CT to determine liver volume in clinical routine.

Material and methods

Study population

Between June 2017 and August 2018, 66 patients (38 men and 22 women, mean age 66.8 ± 12.6 years, range 19–88 years, 6 exclusions) underwent a clinically indicated CT scan of the liver as well as ultrasonography.

The Ethics Committee of the University of Dresden approved the prospective study and it conforms to the Declaration of Helsinki. Written informed consent was obtained from all study participants.

The decisive inclusion criterion was a complete image of the liver within an examination phase of the CT scan. Patients were not selected at baseline for morphological or anatomical liver changes in order to achieve the greatest possible variation in liver size in the study group. One exclusion criterion for patients was poor conditions for ultrasound examination, such as inadequate inspiration compliance, which resulted in the edges of the liver not being fully visible (6 patients). Other exclusion criteria were the infeasibility of an ultrasound examination because of medical reasons or because an ultrasound examination would have led to an unacceptable delay in therapy.

Patients’ age, height, weight and Body Mass Index (BMI), as well as liver-specific laboratory data (alanine aminotransferase (ALAT), aspartate aminotransferase (ASAT), gamma-glutamyl transferase (GGT), bilirubin and platelet count) were collected from the electronic database and correlated with the volumetric results. The thickness of subcutaneous fat tissue in CT was measured in each patient, correlating to the region in which the ultrasound was performed. Fibrosis-4 score (FIB-4) was determined in order to draw conclusions about the degree of fibrosis of the liver16 as follows17:

$${\text{FIB}} - {\text{4 Score }} = \, \left( {{\text{Age}}*{\text{AST}}} \right) \, / \, \left( {{\text{Platelets}}*\surd \left( {{\text{ALT}}} \right)} \right).$$

Image acquisition

The ultrasound examination was carried out directly before CT scan and thus before application of oral contrast agent. Ultrasound examinations were performed with a convex transducer (5 GHz) of the ultrasonic device (Phillips Affiniti 50G, Philips Healthcare, Germany) in supine position by a trained radiology resident with 4 years experience. Additionally, 10 patients were examined by a trained specialist for internal medicine with 2 years of experience in ultrasonography. Both examiners were blinded to each other's measurements and to the results of the CT examination. Image acquisition was performed using subcostal and intercostal positions of the ultrasound probe. The measurements were standardized in strictly sagittal, coronal and transverse orientation to ensure the best possible comparability with the planes reconstructed in CT. Within the selected plane, the liver was fanned out in a structured manner by tilting and sliding while avoiding rotation of the ultrasound probe. Several measurements were taken within each orientation to ensure that the longest diameter could be determined at the level of the maximum transverse (mediolateral, USmaxML), sagittal (dorsoventral, USmaxDV) and coronal (craniocaudal, USmaxCC) diameter in all patients (Fig. 1).

Figure 1
figure 1

Example of a representative measurement of diameters in ultrasonography. Images were taken during the measurement of maximum diameters in (A) mediolateral (USmaxML), (B) craniocaudal (USmaxCC) and (C) dorsoventral direction (USmaxDV).

In cases where a diameter could not be measured in a single ultrasound window, the measurements within one plane were combined using anatomical landmarks (portal vein or hepatic vein) and the measured diameters were added together. In all three dimensions, the borders of the liver had to be clearly visible. All measurements were carried out during apnea at the end of deep inspiration of the patient. This simplified reproducible detection of anatomic structures for the investigator and maintained the position for the patient during the measurement.

Volume indices were calculated following the description above, using the formula:

$${\text{Volume index }}\left( {{\text{US}}} \right) \, = \, \left( {{\text{USmaxML }} \times {\text{ USmaxDV }} \times {\text{ USmaxCC}}} \right) \, /{ 3}.{6}$$

as described by Roloff et al.15.

CT examinations were performed with a 128-slice CT scanner (Somatom Definition AS + , Siemens Healthcare, Germany) with a collimation of 128 × 0.6 mm and a gantry rotation time of 0.28 s. CT data were acquired in the caudocranial direction and within one breath hold. In all examinations, the liver was completely covered, regardless of whether intravenous contrast medium administration was necessary or in which contrast medium phase the liver was examined.. Image reconstruction was done in transverse slices with a thickness of 3 mm. Measurements of CT diameters were performed using a picture archiving and communication system (PACS, IMPAX EE R 20, Agfa Healthcare, Mortsel, Belgium). In CT axial slices and coronal reconstructions were used to measure the maximum diameter in the mediolateral (CTmaxML), dorsoventral (CTmaxDV) and craniocaudal (CTmaxCC) direction by two trained observers, one radiologist with more than 4 years of experience in abdominal imaging and a trained medical student (Fig. 2), respectively. Both evaluators performed the reading independently and were blinded to each other`s results as well as to the results from the ultrasound. Thereafter, we measured the volume index based on CT diameters as follows:

$${\text{Volume index }}\left( {{\text{CT}}} \right) \, = \, \left( {{\text{CTmaxML }} \times {\text{ CTmaxDV }} \times {\text{ CTmaxCC}}} \right) \, /{ 3}.{6}$$
Figure 2
figure 2

Representative measurement of maximum diameters in CT. Measurements were taken in the three standard dimensions (A) mediolateral (CTmaxML) + craniocaudal (CTmaxCC) and (B) dorsoventral (CTmaxDV).

as described by Roloff et al.15.

Standard of reference

Liver segmentation following the assessment of the total liver volume was defined as gold standard. It was performed in transverse CT slices with a thickness of 3 mm. Quantitative analysis was carried out with a semiautomatic volumetric program (Siemens Syngo.Via Multimodality Workplace; Version VB30A_HF01, Siemens, Germany) to segment the liver from the surrounding tissue by using HU-based thresholding. After pre-processing for complete liver segmentation, manual corrections for the determination of the contour of the liver were done if necessary and big vessels including the portal and hepatic vein were excluded (Fig. 3). The segmentation was performed by the same readers that measured liver diameters. They were independent of each other and blinded to the results of diameter measurement, ultrasound examination and clinical data.

Figure 3
figure 3

Refined contour of the liver after semi-automatic extraction of the liver parenchyma from abdominal CT-image in Syngo.Via and excluding the hepatic and portal vein (A) 3D model of the liver after post-processed liver shape (B).

Statistical analysis

Maximum diameters measured in US and CT were presented as means and standard deviations. Quality management was performed via double reading by two observers of diameters in US in 10, and CT in all patients. For inter-rater reliability mean differences in percentages and standard deviations were calculated. Intraclass correlation for inter-rater variability was performed to compare VI and the diameter determinations in the different planes both in CT and in US.

The maximum diameters measured in US and CT were compared using Bland–Altman analysis referenced to the CT volumetry as gold standard. Mean bias and corresponding 95% confidence intervals (CI) were calculated18.

The volume indices from US and CT were analyzed in comparison to liver volume from segmentation with linear regression, and coefficient of determination (r2) was calculated. The correlation coefficient was interpreted according to Evans' classification as very weak (r2 = 0.00–0.19), weak (r2 = 0.20–0.39), moderate (r2 = 0.40–0.59), strong (r2 = 0.60–0.79) and very strong (r2 = 0.80–1.00)19. In addition, we performed Bland–Altman analysis to compare volume index of US, respectively CT with the gold standard.

Multivariable linear regression was applied for comparison of the calculated and the true liver volume in 59 patients. Therefore, we used the difference between the calculated and the true liver volume as outcome and age, sex, BMI, thickness of the subcutaneous fat layer, laboratory data and the Fib-4 score as potential predictor20. A backward selection algorithm was applied keeping only predictors with a p < 0.1. as potential predictors.

All evaluations were done with IBM SPSS Statistics 25 (IBM Corporation, Armonk, NY).

Results

The inter-rater reliability of VI based on diameter measurements using US showed a mean difference of − 34.7% (CI − 90.1, 20.7%). The highest agreement between the two raters was found in the measurements of the diameter in the mediolateral direction (− 2.7%; CI − 29.8, 24.4%), whereas the dorsoventral (− 13.3%; CI − 32.0, 5.4%) and craniocaudal (− 19.5%; CI − 45.7, 6.8%) orientation showed a much higher deviation. In contrast, there was an excellent inter-rater reliability of VI based on CT diameters showing a low mean difference of 1.1% (CI − 16.1, 18.2%). CT diameters revealed the smallest deviations between the raters for the mediolateral direction (2.3%; CI − 16.5, 21.1%) and a higher deviation in the dorsoventral (3.2%; CI − 9.2, 15.7%) and craniocaudal (− 5.3%; CI − 25.2, 14.6%) orientation.

Bland–Altman analysis showed a moderate agreement between the maximum diameters measured in ultrasound and CT (Fig. 4). The highest correlation was found in mediolateral orientation and the lowest in craniocaudal direction. Mean values and standard deviation are shown in Table 1.

Figure 4
figure 4

Bland–Altman correlation between the diameters measured in maximum extension in ultrasound and CT (A) in mediolateral, (B) dorsoventral and (C) craniocaudal orientation with showing the smallest difference in mediolateral extension.

Table 1 Mean and standard deviations of maximum diameters measured in ultrasound, computed tomography and true liver volume extracted from manual liver segmentation.

The mean difference was low for the mediolateral direction (–0.6 cm; CI − 6.7, 5.6 cm), and larger (underestimated) for the dorsoventral (–4.1 cm; CI − 10.3, 2.1 cm) and craniocaudal direction (–3.6 cm; CI − 10.5, 3.3 cm). The results of the intraclass correlation are summarized in Table 2.

Table 2 Intraclass correlation coefficients for inter-rater variability.

Using CT volumetry, the average total liver volume was measured as 1.500 ± 347 cm3, showing a strong correlation between the volume indices calculated from diameter measurement and the true liver volume (r2 = 0.751, Fig. 5). With 4.4 ± 28.3% (Fig. 5) (1.509 ± 432cm3 vs. 1.500 ± 347cm3) the mean bias was small. The largest differences between the results of volumetry and those of VI were found in patients with significantly enlarged right lobe or pronounced liver parenchyma changes.

Figure 5
figure 5

Linear correlation and Bland–Altman correlation between calculated VI from CT and true liver volume.

There is a strong correlation (r2 = 0.751) between the calculated volume indices from maximum diameters measured in computed tomography (LV CT) and the true liver volume extracted from manual segmentation. There is only a small bias between the calculated volume indices from maximum diameters measured in CT (LV CT) and the true liver volume extracted from manual segmentation.

In contrast, US revealed an average total volume of the liver of 863 ± 371 cm3 with a weak correlation between the calculated liver volume from US and the true liver volume (r2 = 0.247, Fig. 6). In comparison to true liver volume, measurement of liver volume using US demonstrated a mean difference of 58.29 ± 66.91% (Fig. 6). Thus, the total volume from ultrasound underestimates the true liver volume severely.

Figure 6
figure 6

Linear correlation and Bland–Altman correlation between calculated VI from US and true liver volume.

There is only weak correlation between the calculated volume indices from maximum diameters measured in ultrasound (LV US) and the true liver volume extracted from manual segmentation. A high bias was seen between the calculated volume indices from maximum diameters measured in CT (LV CT) compared to the true liver volume extracted from manual segmentation.

Correlation analysis did not show any dependency between the actual liver size and body mass index, the measured thickness of subcutaneous fat or the different laboratory values. However, there was a correlation between an increased Fib-4 Score and a lowered inter-rater agreement of the measured diameters in the CT (β = 59.1, 39.8–78.4; p < 0.001). The patients’ characteristics and laboratory data are summarized in Table 3.

Table 3 Patients characteristics and laboratory data.

Discussion

To the best of our knowledge, our study is the first, which investigated the accuracy of a simple technique to estimate liver volume by the calculation of VI on the basis of routine US and CT examination data. The results of our study showed that only the calculated volume indices based on diameter measurements derived from CT are a valid approach for the estimation of liver volume, which underlines previous research15.

Determination of liver size using simple and reliable techniques is clinically warranted. Different approaches to asses liver volume in ultrasound have been presented based on the measurement of liver diameters in one21,22,23 and more than one dimensions24. In our opinion, the determination of the liver volume based on the measurement in one plane is questionable due to the high susceptibility to errors caused by possible anatomical variations or potential measurement inaccuracies. In a clinical setting, it is accepted that liver volume is enlarged if the craniocaudal distance measured by US in the midclavicular line exceeds 16 cm21. This observation must be critically questioned since no results for reliability were reported and our data clearly show a poor inter-rater agreement for diameter measurement using ultrasound. However, our own data demonstrated high inter-rater repeatability of CT diameters. Therefore, the estimation of liver volume using one dimension is possible if the diameter is reliably assessed, for example using cross-sectional imaging. This fact needs to be further explored.

The calculation of the volume index to predict liver volume is another simple approach based on the measurement of liver diameters in more than one dimension. Various formulas for estimating the liver volume using volume indices have been presented so far. An approach by Boscaini et al. used the product of measuring three diameters (length, width, height) and divided this by 2725. This approach saw the liver in the form of a cube, which explains the lack of accuracy. A further development of this approach was provided by Marchesini et al.26 and Zoli et al.27 by comparing volume calculations with CT-based volume determinations. However, the authors described difficulties in comparability and therefore proposed a calibration to eliminate them. Muggli et al.28 followed an approach to determine liver volume on the basis of diametric measurements, however, patients with changes in liver parenchyma were excluded. For this reason, the results appear to show clear limitations for clinical practice. These could also be the reason why the results of the calculated VI appear slightly better in comparison to the actual volume than in our study.

This assumption is supported by the fact that our results showed a decreased interobserver agreement in the calculated VI in patients with an increased fib-4 score, which explains a lower accuracy of VI in presents of liver parenchyma changes.

In previous research, a calibration of volume index determined by the three diameters in their maximum orientation divided by the factor 3.6 was introduced. Using this factor, the volume index and the true liver volume are comparable. However, an accurate estimation of liver volume using volume indices requires also a reliable and robust assessment of liver diameters. As shown in our study results, we found excellent inter-rater reliability if liver diameters are determined in CT images, but not for US. This is also in line with the results from Verma et al., which showed good inter-rater reliability for measurements of diameters using cross-sectional imaging like MRI and a good correlation to hepatic volume29. In addition, the result seems plausible, since in CT the measurements are based on an identical data set, whereas in sonography the measurements of the two raters are based on different images that they acquire themselves.

It must be critically noted that the variability of the differences between VI from CT and volumetry is relatively high, which can be attributed to the fact that volumetry based on segmentation is more adaptable to anatomical variabilities of the liver than the method of diameter measurement. This has to be considered in particular against the background that the highest deviations between volumetry and VI were found in patients who showed a significant enlargement of the right hepatic lobe (riedel’s lobe30) or pronounced changes of the liver parenchyma as in liver cirrhosis.

A further clear advantage of volumetry by segmentation is that it can also be performed for only partial areas of the liver, which plays an important role in preoperative planning prior to liver resection. Good results in the agreement of the volumetrically determined volume both in MRI and CT with the actual postoperative liver volume could be shown in the work of Karlo et al.31. These partial volume measurements are clearly limited when determining the liver volume on the basis of VI.

Our study results further demonstrated a worse agreement of liver diameters if they were assessed using US. We could exclude possible objective reasons for that, such as patient´s body constitution (investigated by body mass index and size of the subcutaneous fat layer) as well as variations of the liver itself such as parenchymal liver diseases (investigated by liver volume, Fib4-score, laboratory data). Using our correlation, we cannot clarify for sure the reason for the lower inter-rater reliability for measurements of diameters using US with the exception of the methodological influences such as the cooperation of the patient during inspiration or the measurement inaccuracies resulting from the different posture of the ultrasound probe.

This is supported in particular by the fact that a significantly poorer correlation can be seen in measurements within the craniocaudal plane. In addition to the fact that the ultrasound conditions were most clearly impaired in this plane due to anatomical features such as the costal arch and the resulting restricted freedom of movement of the ultrasound transducer, this can be explained in particular by the respiration-dependent changes, particularly in this plane, and the associated changes in the position of the liver, lungs and diaphragm. In the other two dimensions, there is a much better correlation between the raters, indicating that the patient's influence on the measurements is less pronounced. In addition, a good and almost good correlation between the raters in dorsoventral and mediolateral suggests a proper training.

A further potential source of error can be suspected in cases where the diameters were not determined within one position of the ultrasonic probe but had to be calculated on the basis of two positions. In this case, despite the greatest care, it cannot be guaranteed that the second setting will exactly match the plane of the first setting, which is an additional source of error.

A possible improvement approach in the future could be the use of 3D ultrasound. This method, which is widely used in gynecological imaging32, does not yet play a clinical role in imaging the liver. However, there are already first diagnostic approaches to use it for imaging the liver surface in parenchymatous diseases33 and for monitoring during interventional procedures34,35. Regarding correlation of liver measurements with clinical parameters, we found only a correlation between the FIB-4 and the inter-rater reliability in the diameter measurement in CT. Among other things, this could be due to the change of the liver surface, which can lead to deviations in the measurements due to the difficulty in determining the exact maximum diameter in case of irregularities of the liver surface in cirrhosis.

Consistently, our study shows a significant poorer inter-rater reliability in US compared to CT and results further suggest that the volume of the liver using US-based calculated VI does not lead to a valid result. In our opinion, calculation of volume index based on measurement of liver diameters using CT is an excellent, simple approach to predict liver volume. This simple technique is ready for clinical setting.

There are some limitations in our study. Study patients underwent CT exclusively for clinical reasons. As such, a clinically healthy control group was missing for comparison. However, in previous research, the excellent accuracy of calibrated volume indices to estimate liver volume in a cohort of volunteers15 was confirmed. In this study, we intentionally focus on a patient cohort to demonstrate that our approach is applicable in clinical practice. A further limitation of the study is that among the subjects included, no one had previously undergone liver surgery. A statement regarding the value of VI, for example after hemihepatectomy, is still to be investigated. In addition, it is a known obstacle in ultrasound to obtain exact orientations within a plane. Nevertheless, despite rigorous efforts (multiple control of the measurement level) and extensive previous training, there may have been measurement inaccuracies by distortions, which may have influenced the results. Furthermore, the level of experience of the examiners, especially in such an examiner-dependent procedure as ultrasound, must be considered as a possible cause for the low inter-rater agreement.

In conclusion, the volume index is an excellent approach to estimate liver volume very fast using a routinely available data set. However, an accurate estimation of liver volume requires a robust and reliable assessment of liver diameters as provided by cross-sectional imaging, such as CT. US cannot reliably measure maximum liver diameters, wherefore the usage of US-based volume indices should not be used due to its low accuracy.