Introduction

Diffusion-weighted imaging (DWI) is a sensitive tool and an attractive technique to assess the so-called Brownian motion, which obtains not only anatomic and structural information but also qualitative and quantitative data of the kidney. Furthermore, it is a non-invasive examination and easy to perform on patients without using a contrast medium, which is essential for patients with renal dysfunction to avoid nephrogenic systemic fibrosis1,2.

At present, some DWI sequences have widely been used to diagnose renal diseases and evaluate the renal function in a large number of acute and/or chronic kidney diseases and various renal tumors3,4,5,6,7, including breath-hold DWI (BH-DWI)8, free-breathing DWI (FB-DWI)9, navigator-triggered DWI (NT-DWI)10, respiratory-triggered DWI (RT-DWI)11, readout-segmented DWI (RS-DWI)12, and zoomit-DWI (Z-DWI)13. Zoomit-DWI is a reduced field-of-view (rFOV) single-shot DWI sequence, which is also called as field-of-view optimized and constrained undistorted single-shot DWI (FOCUS DWI) in GE and zonal oblique multislice (ZOOM) in Philips14,15. Compared with conventional single-shot DWI (BH-DWI, FB-DWI, NT-DWI and RT-DWI), Z-DWI can reduce the interference of gastrointestinal motility and gas artifact on the kidney through small field of view local excitation13, and RS-DWI can reduce deformation artifact through staggered acquisition in frequency or phase encoding direction12. In the process, the apparent diffusion coefficient (ADC), which represents the mobility of water molecules within the tissue, is a vital quantitative imaging parameter and a diagnostic & therapeutic biomarker for patients with renal diseases. However, many factors, including gastrointestinal peristalsis, breathing and cardiac pulsations, can affect the accurate measurements of ADCs, cortico-medullary contrast to noise ratio(c-mCNR) and even the image quality16,17,18.

Thus, reliable measurement of ADCs, sufficient c-mCNR and good image quality are essential in assessing renal function and monitoring therapeutic effects using the DWI sequence. Currently, most studies on ADC reliability were focused on the liver, with only a few studies focusing on the kidney. Friedli et al. suggested that ADC (the cortico-medullary ADC difference) of RS-DWI had a better correlation with fibrosis than conventional DWI in patients with chronic kidney disease19. He et al. considered Z-DWI to have better image quality, less distortion and susceptibility artifacts than the conventional DWI13. Tavakoli et al. found that simultaneous multislice RT- DWI of the kidney reduces scan acquisition time by 30% and yields substantially improved image quality to enable better lesion characterization than FB-DWI20. However, the studies compared only two different DWI sequences and did not systematically compare the ADC reliability, c-mCNR and image quality of the commonly used renal DWI sequences. Furthermore, most studies drew regions of interest (ROI) in only one part of the kidney and did not separate ROIs for the renal cortex and medulla when taking the ADC measurements.

Therefore, the aim of our study is to systematically compare the reliability of ADC measurement in the renal cortex and medulla, c-mCNR and image quality among BH-DWI, FB-DWI, NT-DWI, RT-DWI, RS-DWI and Z-DWI, and then obtain the optimal renal DWI, which can be recommended for clinical application.

Results

Intra- and interobserver agreement of ADC measurements with the six techniques in kidney

The average ADC values of the two representative anatomic sites (cortex and medulla) were obtained for both kidneys, with the lowest values obtained using Z-DWI (Table 1) for reader 1 and reader 2. The ICC of ADC measurements in the cortex were higher than that in the medulla for the six DWI sequences. For example, the range of ADC values was 1.385–2.116 × 10−3 mm2/s in cortex and 1.174–1.817 × 10−3 mm2/s in medulla for the first measurement of reader 1(Table 1). Z-DWI has a superior inter-observer agreement of ADC measurements in the cortex and medulla in each kidney (LKC: 0.856, LKM: 0.798, RKC: 0.855, RKM: 0.808, all P > 0.05) than BH-DWI, FB-DWI, NT-DWI, RS-DWI, and RT-DWI (some P < 0.05). For example, in the first measurements of the two readers, the average ADC values of Z-DWI for the LKM ((1.240 ± 0.026) × 10−3 mm2/s vs. (1.264 ± 0.021) × 10−3 mm2/s; P = 0.238) had less variation compared with RS-DWI ((1.749 ± 0.034) × 10−3 mm2/s vs. (1.679 ± 0.028) × 10−3 mm2/s; P < 0.003). In addition, Z-DWI yielded the highest intra-observer ICCs (0.876–0.944, all P > 0.05) among the six DWI sequences (Table 1). For example, in reader 1’s two measurements with Z-DWI, the average ADC values for the LKC ((1.463 ± 0.027) × 10−3 mm2/s vs. (1.472 ± 0.025) × 10−3 mm2/s; P = 0.512) had less variation compared with RT-DWI ((1.896 ± 0.020) × 10−3 mm2/s vs. (1.964 ± 0.019) × 10−3 mm2/s; P < 0.001).

Table 1 The ADC measurement in six DWI techniques and their Intra- and Interobserver agreement between them.

ADC reproducibility in the 12 anatomic locations with each technique

All coefficient of variation (CV) fell between 0.9 and 2.1% in reader1 and reader 2. The repeatability of ADC measurements in the 12 anatomic locations varied for each technique. The mean ADC absolute differences (bias) with Z-DWI was 0.070–0.111 × 10−3 mm2/s, which was lower than BH-DWI (0.083–0.181 × 10−3 mm2/s), FB-DWI (0.087–0.186 × 10−3 mm2/s), NT-DWI (0.076–0.150 × 10−3 mm2/s), RS-DWI (0.125–0.203 × 10−3 mm2/s) and RT-DWI (0.096–0.148 × 10−3 mm2/s) for the 12 representative locations. Furthermore, Z-DWI had the highest ADC measurement repeatability, with the lowest LOA (0.031–0.056 × 10−3 mm2/s) than all other sequences (Table 2, Figs. 1, 2 and Supplementary Table 1).

Table 2 The mean absolute differences of ADCs measurement and their 95% confidence interval in twelve anatomic locations with six DWI techniques.
Figure 1
figure 1

Comparison of ADC measurement repeatability of the six different DWI sequences in right kidney. The Bland–Altman plots of ADC measurements presented that Z-DWI had the lowest lowest LOA (0.033–0.056 × 10−3 mm2/s) (near zero) than all other sequences at almost all measurement points (6 anatomic locations of right kidney), which indicates that Z-DWI has the best ADC measurement repeatability.

Figure 2
figure 2

Comparison of ADC measurement repeatability of the six different DWI sequences in left kidney. The Bland–Altman plots of ADC measurements presented that Z-DWI had the lowest lowest LOA (0.031–0.051 × 10−3 mm2/s) (near zero) than all other sequences at almost all measurement points (6 anatomic locations of left kidney), which indicates that Z-DWI has the best ADC measurement repeatability.

Measurement of cortico-medullary contrast to noise ratio(c-mCNR)

For the measurement of c-mCNR, a good agreement between reader 1 and reader 2 was found in the upper pole (RK: r = 0.779; LK: r = 0.891), middle pole (RK: r = 0.775; LK: r = 0.818) and lower pole (RK: r = 0.72; LK: r = 0.854) with Z-DWI (Tables 3 and 4). Furthermore, the Z-DWI has a slightly higher c-mCNR than other DWIs in most representative locations (P > 0.05). Notably, it is significant higher than BH-DWI and FB-DWI in the middle pole of bilateral kidney and the upper pole of the left kidney (P < 0.05). For example, in the middle pole of the right kidney, the c-mCNR was 12.62 ± 3.02 (95% CI: 11.29–13.96) with Z-DWI measured by reader 1, which was slightly higher than that with RS-DWI (9.70 ± 6.00 (95% CI: 7.04–12.36), P > 0.05), NT-DWI (9.64 ± 3.48 (95% CI: 8.10–11.19), P > 0.05), and RT-DWI (9.70 ± 6.00 (95% CI: 7.04–12.36), P > 0.05), but significantly higher than that with FB-DWI (7.14 ± 2.94 (95% CI: 5.84–8.44), P < 0.001) and BH-DWI (8.62 ± 6.14 (95%CI: 5.89–11.34), P = 0.004) (Tables 3 and 4).

Table 3 Cortico-medullary contrast to noise ratio (c-mCNR) of right kidney.
Table 4 Cortico-medullary contrast to noise ratio (c-mCNR) of left kidney.

Image quality analysis

The two readers had an excellent agreement in evaluating the five aspects (K1-K5) of image quality (Kappa value 0.945–0.989). Z-DWI had a high score in terms of image blurring (5 points), severity of artifacts (4 points), sharpness of boundaries (5 points), clarity of the renal cortex and medulla (5 points), and overall image quality (5 points), which was similar with the image quality of RT-DWI and NT-DWI (P > 0.05). However, Z-DWI had a better image quality than BH-DWI in K4 (ADC map) (P < 0.05), FB-DWI in K2 (all P < 0.05), K4 and K5 (ADC map) (all P < 0.05), and RS-DWI in all image quality aspects except for K2 and K4 (ADC map) (all P < 0.05) (Table 5, Figs. 3 and 4).

Table 5 The evaluation of image quality in volunteer by reader 3(R3) and reader 4(R4).
Figure 3
figure 3

Comparisons of image quality of BH-DWI, FB-DWI, NT-DWI, RT-DWI, RS-DWI and Z-DWI. Diffusion-weighted trace images at three different b-values (800, 400, 50 s/mm2) with the corresponding ADC maps (right) are arrayed. Z-DWI had a better image quality than BH-DWI and FB-DWI in clarity of the renal cortex and medulla (K4) (ADC map) (all P < 0.05) and RS-DWI in sharpness of boundaries (K1), clarity of the renal cortex and medulla (K4) and overall image quality(K5) (all P < 0.05). Z- DWI was slightly superior to RT- DWI and NT-DWI in sharpness of boundaries (K1, ADC map); however, the difference in image quality between the three was not significant (all P > 0.05).

Figure 4
figure 4

Comparisons of image quality of BH-DWI, FB-DWI, NT-DWI, RT-DWI, RS-DWI and Z-DWI at b = 800 s/mm2 and corresponding ADC map. The image quality of Z-DWI was significantly different from BH-DWI, FB-DWI and RS-DWI in the three representative section (n-2 slice, n slice and n + 2 slice) in clarity of the renal cortex and medulla (ADC map) (all P < 0.05). Z- DWI was slightly superior to RT- DWI and NT-DWI in the three representative section (n-2 slice, n slice and n + 2 slice) in sharpness of boundaries (ADC map); however, the difference in image quality between the three was also not significant (all P > 0.05).

Discussion

Currently, BH-DWI, FB-DWI, NT-DWI, RS-DWI, RT-DWI and Z-DWI have widely been used for the diagnosis of renal diseases and evaluate renal function3,6,8,12,13. For these DWIs, the reliability of ADC value and good image quality are vital in detecting renal disease and assessing renal function accurately. To our knowledge, this is the first MRI study to compare these DWIs systematically, by evaluating the intra- and inter-observer agreement in ADC measurements, reproducibility of ADC values and image quality to establish the most reliable clinically applicable renal DWI sequence.

ADC values derived from coronal renal DWI exhibited moderate-to-good agreement to axial DWI11. In our study, coronal renal DWI was performed because it can provide full coverage of the kidney shape. The mean ADC value in the renal cortex falls between 1.429 and 2.082 × 10−3 mm2/s and 1.211–1.749 × 10−3 mm2/s in the medulla in 12 representative sections (the upper, middle, and lower pole of both kidneys), which were near the lower limit of the values reported in the literature ((1.78 ± 0.11) × 10−3 mm2/s for the renal cortex and (1.48 ± 0.13) × 10−3 mm2/s for renal medulla in healthy volunteers21. Furthermore, our results showed that the 95% CI of ADC measurements in the cortex was higher than that in the medulla using the six DWI sequences, consistent with Sulkowska et al.22. Previous ADC values obtained with NT-DWI10 and RS-DWI23 were similar to our findings but were slightly higher with BH-DWI10 and Z-DWI13 than that in our results. In our results, Z-DWI yielded lower ADC values in the cortex and the medulla than the other five DWIs, which is consistent with Cai et al. findings that showed that the mean tumor ADC values of rFOV-DWI were significantly lower than those of fFOV-DWI (1.237 ± 0.228 × 10−3 mm2/s vs 1.683 ± 0.322 × 10−3 mm2/s, P < 0.001) in patients with gastric cancer24,25. The possible reason was that DWI with reduced FOV produce images with sharper margins and anatomic structural visualization26, which is helpful in drawing ROI in renal the cortex and medulla, yielding a stable and low ADC value. This suggests that a lower ADC value should be used in clinical work when using Z-DWI.

In addition, Z-DWI has the best intra-observer agreement (intra-class ICCs: 0.906–0.944) and inter-observer agreement (inter-class ICCs: 0.798–0.856) among the six sequences, indicating that Z-DWI is sufficiently reliable and repeatable when assessing ADC measurements. The possible reason for this result is that a reduced (“zoomed”) FOV in the phase-encoding direction decreases the influence of gastrointestinal peristalsis and respiratory motion artifacts on kidney images. In addition, RT-DWI and NT-DWI can reduce the influence of motion artifacts by respiratory- and navigator-triggered techniques. However, it is at the cost of rather long and uncertain scan times (more than 120 s in both sequences), which can markedly increase patients' discomfort and sensitivity to motion27. Consequently, the intra- and inter-observer agreements with RT-DWI and NT-DWI were lower than with Z-DWI. Previous studies have shown that Z-DWI has obvious advantages in cervical cancer28, thyroid micronodules29, cervical spinal cord30, etc. It enables clearer identification of lesions and reduction of image artifacts. Our research has further verified its value in kidney applications. Moreover, we found that the CV was less than 3% in all measurements, suggesting that the ADC measurements were reliable and consistent in all DWIs.

Our results indicate that Z-DWI has the best ADC repeatability because it yielded the least mean absolute differences of ADCs and LOAs in all the anatomical sections. This finding may be related to the “zoomed” technique in the direction of phase-encoding, which, when combined with dynamic, spatially selective RF pulses, further improved image quality in renal imaging considerably more than other DWIs15,31. According to previous studies of abdominal organs, different breathing schemes will affect the absolute ADC value32. The study of Yıldırım İO et al.33 found that compared with conventional DWI sequences, Z-DWI may be more effective in the diagnosis and monitoring of treatment and postoperative responses in patients with varicocele. Therefore, the good repeatability of Z-DWI helps us to evaluate the ADC value of renal disease quantitatively. Our study verifies that Z-DWI has the best consistency and reproducibility, which is of great significance to the future clinical applications of renal DWI sequences. Our results also showed that all the LOAs were around 20–30% of the mean ADC values. This is in line with previous studies that recommended at least a 30% change in ADC values when evaluating a lesion's response to treatment with the same DWI technique6,23.

Renal cortico-medullary ADC difference is an important marker for differentiating renal diseases. A good agreement was found with Z-DWI for assessing c-mCNR (ICC > 0.70) in all representative locations, indicating the reliability of Z-DWI in assessing ADC measurements of renal lesions. Furthermore, Z-DWI has a slightly higher c-mCNR than other DWIs in most representative locations (P > 0.05), and significantly higher c-mCNR than BH-DWI and FB-DWI in the middle pole of both kidneys and the upper pole of the left kidney (P < 0.05), which is consistent with previous reports34,35. This suggests that the Z-DWI may be a good sequence for depicting and differentiating renal diseases.

The DWIs with a long scan time (like Z-DWI, RS-DWI, RT-DWI and NT-DWI) can reduce the artifacts in DWI protocols, but this in turn can markedly increase patient’s discomfort and decrease image quality. In our study, Z-DWI yielded a high score in terms of imaging blurring, sharpness of boundaries, clarity of the renal cortex and medulla, and overall image quality, which has the similar image quality to RT-DWI and NT-DWI (P > 0.05) and superior to RS-DWI (P < 0.05). The possible reason is that the “zoomed” technique in the direction of phase-encoding, combined with dynamic, spatially selective RF pulses reduced susceptibility artifacts markedly and gained considerable image quality improvements in renal imaging15,31. Although RS-DWI can reduce T2 blurring and susceptibility effects12, its long acquisition time (226–379 s in our study) makes it prone to motion artifacts, reducing the image quality, especially for the mobile kidney. Furthermore, Z-DWI had a higher score than BH-DWI in clarity of the renal cortex and medulla (ADC map, P < 0.05) and RS-DWI in clarity of the renal cortex and medulla (all P < 0.05). This indicates that imaging with Z-DWI provides a clearer margin between the renal cortex and medulla and helps to locate the orientation of renal lesions and precisely measure ADC value in the cortex and medulla. In addition, Z-DWI was better than FB-DWI and RS-DWI in severity of artifacts (P < 0.05), which is similar to a previous study where FB-DWI and RS-DWI had more artifacts compared to Z-DWI13,36.

This study also has some limitations. First, the volunteers included in this study are all young with better breathing coordination, which is somewhat different from the clinical situation of patients with kidney disease. Secondly, this study was performed in normal kidneys, without any lesions, to ensure the same condition of the kidney to avoid the bias of ADC measurements due to inhomogeneity that lesions might cause. Finally, in order to evaluate the image quality of different kidney regions, this study uses a coronal scan, which increases the impact of respiratory motion artifacts on the image.

In summary, Z-DWI had an excellent intra-observer agreement and good inter-observer agreement among the six sequences. Furthermore, Z-DWI had the highest ADC repeatability and c-mCNR in most of the 12 locations of the kidneys observed. In addition, Z-DWI had a similar image quality with RT-DWI and NT-DWI and better image quality than BH-DWI, FB-DWI and RS-DWI (P < 0.05). Therefore, Z-DWI is the optimal renal DWI sequence that can be used as a reliable quantitative parameter and therapeutic biomarker for patients with renal disease and evaluation of renal function. Thus, it is recommended as the DWI sequence for clinical examination of the kidney due to its good image quality and reliable diagnostic confidence.

Materials and methods

Ethics statement and participants’ enrollment

This prospective study was approved by the research ethics committee of our institution (Xiangya Hospital, Central South University, China). The authors confirm all data has informed written informed consent obtained from each participant. All methods were performed in accordance with the relevant guidelines and regulations and strictly abide by the Declaration of Helsinki. 22 healthy young volunteers with similar age (juniors in a medical college, mean: 21 years, range: 20–22 years) were enrolled (12 males, 10 females).

In our study, the inclusion criteria included: (a) no history of albuminuria, hematuria and weight loss; (b) no history of any kidney surgery; (c) ability of the subject to hold his or her breath for up to 20 seconds. The exclusion criteria included: (a) contraindications to MR imaging; (b) history of any kidney disease and surgery.

MR imaging protocol

Magnetic resonance examinations were performed on a 3.0 T system (MAGNETOM Prisma, Siemens Healthcare, Erlangen, Germany) with an 18-channel anterior surface body coil combined with 12 elements of a 32-channel spine coil. Each subject was scanned twice in the DWI series. The DWI series included end-expiratory breath-hold DWI (BH-DWI) (one breath-hold), free-breathing DWI (FB-DWI), navigator-triggered DWI (NT-DWI), readout-segmented DWI (RS-DWI)(with respiratory-triggering), respiratory-triggered DWI (RT-DWI) and Zoomit DWI (Z-DWI) (with respiratory-triggering). Three b values of 50, 400, and 800 s/mm2 were sampled in three orthogonal diffusion directions (three-scan trace) for all DWIs. A 5 min rest was allowed between two identical sessions. The scan parameters were kept as close as possible, and the detailed parameters of all sequences are summarized in Table 6. The imaging parameters of the two scans were consistent. Each participant had 12 scans (six scans using the 6 techniques in each session). The fat suppression was achieved with spectral adiabatic inversion recovery in all DWI sequences, and the acceleration factors were 2 in all sequences. A k-space-based parallel imaging technique was used. The scan time was recorded.

Table 6 The summarized parameters of all DWI sequences in our study.

Image analysis

All the DWI images data were saved to the workstation. They were evaluated by four readers, including (1) ADC values, (2) the repeatability of ADC measurements, and (3) subjective image quality. The ADC values were calculated separately using the post-processing software (Syngo.via VB10, Siemens Healthcare). The measurements of ADC were done by two radiologists (WG.L. and H.L., readers 1 and 2, with 5 and 10 years of clinical imaging diagnosis experience, respectively), and two other radiologists assessed the subjective image quality (YG. P. and WZ. L., readers 3 and 4, with 15 and 20 years of clinical imaging diagnosis experience, respectively). Independent double-blinding was used in four readers throughout the measurement and evaluation process.

ADC value measurement and repeatability evaluation

12 ROIs were drawn on the b = 50 s/mm2 images, including the upper, middle and lower poles of cortex and medulla on both kidneys. ROIs 20-24mm2 in size37,38 were positioned the on the b = 50 s/mm2 image (Fig. 5A,B), and then copied to the ADC map for ADC measurements (Fig. 5C) and b = 800 s/mm2 images for c-mCNR measurements (Fig. 5D). Then, ROIs were drawn in the second scan and in the repeated series of the other five sequences in a similar manner. The ADC measurements were repeated one week after the first measurement to avoid recall bias. The second radiologist repeated the same measurement. The ADC value were gained using the following formula by the log-linear fitting algorithm with three different b factors (b = 50, 400, 800 s/mm2):

$$I_{Trace} = I_{0} e^{ - b*ADC} = I_{0} e^{{ - b*\frac{{\left( {D1 + D2 + D3} \right)}}{3}}} = \sqrt[3]{{I_{1} *I_{2} *I_{3} }}$$
(1)
Figure 5
figure 5

Schematic diagram of typical ROI placement in the renal cortex and medulla. Raw diffusion-weighted imaging (DWI) at b = 50 s/mm2 (A), six representative ROIs (3 ROIs each for the renal cortex and medulla in superior, middle and inferior zones, respectively) on DWI at b = 50 s/mm2 (B), corresponding ADC map (C) and b = 800 s/mm2 image (D) for c-mCNR measurements. First, the DWI slice (using b = 50 s/mm2) with the largest renal section was chosen and a straight line was drawn along the upper and lower poles of the kidney (white dashed line). Then, a perpendicular bisector was drawn (white dashed line). Second, the medullary zone adjacent to the white dashed, which has a clear lower signal intensity was identified and the ROIs representing the superior, middle and inferior zones were drawn manually. Subsequently, three similar ROIs were drawn in renal cortex based on the representative medulla positions. These ROIs were copied to the corresponding ADC map for ADC measurement. Moreover, the background signal standard deviation (SD) for c-mCNR measured using an equally sized ROI placed at a nearby background (air) in the corresponding section, close to the site of the kidney ROI, and avoiding any prominent artifacts.

I1, I2, and I3 are the measured diffusion-weighted images in three orthogonal gradient directions, and D1, D2 and D3 are the corresponding diffusion coefficients.

In addition, CV of ADC value was used to assess the relative degree of dispersion between ADC value measurements, which was calculated as the following formula:

$$CV_{ADC} = \frac{SD}{{ADC_{Mean} }} \times 100\%$$
(2)

Here, SD was the standard deviation of ADC value and ADCMean was the mean value of ADC value in various representive point39. It indicated that the ADC measurement was reliable when CV less than 0.15.

Cortico-medullary contrast to noise ratio (c-mCNR)

The signal intensity (SI) was measured in different anatomical regions with b = 800 s/mm2 images, including the upper, middle and lower poles of cortex and medulla on both kidneys. Moreover, the background signal standard deviation (SD) measured using an equally sized ROI placed at a nearby background (air) in the corresponding section, close to the site of the kidney ROI, and avoiding any prominent artifacts (Fig. 5D). The following formula was used to calculate the corresponding c-mCNR of different DWI sequences:

$$c - mCNR = \frac{{\left| {SI_{cortex} - SI_{medulla} } \right|}}{{SD_{background} }}$$
(3)

where SIcortex and SImedulla were the signal intensity of the specific position ROI (for instance, the upper pole of right kidney). SDbackground was the standard deviation of the chosen artifact-free ROI positioned on the background (air) of the corresponding slice. In all volunteers, CNR were measured once in 1 week by reader 1 and reader 2. Mean values of c-mCNR with the standard deviation and 95% confidence interval were recorded.

The evaluation of image quality

The image quality of the six DWIs on the ADC map and DWI images at b = 50, 400 and 800 s/mm2 were evaluated by two radiologists (reader 3 and reader 4), respectively. The score criteria of image quality for each DWI are shown in Table 7.

Table 7 The criterion of the image quality scores for coronal kidney DWI in our study.

Statistical analysis

The mean value and standard deviation (SD) of ADC values of 12 ROIs in cortex and medulla on both kidneys were used to estimate the consistency of ADC measurement. The t-test was used to compare the difference between the first and second readers’ measurements (inter-observer agreement) and the difference between repeated measurements (intra-observer agreement). The intra- and inter-class correlation coefficients (ICCs) (and 95% confidence intervals) were used to evaluate the intra- and inter-observer agreement, respectively. An ICC value greater than 0.70 indicates good consistency.

In order to evaluate the repeatability of ADC, we used the Bland–Altman method, which compares the 95% confidence interval (limit of agreement [LOAs]) between the first and second sets of DWI sequences and the mean absolute difference of ADC values. The median of image quality evaluation was obtained from the 3 b-value DWI images and ADC map. The inter-observer agreement for image quality was analyzed by calculating weighted kappa coefficients (quadratic weighting), with kappa values of 0.01–0.25 representing slight agreement, 0.25–0.45 fair, 0.45–0.65 moderate, 0.65–0.85 substantial, and 0.85–1.00 almost perfect agreement. The Friedman test was used to compare the differences between the six methods, and the Dunn-Bonferroni post-hoc test adjusted for all significant pairwise comparisons. Statistical analysis was performed using SPSS (version 19.0, Chicago, IL) software. When the P-value was less than 0.05, the difference is considered significant.