Introduction

The noradrenaline analogue 123I-labeled meta-iodobenzylguanidine (MIBG) allows the visualization of cardiac sympathetic nerve activity1,2, and cardiac scintigraphy with 123I-MIBG has played important roles in the diagnostic evaluation of heart failure3,4,5,6 and neurodegenerative diseases7,8,9,10,11,12,13. The heart-to-mediastinum ratio (HMR), calculated as 123I-MIBG accumulation in the heart divided by that in the mediastinum14,15, has been quantitatively applied to evaluate cardiac sympathetic nerve activity in 123I-MIBG images.

The HMR is considerably influenced by the location and size of cardiac and mediastinal regions of interest (ROI) on 123I-MIBG planar images. This is because 123I-MIBG planar image processing has not been standardized16,17,18,19. We developed a semi-automated method for standardizing the size and position of myocardial and mediastinal ROI to overcome this issue20. However, HMR variation persisted due to the characteristics of scintigraphic imaging systems such as gamma cameras and collimators, as well as the thin septa of collimators that can be easily penetrated by high-energy 529 photons emitted by 123I radioisotope21,22. This degrades the quality of 123I-MIBG images, and decreases the HMR21,22,23,24.

Consequently, we also developed a method of cross-calibrating HMR based on the performance of various collimators25,26,27,28 that can translate all HMR derived from various collimators and unify them as though are derived from a single collimator. We refer to this process as a method for standardizing HMR. The method is based on an acrylic chest phantom that was designed for 123I-MIBG planar imaging23. It can calibrate collimator performance differences in clinical HMR calculations that lead to standardized HMR. We validated this method in multicenter phantom studies in Japan and Europe28,29,30,31,32,33. The calibration phantom has been imaged using 225 and 210 collimators in Japan26 and Europe33, respectively. The findings of these studies validated the feasibility of HMR standardization using the phantom-based method. However, some minor differences in HMR have persisted28. Relationships between HMR and imaging conditions, and between HMR and the characteristics of gamma cameras with collimators also remain obscure.

Here, we present an improved standardization method for HMR based on combinations of gamma cameras and collimators. A multicenter phantom imaging database was created to identify the cause of HMR variations in imaging conditions, revealed that the energy-window setting for 123I is indispensable for robust HMR values. Moreover, this database allows the determination of mean calibration factors in combinations of gamma cameras and collimators. A clinical study showed that the standardization method with mean calibration factors is valid for patients with normal 123I-MIBG uptake. Our results indicate a vital role of HMR calculations in cardiac 123I-MIBG examinations.

Results

Monte Carlo simulation

We conducted a Monte Carlo simulation of 123I-MIBG phantom imaging (Fig. 1). A digital phantom was generated from an acrylic 123I-MIBG phantom. Density and source maps for the simulation were created from phantom images acquired by X-ray computed tomography (CT) (Fig. 1a). The energy spectra were dependent on low-energy (LE), low-medium-energy (LME), and medium-energy (ME) collimators during planar imaging (Fig. 1b). When the simulation and experimental 123I-MIBG phantom images were compared in terms of LE, LME, and ME collimators, image blurring due to 529-keV high-energy photon was visualized under both conditions with the LE collimator (Fig. 1c). The HMR in the three collimators did not significantly differ between the simulation and experimental conditions (Fig. 1d). Septal thickness, collimator length, and diameter of the collimator hole were major components of HMR variation (Fig. 1e).

Figure 1
figure 1

Monte Carlo simulation of 123I-MIBG phantom images. (a) Acrylic 123I-MIBG phantom (left), density (middle) and radioactive source (right) maps of simulation materials. (b) Energy spectra of 123I-MIBG phantom imaging in low-, low-medium, and medium-energy collimators. (c) Simulated (upper) and experimental (lower) phantom images generated with low- (left), low-medium (middle), and medium- (right) energy collimators. (d) HMR calculated from simulated and experimental phantom images using three types of collimator. Error bars represent SD of means (Student t tests). Simulated and experimental HMR do not significantly differ in the three collimators. (e) Heat maps of HMR according to collimator design. Hole diameter, septal thickness, and length of collimators ranged from 1.0 to 4.0, 0.10–1.50, and 20–60 mm, respectively. HMR heart-to-mediastinum count ratio.

HMR variation under acquisition conditions

We confirmed the HMR in 123I-MIBG planar images obtained over periods of 1–10 min. Although mean HMR did not significantly differ among acquisition periods of 1, 2, 3, 4, 5, 7, and 10 min (2.32 ± 0.063, 2.32 ± 0.035, 2.34 ± 0.040, 2.35 ± 0.029, 2.33 ± 0.017, 2.34 ± 0.024, and 2.35 ± 0.010, respectively), the standard deviations (SD) of the HMR gradually decreased over longer acquisition periods (Fig. 2a). The quality of phantom images seemed most stable during acquisition for 5 and 7 min. The HMR were also stable at any gamma camera position from the phantom surface (Fig. 2b). Image quality was degraded in both LE high-resolution (LEHR) and LME general-purpose (LMEGP) collimators when the distance from the phantom surface was increased. The HMR was higher when images were acquired at the energy windows of 159 keV ± 7.5% than at 159 keV ± 10%, indicating that the setting of primary energy window of 123I affected the HMR (Fig. 2d). The HMR values with 256 and 512 matrices did not significantly differ except under two imaging conditions (Fig. 2e).

Figure 2
figure 2

123I-MIBG imaging characteristics in multicenter phantom study. (a) HMR does not significantly differ among 1-, 2-, 3-, 4-, 5-, 7-, and 10-min image acquisitions. Images were acquired from phantom in 1-, 3-, 5-, and 7-min. (b) Relationships between HMR and gamma cameras located 10, 30, 50, 70, 90, and 180 mm from phantom surface. Phantom images derived using LEHR and LMEGP collimators located 10- and 180-mm above phantom surface. (c) Eligible phantom image datasets based on selection criteria in multicenter phantom imaging study. (d) Comparison of HMR at energy windows of 159 keV ± 10% and 159 keV ± 7.5% in equipment from two vendors. (e) Comparison of HMR with matrices of 256 and 512 in equipment from four vendors. (f) Gamma cameras manufactured by six vendors (left) and seven types of collimators (right) were included in multicenter phantom image datasets. Error bars are SD of mean. *P < 0.05, ***P < 0.001, and ****P < 0.0001. Paired t-test for each comparison in (a), Wilcoxon singed rank test in (d), and Student t test in (e). CHR cardiac high-resolution, ELEGP extended low-energy general-purpose, HMR heart-to-mediastinum count ratio, LEAP low-energy all-purpose, LEGAP low-energy general-all-purpose, LEGP low-energy general-purpose, LEHR low-energy high-resolution, LMEGP low-medium-energy general-purpose, ME medium-energy, MEGAP medium-energy general-all-purpose, MEGP medium-energy general-purpose, MELP ME low-penetration.

Multicenter 123I-MIBG phantom image database

Among 1648 phantom image sets from 600 institutions accumulated in Japan between February 2009 and April 2017, 705 were eligible as multicenter phantom data (Fig. 2c). The imaging conditions of the 123I-MIBG phantom database were as follows: imaging matrices, 256 and 512; median pixel size, 2.21 (IQR 1.47–2.26) mm; and median acquisition time, 300 (60–900) s. Figure 2f shows the numbers and ratios (%) of gamma cameras and collimators included in the multicenter 123I-MIBG phantom image database.

Conversion coefficient of gamma camera-collimator combinations

We examined conversion coefficients in the following collimator categories (Fig. 3a): cardiac high-resolution (CHR), low-energy high-resolution (LEHR), low-energy general-purpose (LEGP), low-energy all-purpose (LEAP), low-energy general-all-purpose (LEGAP), extended low-energy general-purpose (ELEGP), low-medium-energy general-purpose (LMEGP,) medium-energy (ME), medium-energy general-purpose (MEGP), medium-energy general-all-purpose (MEGAP), and medium-energy low-penetration (MELP). The mean conversion coefficients were 0.545 ± 0.0268 for CHR (n = 21); 0.545 ± 0.0414 for LEHR (n = 167); 0.631 ± 0.0455 for LEGP, LEAP, and LEGAP (n = 57); 0.745 ± 0.0268 for ELEGP (n = 149); 0.823 ± 0.0437 for LMEGP (n = 102); 0.879 ± 0.0429 for ME, MEGP, and MEGAP (n = 179), and 0.894 ± 0.0349 for MELP (n = 29). These conversion coefficients evaluated in individual LEGP, MEGP, and MELP collimators did not significantly differ among vendors (Fig. 3b). Conversion coefficients were independent of manufacturers, being 0.519 ± 0.0296 for Siemens (n = 42), 0.527 ± 0.0326 for ADAC (n = 6), 0.546 ± 0.0433 for GE (n = 67), 0.552 ± 0.0347 for Toshiba (n = 24), and 0.583 ± 0.0315 for Picker (n = 25) in LEHR collimators; and 0.808 ± 0.0388 for Toshiba (n = 49) and 0.836 ± 0.0440 for Siemens (n = 53) in LMEGP collimators. Count statistics widely varied in the mediastinum determined from 123I-MIBG phantom images derived using LEHR collimators (Fig. 3c). When GE gamma cameras were combined with LEHR, ELEGP, and MEGP collimators, conversion coefficients significantly differed among the Millennium VG (0.506 ± 0.0318, n = 11), Millennium MG (0.567 ± 0.0480, n = 10) and Infinia (0.566 ± 0.0342, n = 26) with LEHR collimator; and between Discovery/Optima (0.740 ± 0.0262, n = 70) and Infinia (0.750 ± 0.0268, n = 79) with ELEGP collimator (Fig. 3d). Mean conversion coefficients of combinations of gamma cameras with collimators were determined using the multicenter 123I-MIBG phantom image database (Table 1).

Figure 3
figure 3

Conversion coefficients in multicenter 123I-MIBG phantom study. (a) Mean conversion coefficients in CHR, LEHR, LEGP/LEAP/LEGAP, ELEGP, LMEGP, ME/MEGP/MEGAP, and MELP collimator categories. (b) Conversion coefficients obtained using following collimators from LEHR, LEGP, LMEGP, MEGP, and MELP from 5, 3, 2, 5, and 2 vendors, respectively. (c) 123I-MIBG phantom image quality obtained using LEHR collimator from 5 vendors. (d) Conversion coefficients obtained from 4, 2, and 4 GE gamma cameras with LEHR, ELEGP, and MEGP collimators, respectively. Error bars are SD of means. *P < 0.05, **P < 0.01, ***P < 0.001, and ****P < 0.0001. Tukey–Kramer and Student t-tests in a, b and d. CHR cardiac high-resolution, ELEGP extended low-energy general-purpose, LEGP low-energy general-purpose, LEHR low-energy high-resolution, LMEGP low-medium-energy general-purpose, MEGP medium-energy general-purpose, MELP ME low-penetration.

Table 1 Average multicenter conversion coefficients for combinations of gamma cameras and collimators obtained from 705 image sets.

To confirm the flexibility of conversion coefficients to account for the variation in image acquisitions, we calculated average multicenter conversion coefficients from the phantom image database consisted of 1459 phantom image sets and compared these conversion coefficients and those from the database consisted of 705 image sets in Supplementary Table 1. The average conversion coefficients were statistically equivalent between two databases in CHR, LEHR, LEGP, ELEGP, LMEGP, and MEGP collimators except for MELP collimator. Consequently, we additionally determined the average multicenter conversion coefficients of combinations of gamma cameras with collimators in 1459 phantom image sets (Supplementary Table 2).

Clinical application of the method for standardizing 123I-MIBG scintigraphy

Normal HMR values of two hospitals were evaluated with or without standardization (Fig. 4a). The HMR for early and delayed 123I-MIBG scintigraphic images were corrected with institutional and multicenter conversion coefficients. The institutional conversion coefficients were 0.631 and 0.840 for hospitals A and B, respectively. The multicenter conversion coefficients were 0.621 and 0.838 for hospitals A and B, respectively. Although the averaged normal values in hospitals A and B significantly differed before standardization, the standardized normal values did not significantly differ between these hospitals in both early and delayed 123I-MIBG images (Fig. 4b,c). Furthermore, HMR corrected with institutional and multicenter conversion coefficients did not significantly differ.

Figure 4
figure 4

Clinical implementation of HMR standardization using conversion coefficients. (a) HMR standardization using institutional and multicenter conversion coefficients. (b) Uncorrected and corrected HMR in early 123I-MIBG images. (c) Uncorrected and corrected HMR in delayed 123I-MIBG images. Error bars are SD of means. *P < 0.05. Student t-test and paired t-test in (b) and (c). HMR heart-to-mediastinum count ratio, LEGP low-energy general-purpose, LMEGP low-medium-energy general-purpose.

To evaluate the accuracy of the calibration factor, we calculated net reclassification improvement34 (NRI) in normal subjects and patients with heart failure. Of the 12 patients (24 image sets) who were diagnosed with heart failure, classification was improved in four images when the calibration factor was applied to H/M ratio. However, of the 21 normal subjects (42 image sets), classification was only worsened in one image. The NRI in all subjects showed 14.3% (p = 0.099). Uncorrected and corrected HMR values for individual heart failure patients are shown in Supplementary Table 3.

Discussion

The major findings of the present simulation study were that collimator design, including collimator length, hole diameter, and septal thickness, affect 123I-MIBG image quality and HMR. The phantom studies revealed that the energy-window setting for 123I is an important factor for reducing HMR variation. However, HMR variations due to acquisition time, matrix size, and distance between gamma camera and phantom surfaces were limited. The conversion coefficients represented the characteristics of gamma cameras and collimators in the multicenter phantom study, and differed among manufacturers of collimators even those with the same name. Moreover, conversion coefficients significantly differed in some combinations of gamma cameras with collimators, even within individual manufacturers. In the clinical validation study, the standardization methodology yielded equivalent normal HMR values corrected with both institutional and multicenter conversion coefficients.

Monte Carlo simulation provided reasonable 123I-MIBG phantom planar images even considering the effects of the 529 keV photons added to the 159 keV photons. Although the fractions of 529 and159 keV photons of 123I were 1.39% and 83.3%, respectively, the high-energy photons hampered quantitative analysis of HMR and degrade 123I-MIBG planar image quality. Since these 529 keV photons easily penetrated thin collimator septa, a peak appeared in the energy spectrum with the LEHR collimator. Photons that penetrated the septum or scattered, also degraded 123I-MIBG planar image quality and reduced the HMR. In addition to septal thickness, collimator length and hole diameter are also important components that determine both 123I-MIBG image quality and HMR. Thick collimator septa, small hole diameters, and long collimators are most appropriate. Considering these effects of 529 keV photons, the MEGP collimator is adequate for 123I-MIBG imaging.

We previously determined conversion coefficients for several collimator groups in 225 experiments at 84 institutions26. A comparison of mean conversion coefficients between the present and previous phantom studies revealed the following: 0.55 ± 0.027 (n = 21) vs. 0.55 ± 0.02 (n = 9) for CHR (p = n.s.); 0.55 ± 0.041 (n = 167) vs. 0.55 ± 0.05 (n = 73) for LEHR (p = n.s.); 0.63 ± 0.046 (n = 57) vs. 0.65 ± 0.04 (n = 25) for LEGP, LEAP, and LEGAP (p = n.s.); 0.75 ± 0.027 (n = 149) vs. 0.75 ± 0.03 (n = 14) for ELEGP (p = n.s.); 0.82 ± 0.044 (n = 102) vs. 0.83 ± 0.05 (n = 46) for LMEGP (p = n.s.); 0.88 ± 0.043 (n = 179) vs. 0.88 ± 0.05 (n = 40) for ME, MEGP, and MEGAP (p = n.s.), and 0.89 ± 0.035 (n = 29) vs. 0.95 ± 0.04 (n = 14) for MELP (p < 0.0001), respectively. These results showed that only the MELP collimators significantly differed. The MELP collimators were manufactured by Toshiba Medical Systems Corporation and Siemens Healthineers. The present findings showed that although the conversion coefficients were equivalent between the two vendors (Fig. 3b), they were affected by the energy window setting of 123I (Fig. 2d). We applied a single energy window setting in the present study, whereas windows were set at 159 keV ± 10% and 159 keV ± 7.5% in the previous study. In Supplementary Table 1, when we compared two imaging databases acquired with single and various energy window settings, the average values of conversion coefficients were significantly different in for the MELP collimator condition.

Imaging conditions need standardization in addition to HMR for 123I-MIBG image acquisition. A tremendous amount of data regarding imaging protocols has been accumulated in the multicenter phantom image database with respect to the imaging matrix, energy window setting of 123I, and acquisition time. Moreover, 145 image datasets scatter-corrected using 123I dual-35,36 and triple-energy23,37 windows were included in the phantom database. Since the clinical usage of 123I-MIBG was approved in 1992 in Japan, many studies have investigated the HMR quantitation23,38,39,40,41, which has led to a wide variety of imaging conditions and correction methods. In addition, 123I-MIBG phantom experiments conducted in the Netherlands, Belgium, the UK, Austria, and Italy28,29,30,32,33 have also generated a considerable amount of data. Our phantom-based standardization methodology allows international comparisons of HMR.

Our study has several limitations. We used an institutional 123I-MIBG imaging procedure for the phantom scans. Therefore, the imaging procedure was not unified in the multicenter phantom study. However, eligible phantom image sets were selected according to the selection criteria of the multicenter 123I-MIBG phantom image database (Fig. 2c). Although we provided multicenter conversion coefficients to standardize HMR, they could only be used at the energy-window setting of 159 keV ± 10%. Since the number of conversion coefficients for the energy-window setting of 159 keV ± 7.5% was limited (Supplementary Table 4), additional multicenter 123I-MIBG phantom imaging studies are needed to accumulate conversion coefficients for this setting. The clinical validation study confirmed the feasibility of our method only for patients with normal 123I-MIBG distribution. A multicenter clinical trial should be conducted using institutional and multicenter conversion coefficients.

In conclusion, our standardization methodology for 123I-MIBG scintigraphy allowed determination of the characteristics of gamma cameras and collimator combinations in the multicenter phantom study. The clinical validation study showed that normal HMR derived from two different institutions did not significantly differ after standardization.

Material and methods

Quantitative analysis in 123I-MIBG imaging

The HMR was used to calculate cardiac 123I-MIBG accumulation in planar images as cardiac 123I-MIBG uptake divided by background of 123I-MIBG distribution using ROI positioned over the heart and over the upper mediastinum14. Fully and semi-automated ROI setting algorithms were applied to the phantom and clinical studies20, respectively. The HMR were automatically calculated using both algorithms.

Calibration phantom for planar 123I-MIBG imaging

A flat, polymethyl methacrylate phantom (Taisei Medical, Co. Ltd, Osaka, Japan) was developed to calibrate HMR under various imaging conditions with collimators23,33 (Fig. 1a). The volume (width × depth × height) of this phantom is 380 × 380 × 50 mm3, and it can mimic planar 123I-MIBG distribution in the heart, mediastinum, liver, lungs, and thyroid gland. Anterior and posterior planar 123I-MIBG images were acquired from both sides of the phantom. The designated HMR of the anterior and posterior views were 2.60 and 3.50, respectively. Details of the phantom design have been published elsewhere23.

Calibration factor for gamma camera and collimator system

The calibration factor was calculated from the HMR derived from anterior (HMRAnt) and posterior (HMRPost) planar 123I-MIBG phantom images using dedicated software and is defined as a conversion coefficient calculated as:

$$ {\text{Conversion}}\;{\text{coefficient}} = {{\left( {{{\left( {{\text{HMR}}_{{{\text{Ant}}}} + {\text{HMR}}_{{{\text{Post}}}} } \right)} \mathord{\left/ {\vphantom {{\left( {{\text{HMR}}_{{{\text{Ant}}}} + {\text{HMR}}_{{{\text{Post}}}} } \right)} {{2} - {1}}}} \right. \kern-\nulldelimiterspace} {{2} - {1}}}} \right)} \mathord{\left/ {\vphantom {{\left( {{{\left( {{\text{HMR}}_{{{\text{Ant}}}} + {\text{HMR}}_{{{\text{Post}}}} } \right)} \mathord{\left/ {\vphantom {{\left( {{\text{HMR}}_{{{\text{Ant}}}} + {\text{HMR}}_{{{\text{Post}}}} } \right)} {{2} - {1}}}} \right. \kern-\nulldelimiterspace} {{2} - {1}}}} \right)} {\left( {{{\left( {{2}.{6}0 + {3}.{5}0} \right)} \mathord{\left/ {\vphantom {{\left( {{2}.{6}0 + {3}.{5}0} \right)} {{2} - {1}}}} \right. \kern-\nulldelimiterspace} {{2} - {1}}}} \right)}}} \right. \kern-\nulldelimiterspace} {\left( {{{\left( {{2}.{6}0 + {3}.{5}0} \right)} \mathord{\left/ {\vphantom {{\left( {{2}.{6}0 + {3}.{5}0} \right)} {{2} - {1}}}} \right. \kern-\nulldelimiterspace} {{2} - {1}}}} \right)}}, $$

where, 2.60 and 3.50 are the respective designated HMR in anterior and posterior views of the calibration phantom. An institutional conversion coefficient (CCi) was derived from the anterior and posterior phantom images after image acquisition under institutional 123I-MIBG planar imaging conditions.

Conversion to standardized HMR using the calibration factor

Since the European Association Nuclear Medicine and the European Council of Nuclear Cardiology have proposed using MEGP collimators for123I-MIBG imaging16, all HMR were converted into that for a MEGP collimator. A standardized conversion coefficient (CCstd) has already been defined as 0.8826. The CCi and CCstd allow for the conversion of all institutional HMR (HMRi) into standardized HMR (HMRstd) using the equation26:

$$ {\text{HMR}}_{{{\text{std}}}} = {\text{CC}}_{{{\text{std}}}} /{\text{CC}}_{{\text{i}}} \times \left( {{\text{HMR}}_{{\text{i}}} - {1}} \right) + {1}. $$

Monte Carlo simulation for 123I-MIBG imaging

A digital phantom image was created from the acrylic calibration phantom image acquired using X-ray CT. Density and source maps of the phantom were generated for the following simulations. The simulation of imaging nuclear detectors (SIMIND; Lund University, Lund, Sweden) Monte Carlo program42 allowed 123I-MIBG planar imaging simulations using various types of collimators. Combinations of the following collimator conditions were examined: collimator hole diameters of 1, 2, 3, 4, and 5 mm; septal thicknesses of 0.10, 0.45, 0.80, 1.15, and 1.50 mm, and collimator lengths of 20, 30, 40, 50, and 60 mm. We generated 123I-MIBG planar images using a total of 8.65 × 108 photons. The number of detected photons ranged from 418 to 15,321 per second. Planar MIBG imaging was simulated with 256 × 256 matrices, and the energy window of 123I was set at 159 keV ± 7.5%.

HMR variations during various acquisition periods

Anterior MIBG planar images were acquired using a dual-head gamma camera (e.cam; Toshiba Medical Systems, Tokyo, Japan) and an LMEGP collimator over periods of 1, 2, 3, 4, 5, 7, and 10 min from the phantom containing 55.5 MBq of 123I-MIBG. Five image datasets were acquired during each period. Planar imaging was conducted with a 256 × 256 matrix, and a pixel size of 1.65 mm. A photopeak window of 123I was centered at 159 keV with a 15% energy window. This study proceeded at Narita Memorial Hospital, Aichi, Japan.

HMR variations according to distance between gamma camera location and phantom surface

Anterior planar images were acquired from the phantom containing 55.5 MBq of 123I-MIBG. The phantom was equipped with a Symbia T6 dual-head gamma camera (Siemens Healthineers, Erlangen, Germany) with LEHR and LMEGP collimators. The gamma camera positions were set at 10, 30, 50, 70, 90, and 180 mm from the phantom surface for both LEHR and LMEGP collimators. The number of acquired counts was consistently 1.0 × 106. Planar imaging proceeded with a 256 × 256 matrix, and 2.40-mm pixels. The photopeak window of 123I was centered at 159 keV with a 20% energy window. This study proceeded at Kanazawa University Hospital, Kanazawa, Japan.

Multicenter 123I-MIBG phantom image database

We accumulated 1648 phantom image sets from 600 institutions in Japan between February 2009 and April 2017. The six gamma camera manufacturers selected for this database were ADAC Laboratories (Milpitas, CA, USA), GE Healthcare (Waukesha, WI, USA), Philips Medical system (Milpitas, CA, USA), Picker Corporation (Cleveland, OH, USA), Toshiba Medical Systems Corporation, and Siemens Healthineers. We excluded 145 phantom image datasets for scatter correction of the 123I dual- and triple-energy windows and 79 others acquired with 64 and 128 matrices. We excluded 297 minor conditions of the energy window setting (keV) for 123I-MIBG image acquisition as follows: 154 ± 10% (n = 4), 155 ± 10% (n = 4), 156 ± 7.5% (n = 1), 156 ± 10% (n = 32), 157 ± 10% (n = 28), 158 ± 10% (n = 101), 158 ± 10.5% (n = 1), 158 ± 12% (n = 2), 158 k ± 7% (n = 1), 158 ± 7.5% (n = 23), 159 ± 10.5% (n = 3), 159 ± 12% (n = 1), 159 ± 6.3% (n = 2), 159 ± 8% (n = 1), 159 ± 9% (n = 1), 160 ± 10% (n = 80), 160 ± 7.5% (n = 8), and missing data (n = 4). We excluded the following 14 minor collimators and gamma cameras: 123I (n = 1), Cardiac (n = 2), LELP (n = 1), LPHR (n = 1), and MEDIUM (n = 2) manufactured by Siemens; high energy (HE) GP (n = 1) by GE; Cardio (n = 1), and MEHR (n = 3) by Toshiba; LE ultra-high resolution (n = 1) by Picker; and RC-1500I gamma camera with LEGP collimator (n = 1) by Hitachi Medico Corporation, Chiba, Japan. Twenty-two failed phantom experiments were excluded. Mean conversion coefficients were computed for phantom images classified according to collimator groups as CHR; LEHR; LEGP, LEAP, and LEGAP; ELEGP; LMEGP; ME, MEGP, and MEGAP; and MELP. Twenty phantom image datasets were excluded due to outliers of the mean conversion coefficients based on box and whisker plots. We finally excluded 366 images acquired with the energy set at 159 keV ± 7.5%.

HMR variation according to energy window

The HMR values obtained with the energy window setting of 159 keV ± 10% and 159 keV ± 7.5% were compared in the multicenter phantom image database comprising 1071 image datasets from 482 institutions. Since the number of image datasets was insufficient for comparisons of energy window settings of 159 keV ± 10% and 159 keV ± 7.5% in ADAC (12 vs. 0, respectively), GE (351 vs. 0, respectively), Philips (42 vs. 2, respectively), and Picker (80 vs. 3, respectively), we compared the datasets from Siemens (132 vs. 227, respectively) and Toshiba (88 vs. 134, respectively).

HMR variation according to imaging matrix

The HMR from 256 and 512 matrices were compared in the multicenter phantom image database that comprised 705 image datasets from 309 institutions obtained with an energy window setting of 159 keV ± 10%. The image datasets from GE (n = 351), Picker (n = 80), Siemens (n = 132) and Toshiba (n = 88) were compared.

Conversion coefficient for combinations of gamma cameras and collimators

Based on the multicenter 123I-MIBG phantom image database with the image selection criteria, mean conversion coefficients for combinations of gamma cameras and collimators were determined using 705 image sets from 309 institutions. The 9 types of gamma cameras were Discovery/Optima (n = 121), Infinia (n = 151), Millennium MG (n = 33), and Millennium VG (n = 33) manufactured by GE; BrightView (n = 36) by Philips; PRISM (n = 73) by Picker; e.cam/Symbia (n = 110) and EvoExcel/IntevoExcel (n = 12) by Siemens, and e.cam/Symbia (n = 79) by Toshiba. Additional multicenter 123I-MIBG phantom image datasets for the EvoExcel/IntevoExcel system were accumulated due to the absence of these image datasets in the phantom database. The number of additional image datasets was 27 from 14 institutions.

When the multicenter 123I-MIBG phantom images were selected based on two imaging conditions of whole energy-window setting and imaging matrice, mean conversion coefficients for combinations of gamma cameras and collimators were determined using 1,459 image sets from 593 institutions (Supplementary Fig. 1). The 12 types of gamma cameras were Forte (n = 40) manufactured by ADAC; Discovery/Optima (n = 128), Infinia (n = 173), Millennium MG (n = 34), and Millennium VG (n = 51) manufactured by GE; BrightView (n = 95) by Philips; PRISM (n = 92) by Picker; e.cam/Symbia (n = 425) and EvoExcel/IntevoExcel (n = 34) by Siemens, and e.cam/Symbia (n = 298), GCA 7100/7200 (n = 26), and GCA 9300 (n = 23) by Toshiba.

Clinical validation image dataset

We applied the calibration method of HMR to an anonymized clinical image dataset. The Japanese Society of Nuclear Medicine working group (JSNM-WG) activity collected planar images from patients who were determined as normal cardiac 123I-MIBG uptake in 2007 and 201543,44,45. All personal information of 123I-MIBG images was excluded and anonymized 123I-MIBG images formatted with digital imaging and communications in medicine were provided as a research database. We obtained the permission for the secondary use of the databases as a research purpose in accordance with JSNM-WG regulation. Details of the patient characteristics have been published elsewhere43. In the anonymized clinical image dataset, male and female (n = 8 each) 123I-MIBG images were collected from hospital A, and eight male and six female images were collected from hospital B (Fig. 4a). The gamma cameras and collimators were PRISM IRIX and LEGP manufactured by Shimadzu (Picker Corp., Cleveland, Ohio, USA/Shimadzu Corp., Kyoto, Japan) at hospital A, respectively, and e.cam and LMEGP manufactured by Siemens at hospital B, respectively. The acquisition time, imaging matrix, and energy-window setting for 123I were 5 min, 256, and 159 keV ± 10%, respectively at both hospitals. Early and delayed planar images were acquired at 15 min and 4 h after injecting 123I-MIBG in both hospitals, respectively. The institutional conversion coefficients were obtained from 123I-MIBG phantom scans at each hospital. Mean multicenter conversion coefficients matching the combinations of gamma cameras and collimators at hospitals A and B were calculated from the 123I-MIBG phantom image database.

For the calculation of net reclassification improvement with the standardization procedure in clinical subjects, we used clinical image datasets collected from the hospital A (16 subjects, 32 images for early and delayed conditions) and Kanazawa University Hospital, Kanazawa, Japan (17 subjects, 34 images). Regarding clinical 123I-MIBG imaging condition in Kanazawa University Hospital, the acquisition time, imaging matrix, and energy-window setting for 123I were 5 min, 256, and 159 keV ± 10%, respectively. Early and delayed planar images using an LMEGP collimator were acquired at 20 min and 3 h after injecting 123I-MIBG. Of the 33 patients, 12 patients were diagnosed with heart failure, and 21 subjects were diagnosed with a normal heart. The reclassification table was generated to compare standardized HMR values using multicenter conversion coefficients with uncorrected HMR values. These HMR values were classified into two patient groups using the thresholds of 2.17 and 2.49 that were determined by the receiver operating characteristic analysis in unstandardized and standardized conditions, respectively.

Statistical analysis

All continuous values are expressed as means ± SD. The Shapiro–Wilk testing for the evaluation of normality was performed in the continuous dataset. Differences in continuous variables were analyzed using Student t-tests and Wilcoxon singed rank tests. Multiple comparisons of continuous variables were assessed using Tukey–Kramer tests. Differences in paired continuous data were analyzed using paired t-tests. All statistical tests were two-tailed, and values with p < 0.05 were considered significant. All data were statistically analyzed using JMP version 11.2.1 (SAS Institute Inc., Cary, NC, USA).