Measurement and Evaluation of Quantitative Performance of PET/CT Images before a Multicenter Clinical Trial

To ensure the reliability of the planned multi-center clinical trial, we assessed the consistence and comparability of the quantitative parameters of the eight PET/CT units that will be used in this trial. PET/CT images were scanned using a PET NEMA image quality phantom (Biodex) on the eight units of Discovery PET/CT 690 from GE Healthcare. The scanning parameters were the same with the ones to be used in the planned trial. The 18F-NaF concentration in the background was 5.3 kBq/ml, while the ones in the spheres of diameter 37 mm, 22 mm, 17 mm and 10 mm were 8:1 as to that of the background and the ones in the spheres of diameter 28 mm and 13 mm were 0 kBq/ml. The consistency of hot sphere recovery coefficient (HRC), cold sphere recovery coefficient (CRC), hot sphere contrast (QH) and cold sphere contrast (Qc) among these 8 PET/CTs was analyzed. The variation of the main quantitative parameters of the eight PET/CT systems was within 10%, which is acceptable for the clinical trial.


Results
Recovery coefficient. The HRC (hot sphere recovery coefficient) and CRC (cold sphere recovery coefficient), as well as the mean value, standard deviation (SD) value and variability (variability = SD/mean x 100%, generally named coefficient of variability, CV) of the eight PET/CT systems are presented in Table 1. The HRC and CRC distributions with different diameter of spheres are graphically shown in Figs 1-3. The variation of HRC mean (HRC of mean uptake) and CRC mean (CRC of mean uptake) were both within 10%. The corresponding value of HRC max (HRC of maximum uptake) was less than 10% in hot lesions with diameter ≥17 mm, but more than 10% in hot lesions with diameter of 10 mm in the eight PET/CT systems. Along with the reduction of the   sphere diameter, HRC mean decreased, while CRC mean increased. HRC max increased along with the increasing diameter until 17 mm, and reached the plateau at 1 thereafter.
Contrast. Q H (hot sphere contrast) and Q c (cold sphere contrast) were calculated by employing equations 4-5.
The Q H , Q c , as well as the mean value, SD value and variability for the eight PET/CT systems are presented in Table 2.
The Q H and Q c distributions with different diameter of spheres are graphically shown in Figs 4 and 5. The variation of Q H was less than 10% in hot lesions with diameter ≥17 mm, but more than 10% in hot lesions with diameter of 10 mm. The variation of Q c in 13 mm diameter sphere was also more than 10% in the eight PET/CT systems.    Table 3. The N distributions with different diameter of spheres are graphically shown in Fig. 6. The variation of N was less than 10% in hot lesions with diameter ≥17 mm, but more than 10% in hot lesions with diameter of 10 mm and in cold lesions with diameter of 13 mm.

Discussion
NEMA NU2 image quality phantom simulates the clinical hot and cold lesions and reflects the clinical image quality of PET/CT system. NEMA NU2 specifies two largest spheres (diameter of 37 mm and 28 mm) as the cold lesions and the other spheres (diameter of 22 mm, 17 mm, 13 mm and 10 mm) as the hot lesions. Considering the features of our planned clinical trial, to simulate the performances of hot tumor lesions with a variety of sizes, we used the two spheres with diameter of 28 mm and 13 mm for cold lesions and the others for hot lesions. The recovery coefficient is a simple and intuitive indicator of partial volume effects. Due to the inherent limitations in theory and detection, the spatial resolution of PET is worse than CT (Computed Tomography) or MRI (Magnetic Resonance Imaging) system. This results in severe partial volume effect on PET images 2,3 . Ideally, recovery coefficient is 1 for hot lesions (most malignant tumors in PET) and 0 for cold lesions. However, the measured activity of hot lesion is usually lower than the actual value due to the partial volume effect. The cold lesion will lead to the opposite result. The smaller the size of the lesion is, the greater the partial volume effect will be 2,3 .
A study showed that when the sphere diameter is equal to three times the spatial resolution of the system expressed as the Full Width at Half Maximum (FWHM), the maximum pixel value of the image is theoretically 99.4% of the true value 2 . However, when the image is superimposed by noise, the measured activity can be higher than the actual one 3,4 . For a hot lesion whose size is close to the FWHM of the imaging system, the maximum activity concentration measured on the image can be reduced to 29% of the actual one 2 . This is consistent with our results. As the data showed in Table 1, in most cases, HRC max results were larger than 1, except in the sphere with diameter of 10 mm. In addition, the variation of HRC max was much larger than that of HRC mean . Therefore, the quantitative data related to the average value in the region-of-interest, such as SUV mean or SUV peak 5 will be favorable in the following clinical trial, instead of SUV max [6][7][8][9] . In our study, HRCmax increased along with the increasing diameter until 17 mm probably because of decreasing partial volume effect. When diameter became larger than  Table 3. Background variability in eight PET/CT systems. Theoretically, there can be a platform area in the center of large lesions with no "activity loss". The impacts of partial volume effect mainly on the edge of the lesion in this case and causes "activity loss" 2,3 . Our result showed that the HRC mean of all hot spheres with diameter of 37 mm, 22 mm, 17 mm and 10 mm were smaller than 1. In addition, the smaller the hot lesion was, the lower the HRC mean would be.
Partial volume effect, caused by the finite spatial resolution, is the main cause of error in PET quantification. As Figs 1 and 2 show, the lines of the eight units are almost parallel. This indicates that the variation among them was not from random error and statistical fluctuation of counts. This probably is caused by the differences in the accuracy of cross calibration between PET/CT system and the accessories used as well as the differences in spatial resolution in these units. Above all, the variation of the main quantitative data was within 10%, which is acceptable in the following multicenter clinical trial.
Our results showed also that the recovery capacity in small lesions was lower than the one in large lesions. In addition, the variation in small lesions among the eight units was larger. This will be an important cause of error in the following clinical trial. In this case, partial volume effect correction is necessary 10,11 .
With an ideal mean value of 100%, the contrast of the lesion in PET images reflects the relationship between the lesion and the surrounding background. This represents the overall image quality and the ability to detect small lesions of a PET system. As Table 2 and Figs 4 and 5 show, the contrast of hot and cold lesions decreased as the size of the lesion became smaller. The contrast of the biggest hot lesion (diameter 37 mm), and the smallest hot lesion (diameter 10 mm) were only 79.08% and 32.31% respectively. The smaller the lesion, the worse the detection ability of the device. This is consistent with other studies 12 . The contrasts were different among the eight devices. For hot spheres with diameter of 10 mm, the variation was up to 26.3%. This indicates that the small lesion detectability among these eight units are different. Therefore, more attention should be taken in the data analysis related to small lesion detectability in the following clinical trials. Background variability reflects the noise level of the image. The greater the variability value, the higher the image noise. Our results showed that the background variability increased as the ROIs became smaller. The background variability is equal to the ratio of the SD to the mean value in the background ROI, and SD is equal to the square root of the mean value. In other words, the background variability is equal to the reciprocal of the square root of the mean value in the ROI 13 . When diameter of ROI was not more than 13 mm, the background variability among these eight devices was greater than 10%, suggesting that, in order to ensure the consistence in the following clinical trial, the background ROI should be as large as possible (diameter greater than or equal to 17 mm) when target-to-background ratio related parameters are investigated.
In summary, the variation of major quantitative performances was within 10% among the eight PET/CT units, which is acceptable for the following multicenter clinical trial. However, there are still some aspects that need to pay more attention in the setup of QC protocol before the clinical trial. Firstly, the quantitative data related to the average voxel value, such as SUV mean or SUV peak will be favorable, instead of SUV max . Secondly, for lesions with diameter ≤13 mm, the analysis of quantitative data needs to be careful. Thirdly, more attention should be taken in the data analyses related to small lesion detectability. Lastly, the background ROI should be drawn as large as possible when target-to-background ratio related parameters are investigated.  of the phantom, a cylindrical insert with an outer diameter of 50 mm and an inner length of 180 mm was placed. Low density (0.3 g/ml) substance was filled into the cylinder insert to simulate the lung tissue. Six hollow spheres with thickness of 1 mm were evenly distributed around the lung insert. The inner diameters of these spheres were 37 mm, 28 mm, 22 mm, 17 mm, 13 mm and 10 mm, respectively.

Methods
To mitigate the operating discrepancy, one operator performed all the phantom preparation, imaging and data analysis procedures. The 18 F-NaF concentration in the background was 5.3 kBq/ml, while the ones in the spheres of diameter 37 mm, 22 mm, 17 mm and 10 mm were 8:1 as to that of the background to simulate the hot lesion, and the ones in the spheres of diameter 28 mm and 13 mm were 0 kBq/ml to simulate the cold lesion.
The image acquisition and reconstruction protocols of the planned clinical trial will be the same with this investigation. PET images were acquired (2 beds, 2 min/bed), then reconstructed with Ordered Subset Expectation Maximization (OSEM) algorithm, 24 subsets, 2 iterations, and Gaussian post filter with full width at half maximum 6.4 mm, and with Time-of-Flight (TOF) and Point Spread Function (PSF) technologies. A z-axis 1:2:1 filter was used. The slice thickness of the reconstructed PET image was 3.27 mm. CT images were acquired with 140 kVp tube voltage and automatic tube current (15-180 mA), rotation speed 0.5 s and pitch of 0.984. Adaptive Statistical Iterative Reconstruction (ASiR, GE Healthcare) reconstruction algorithm was used for the reconstruction of CT data. The CT images were used for attenuation correction of PET images and image fusion with PET. Image analysis. The image analyses were performed using Xeleris Workstation (GE Healthcare). For recovery coefficient analysis, the sphere Volumes-of-Interest (VOIs) were delineated in the center of the sphere on PET images. The radioactive uptake concentration (kBq/ml), maximum pixel counts in hot lesions (C max,H,j ), average pixel counts in hot lesions (C mean,H,j ), as well as in cold lesions (C mean,C,j ) and their SDwere assessed, where j represents for the number of the sphere.
For contrast and background variation analyses, we used the method of NEMA NU2 1 standard. The Regions-of-Interest (ROIs) were delineated on the transaxial CT images in the center of the spheres, and then projected on the PET images. The mean counts (C B,j ) and SD B, j of ROIs with the same diameter were recorded.
HRC and CRC were calculated by employing Equations 1-3 14,15 . While Q H , Q c and background variability (N) were assessed by employing Equations 4-6 1 .