Convolutional neural network-based reconstruction for positronium annihilation localization

Jegal, Jin; Jeong, Dongwoo; Seo, Eun-Suk; Park, HyeoungWoo; Kim, Hongjoo

doi:10.1038/s41598-022-11972-5

Download PDF

Article
Open access
Published: 20 May 2022

Convolutional neural network-based reconstruction for positronium annihilation localization

Jin Jegal¹,
Dongwoo Jeong¹,
Eun-Suk Seo²,
HyeoungWoo Park¹ &
…
Hongjoo Kim¹

Scientific Reports volume 12, Article number: 8531 (2022) Cite this article

1792 Accesses
3 Citations
1 Altmetric
Metrics details

Subjects

Abstract

A novel hermetic detector composed of 200 bismuth germanium oxide crystal scintillators and 393 channel silicon photomultipliers has been developed for positronium (Ps) annihilation studies. This compact 4π detector is capable of simultaneously detecting γ-ray decay in all directions, enabling not only the study of visible and invisible exotic decay processes but also tumor localization in positron emission tomography for small animals. In this study, we investigate the use of a convolutional neural network (CNN) for the localization of Ps annihilation synonymous with tumor localization. Two-γ decay systems of the Ps annihilation from ²²Na and ¹⁸F radioactive sources are simulated using a GEANT4 simulation. The simulated datasets are preprocessed by applying energy cutoffs. The spatial error in the XY plane from the CNN is compared to that from the classical weighted k-means algorithm centroiding, and the feasibility of CNN-based Ps annihilation reconstruction with tumor localization is discussed.

Segment anything in medical images

Article Open access 22 January 2024

Jun Ma, Yuting He, … Bo Wang

Microenvironmental reorganization in brain tumors following radiotherapy and recurrence revealed by hyperplexed immunofluorescence imaging

Article Open access 15 April 2024

Spencer S. Watson, Benoit Duc, … Johanna A. Joyce

nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation

Article 07 December 2020

Fabian Isensee, Paul F. Jaeger, … Klaus H. Maier-Hein

Introduction

A positronium (Ps) is a quasi-stable bound system of an electron and its anti-particle, a positron. The two particles approach closer and closer to each other, turning into γ-rays until they finally annihilate one another. Ps annihilation is not only used for fundamental physical research, such as standard model verification¹ and new physics model discovery², but also applied research, such as Ps annihilation lifetime spectroscopy (PALS)^3,4 and positron emission tomography (PET)⁵. In these studies, Ps annihilation localization is one of the important factors because since Ps annihilation cannot be directly measured, in most cases, the annihilation position can only be reconstructed through the measurement of subsequent γ-ray energy. Therefore, great effort is required to arbitrarily control the annihilation position or to reconstruct based on the subsequent measurement.

Recently, convolutional neural network (CNN)-based data reconstruction has shown especially successful performance when utilized for charged particle tracking with good precision in accelerator and calorimeter experiments^6,7,8. Additionally, CNNs have effectively overcome the spatial resolution limitations of PET, which is essentially limited by the size of the detector array elements, such as the crystal scintillator and readout pixels used in medical imaging⁹. These high-performance results can be obtained through Monte Carlo (MC) simulations, which can generate sufficient training data and represent the geometry of the detector well. Consequently, CNNs are expected to be suitable for the reconstruction of Ps annihilation and background noise cutoff at the Kyungpook National University Advanced Positronium Annihilation Experiment (KAPAE) detector.

The KAPAE detector, which is a 4π detector, comprises 200 Bi₄Ge₃O₁₂ (BGO) crystal scintillators and 393 channels of silicon photomultipliers (SiPMs). In the KAPAE detector, a ²²Na radioactive source is used to generate positrons from β+ decay. The instrument configuration is optimized to trigger on positrons by varying the polyethylene naphthalate (PEN) film plastic scintillator thickness¹⁰. The KAPAE detector aims to study CPT violation in Ps annihilation physics^10,11. Based on the relative spin orientations, the Ps ground state has two possible configurations: the triplet state (³S₁), or ortho positronium (o-Ps), and the singlet state (¹S₀), or para positronium (p-Ps). Due to C-parity conservation, p-Ps and o-Ps decay to even and odd numbers of photons, respectively. Since these processes possess different C-parity values, the precise distinction of p-Ps and o-Ps is important to test discrete symmetries of C, CP, and CPT in the lepton sector¹².

In this study, data reconstruction based on a CNN focusing on a back-to-back 2-γ decay system is conducted. The 2-γ energies are deposited in the surrounding BGO scintillators, and this process is simulated using the GEANT4 simulation toolkit¹³. The simulation data are used to produce datasets for reconstructions based on the CNN and weighted k-means algorithm¹⁴. The k-means clustering algorithm is a conventional method to typically determine the clustering centroid for uncategorical datasets, and it has been utilized in the abovementioned fields of CNN applications. Through this 2-γ decay system data reconstruction, we can distinguish the p-Ps signal from the o-Ps signal for the background noise cutoff and detect o-Ps events more correctly with high efficiency. Additionally, note that the size of the KAPAE detector is compact (150 × 150 × 150 mm³), and it can simultaneously detect γ-ray decays in all directions. This feature makes it possible to utilize the KAPAE detector with an ¹⁸F radioactive source in PET application for small animal tumor localization¹⁵.

Results

Energy cutoff criterion

Four sets of data corresponding to various energy cuts centered at 511 keV are used to train the data. In Table 1, “1σ” and “2σ” are the energy cut between ± 1σ and ± 2σ centered at 511 keV, and “ > 2σ” is the energy cutoff ranges of more than 2σ, 0.6 MeV. The number of events varies corresponding to the energy cutoff ranges, as shown in Fig. 1. However, as we maintain the batch size to 3% of the dataset for the same number of interactions in each case, these four sets can be compared to determine how different energy cut ranges affect the reconstruction.

Table 1 XY RMSE [mm] of Ps annihilation localization using CNN-based reconstruction depending on the energy cutoff criteria.

Full size table

The root mean square error (RMSE) of Ps annihilation localization using CNN depending on the energy cut criterion and radioactive sources is summarized in Table 1. The RMSE is calculated by comparing the XY coordinates of Ps in each event from the GEANT4 simulation (initial) with the reconstruction position based on the CNN (predicted). In the test data, 100,000 events are used for calculating the RMSE. Since the X and Y coordinates are not correlated, the RMSE is calculated all at once. As shown in Table 1, when σ is increased to include the low energy contribution from Compton scattering, the spatial error in the XY plane decreases. In addition, when 1.28 MeV γ-rays from the ²²Na source is included, the RMSE is further decreased. However, the reconstructed position is closer to the initial value for the smaller σ cases where 1.28 MeV γ-ray events are not included, as shown in Fig. 2. This means that the inclusion of the correlated information in the 2-γ decay is helpful for training the CNN model. For the 1σ and 2σ cases, there are peaks between ± 5.0 mm and ± 7.5 mm, resulting in lower accuracy than other cases. Therefore, the optimized energy cutoff criterion for the back-to-back 2-γ decay system discrimination modeling is determined as “ > 2σ”, where the spatial error in XY is 4.19 mm for ²²Na and 3.93 mm for ¹⁸F.

The CNN performance

The Ps annihilation localization performance of the CNN is compared with that of classical centroiding weighted k-means algorithm. The “ > 2σ” energy cutoff is used to achieve the best accuracy in the XY plane. The RMSE between the simulated XY position of Ps annihilation and that predicted by CNN is calculated. It is compared with the RMSE for the weighted k-means clustering algorithm in Table 2. The RMSE of the CNN with our proposed architecture is 2.2 times smaller than that of the weighted k-means clustering algorithm.

Table 2 The RMSE [mm] of Ps localization based on the weighted k-means algorithm and the CNN-based reconstruction corresponding to the radioactive source.

Full size table

Figure 3 shows the Ps annihilation localization of the CNN and weighted k-means clustering; they are compared with the initial position from the GEANT4 simulation data in the case of ¹⁸F. In the GEANT4 simulation data, most positrons are generated from the β + decay of a radioactive source. Since the radioactive source is located at the center of the trigger system, (0,0) mm in the XY plane, most of the Ps annihilate in the trigger space within ± 7.5 mm (XY) (Fig. 4). In addition, an increase in the number of Ps annihilations at the edge of the trigger space near ± 7.5 mm (XY) is due to positron diffraction in the solid BGO crystal¹⁶. The weighted k-means algorithm is repeatedly trained to determine the clustering centroid until the distance between the centroid and its adjacent components is minimized. In contrast, the CNN is optimized to directly minimize the error by comparing the initial position data until the results converge to a minimum loss index. The CNN can adapt to various circumstances even with irregular dataset information by updating weight. Consequently, the CNN results (Fig. 3) show that the reconstructed Ps annihilation localization is more accurate than the weighted k-means clustering-based localization.

Discussion

The classical method of reconstructing Ps annihilation sites has become saturated in terms of PET medical imaging. In recent years, the PET/CT fusion equipment has been developed¹⁷, and depth-of-interaction encoding technology has been applied to the time-of-flight PET¹⁸. To dramatically overcome these technical limitations, the application of deep learning is a natural procedure. In particular, an image-optimized CNN is the best candidate method. Studies starting from this point of view had data predate processing, overfitting, and errors in the interpretation of learning results. This may be an error in the interpretation of the result because the principle of the detector is not understood. We conducted a study with a high degree of understanding of the learning results using a self-developed KAPAE detector as a model.

Finally, this study performed CNN-based Ps annihilation reconstruction using the KAPAE detector and compared the results with a conventional weighted k-means algorithm. The detector geometry and the back-to-back 2-γ decay system produced through p-Ps annihilation are simulated using the GEANT4 simulation, which generates enough data for training. ²²Na and ¹⁸F radioactive sources are used for CPT violation study and PET application, respectively. The energy cutoff criterion of over 2σ is determined using the proposed CNN architecture and by comparing the RMSEs in each radioactive source. For ²²Na, using the weighted k-means algorithm and the CNN, Ps annihilation is reconstructed with RMSEs of 9.42 mm and 4.19 mm, respectively. For ¹⁸F, using the weighted k-means algorithm and the CNN, Ps annihilation is reconstructed with RMSEs of 8.49 and 3.93 mm, respectively. ²²Na has a relatively long-term half-life compared to ¹⁸F and is used for rare decay or CPT tests, which require several months to several years of measurement time. ¹⁸F has a short half-life, emits γ-ryas in large amounts in several minutes and is useful for diagnostic imaging and techniques. Comparing the results of the weighted k-means algorithm and the CNN together, the RMSE of ¹⁸F is smaller. This is due to the absence of 1.28 MeV γ-rays in the ²²Na decay, which shows that ¹⁸F can be trained well without preprocessing by applying energy cutoffs and is advantageous for PET application when reconstructing Ps annihilation localization compared to ²²Na. In conclusion, the proposed CNN architecture achieved approximately two times better spatial resolution in the XY plane compared to the weighted k-means algorithm. Thus, the proposed CNN architecture can be applied to distinguish p-Ps from o-Ps for CPT violation studies from subsequent γ energy deposited on BGO scintillators as well as to localize the tumor position in PET for small animals.

Methods

Monte Carlo simulation

Figure 4 shows the GEANT4 simulation of the radioactive decay from a ²²Na or ¹⁸F point source in the KAPAE detector. The simulated detector comprises 192 BGO scintillators with dimensions of 7.5 × 7.5 × 150 mm³ and 8 endcap BGO scintillators with dimensions of 7.5 × 7.5 × 50 mm³ surrounding the trigger system. Each scintillator is covered by a VM2000 reflector of 75 μm thickness. A point source is placed at the center of the PEN film plastic scintillator, represented by the yellow box in the middle of Fig. 4. The KAPAE detector is filled with nitrogen gas, and a silica aerogel is used for the generation of o-Ps annihilation for minimization of a pick-off effect¹⁰. We simulate the 2-γ system of p-Ps signals only because we can discriminate o-Ps in real data by developing a p-Ps data reconstruction model. The real data for the γ energy spectrum from the BGO scintillator are colleted using SiPMs attached to both ends of the KAPAE detector. Instead of replicating all SiPMs in the simulation, only the SiPM photon detection efficiency from the experiment is considered in the simulation.

One million events of the 2-γ decay system of p-Ps are initially generated for discrimination with o-Ps in real data, and only events passing through the trigger system are selected. The pseudo data are processed using the scintillation energy resolution in the BGO scintillator scintillation from a preliminary experiment utilizing the SiPMs. The total statistical noise fluctuation (σ/E) is proportional to (1/$\sqrt{\mathrm{n}}$), given by the Poisson distribution as follows:

$$\frac{\sigma}{E} = k\frac{1}{\sqrt{n}}$$

(1)

where k is the proportional constant, n = L × E × PDE is the number of photons, L is the absolute light yield of the scintillator (photons/MeV), E is the energy (MeV), and PDE is the scintillation light detection efficiency of the SiPM. The pseudo data of ²²Na and ¹⁸F are processed using Eq. (1). The energy spectra of 511 keV γ-rays from both sources are shown in Fig. 5. The full width at half maximum (FWHM) is 24% for each source.

The pseudo data are written in Python for the machine learning framework and processed to a matrix of pixels corresponding to the BGO scintillator array position mapping. The data are restructured to a 14 × 14 matrix for the top view of the detector. The central 4 × 4 data are from the deposited energy of the short 8 endcap BGO scintillators. The 4 endcap scintillators above the trigger system are paired with the 4 endcap scintillators below the trigger system. The energy deposit sum of each pair forms the central 4 × 4 data. Four sets of data corresponding to various energy cuts centered at 511 keV are used to train the data (Table 3).

Table 3 The number of events corresponding to the energy cutoff ranges.

Full size table

Weighted K-means algorithm

To evaluate the CNN performance, the conventional weighted k-means algorithm¹⁹ is also utilized. The k-means algorithm employs an iterative approach to group the data into k predetermined clusters by minimizing the sum of squared errors (SSE). The SSE is obtained as follows:

$${\text {SSE}} = \mathop \sum \limits_{{i = 1}}^{n} \mathop \sum \limits_{{j = 1}}^{k} w^{{(i,j)}} \left\| {x^{{(i)}} - \mu ^{{(j)}} } \right\|^{q}$$

(2)

where μ^(j) is the centroid of the jth cluster, x⁽ⁱ⁾ is the data sample, k is the number of clusters, n is the number of elements in the dataset, and q is an integer that defines the nature of the distance function (q is 2 for the Euclidean distance). Furthermore, μ^(j) is 1 if the data sample x⁽ⁱ⁾ belongs to the jth cluster and 0 otherwise. The weighted k-means algorithm is utilized in the scikit-learn package. Unlike the CNN data, the center XY position of the BGO scintillator deposited with γ energy is used as the data. Each position has a weight corresponding to the deposited γ energy of the BGO scintillator. Since a radioactive source is centered on the detector, k is set to 1, and the Ps annihilation position will be biased by the γ energy.

CNN

CNN architecture

The proposed CNN architecture is initialized using Keras 2.1.6 with TensorFlow 2.4.0 as a backend in Python 3.8.5²⁰. The network architecture (Fig. 6) comprises two convolutional layers as feature extraction and the output dense layer. Various architectures are set up and tested, and the number of convolutional, pooling, and fully connected layers and the number of filters in each layer are determined. Starting with a 14 × 14 input shape, rectified linear unit (ReLU) activation, defined as f(x) = max(x, 0)²¹, is performed for each functional layer, including the dense layer. For regression, a linear activation is performed at the end of the dense layer to output the 2D coordinates (XY) of the Ps annihilation position as a 1 × 2 vector.
Model training and testing

MC simulation data of the 2-γ decay of the ²²Na and ¹⁸F radioactive sources are employed for the training. By utilizing the GEANT4 simulation and excluding the empty matrix data where all components are 0, sufficient events are generated to train the CNN. The CNN training is performed by using 70% of the dataset as a training set and 30% of the training set is used for training validation. The training set is used for CNN architecture and hyper-parameter optimization. As an optimizer, Nesterov-accelerated adaptive moment estimation (Nadam)²² is used for training optimization with an initial learning rate of 0.001. Nadam is an advanced optimizer with a Nesterov-accelerated gradient added to the adaptive moment estimation (Adam)²³. Moreover, it can advantageously find the global minimum more quickly and accurately than Adam by determining the gradient after moving to the momentum value rather than determining the gradient and momentum values to move from the current position to the next position²⁴. The mean squared error (MSE) is used as the loss metric. The batch size is adapted to maintain the batch size as 3% of the dataset, and the number of epochs is optimized to the epoch just before overfitting occurs. Then, the model converges in the direction of minimizing the MSE and the mean absolute error as a function of the number of epochs. These loss curves reach approximately equal minimums with asymptotic behavior. The remaining 30% of the dataset is used for testing the model.

References

Moskal, P. et al. Testing CPT symmetry in ortho-positronium decays with positronium annihilation tomography. Nat. Commun. 12, 1–9 (2021).
Article ADS Google Scholar
Rubbia, A. Positronium as a probe for new physics beyond the standard model. arXiv preprint hep-ph/0402151 (2004).
Bigg, D. A review of positron annihilation lifetime spectroscopy as applied to the physical aging of polymers. Polym. Eng. Sci. 36, 737–743 (1996).
Article CAS Google Scholar
Dong, A. W. et al. Positron annihilation lifetime spectroscopy (PALS) as a characterization technique for nanostructured self-assembled amphiphile systems. J. Phys. Chem. B 113, 84–91 (2009).
Article CAS Google Scholar
Moskal, P., Jasińska, B., Stępień, E. & Bass, S. D. Positronium in medicine and biology. Nat. Rev. Phys. 1, 527–529 (2019).
Article Google Scholar
Sharma, R. K. & Gabrani, G. Exploring Deep Learning Methods for Particle Track Reconstruction. in 2019 19th International Conference on Computational Science and Its Applications (ICCSA). 120–125 (IEEE).
Belayneh, D. et al. Calorimetry with deep learning: particle simulation and reconstruction for collider physics. Eur. Phys. J. C 80, 1–31 (2020).
Article Google Scholar
Erdmann, M., Glombitza, J. & Walz, D. A deep learning-based reconstruction of cosmic ray-induced air showers. Astropart. Phys. 97, 46–53 (2018).
Article ADS Google Scholar
LaBella, A., Vaska, P., Zhao, W. & Goldan, A. H. Convolutional neural network for crystal identification and gamma ray localization in PET. IEEE Trans. Radiat. Plasma Med. Sci. 4, 461–469 (2020).
Article Google Scholar
Jeong, D. W., Khan, A., Park, H. W., Lee, J. & Kim, H. Optimization and characterization of detector and trigger system for a KAPAE design. Nucl. Instrum. Methods Phys. Res., Sect. A 989, 164941 (2021).
Article CAS Google Scholar
Park, H., Jung, D., Hwang, S. & Kim, H. Design of novel compact detector based on the bismuth germanate scintillator and silicon photomultiplier for ortho-positronium physics. Acta Phys. Pol., B 51, 143 (2020).
Article ADS Google Scholar
Bass, S. D. QED and fundamental symmetries in positronium decays. arXiv:1902.01355 (2019).
Agostinelli, S. et al. GEANT4—a simulation toolkit. Nucl. Instrum. Methods Phys. Res. Sect. A 506, 250–303 (2003).
Article ADS CAS Google Scholar
MacQueen, J. Some methods for classification and analysis of multivariate observations. in Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. 281–297 (Oakland, CA, USA).
Endo, K. et al. PET and PET/CT using 18 F-FDG in the diagnosis and management of cancer patients. Int. J. Clin. Oncol. 11, 286–296 (2006).
Article Google Scholar
Hugenschmidt, C. Positrons in surface physics. Surf. Sci. Rep. 71, 547–594 (2016).
Article ADS CAS Google Scholar
Beyer, T. et al. A combined PET/CT scanner for clinical oncology. J. Nucl. Med. 41, 1369–1379 (2000).
CAS PubMed Google Scholar
Ito, M., Hong, S. J. & Lee, J. S. Positron emission tomography (PET) detectors with depth-of-interaction (DOI) capability. Biomed. Eng. Lett. 1, 70–81 (2011).
Article Google Scholar
Ahmad, A. & Dey, L. A k-mean clustering algorithm for mixed numeric and categorical data. Data Knowl. Eng. 63, 503–527 (2007).
Article Google Scholar
Keras-team, K. D. The python deep learning library. Available (accessed 5 May 2019). io.
Hinton, G. E. Rectified linear units improve restricted boltzmann machines vinod nair. in 27th International Conference on Machine Learning (ICML). (2010).
Dozat, T. Incorporating nesterov momentum into adam. in ICLR Workshop (2016).
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).
Ruder, S. An overview of gradient descent optimization algorithms. arXiv:1609.04747 (2016).

Download references

Acknowledgements

This research was supported by the MSIT (Ministry of Science, ICT), Korea, under the High-Potential Individuals Global Training Program (2021-0-01544) supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation) and the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Korea (no. 2020R1A6A3A01099805 and 2021R1A2B5B03002006).

Author information

Authors and Affiliations

Department of Physics, Kyungpook National University, Daegu, 41566, Korea
Jin Jegal, Dongwoo Jeong, HyeoungWoo Park & Hongjoo Kim
Department of Physics, University of Maryland, College Park, MD, 20742, USA
Eun-Suk Seo

Authors

Jin Jegal
View author publications
You can also search for this author in PubMed Google Scholar
Dongwoo Jeong
View author publications
You can also search for this author in PubMed Google Scholar
Eun-Suk Seo
View author publications
You can also search for this author in PubMed Google Scholar
HyeoungWoo Park
View author publications
You can also search for this author in PubMed Google Scholar
Hongjoo Kim
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J. J performed deep learning, data analysis and contributed writing the manuscript. D. W. J performed simulation and data generation. E. S. S contributed to data interpretation and revising the manuscript. H. W. P and H. J. K contributed to conception of this study, data interpretation and ensured financial support. All authors reviewed the manuscript.

Corresponding authors

Correspondence to HyeoungWoo Park or Hongjoo Kim.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Video 1.

Supplementary Information 1.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Jegal, J., Jeong, D., Seo, ES. et al. Convolutional neural network-based reconstruction for positronium annihilation localization. Sci Rep 12, 8531 (2022). https://doi.org/10.1038/s41598-022-11972-5

Download citation

Received: 30 December 2021
Accepted: 27 April 2022
Published: 20 May 2022
DOI: https://doi.org/10.1038/s41598-022-11972-5

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.