Introduction

With the remarkable advancement of artificial intelligence (AI), deep learning technology is being increasingly used in medical imaging, including magnetic resonance imaging (MRI), computed tomography (CT), and positron emission tomography1. The image-to-image translation is a popular topic of deep learning2. For the central nervous system, several studies have been conducted to generate synthetic images, for example, generating contrast-enhanced MRI of brain tumors from non-contrast-enhanced MRI3, generating CT imaging from MRI4, and removing motion artifacts from cerebral angiography5. These techniques are expected to reduce the examination time, omit minor invasive procedures such as contrast administration, and reduce or prevent radiation exposure.

Previous studies have focused on generating morphological images; however, few studies have been conducted on synthesizing metabolic or physiological images. Recently, methionine PET was developed from contrast-enhanced T1-weighted imaging of subjects with brain tumors6. This study estimated the amino acid metabolism of brain tumors and categorized high- and low-grade gliomas. Hence, the image-to-image translation technique could be applied to metabolic images. In addition to PET, metabolic and physiological information can be estimated from MRI using diffusion tensor images (DTI). DTI is an advanced sequence of diffusion weighted images (DWI), requiring equal or more than six directions of the diffusion gradient7. DTI has the advantage of being able to predict physiological information, including tissue density, directional strength, continuity of nerve fibers7, and brain connectivity8,9. However, DTI examinations take longer than conventional DWI. Therefore, the DTI sequence has not been included in routine MR examinations.

If DTI can be obtained as a synthetic image from conventional DWI using an AI technique, detailed physiological information about the brain can be easily assessed within a short inspection time. However, to date, no studies have generated metabolic or physiological MR images, such as DTI, using image-to-image translation techniques. Therefore, this study aimed to create an image-to-image translation AI model that synthesizes DTI from conventional DWI. Further, we aimed to validate the similarities between the original and synthetic DTI.

Materials and methods

Subjects

The Ethical Committee of Osaka Metropolitan University Graduate School of Medicine approved this study (IRB:#2021–109), and all participants provided written informed consent before participation. This study complies with the Declaration of Helsinki. Thirty-two healthy volunteers (males, 16; mean age, 30 years; range, 20–38 years) without a history of brain disease or intracranial surgery were prospectively recruited at our institution between July 2021 and January 2022. No subjects were excluded.

MRI acquisition

MRI was performed using a 3.0-T scanner (Magnetom Vida; Siemens, Erlangen, Germany). The DTI included images with the b-value of 0 and 1000 s/mm2. Further details are provided in Table 1 and Online Appendix-S1. Three-dimensional T1-magnetization prepared rapid acquisition with gradient-echo images were obtained as anatomical images. Whole-brain DTI with 76 sections were acquired using a 2-dimensional single-shot echo planar imaging sequence with six direction of motion probing gradient (MPG). DTI was denoised and corrected for Gibbs ringing artifacts, motion and eddy currents, susceptibility-induced distortions, and bias field inhomogeneities using the MRtrix3 software (http://www.mrtrix.org/). DWI with three directions of the MPG was also performed using the same parameters and imaging planes. The b1000 images of DWI were registered to the corresponding DTI with a 6-degree of freedom rigid transformation using the FLIRT function of the FMRIB Software Library version 6.0 (FSL; Oxford, UK; www.fmrib.ox.ac.uk/fsl) since slight misregistration may have occurred between the DTI and DWI sequences, which affects the precision of the image-to-image translation. In contrast, the b0 images of DWI were not registered to the DTI; hence, registered b1000 images of DWI and original b0 image of DTI were utilized for generating maps of synthetic DTI.

Table 1 MRI acquisition parameters.

Data partition

Thirty-two subjects were randomly divided into training, validation, and test datasets of 26, 2, and 4 subjects, respectively.

AI model overview

DTI data require equal or more than six MPG directions7. Thus, this study synthesized one MPG direction of DTI from DWI (three MPG directions of the x-, y-, and z-axes) and repeated this six times according to each MPG direction. To develop the AI model, DWI (signal intensities of the x-, y-, and z-axis MPG were used as three channels) and one MPG direction of DTI in the same plane (identical signal intensities were input to the three channels) were paired as a png file. Since the order and direction of the MPG were same relative to the gantry for all subjects using the MR machine provided by Siemens, the DTI data could be classified according to the MPG direction. Since signal intensities above 255 on DWI were anticipated to be noise, a signal intensity threshold of 255 was set for DWI to scale the data, to simplify it for the AI model to process. The signal intensities were not normalized since the raw values were important for predicting physiological information.

A deep-learning model was developed based on pix2pix, a generative adversarial network, which is an image-to-image translation model that uses paired images in the training and validation datasets2. In this model, the generator adopts a U-Net-based architecture10 and the discriminator adopts a convolutional PatchGAN classifier11. The model was trained using the training dataset, adjusted to the validation dataset, and evaluated on the test dataset.

An overview of the AI model is presented in Fig. 1. The learning process was performed six times, because one AI model was needed for one MPG direction. Thus, six AI models were created for each MPG direction. The details are presented in Online Appendices-S2 and S3.

Figure 1
figure 1

Explanation and overview of the deep learning-based model. Step 1 is the image-generation phase. The generator generates one slice of synthetic DTI of one MPG direction from one slice of DWI (three channels: x-, y-, and z-axis MPG). The original DWI is then concatenated with the synthetic DTI. Step 2 is the learning phase of the discriminator. The concatenated image of one slice of original DWI and synthetic DTI from step 1 or a concatenated image of one slice of original DWI and original DTI from training data are input to the discriminator. The discriminator correctly distinguishes the synthetic and original DTI; therefore, the loss value is set to be small if the discriminator is correct, whereas it is set to be large if the discriminator is wrong. The resulting loss value is backpropagated to the discriminator and the parameters are updated. Step 3 is the learning phase of the generator. The generator generates synthetic DTI with such a high similarity to the original DTI that they can be mistakenly recognized by the discriminator. Hence, the loss value is set to be large if the discriminator is correct, whereas it is set to be small if the discriminator is wrong. In addition, the L1 loss value from the original and synthetic DTI are obtained. These loss values are combined to form the loss value of the generator, and the parameters of the generator are updated. Steps 1 through 3 are repeated as the learning progresses for one MPG direction with the epoch number of 100. DTI: diffusion tensor imaging, MPG: motion probing gradient, DWI: diffusion weighted imaging.

Visualization of DTI and region of interest (ROI) placement

The separated six MPG directions and 76 slices of the synthetic DTI of the test dataset were 4-dimensionally concatenated as synthetic DTI data. From the original and synthetic DTI, fractional anisotropy (FA), mean diffusivity (MD), axial diffusivity (AD), radial diffusivity (RD), and color-coded maps were generated using the DTIFIT function of FSL software. In this process, the same b vector information and b0 images were used in the original and synthetic DTI.

The ROIs of the lentiform nucleus (LN), thalamus, posterior limb of the internal capsule (PLIC), posterior thalamic radiation (PTR), and splenium of the corpus callosum (SCC) were automatically segmented for each subject using the ICBM DTI-81 atlas. Then, the ROIs were applied to the FA, MD, AD, and RD maps derived from the original and synthetic DTI to calculate the mean values and signal-to-noise ratio (SNR) within each ROI (Fig. 2). The ROI creation method is shown in Online Appendix-S4.

Figure 2
figure 2

The separated six MPG directions and 76 slices of the synthetic DTI of the test dataset are 4-dimensionally concatenated as one synthetic DTI data. From the original and synthetic DTI, original and synthetic FA map, MD map, AD map, RD map, and color-coded map are generated. ROIs of the LN, thalamus, PLIC, PTR, and SCC are applied to the maps. MPG: motion probing gradient, DTI: diffusion tensor imaging, FA: fractional anisotropy, MD: mean diffusivity, AD: axial diffusivity, RD: radial diffusivity, ROI: region of interest, LN: lentiform nucleus, PLIC: posterior limb of the internal capsule, PTR: posterior thalamic radiation, SCC: splenium of the corpus callosum.

Statistical analyses

The peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) of the separated DTI data were calculated2,12. PSNR > 40 indicated a low degree of degradation, whereas SSIM > 0.8 indicated a high degree of similarity12,13. For the test dataset, the mean values of the original and synthetic FA/MD/AD/RD maps within each ROI were calculated. The SNR within each ROI was calculated using the following formula: SNR = mean signal intensity/standard deviation (SD). These values were compared between the original and synthetic DTI using a paired-t test (P < 0.05, indicating statistical significance). A visual inspection of noise quality was conducted by three radiologists, who were blinded to the original and synthetic DTI; they reviewed the color-coded, FA, MD, AD, and RD maps and selected the noisier images for each map. The Bland–Altman plot between the mean values of the FA/MD/AD/RD maps derived from the original and synthetic DTI was evaluated. The analyses were performed using R (version 4.0.0, 2020; R Foundation for Statistical Computing) or GraphPad Prism version 9.4.1 software (GraphPad Software, San Diego, CA, USA; https://www.graphpad.com/scientific-software/prism/).

Results

The training dataset included 1976 image pairs of one direction of MPG image and DWI from 26 subjects (male, 13; mean age ± SD, 29 ± 5.3 years). The validation dataset included 152 image pairs from two subjects (one male [32 years old] and one female [24 years old]). The test dataset included 304 image pairs from four subjects (male, 2; mean age ± SD, 30 ± 5.8 years).

PSNR and SSIM of the separated DTI data were 32.7 and 0.92, respectively.

The results of the visual inspection for noise quality, as detailed in Online Appendix-S5, indicated that all synthetic DTI maps exhibited more noise compared to their original counterparts.

Table 2 presents the mean values and SNR of the original and synthetic DTI within each ROI. Only the AD of the PTR and SCC showed significant differences between the original and synthetic maps (P = 0.047 and 0.017, respectively), whereas the SD of all analyses was larger in the synthetic maps than in the original maps. The SNR of the FA map within the LN was significantly higher in the synthetic data (P = 0.012), whereas the SNR in the other ROIs tended to be lower in the synthetic data than in the original data. Figure 3 shows the Bland–Altman plot of the FA, MD, AD, and RD values within the LN, thalamus, PLIC, PTR, and SCC. Except for the plot corresponding to subject 3, each plot showed a similar distribution. In the evaluation of MD and RD, most values were distributed under zero, indicating that the synthetic MD and RD tended to show higher values than the original data. In the evaluation of AD, values of the white matter were localized above zero, indicating that synthetic AD tended to have lower values than the original data.

Table 2 Each value and SNR of original and synthetic maps within the gray and white matter.
Figure 3
figure 3

Bland–Altman plots between mean values of the original and synthetic FA/MD/AD/RD maps within each ROI. Except for subject 3, each value shows a similar distribution. In the evaluation of MD and RD, most plots localize under zero, indicating that the synthetic data tend to show higher values than the original data. In the evaluation of AD, plots of the white matter localize above zero, indicating that the synthetic data tend to show lower values than the original data. FA: fractional anisotropy, MD: mean diffusivity, AD: axial diffusivity, RD: radial diffusivity, ROI: region of interest.

Discussion

In this study, DTI was synthesized from conventional DWI using an image-to-image translation model, and the imaging similarity of physiological parameters between the synthetic and original images were evaluated. The variation in all values was larger, and the SNR tended to be lower in synthetic DTI than in original DTI. Except for one subject, the Bland–Altman plots showed that each plot had a similar distribution. The synthetic data of MD and RD showed higher values than the original data, whereas the synthetic data of AD in the white matter showed lower values than the original data.

Synthetic DTI generated using conventional DWI has two main advantages: reduced scan time and reduced motion artifacts. Acquiring synthetic DTI can reduce the scan time compared with acquiring ordinal DTI. Reduced scan time would enable acquiring several MPG directions (64 or more], once the AI model has been established14. In fact, using the MR scanner at our institution, the scan times for acquiring DTI with six and 64 MPG directions were 55 s and 5 min 16 s, respectively, whereas that of conventional DWI was shorter compared with DTI. This is particularly beneficial for patients who have difficulty in remaining still during lengthy MR examinations15. Shorter scan times can lead to improved imaging quality since there is a decreased likelihood of motion artifacts resulting from movement during the scan.

PSNR and SSIM are often used to evaluate the imaging similarity generated from an image-to-image translation technique. According to previous studies12,13, the PSNR and SSIM of synthetic DTI data appear to indicate moderate similarities. Since this study had a small sample size, recruiting more subjects may improve the PSNR and SSIM. However, it is unclear whether the PSNR and SSIM are adequate for evaluating medical imaging, and a gold standard for evaluating physiological imaging has not yet been established. Therefore, a new index for evaluating synthetic medical imaging is required for future studies.

Consistent with the current study, FA is known to be higher in the white matter than in the gray matter, whereas MD shows similar values in the gray and white matter16. However, in synthetic DTI, the variation in all values was larger, and the SNR tended to be lower than that of the original DTI. When we reviewed DTI maps visually, the synthetic DTI maps were noisier than the original DTI maps. To further investigate, we conducted the same analysis using denoised and corrected DWI. Nevertheless, the SSIM and PSNR values of the synthetic images remained virtually unchanged (detailed data not shown), indicating that denoising and artifact correction of the input DWI did not notably influence the quality of the synthetic images. Our image-to-image translation model separately learned to generate images in each MPG direction and then concatenated the synthetic images of the six MPG directions. Thus, this model cannot learn how to reduce the noise that occurs during the creation phase of each map. Due to the increased noise, we could not evaluate the differences between synthetic and original DTI using other methods, such as tractography or connectome analyses17. Therefore, additional ingenuity may be required to reduce the noise in maps produced by DTI.

For the subject-wise evaluation, the values of one subject (subject 3) seemed to be clustered among the test dataset in the Bland–Altman plot, indicating that the maps of synthetic DTI may have been affected by conventional DWI. Despite thorough visual comparative review of subject 3's original DWI against others, specific causative factors, such as noise or artifacts, remained unidentified. Hence, when using synthetic DTI, a subject-wise coefficient or reference region may be required for imaging harmonization. When we evaluated the distribution of the plots of the MD, AD, and RD values, the synthetic data in the MD and RD maps tended to show higher values and those in the AD maps of the white matter showed lower values than the original data. The number of MPG directions may have affected the map values18. A previous study suggested that at least 20 and 30 MPG are necessary for robust estimations of FA and MD, respectively19. Since this study focused on a new technique that created synthetic DTI from conventional DWI, a minimum number of MPG directions was selected. Increasing the number of MPG directions may improve the similarities between the values derived from original and synthetic DTI.

This study has several limitations. Firstly, the sample size and the number of MPG directions were small. Increasing the number of participants or obtaining more MPG directions may improve the quality of the synthetic DTI. Secondly, synthetic physiological MR images have not been previously reported; thus, a method for evaluating synthetic medical imaging has not been established. Further validation of the results of this study is required. Conversely, since image-to-image translation techniques have recently been developed and are expected to show further development in various fields, including the medical field, the current results could serve as a reference for future studies. Thirdly, while a 2D GAN model, which utilizes versatile 2D RGB images, was used in this study due to the small dataset, developing a 3D model might enhance the synthesis of DTI and is worth exploring in future research. Models like StarGAN, which can handle multiple domains, may also improve image quality. However, it is expected that the use of a 3D AI model or a specific image generation model would require a larger dataset.

Conclusion

Although further improvements and validation are required, generating synthetic DTI from conventional DWI using an image-to-image translation model is possible. Overall, our study would promote efficient diagnosis and prevent adverse effects associated with the radiological modalities.