Deep robust residual network for super-resolution of 2D fetal brain MRI

Spatial resolution is a key factor of quantitatively evaluating the quality of magnetic resonance imagery (MRI). Super-resolution (SR) approaches can improve its spatial resolution by reconstructing high-resolution (HR) images from low-resolution (LR) ones to meet clinical and scientific requirements. To increase the quality of brain MRI, we study a robust residual-learning SR network (RRLSRN) to generate a sharp HR brain image from an LR input. Due to the Charbonnier loss can handle outliers well, and Gradient Difference Loss (GDL) can sharpen an image, we combined the Charbonnier loss and GDL to improve the robustness of the model and enhance the texture information of SR results. Two MRI datasets of adult brain, Kirby 21 and NAMIC, were used to train and verify the effectiveness of our model. To further verify the generalizability and robustness of the proposed model, we collected eight clinical fetal brain MRI 2D data for evaluation. The experimental results have shown that the proposed deep residual-learning network achieved superior performance and high efficiency over other compared methods.

www.nature.com/scientificreports/ an efficient sub-pixel CNN. When the redundant nearest-neighbor interpolation was replaced with the interpolation, the deconvolutional layer was simplified into a sub-pixel convolution. This interpolation was more efficient than the nearest-neighbor interpolation. Although these models demonstrated promising results, they all required upscaled input images at the desired spatial resolutions via bicubic interpolation prior by applying the network, and these models did not use low-level feature information. To cope with these limitations, some SR algorithms have adopted residual learning 5,[7][8][9]13,21,22 , showing effective improvements.
In this work, there are three aspects of our contributions: (1) To address the computational-cost problem and avoid generating fake features, we adopted a deep residual network to train residuals in a coarse-to-fine fashion. (2) In order to sharpen the SR image, we combined Gradient Difference Loss (GDL) 23 and the robust Charbonnier loss function, this way can deal with outliers and improve reconstruction accuracy. (3) We collected eight clinical fetal-brain MRIs for further evaluating the generalizability and robustness of the proposed model. Figure 2 has shown the HR example slices for the different algorithms: cubic spline interpolation and non-local means up-sampling (NMU) 24 , low-rank total variation (LRTV) 25 , and SRCNN 26 for visual inspection with the ground-truth MR image and LR image on Kirby 21, NAMIC 1 , and clinical fetal MR images, respectively. All the figures in our paper were drawn by Microsoft Office PowerPoint 2016 (https:// www. office. com/). It can be seen that our approach recovered fine details and preserved the edges.

Experimental results
The SR deep-learning technique was not very limited by MRI parameters and could therefore be further migrated to the fetal brain. Thus, we applied our model to fetal MRIs, which were provided by the First Affiliated Hospital of Xi'an Jiaotong University. We labeled the fetal brain on the MRI and extract the fetal brain. The MRIs of each fetus were cut into 10-20 slices. We tested all slices of each fetus. Figure 2c shows the SR example slices of different algorithms on a subject. The reconstructed MR images by our network provided more details than did the other algorithms. The error maps Fig. 3 can make it easier to identify differences between the methods.
For a quantitative comparison, the average peak signal-to-noise ratio and structural similarity 27 were used to evaluate the performance of each algorithm. Tables 1, 2 and 3 provided a summary of the quantitative evaluation within a scale factor of two, include Mean, Standard Deviation (SD) and confidence interval (CI) which confidence level is 95% of PSNR and SSIM. The reported results tend to show that CNN-based approaches (e.g., SRCNN and our RRLSRN model) achieved better performance than did cubic spline, NMU, and LRTV. Our experiments also showed that residual learning approaches were more effective than SRCNN.
In our model, we combined the Charbionner loss and GDL to train our model. To verify the effect of GDL on SR results, we compared the PSNR of model without GDL on 8 clinical fetal brain MR images, the results are shown as Table 4. All PSNR of 8 fetal MR images with the GDL are higher than without GDL. The results demonstrate that GDL is helpful to improve the quality of images.
Our experiment has shown that the proposed model with GDL can enhance the brain's edge of MRI. And we show the visual difference between our model with GDL and without GDL on the clinical fetal brain MRI dataset as Fig. 4. As shown by the yellow arrow , the reconstruction result of our model with GDL has sharper edges and is similar with HR image than the model without GDL. We trained the model without the transpose convolution at the bottom of our model to demonstrate the effect of transpose convolution. We compared the PSNR on 8 clinical fetal brain MR images, the results have been shown as Table 5. The experimental results show that transpose convolution at the bottom is helpful to improve the accuracy of the results. Residual learning is beneficial to the model.
To verify the efficiency of our algorithm, we separately calculated the test time of our Kirby 21, NAMIC, and the fetal MR image methods. We then compared the spending time of other methods. The results are shown in Table 6. The average speed of our model was faster than those of the NMU, LRTV, SRCNN (faster version) 19 on three datasets.

Discussion
In this work, we proposed a network-based algorithm to learn the residual information between upsampled MR images and HR MR images. Our approach adpoted the robust Charbonnier loss function and GDL which are helpful to train our model. In order to demonstrate the potential of SR methods for enhancing the quality of LR images, we have presented an experiment with image quality transfer from HR experimental dataset to LR images. The results based on two brain MR image datasets have shown that our algorithm outperforms cubic spline, NMU, LRTV and SRCNN in this study. RRLSRN network effectively learned the residual information between upsampled LR MRI and HR MRI, the model can not only improve the accuracy of network SR results, but also greatly reduce the computational cost. Then we applied the model on the clinical fetal MR images. The fetal SR results of the proposed RRLSRN are better than above listed methods. The texture of SR results become detailed.
In terms of the processing speed, we observed that our method trained ×2 faster than NMU, LRTV and SRCNN on both Kirby 21 and NAMIC datasets. Overall, our algorithm performed well in terms of speed.
Our SR method has shown clear improvement over other listed methods, which is the standard technique to enhance image quality from visualization, quantitative evaluation and computational efficiency. Our model is currently SR on the scale of ×2 of 2D MR slices, it can also be extended to ×4 or ×8 times for SR reconstruction by cascading. In future work, we will improve our residual learning based SR framework to obtain better accuracy, meanwhile reduce computational complexity. In addition, we will further apply the SR technology to improve the accuracy and validity of the clinical diagnosis by combining the equipment. where x and y represent the LR and HR images, respectively. κ is the down-sampling operator. r is the residual information between the HR MRI and the bicubic-interpolated MRI. u represents the up-sampling operator. (1) r =y − (u(κBy)) = y − ux = y − z  www.nature.com/scientificreports/ The model can learn the residual feature and up-sampling feature with normal and transposed convolutional layers. The network architecture used in this study is illustrated in Fig. 1b. When using fetal data, we segmented and extracted fetal brains as shown Fig. 1a.
The main architecture of the network for feature extraction consisted of 13 convolutional layers and two transposed convolutional layers to up-sample the extracted features using a scale of two. Because the fetal MRI slice sequence did not enable 3D representation, we designed our model with 2D convolution. The convolution kernel size was 3 × 3 × 64 . The transpose convolutions were 4 × 4 × 1 . Our model performed feature extraction at a coarse resolution and generated feature maps with finer details by using the transposed convolutional layer. Compared to the listed networks, our network can reduce computational complexity significantly.  www.nature.com/scientificreports/ Loss function. This approach can learn the information lost in the image by interpolation, and it can also reduce computational complexity. We optimized the network with a Charbonnier loss 4 , as stated in the following formulation:   www.nature.com/scientificreports/ Let x be the input. We denote the ground-truth HR MRI slice by y, generating the corresponding HR MRI slice by ŷ , and the residual information of MRI by r. The overall Charbonnier loss function is: Where s represents the number of training samples. ε is a very small constant. ε is empirically set as 1e−3 . We utilized our model with the Charbonnier loss function instead of the L 2 loss to cope with outliers and improve MRI SR result accuracy, due to the loss is robust. We also combined the GDL, which can directly penalize the differences of image gradient to sharpen the SR result. The GDL function is defined as follows: The overall GDL loss function is: Where |.| denotes the absolute value function.
Then the final combined loss is:

Dataset and training details
To verify the ability to reconstruct HR MRI slices of the brain, we applied our method on two adult-brain datasets (Kirby 21 and NAMIC) and eight clinical fetal MRIs.
Dataset. Kirby  (2) L Charbonnier (y, y) = x 2 + ε 2 (ŷ − y) We tested the model from case01011 to case01034 NAMIC. The remaining images were used for training. All eight fetal brain MRIs were used for testing. LR images were generated using a scale factor of two. We initialized the network using the model of Lai 4 . The slope of leaky rectified linear units was −0.2 . We padded zeros to make sure that the size of the feature map for each layer is the same as the input. And we trained the model by randomly sampling 64 patches whose sizes were all 128 × 128 . We set the momentum parameter to 0.9 and the weight decay to 1e−4 . The learning rate was initialized to 1e−5 and decreased by a factor of two at every 50 epochs. We trained the original codes of the compared methods to calculate the runtime on the same computer with an Intel i7 processor (64-GB RAM) and Nvidia Tesla V100 graphics processor (16-GB Memory).