Low-field (LF) magnetic resonance imaging (MRI) has recently regained attention, with multiple initiatives aiming not only to democratize MRI worldwide, but also to complement conventional high-field MRI. Mostly driven by simpler and scalable magnet construction, LF MRI offers increased accessibility from reduced purchasing and maintenance costs, cryogenic-free infrastructure, reduced susceptibility artifacts, higher T1 contrast, and potential for smaller footprint designs1,2,3,4. Yet, LF MRI suffers from lower sensitivity due to an intrinsically lower nuclear spin polarization. Consequently, the associated lower signal-to-noise ratio (SNR) per unit time and volume almost invariably requires averaging and hence prolonged acquisition times that may restrict its clinical relevance. Technical progress achieved over the past decades in diverse areas of MRI, e.g. power electronics, radio frequency detection, sequence programming, and image processing, have contributed to bring LF back to the surface1,2 but further acceleration remains paramount to push the achievable SNR and voxel resolution per unit time, aiming at a broad deployment in radiology departments and beyond.

Regardless of magnetic field strength, MRI requires one to encode the signal originating from 1H nuclei to make an image, that may result in long imaging times. k-space undersampling strategies have been largely used to accelerate acquisitions, but typically lead to information loss or reconstruction artifacts after Fourier Transform (FT). Restoring the unaltered image becomes then an ill-posed, inverse problem5 where regularization methods involving prior knowledge such as the sparsity of MR data in a certain domain (e.g. Wavelet domain) can be leveraged, as done in Compressed Sensing (CS)6. Yet, the application of CS in clinical routine is still limited as it requires complex parameter tuning, sometimes very specific to an application or image type, and suffers from long online reconstruction times. An alternative approach to accelerating MR acquisitions is parallel imaging (PI)7,8. As opposed to CS, PI is broadly spread and employed in clinical routine. In PI methods, the undersampled k-space is acquired using multiple receiver coils and the spatial dependence of their \({B}_{1}^{-}\) field is leveraged to turn the initial ill-posed to a well-posed inverse problem that can be solved by a direct matrix inversion8. As noise is severely amplified at high undersampling rates though (characterized by the g-Factor), a limited speed-up rate of fourfold (maximum) is generally observed. Relying on body noise regime dominance to exploit the spatial dependency of coil elements and spatially altered SNR, PI shall however not be indicated to accelerate MRI below a certain magnetic field strength. Indeed, below 5 MHz, sample noise may no longer dominate in the acquisition chain9 and such a reconstruction paradigm no longer holds.

Deep Learning (DL) is an emerging field of research that has shown promising results in a variety of domains, including MRI. In fact, DL has provided a new paradigm to solve ill-posed inverse problems, and particularly the reconstruction of undersampled MR acquisitions where superior performance to widespread CS and PI acceleration methods was shown10,11,12,13,14,15. Here, we hypothesize that DL could be a serious means to accelerated LF MRI.

In DL approaches, neural networks made of numerous layers (i.e. referred to as “deep”) are used to learn the complex functions that directly map an input data to a corresponding output. DL models, and especially data-driven approaches with a large number of free parameters require large training set to avoid what is called overfitting5. Overfitting occurs when a model learns from a training set which variety is limited, hence failing to generalize to images too different from the training set. Considering such limitation, initiatives were proposed to provide large-scale, open access databases of processed MR images16,17,18,19 or raw space20 to researchers in the MRI community. Interestingly, two aspects in undersampled DL-MR works often seem overlooked, namely prospective validation and phase image reconstruction. Prospective validation is a key step though, to properly assess the performance of proposed methods on undersampled data acquired in real conditions (i.e. not subsampled a posteriori). The integration of complex data inherent to MRI could additionally open new perspectives for the application of DL with phase-contrast MR techniques such as magnetic susceptibility mapping, or further flow and motion encoding. It may also provide more accurate outcomes as MRI data does not limit itself to most used magnitude information. Yet only little research has reported on phase data reconstruction, none of which were validated prospectively21,22. Meanwhile, it should be noted that several studies were inherently limited by their model architecture that cannot handle complex input data, or by the nature of the training set constituted solely of magnitude images.

At LF, few attempts have yet been reported that leverage DL for undersampled MRI data reconstruction23. Schlemper et al. developed a non-uniform variational model to simultaneously reconstruct and recover artifacts from undersampling (3.5-fold) and hardware for a point-of-care scanner (64 mT). In a different work, Koonjoo et al. exploited a model named AUTOMAP to improve SNR at low field, respectively 6.5 mT and 47 mT24.

In general, DL approaches face a major challenge at LF: there is no large-scale LF database acquired at the same magnetic field strength. Indeed, the number of scanners/clinical exams at LF are still very limited and there is no consensus as to what field regime is LF MRI. Consequently, LF databases remain private and rather small both on global (worldwide) and laboratory scales. The above-mentioned LF studies relied on an open access conventional (i.e., high-field) MR database from the human connectome project (HCP)19 to train and test their model. The latter was pre-processed ad hoc by adding noise23,24 and artifacts that would result from patient motion to simulate LF brain data, and account for the limited gradients linearity specific to the system used23. Despite such processing, peculiarities such as noise regime and image contrast at LF can be quite different from what is expected at conventional fields (1.5–3 T). Typically, the magnetic properties of biological tissue change at low field, leading to a superior T1 dispersion that implies different soft tissue contrast, and noise in the reception chain may no longer be dominated by the sample but instead by thermal noise from the electronics. Therefore, training with preprocessed conventional field data might very well have a negative impact on reconstruction outcomes. First, this can translate in decreased model performance because of the discrepancy between training and test sets25. Second, it might lead to an undesired contrast transfer from conventional field to LF DL reconstructed MR images. Finally, we will note that for both LF DL approaches referenced above, the phase information was not considered as the HCP database contains only magnitude images.

In the proposed work, we explore the capability of a data-driven DL approach to accelerate LF MR acquisitions (here, at 0.1 T) while maintaining both magnitude and phase information. A particular emphasis was brought to tackle challenges associated with small datasets typically encountered at low field. Here, a relatively small dataset (n = 10) of human wrist images was collected and data augmentation was employed to mitigate data scarcity while preventing the deviation between training and testing sets. With data augmentation, the size of a small dataset is artificially expanded based on basic image manipulations (e.g. rotation, translation, cropping, noise injection, etc.) or on deep learning (e.g. generative adversarial networks)26. We used 2-channel U-net27, one of the state-of-the-art deep learning architectures11,12,20,28,29,30, as an image domain learning model. Its performance was first evaluated on magnitude and phase reconstructed images for three different acceleration rates (R = 3, 4 and 5). Then, in an attempt to preserve high frequencies at the highest acceleration rate (R = 5), the model performance was investigated for undersampling schemes describing different k-space coverage. Finally, the proposed approach was validated on both retrospective and acquired, prospective 3D LF data, keeping a keen eye towards clinical transfer.

Materials and methods

Low-field data

A total of 10 fully sampled, 3D spoiled gradient echo (GRE) in vivo MR images of the human hand and wrist were acquired at 0.1 T, on a compact, biplanar system using a custom-built transmit/receive coil tuned at F0 = 4.256 MHz31,32. All data were collected with the following imaging parameters: matrix size = 128 × 115 × 9, voxel size = [1.2 × 1.2 × 6.3] mm3, TE/TR = 7.2/31 ms, bandwidth = 17 kHz, flip angle (FA) = 70°, number of averages (NA) = 28 (acquisition time = 14 min 56 s). The dataset was split by randomly choosing 80% (8 sets) for training and 20% (2 sets) for validation. Retrospective and prospective test data consisted each of 6 additional sets of 3D images acquired with the same protocol described above. All MRI experiments were conducted following the local ethics regulations and informed consent was obtained from all subjects. The study was approved by the Ethikkommission Nordwest- und Zentralschweiz (EKNZ) (project-ID 2022-00348).

Data preprocessing

Each 3D k-space was zero-filled to reach 128 × 128 × 9 matrix dimension and fit the square input/output dimensions required by U-net. Even though training with non-square input dimensions (128 × 115) is feasible, we chose to avoid the potential problem of stride and padding size adjustments that could be encountered in this case. Each k-space was then inverse Fourier transformed to generate a complex 3D image. To avoid potential overfitting caused by the rather small size of our dataset and improve the model performance, data augmentation was employed. The Keras library33 provides a 2D ImageDataGenerator class to facilitate data augmentation implementation. To handle 3D image augmentation, a modified version of Keras class was used34. More specifically, for a given 3D complex image, two data augmentation functions were jointly used on the real and imaginary part that include a random selection of one of the following transformations: horizontal and vertical shift (range [0, ± 50 pixels]), horizontal and vertical flip, shear (range [0, 45°]), zoom (range [75%, 1.25%]) and rotation (range [0°, 360°]).

The ‘augmented’ 3D complex image set was then Fourier transformed back to k-space domain and retrospectively undersampled on both the first and second phase encode directions (ky and kz). Afterwards, data normalization was performed by dividing each 3D k-space with its corresponding standard deviation. Pairs of input/output, undersampled and fully sampled complex images were finally obtained following an inverse Fourier transform to go back to the image domain.

Model architecture and training details

In undersampled MRI, the aim is to find an optimal reconstruction function \(f:x\to y\), which maps an undersampled image \(x\) to a fully sampled image \(y\). \(f\) can be formulated as follows:

$$f=\underset{\mathit{f }}{\mathit{argmin}}\mathcal{L}\left[f\left(x\right)-y\right]$$

with \(\mathcal{L}\) being the loss function. \(x\) is generated retrospectively using the acquired fully sampled k-space: \(x= iFT(U(k))\), where \(U\) denotes the sampling pattern and \(k\) the fully acquired k-space. To train our model, the mean squared error (MSE) was used as a loss function. Subsequently, the later formula can be expressed as follows:

$$\mathit{f}=\underset{{f }}{\mathit{argmin}}\sum_{1}^{\mathit{N}}{\Vert \mathit{f}\left({\mathit{x}}^{\mathrm{i}}\right)-{\mathit{y}}^{\mathrm{i}}\Vert }^{2}$$

where \(i\) indicates the ith sample in the set and \(N\) is the total number of samples used to compute the loss. In this work, the deep convolutional neural network architecture U-Net was used as our reconstruction function \(f\). U-net consists of an encoding path providing low-level features followed by a decoding path that enables precise localization of these features. Feature localization is further improved thanks to concatenation operations that binds the encoding path to the decoding path (cf. Fig. 1). It is a multiscale model with a large receptive field that can capture globally distributed artifacts29.

Figure 1
figure 1

The chosen Residual U-net used to reconstruct LF undersampled data.

In detail, U-net input/output dimensions were set to 128 × 128 × 2. The two channels correspond to the real and imaginary parts of the input/output image, hence preserving the complex nature of the data. The encoding path involves a sequence of two convolution operations (size: 3 × 3, stride 1), followed by a ReLU activation function and max-pooling operation (size: 2 × 2, stride 2). This last step halves the spatial dimensions. This sequence is repeated four times and the number of filters doubles after each sequence. Similar to the encoding path, the decoding path consists of four sequences where each sequence involves an up-sampling operation to restore image size followed by two convolution operations. The decoding path additionally involves a concatenation operation, where a higher resolution feature map from the encoding path is concatenated to its corresponding feature map in the decoding path to recover details lost in the encoding path. The last layer is a convolution layer with 2 filters (size 1 × 1) used to map a 64-channel layer into real and imaginary outputs.

Lee et al. showed that a residual block can improve the reconstruction performance of a U-net model35. The residual block introduced in reference36 consisted of a shortcut connection that performs an identity map of U-net input and adds it to the output resulting in a ‘residual U-net’. Using the residual block instead of the original alias-free full image simplifies the topology structure which translates in easier and more accurate learning35. This method was used, later on, in many MR accelerated acquisition studies10,11,12,14,37 and is similarly employed in our work.

RMSProp was adopted as an optimizer with an adaptive learning rate starting from 10–3. The number of epochs was set arbitrarily to 2000. However, the model converged before reaching 2000 epochs without overfitting. The validation was done every epoch to avoid the potential overlooking of overfitting phenomena. Training and validation loss curves are shown in supplementary material (cf. supplementary Fig. S1). Additionally, repeatability experiments were carried out (cf. supplementary material). The pipeline was implemented in Python3 using the Keras library and TensorFlow38 as a backend. The learning computation was carried out on an Intel Xeon workstation associated with a Graphics Processing Unit (GPU) GeForce GTX 1080 Ti, and took a total time of 1.5 h.

Conducted studies

The phase encoding directions (ky and kz) were undersampled following 2D Gaussian sampling distributions, while a fixed number of k-space center lines (CL) was fully sampled to preserve low spatial frequency information and SNR. The study was divided into two parts: a retrospective analysis based on the results of two different experiments, and a prospective analysis performed on data acquired according to the retrospective outcomes.

Retrospective study

In a first experiment, the performance of the DL model was assessed for different acceleration rates. Fully sampled k-spaces were retrospectively undersampled using three Gaussian sampling patterns with the same number of CL = 7, the same variances \({\sigma }_{y}\) and \({\sigma }_{z}\) along the two-phase directions (\({\sigma }_{y}\) = 0.10 and \({\sigma }_{z}\) = 0.20), but different acceleration rates R = 3, 4 and 5 (cf. Fig. 2, top row).

Figure 2
figure 2

Model performance for different acceleration rates: (a) threefold, (b) fourfold and (c) fivefold. The red circles point out the missed details with higher acceleration rates in the magnitude images.

In a second experiment, the performance of residual U-net was investigated for the highest acceleration rate (R = 5) while changing the variance of the Gaussian sampling patterns, hence leading to various artifacts in the reconstructed images. Indeed, when a Gaussian sampling pattern is used, the lower \({\sigma }_{y}\) and \({\sigma }_{z}\), the closer to a low pass filter the sampling pattern, and the blurrier the resulting zero-filled images. Similar to noise artifacts, blurriness is globally distributed over the image. If \({\sigma }_{y}\) and \({\sigma }_{z}\) are high, high-frequencies are better preserved but at the cost of less-distributed or localized artifacts that we term ‘local’ artifacts. Here, four different sampling patterns were compared with fixed R = 5 and \({\sigma }_{z}\) = 0.2, and a) CL = 23 /\({\sigma }_{y}\) = 0, b) CL = 7 \(/{\sigma }_{y}\) = 0.10, c) CL = 7/ \({\sigma }_{y}\) = 0.15, and ultimately d) CL = 7/ \({\sigma }_{y}\) = 0.20 (cf. Fig. 4, top row).

Prospective study

A set of test data was acquired using a fivefold undersampled k-space acquisition (acquisition time ~ 3 min). An optimal sampling pattern was chosen according to the results of the previous experiment.


Most commonly accepted metrics in the field of image reconstruction were used to evaluate the model performance, namely Peak SNR (PSNR), structure similarity index (SSIM)39, and normalized root MSE (NRMSE).

  • PSNR represents the ratio between the maximum intensity across a reference image \(f\) and the MSE of the DL reconstructed image \(\widehat{f}\):

    $$PSNR=20{\mathit{log}}_{10}\frac{max(f)}{\sqrt{\frac{1}{\mathit{M}}\sum_{\mathrm{i}=1}^{\mathit{M}}{(f\left(\mathrm{i}\right)- \widehat{f}(\mathrm{i}))}^{2}}}$$

    where M is the number of pixels in the image.

  • The SSIM index attempts to quantify the perceptual differences between two images. The comparison is based on three features: luminance, contrast and structure. These features are evaluated at different image locations by using a sliding window. The resulting SSIM between two image patches A and B is given by:

    $$SSIM\left(A, B\right)= \frac{(2{\mu }_{A}{\mu }_{B}+{C}_{1})({\sigma }_{AB}+{C}_{2})}{({\mu }_{A}^{2}+{\mu }_{B}^{2}+{C}_{1})({\sigma }_{A}^{2}+{\sigma }_{B}^{2}+{C}_{2})}$$

    \(\mu\) and \(\sigma\) are the mean and the variance values of an image patch. \({C}_{1}\) and \({C}_{2}\) are two constants to stabilize the division. The reported SSIM in the rest of the document is the mean SSIM calculated as follows: \(\frac{1}{P}\sum_{i=1}^{P}SSIM\), where \(P\) is the total number of patches.

  • NRMSE is the normalized root mean squared error. When normalized by the mean value of the reference image \(f\), NRMSE is given by:

    $$NRMSE= \frac{\sqrt{\frac{1}{M}\sum_{i=1}^{M}{(f\left(i\right)- \widehat{f}(i))}^{2}}}{\mathit{mean}\left(f\right)}$$

Additionally, to assess the sharpness of the reconstructed images, 2D kernels for edge detection were employed in vertical and horizontal directions (\({G}_{x}\) and \({G}_{y}\)). The gradient magnitude \(G\) is given in Eq. 6 and as opposed to the above metrics, G is a reference-free metric.


This metric was only considered on magnitude images. In fact, in phase-contrast images, the contribution of phase wraps dominates compared to that of image details in the gradient calculation, translating in very challenging data interpretation. In general, considering that the phase is encoded between -pi and pi, we believe that only NRMSE and SSIM are appropriate metrics for phase images and were chosen as such in all further analyses.

A binary mask defined as a Region-of-Interest (ROI) for each image was generated. More specifically, a preliminary mask was first created based on pixels thresholding of the magnitude of the reference image. Then, a closing operation followed by an opening operation40, which are mathematical morphological operations, were performed in order to extract the ROI and remove the isolated noisy pixels. The resulting mask was used such that the background contribution (i.e., noise in the image) was excluded when computing all reconstruction metrics. Indeed, U-net generally leads to reconstructed images with a noticeably lower background noise than the reference images. This naturally leads to an underestimation of the quality of the reconstructed images when the entire field-of-view is considered.

Images of the 3D datasets that did not contain signal (i.e., noise only) were not included in the calculation of our reconstruction metrics.

Finally, to determine differences between groups or methods, non-parametric tests were carried out for each metric (Kruskal–Wallis and Wilcoxon), where a p-value < 0.05 was in general considered statistically significant (when relevant, p-values were adjusted using the Bonferroni correction to account for multiple comparisons).


Retrospective study

The model performance was first evaluated at different acceleration rates (3, 4 and 5) with a Gaussian sampling pattern acting as a low pass filter (CL = 7, \({ \sigma }_{y}\) = 0.10 and \({\sigma }_{z}\) = 0.20). An example of the corresponding reconstructions is shown in Fig. 2. Residual U-net was able to reduce the blurriness in zero-filled Fourier transformed magnitude images for the three acceleration rates and provided smooth magnitude reconstructed images with sharper boundaries as illustrated by the gradient intensity profile shown in Fig. 3. In particular, the gradient metric indicated that residual U-net could better recover image sharpness and preserve the structures edges especially at high acceleration rates (see Table 1). For instance, the sharpness of the magnitude images improved in average over the test set by almost 13% at R = 5. However, higher acceleration rate progressively resulted in lower image quality where high spatial-frequencies and hence small details were more likely to be missed in the reconstructed magnitude images (cf. red circles in Fig. 2). Regarding phase images, interestingly, residual U-net was able to preserve not only the image sharpness but also most edges that were smoothed out in zero-filled phase images especially for high acceleration rates (cf. Fig. 2). As for magnitude images, the model performance was lower with higher acceleration rates. Our observations are supported by the quantitative results calculated over test data and summarized in Table 1. The statistical analysis on phase and magnitude metrics showed that DL reconstruction is significantly better than FT reconstruction for all acceleration rates (p-values < 0.01667) (cf. Tables S3 and S4 in supplementary material).

Figure 3
figure 3

Example illustrating the model performance in preserving edges. Gradient profiles along the red line in reference (green), FT (red) and DL (blue) magnitude images with a fivefold acceleration rate (CL = 7, \({\sigma }_{y}= 0.10\)).

Table 1 Summary of performance obtained on the magnitude and phase images for different acceleration rates.

The impact of different sampling patterns on the reconstruction performance at the highest acceleration rate R = 5 was also explored and the results are summarized in Table 2. Figures 4 and 5 show respectively magnitude and phase reconstruction examples corresponding to the four sampling patterns. Sampling pattern (a) led to highly blurred images after FT with local aliasing-type of artifacts in both the background and the ROIs. The proposed model enhanced the edges sharpness and corrected the artifacts in the background, yet could neither recover the lost higher frequencies nor correct for the local artifacts inside the ROI (cf. red arrows in Fig. 4). Sampling pattern (b) led to blurred Fourier transformed images with subtle, local artifacts. Therefore, the model reconstructed sharp and nearly artifact-free magnitude images. The non-centered sampling patterns (c) and (d), as expected, led to sharper zero-filled images. However, local artifacts were still visible in the ROIs depending on the MR images. Regarding phase images, larger \({\sigma }_{y}\) led to higher details recovery. The local artifacts seen on magnitude images are less obvious in phase images.

Table 2 Summary of performance obtained on magnitude and phase images for different sampling patterns.
Figure 4
figure 4

Impact of sampling patterns on the reconstruction performance of the magnitude images. (a) CL = 23 \({\sigma }_{y}\) = 0, (b) CL = 7 \({\sigma }_{y}\) = 0.10, (c) CL = 7 \({\sigma }_{y}\) = 0.15 and (d) CL = 7 \({\sigma }_{y}\) = 0.20. Acceleration rate = 5. Although the model succeeded in fully removing local artifacts in the background, red arrows show examples of local artifacts still present inside the ROIs.

Figure 5
figure 5

Impact of the sampling pattern on the reconstruction performance of the phase images. (a) CL = 23 \({\sigma }_{y}\) = 0, (b) CL = 7 \({\sigma }_{y}\) = 0.10, (c) CL = 7 \({\sigma }_{y}\) = 0.15 and (d) CL = 7 \({\sigma }_{y}\) = 0.20. Acceleration rate = 5.

Apart from sampling pattern (a) or \({\sigma }_{y}=0\) which translated in the lowest performance, the rest of the sampling patterns show, overall, similar quantitative results. Nevertheless, the sampling pattern (b) or \({\sigma }_{y}=0.10\) demonstrates improved PSNR, SSIM and NRMSE on magnitude images (cf. Tables S5 and S6 in supplementary material) and led to nearly artifact-free images of 3D LF wrist/hand test data. Therefore, this pattern was used to evaluate the performance of the model on prospectively acquired data.

Prospective study

Figure 6 shows three examples of fivefold undersampled MR data. A similar behavior to retrospective undersampling was observed; namely, edges were sharper, and edges lost in the Fourier transformed phase images were recovered. Table 3 shows an increase in gradient values of 9.4% for magnitude images.

Figure 6
figure 6

Reconstruction results of 3 different prospectively undersampled MR data with the Gaussian sampling pattern: CL = 7, \({\sigma }_{y}=0.10\). Acceleration rate = 5.

Table 3 Results of prospectively acquired MR images.


In this work, we investigated the potential of residual U-net, a data-driven model, to reconstruct 3D LF undersampled MR images in combination with small databases. Data augmentation with basic image transformations was adopted to mitigate the small size of the training dataset. The performance of this approach for three different acceleration rates was first assessed on retrospectively undersampled LF data. Quantitative results showed a statistically significant improvement when U-net reconstruction was used instead of conventional iFT. More specifically, the proposed DL approach was able to recover the image sharpness, even for high acceleration rates. We believe that sharpness was improved as a consequence of a reduced global distributed artifact –blurriness in the present case—generated by the sampling pattern rather than an explicit learning of the anatomical edges or details. The global artifact correction was most likely the result of data augmentation coupled to the U-net architecture nature. Indeed, the model learned at each iteration a newly-generated undersampled real/imaginary data having the same global artifact and learned to correct for it without overfitting details of the image. Moreover, considering its large receptive field, U-net was often used to remove incoherent globally distributed artifacts such as noise or streaking artifacts11,12,29,41,42. Thus, the large receptive field of U-net proves once more to be suitable for correcting a global distributed artifact. Phase images were reconstructed with equal performance as magnitude images, thus opening new perspectives for MR techniques relying on phase-contrast imaging. The edges appearing smoothed in zero-filled FT phase-contrast images were sharper using DL reconstruction. One can think that this could be caused by an explicit anatomical learning. However, since this information was already present in the DL model inputs (i.e., the real and imaginary components of Fourier transformed image as illustrated in Fig. S4 in supplementary material), we hypothesize that it originates from a reduced global blurriness of real and imaginary components inherited from the k-space sampling pattern. Accordingly, in the context of pathological cases, we would assume that, as long as the pathology information is present in a “blurred” form in the DL model input, it can be restored without artifact. However, assuming that pathology would take the shape of very small structural changes lost from the undersampling scheme used, there could indeed be a risk that the model forces the reconstructions toward a healthy wrist appearance. That said, a solution could be to settle on a minimum resolution loss that would limit such a risk, to recover structures of a minimum size. Then of course, as for any imaging modality and regardless of DL reconstruction pipelines, a general lack of resolution may prevent to recognize small pathological structures that would be lost from partial volume effects. In the latter case though, DL is not the limiting factor.

One limitation of our DL reconstruction comes from some remaining smoothness in both magnitude and phase images though. This could be due to 1- the undersampling patterns inevitably missing high spatial frequencies in k-space, and/or 2- the MSE loss function that tends to reduce the noise over flat intensity regions, translating in noiseless textures on the reconstructed image43. For the employed sampling pattern (CL = 7, \({\sigma }_{y}\) = 0.10), this loss of higher spatial frequencies is further exacerbated by the higher acceleration rate (fivefold), both in the U-net reconstructed magnitude and phase images, as illustrated by the lower metrics shown in Table 1.

In line with this hypothesis, an attempt to maintain the high spatial frequencies in the reconstructed images was conducted by investigating different sampling patterns (\({\sigma }_{y}\) values 0, 0.10, 0.15 and 0.20) for R = 5. As expected, higher \({\sigma }_{y}\) led to better details preservation in both reconstructed magnitude and phase images. However, U-net was not able to correct for the local artifacts generated by the different sampling patterns, especially inside the ROIs of the magnitude images. These generated artifacts are highly dependent on the sampling pattern when dealing with small learning dataset. This in turn means that the proposed approach cannot be easily generalized. Additional concerns on model generalization arise from the homogenous sequence parameters that we deliberately used so far for the training set. Indeed, as the knowledge in this specific field of research is still in its infancy, we believe it is important to limit the degrees of freedom that might influence the model performance. In future work, we will explore different solutions for enhanced flexibility by using sampling patterns less prone to local artifacts such as variable density Poisson-disc. And, for clinical relevance, we will study the generalization ability of our model with respect to different sequence parameters.

Additional improvement can be obtained using other types of loss functions. Although we employed in this work one of the most commonly used MSE, future work includes exploring the use of perceptual loss to improve the reconstruction performance44,45.

Finally, as opposed to most undersampled MR studies using DL, we prospectively tested our trained model on undersampled wrist MR images using the highest acceleration rate (fivefold) and \({\sigma }_{y}\) = 0.10. The results on magnitude and phase images confirmed that our model performs similarly on both simulated and acquired data. This key aspect has two major advantages. First, our method could be directly used in concrete applications to reconstruct full MR complex data without further optimization steps. Second, it could also be used to fine-tune with confidence the sampling pattern on simulated data without the need of collecting extra data.

As future steps, further investigations will be carried out to compare the effectiveness of simulated data against data augmentation, as well as to assess the potential added value of transfer learning approach46 based on natural or MR datasets compared to the present work. Ultimately, different promising network architectures will be considered in future work such as model-driven5 methods (unrolled optimization algorithms) and transformers47.


In this work, we demonstrated the potential of residual U-net for accelerating LF 3D MRI acquisitions. Promising results were obtained not only on magnitude but also on phase reconstructed images for three to fivefold acceleration rates. Although a relatively small (n = 10) dataset was used for training, data augmentation based on geometric image manipulations successfully mitigated data scarcity and allowed complex-valued MR data recovery. The global structure and image sharpness were preserved in the reconstructed magnitude and phase images. Additionally, our results indicated that the model performances are tied to the adopted sampling pattern. Nevertheless, promising results were obtained not only on simulated data but also on acquired fivefold accelerated MR data. This highlights the substantial potential of emerging DL approaches to significantly speed up LF MR acquisitions and ultimately to bring LF MR closer to clinical applications.