Image reconstruction is essential for imaging applications across the physical and life sciences, including optical and radar systems, magnetic resonance imaging, X-ray computed tomography, positron emission tomography, ultrasound imaging and radio astronomy1,2,3. During image acquisition, the sensor encodes an intermediate representation of an object in the sensor domain, which is subsequently reconstructed into an image by an inversion of the encoding function. Image reconstruction is challenging because analytic knowledge of the exact inverse transform may not exist a priori, especially in the presence of sensor non-idealities and noise. Thus, the standard reconstruction approach involves approximating the inverse function with multiple ad hoc stages in a signal processing chain4,5, the composition of which depends on the details of each acquisition strategy, and often requires expert parameter tuning to optimize reconstruction performance. Here we present a unified framework for image reconstruction—automated transform by manifold approximation (AUTOMAP)—which recasts image reconstruction as a data-driven supervised learning task that allows a mapping between the sensor and the image domain to emerge from an appropriate corpus of training data. We implement AUTOMAP with a deep neural network and exhibit its flexibility in learning reconstruction transforms for various magnetic resonance imaging acquisition strategies, using the same network architecture and hyperparameters. We further demonstrate that manifold learning during training results in sparse representations of domain transforms along low-dimensional data manifolds, and observe superior immunity to noise and a reduction in reconstruction artefacts compared with conventional handcrafted reconstruction methods. In addition to improving the reconstruction performance of existing acquisition methodologies, we anticipate that AUTOMAP and other learned reconstruction approaches will accelerate the development of new acquisition strategies across imaging modalities.
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Grangeat, P. Tomography (John Wiley & Sons, 2013)
Gull, S. F. & Daniell, G. J. Image reconstruction from incomplete and noisy data. Nature 272, 686–690 (1978)
Zeng, G. L. Medical Image Reconstruction (Springer, 2010)
Yu, Z ., Thibault, J.-B ., Bouman, C. A ., Sauer, K. D. & Hsieh, J. Fast model-based X-ray CT reconstruction using spatially nonhomogeneous ICD optimization. IEEE Trans. Image Process. 20, 161–175 (2011)
Pruessmann, K. P ., Weiger, M ., Börnert, P . & Boesiger, P. Advances in sensitivity encoding with arbitrary k-space trajectories. Magn. Reson. Med. 46, 638–651 (2001)
Hinton, G . et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29, 82–97 (2012)
Krizhevsky, A ., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 1097–1105 (2012)
Gilbert, C. D ., Sigman, M. & Crist, R. E. The neural basis of perceptual learning. Neuron 31, 681–697 (2001)
Lu, Z.-L ., Hua, T ., Huang, C.-B ., Zhou, Y. & Dosher, B. A. Visual perceptual learning. Neurobiol. Learn. Mem. 95, 145–151 (2011)
Vincent, P ., Larochelle, H ., Bengio, Y. & Manzagol, P.-A. Extracting and composing robust features with denoising autoencoders. In Proc. 25th Int. Conf. on ‘Machine Learning’ 1096–1103, http://www.cs.toronto.edu/~larocheh/publications/icml-2008-denoising-autoencoders.pdf (2008)
Ogawa, T., Kosugi, Y. & Kanada, H. Neural network based solution to inverse problems. In IEEE World Congr. on ‘Computational Intelligence’ Vol. 3, 2471–2476, http://ieeexplore.ieee.org/document/687250/ (1998)
Schiller, H. & Doerffer, R. Neural network for emulation of an inverse model operational derivation of Case II water properties from MERIS data. Int. J. Remote Sens. 20, 1735–1746 (1999)
Hoole, S. R. H. Artificial neural networks in the solution of inverse electromagnetic field problems. IEEE Trans. Magn. 29, 1931–1934 (1993)
Floyd, C. E. An artificial neural network for SPECT image reconstruction. IEEE Trans. Med. Imaging 10, 485–487 (1991)
Pelt, D. M. & Batenburg, K. J. Fast tomographic reconstruction from limited data using artificial neural networks. IEEE Trans. Image Process. 22, 5238–5251 (2013)
Jin, K. H ., McCann, M. T ., Froustey, E . & Unser, M. Deep convolutional neural network for inverse problems in imaging. IEEE Trans. Image Process. 26, 4509–4522 (2017)
Hammernik, K . et al. Learning a variational network for reconstruction of accelerated MRI data. Magn. Reson. Med. 79, 3055–3071 (2017)
Lustig, M ., Donoho, D. & Pauly, J. M. Sparse MRI: the application of compressed sensing for rapid MR imaging. Magn. Reson. Med. 58, 1182–1195 (2007)
Fan, Q . et al. MGH–USC Human Connectome Project datasets with ultra-high b-value diffusion MRI. Neuroimage 124, 1108–1114 (2016)
Deng, J. et al. ImageNet: a large-scale hierarchical image database. In IEEE Conf. on ‘Computer Vision and Pattern Recognition’ 248–255, http://www.image-net.org/papers/imagenet_cvpr09.pdf (2009)
Hornik, K ., Stinchcombe, M. & White, H. Multilayer feedforward networks are universal approximators. Neural Netw. 2, 359–366 (1989)
Di Carli, M. F. & Lipton, M. J. Cardiac PET and PET/CT Imaging (Springer, 2007)
Yang, Z. & Jacob, M. Mean square optimal NUFFT approximation for efficient non-Cartesian MRI reconstruction. J. Magn. Reson. 242, 126–135 (2014)
Virtue, P . & Lustig, M. On the empirical effect of Gaussian noise in under-sampled MRI reconstruction. Preprint at https://arxiv.org/abs/1610.00410 (2016)
Brown, R. W., Cheng, Y. C. N., Haacke, E. M., Thompson, M. R. & Venkatesan, R. Magnetic Resonance Imaging: Physical Principles and Sequence Design 2nd edn (Wiley, 2014)
Gold, J., Bennett, P. J. & Sekuler, A. B. Signal but not noise changes with perceptual learning. Nature 402, 176–178 (1999)
Wright, J . et al. Sparse representation for computer vision and pattern recognition. Proc. IEEE 98, 1031–1044 (2010)
Maaten, L. V. D. & Hinton, G. Visualizing Data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Getis, A. in Handbook of Applied Spatial Analysis (eds Fisher, M. M . & Getis, A. ) 255–278 (Springer, 2010)
Kubo, T . et al. Radiation dose reduction in chest CT: a review. Am. J. Roentgenol. 190, 335–343 (2008)
Daigle, O., Djazovski, O., Laurin, D., Doyon, R. & Artigau, É. Characterization results of EMCCDs for extreme low-light imaging. In Proc. SPIE on ‘High Energy, Optical, and Infrared Detectors for Astronomy V’ Vol. 8453, 845303, https://doi.org/10.1117/12.926385 (2012)
Girard, J. N . et al. Sparse representations and convex optimization as tools for LOFAR radio interferometric imaging. J. Instrum. 10, C08013 (2015)
Lebed, E ., Sarunic, M. V ., Beg, M. F. & Mackenzie, P. J. Rapid volumetric OCT image acquisition using compressive sampling. Opt. Exp. 18, 21003–21012 (2010)
Fessler, J. A. & Sutton, B. P. Nonuniform fast Fourier transforms using min-max interpolation. IEEE Trans. Signal Process. 51, 560–574 (2003)
Kim, D. H., Adalsteinsson, E. & Spielman, D. M. Simple analytic variable density spiral design. Magn. Reson. Med. 50, 214–219 (2003)
Uecker, M., Ong, F., Tamir, J. I. & Bahri, D. Berkeley advanced reconstruction toolbox. Proc. Int. Soc. Magnetic Resonance in Medicine 2486 (2015)
Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous distributed systems. Preprint at https://arxiv.org/abs/1603.04467 (2016)
Nair, V . & Hinton, G. E. Rectified linear units improve restricted Boltzmann machines. In Proc. 27th Int. Conf. on ‘Machine Learning’ 807–814 (ACM, 2010)
Makhzani, A. & Frey, B. J. Winner-take-all autoencoders. Adv. Neural Inf. Process. Syst. 28, 2791–2799 (2015)
Reeder, S. B. et al. Practical approaches to the evaluation of signal-to-noise ratio performance with parallel imaging: application with cardiac imaging and a 32-channel cardiac coil. Magn. Reson. Med. 54, 748–754 (2005)
Duyn, J. H., Yang, Y., Frank, J. A. & van der Veen, J. W. Simple correction method for k-space trajectory deviations in MRI. J. Magn. Reson. 132, 150–153 (1998)
Pruessmann, K. P., Weiger, M., Scheidegger, M. B. & Boesiger, P. SENSE: sensitivity encoding for fast MRI. Magn. Reson. Med. 42, 952–962 (1999)
Saad, Y. & Schultz, M. H. GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Statist. Comput. 7, 856–869 (1986)
Comtat, C, et al. OSEM-3D Reconstruction Strategies for the ECAT HRRT. IEEE Symp. Conf. Record Nuclear Science 6, 3492–3496 (2004)
Izquierdo-Garcia, D. et al. An SPM8-based approach for attenuation correction combining segmentation and nonrigid template formation: application to simultaneous PET/MR brain imaging. J. Nucl. Med. 55, 1825–1830 (2014)
Yu, K. & Zhang, T. Improved local coordinate coding using local tangents. In Proc. 27th Int. Conf. on ‘Machine Learning’ 1215–1222 (ACM, 2010)
Anderes, E. & Coram, M. A general spline representation for nonparametric and semiparametric density estimates using diffeomorphisms. Preprint at https://arxiv.org/abs/1205.5314 (2012)
Zhang, M ., Singh, N. & Fletcher, P. T. Bayesian estimation of regularization and atlas building in diffeomorphic image registration. Int. Conf. Inf. Process. Med. Imaging 37–48 (Springer, 2013)
Fishbaugh, J., Prastawa, M., Gerig, G. & Durrleman, S. Geodesic image regression with a sparse parameterization of diffeomorphisms. In 1st Int. Conf. on ‘Geometric Science of Information’ GSI 2013 (eds Nielsen, F. & Barbaresco, F. ) Vol. 8085, 95–102, https://link.springer.com/chapter/10.1007/978-3-642-40020-9_9 (2013)
Bernstein, A ., Kuleshov, A. & Yanovich, Y. Manifold Learning in Regression Tasks. Statistical Learning and Data Sciences 414–423 (Springer, 2015)
Hornik, K. Approximation capabilities of multilayer feedforward networks. Neural Netw. 4, 251–257 (1991)
Irie, B. & Miyake, S. Capabilities of three-layered perceptrons. In IEEE Int. Conf. on ‘Neural Networks’ Vol. 1, 641–648 (1988)
Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Contr. Signals Syst. 2, 303–314 (1989)
Barron, A. R. Approximation and estimation bounds for artificial neural networks. Mach. Learn. 14, 115–133 (1994)
Mordvintsev, A., Olah, C. & Tyka, M. DeepDream—a code example for visualizing neural networks. https://research.googleblog.com/2015/07/deepdream-code-example-for-visualizing.html (Google Res, 2015)
Addy, N. O., Wu, H. H. & Nishimura, D. G. Simple method for MR gradient system characterization and k-space trajectory estimation. Magn. Reson. Med. 68, 120–129 (2012)
Han, H., Ouriadov, A. V., Fordham, E. & Balcom, B. J. Direct measurement of magnetic field gradient waveforms. Concepts Magn. Reson. 36A, 349–360 (2010)
Goodfellow, I., Pouget-Abadie, J. & Mirza, M. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2672–2680 (2014)
Haskell, M., Cauley, S. F. & Wald, L. L. TArgeted Motion Estimation and Reduction (TAMER): data consistency based motion mitigation for MRI using a reduced model joint optimization. IEEE Trans. Med. Imaging PP, 99, https://doi.org/10.1109/TMI.2018.2791482 (2018)
Fessler, J. A., Lee, S., Olafsson, V. T., Shi, H. R. & Noll, D. C. Toeplitz-based iterative image reconstruction for MRI with correction for magnetic field inhomogeneity. IEEE Trans. Signal Process. 53, 3393–3402 (2005)
Cauley, S. F. et al. Fast reconstruction for multichannel compressed sensing using a hierarchically semiseparable solver. Magn. Reson. Med. 73, 1034–1040 (2015)
Xi, Y., Xia, J., Cauley, S. & Balakrishnan, V. Superfast and stable structured solvers for Toeplitz least squares via randomized sampling. SIAM J. Matrix Anal. Appl. 35, 44–72 (2014)
Xia, J., Chandrasekaran, S., Gu, M. & Li, X. S. Fast algorithms for hierarchically semiseparable matrices. Numer. Linear Algebra Appl. 17, 953–976 (2010)
Weller, D. S., Ramani, S. & Fessler, J. A. Augmented Lagrangian with variable splitting for faster non-Cartesian L1-SPIRiT MR image reconstruction. IEEE Trans. Med. Imaging 33, 351–361 (2014)
Zhao, B., Setsompop, K., Ye, H., Cauley, S. F. & Wald, L. L. Maximum likelihood reconstruction for magnetic resonance fingerprinting. IEEE Trans. Med. Imaging 35, 1812–1823 (2016)
Han, S., Mao, H. & Dally, W. J. Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. Preprint at https://arxiv.org/abs/1510.00149 (2015)
Hu, H., Peng, R., Tai, Y.-W. & Tang, C.-K. Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. Preprint at https://arxiv.org/abs/1607.03250 (2016)
We acknowledge M. Michalski and the computational resources and assistance provided by the Massachusetts General Hospital (MGH) and the Brigham and Women’s Hospital (BWH) Center for Clinical Data Science (CCDS). The CCDS is supported by MGH, BWH, the MGH Department of Radiology, the BWH Department of Radiology, and through industry partnership with NVIDIA. We also acknowledge the Center for Machine Learning at Martinos. We also thank J. Stockmann, J. Polimeni, D. E. J. Waddington and R. L. Walsworth for their comments on this manuscript, and B. Bilgic and C. Liao for their assistance in human subject data acquisition. We acknowledge C. Catana for providing raw PET data and for filtered back projection and OSEM reconstructions. We also thank M. Haskell for providing the MRI motion encoding model. B.Z. was supported by the National Institutes of Health/National Institute of Biomedical Imaging and Bioengineering F32 Fellowship (EB022390). Data were provided in part by the HCP, MGH-USC Consortium (Principal Investigators: Bruce R. Rosen, Arthur W. Toga and Van Wedeen; U01MH093765), which was funded by the NIH Blueprint Initiative for Neuroscience Research grant; the National Institutes of Health grant P41EB015896; and the Instrumentation Grants S10RR023043, 1S10RR023401, 1S10RR019307.
The authors declare no competing financial interests.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Reference brain images were encoded into sensor-domain sampling strategies with high levels of additive white Gaussian noise and reconstructed using both AUTOMAP and conventional approaches: a–e, spiral k-space encoding compared with conjugate-gradient SENSE reconstruction with NUFFT regridding; f–j, Radon projection encoding compared with model-based iterative reconstruction. Image magnitude signal-to-noise ratios (SNRs) and error maps (with root mean squared error calculations) with respect to reference ground truth images are also shown. For each encoding experiment, both error maps are windowed to the same level.
a–c, AUTOMAP was trained using sensor-image pairs of Cartesian Fourier encoded corpora derived from either ImageNet, HCP brain images, or random-valued Gaussian noise without any real-world image structure. Each trained network was then used to reconstruct a noise-corrupted Cartesian k-space brain dataset. The signal-to-noise ratio (SNR) of the reconstructed images is shown. The apparent intensity discontinuity in the region above the eyes is due to the masking process used to de-identify the data in the HCP protocol (see Methods for more details).
Mean squared error (MSE) loss was minimized with stochastic gradient descent using the RMSProp algorithm and plotted here against training epoch count for: a, Cartesian Fourier encoding on IMAGENET corpus; b, spiral Fourier encoding on IMAGENET corpus; and c, Cartesian undersampled Fourier encoding on HCP brain corpus. The validation error tracks the training error without upward divergence, demonstrating a stable training regime with good bias-variance tradeoff.
a, T2-weighted reference image acquired at 3 T with a turbo spin-echo sequence. b, Three-dimensional motion trajectories measured during an Alzheimer’s patient study. c, d, These motion trajectories were used to corrupt the k-space of this reference image, and it was reconstructed without motion compensation using inverse Fourier transform (c) and AUTOMAP (d). Both images show comparable artefact level and structure, demonstrating the stability of AUTOMAP reconstruction in the presence of unanticipated subject motion. A/P refers to anterior and posterior translational motion, L/R refers to left and right translational motion.
a–d, Human FDG PET sinogram data (a) was reconstructed using (b) filtered back projection (FBP), (c) OP-OSEM and (d) AUTOMAP.
About this article
Cite this article
Zhu, B., Liu, J., Cauley, S. et al. Image reconstruction by domain-transform manifold learning. Nature 555, 487–492 (2018). https://doi.org/10.1038/nature25988
Physics in Medicine & Biology (2020)
Comparison and evaluation of the efficacy of compressed SENSE (CS) and gradient‐ and spin‐echo (GRASE) in breath‐hold (BH) magnetic resonance cholangiopancreatography (MRCP)
Journal of Magnetic Resonance Imaging (2020)
Journal of Magnetic Resonance Imaging (2020)
Journal of Magnetic Resonance Imaging (2020)
Der Radiologe (2020)