Transmission of natural scene images through a multimode fibre

The optical transport of images through a multimode fibre remains an outstanding challenge with applications ranging from optical communications to neuro-imaging. State of the art approaches either involve measurement and control of the full complex field transmitted through the fibre or, more recently, training of artificial neural networks that however, are typically limited to image classes belong to the same class as the training data set. Here we implement a method that statistically reconstructs the inverse transformation matrix for the fibre. We demonstrate imaging at high frame rates, high resolutions and in full colour of natural scenes, thus demonstrating general-purpose imaging capability. Real-time imaging over long fibre lengths opens alternative routes to exploitation for example for secure communication systems, novel remote imaging devices, quantum state control processing and endoscopy.

structure as opposed to a more random speckle structure indicates that only a few of the higher order modes have been excited. Conversely, the full speckle pattern, distributed across the full fibre output, is obtained only by exciting many modes. This condition was observed by displacing the beam slightly to either side with respect to the central position. This is the desired configuration as the objective here is to image with the highest resolution possible: the fine features of any image are carried by the higher spatial frequencies, which in turn correspond to the higher order modes in the fibre. The effect of changing the inout focusing condition can be clearly seen in Fig. 1. When focusing with the smallest NA ef f at the fibre input facet, the speckles at the output are largest, corresponding to fewer modes and the final retrieval is significantly worse when compared to the largest NA ef f . The intermediate NA ef f shows slightly worse results with respect to the largest NA ef f . A consistent trend is therefore found between the effective NA, the size and number of speckles at the fibre output and final image quality.

SUPPLEMENTARY NOTE 2: METHODS FOR IMAGE COMPARISON
Two methods have been considered in order to quantify the quality of our predicted images: the structural similarity index (SSIM) [2] and the Parson correlation coefficient (PCC). In both cases, a perfect match would correspond to the maximum value 1. Considering two images X and Y , we use the definitions: where µ X represent the average of X, µ Y the average of Y , σ XY the covariance of X and Y , σ 2 X the variance of X and σ 2 Y the variance of Y . Whereas, C 1 and C 2 are two parament defined as C 1 = (K 1 L) * * 2 and C 1 = (K 2 L) * * 2 where K 1 was set to 0.01 and K 2 to 0.03 and L is the dynamic range of the image pixels. Instead, the Parson correlation coefficent is defined as: where x i and y i indicate the pixels with index i respectively of the images X and Y , andX (similarly with Y ) the average of X.

SUPPLEMENTARY NOTE 3: DATA
For optimisation and testing of the model parameters, we use 50,000 images from the ImageNet collection [3]. The training and experimental datasets are supplied as additional material [4]. Validation image examples are images and videos from the Muybridge collection such as a running horse, a jumping cat and a flying parrot. As explained in the main text, the behaviour of the image retrieval seems to be largely independent of the actual test images that was chosen. We focused mainly on the Muybridge images in the main text but here show other examples in Fig. 2. is a satellite image of the Earth imaged through a 10 m fibre at a two different times after the initial training an inversion process is completed showing that the retrieved inversion matrix and setup can still be used at a later time, as would be expected (yet still need to be verified) for a robust inversion system. In (e) we report the speckle patterns relative to (d) at different times (1 hour, 16 hours, 40 hours and 52 hours) of a single channel of the RGB image. In order to allow a quantitative comparison, we used the SSIM defined in the previous section. We note that the setup was not placed in a specifically engineered or stabilised environment and was thus subject to standard night-day temperature fluctuations (2-3 degrees) and environmental vibrations. On the other hand, as we can see in Fig. 2-(e), the correlation between the speckle pattern in time is still high, allowing a good reconstruction. We judged out the scope of the present work to introduce models able to deal with deep changes in the system transmission matrix, such as in presence of relevant bending or temperature variations. Indeed, in future work it would be interesting to explore the possibilities given by a physics inspired artificial neural network also respect to these challenges. In Fig. 3 we show a collection of images taken from the ImageNet database together with their respective output speckle patterns (at the output of 1 m long fibre) and final reconstructions. These images provide further evidence for the image variability and robustness of the imaging reconstruction.

SUPPLEMENTARY NOTE 4: SOFTWARE
The code was developed in Python 3.6.5 with a standard Anaconda http://www.anaconda.com/download configuration, including Keras [5] and TensorFlow [6]. The code is supplied as additional supplementary material that can be downloaded together with all of the training and experimental data/images [4].

A. Model specification
The model is implemented as a simple complex, densely connected layer. The individual weights are regularised with an L 2 minimising term, weighted by λ = 0.03. weights are initialised randomly, uniformly between ±0.002. Recorded images are collected into a training set of N = 45, 000 and validation set of 5, 000. The ComplexDense Layer is a custom layer we developed for Keras. It is a straightforward Dense layer, but with complex-valued weights. The complex weights are represented with Complex64 64-bit data types. Its only task is to implement the complex-valued multiplication.

B. Parameter optimisation
The model fitting uses the standard Keras routines, involving a choice of stochastic gradient descent and mean square error for the cost function. Here the data variable x train, y train, x validation, y validation, x text and y test refer to the amplitudes of, respectively, speckle patterns (x ) and original images (y). Once the network parameters have converged (we ran the network for 850 iterations which takes ca 2 days on a PC with Nvidia TitanXp GPU card, you can generate predictions of outputs using pred_test = model . predict ( x_test ) **2