Efficient dispersion modeling in optical multimode fiber

Dispersion remains an enduring challenge for the characterization of wavelength-dependent transmission through optical multimode fiber (MMF). Beyond a small spectral correlation width, a change in wavelength elicits a seemingly independent distribution of the transmitted field. Here we report on a parametric dispersion model that describes mode mixing in MMF as an exponential map and extends the concept of principal modes to describe the fiber’s spectrally resolved transmission matrix (TM). We present computational methods to fit the model to measurements at only a few, judiciously selected, discrete wavelengths. We validate the model in various MMF and demonstrate an accurate estimation of the full TM across a broad spectral bandwidth, approaching the bandwidth of the best-performing principal modes, and exceeding the original spectral correlation width by more than two orders of magnitude. The model allows us to conveniently study the spectral behavior of principal modes, and obviates the need for dense spectral measurements, enabling highly efficient reconstruction of the multispectral TM of MMF.


Introduction
This supplementary information contains 8 sections. Section 1 presents a mathematical justification of how mode cross-talk results in higher-order dispersion terms in MMF. Section 2 shows the experimental setup of measuring transmission through MMF. Section 3 elaborates on the numerical correction necessary for multi-wavelength off-axis holography. Section 4 shows the DOFs of the MMF based on singular value decomposition and the original spectral correlation. Section 5 describes and illustrates the processing details of modeling dispersion. Section 6 explains the spectral sampling criterion of modeling dispersion. Section 7 shows the generalization of our parametric dispersion model to various MMF and the corresponding performance. Section 8 shows the applicability of our model to different MMF calibration systems using a public data set.

Theory
To qualitatively study the dispersion effect in a randomly bent MMF with distributed mode mixing, we modeled the fiber as concatenation of N ideal short segments with individual TMs, M n , where each of them has wavelengthindependent eigenvectors, i.e., propagation-invariant modes, and accordingly linear dispersion. The overall TM of the MMF is then the product of matrix series where the arrow indicates the order of the non-commutative matrix products. Each M n has a constant linear dispersion and time delay operator at varying frequency such that we can express the TM of each segment as where we use series development of the matrix exponential. Since M n has wavelength-independent eigenvectors, m n and M n (ω 0 ) share the same set of eigenvectors. We can now express the time delay operator of the overall TM ∂M(ω) ∂ω We next observe that the first of the serial product terms in Eq. S4 can be expressed using the series development of Eq. S3 as which is a polynomial of ∆ω with coefficient matrices g k,θ . Similarly, Plugging Eqs. S5 and S6 in to Eq. S4, we have As a result, the non-commutativity between m n , which essentially models the mode cross-talk, leads to the spectral variance in m(ω) and hence higher order dispersion upon a ∆ω spectral shift. When the m n commute with each other, the terms involving commutators in the right-hand side (RHS) in Eq. S7 disappear, leaving the sum over the m n , and the PMs of the MMF remain wavelength-independent. However, we did not find a compact expression for the solution to Eq. S7.

Experimental setup
We developed an automated msTM measurement system illustrated in Fig. S1.

Correction to spatial channel misalignment
In the off-axis holographic imager setup, the modulation frequency of fringe images is dependent on the effective in-plane momentum component and thus on the operation wavelength [1]. Since we recorded TMs at fixed pixels in the Fourier domain, the spatial channels were misaligned at different wavelengths. To measure a msTM through identical channels, we need to gauge the varying modulation frequency and co-register the pixels. As shown in Fig. S2(a), we obtained multiple modulated MMF images corresponding to different input realizations at each wavelength, averaged the intensity of the images in the Fourier domain, and isolated the corresponding interference lobes. We repeated the process at decreasing wavelength and tracked the center of the lobes across the laser spectrum. As shown in Fig. S2(b), we then fitted the center trajectory with linear regression, which informs on the required pixel shift in the Fourier domain for a given wavelength. For each TM measured at a frequency ω, we imposed corresponding spatial channel registration, and the center position at varying frequency after the correction is shown in Fig. S2(c). While only correction to H polarized channels is shown, we conducted individual corrections to both polarization channels.

Characterization of multispectral transmission matrix
The number of guided modes (or DOFs) Q within the MMF can be quantified by performing singular value decomposition (SVD) on each measured TM, counting the singular values (SVs) above a threshold defined as 5% of the largest SV. As shown in Fig. S3(a), for the SI-MMF with 50µm core size and NA 0.22 operated at around 1550 nm, there are Q∼200 DOFs, or populated modes, in each monochromatic TM of ω within the laser tunable range, consistent with the theoretical values from a typical SI-MMF model, and the number of DOFs gradually increases with ω.
We then calculated the spectral correlation of TMs with one at a reference frequency, C(M(ω), M(ω 0 )), where ω 0 = 191 THz. As plotted in the solid curves in Fig. S3(b), the original TM spectral correlation without processing manifests a fast decay with a half width at half maximum (HWHM) of ∼15.21 GHz (thus the FWHM is δν = 30.43 GHz = ∼0.26 nm) due to modal dispersion, consistent with previously reported results with SI-MMFs in weakly scattering regime [2]. As benchmark, we found a 98.7 ± 0.14% correlation between two repeatedly measured TMs at an identical optical frequency. The low resemblance between different MMF outputs upon even just sub-nanometer spectral perturbation hence led to the plausible conclusion of independent transmission through MMFs at a spectral shift beyond a coherence length.

Data processing framework
After acquiring msTMs, correcting the spatial channel misalignment, and projecting all TMs into the subspace spanned by the singular vectors of the TM at ω 0 , we loaded the data into a sequential computational framework to reconstruct the dispersion model. As illustrated in Fig. S4(a), the processing begins by estimating linear dispersion,D 1 , with analytical operations from a msTM across a small spectrum, with aligned phase offsets.D 1 is then projected into the subspace of unitary matrices to obtain D 1 . D 1 can then be used for either the initialization of higher-order dispersion optimization, or fast construction of an improved linear dispersion model by combining it with a second msTM covering a wider spectrum. Figure S4(b) shows the pipeline of deriving higher-order dispersion given an estimated D 1 . We first computed for X 1i from D 1 as an initial guess of the overall dispersion. To refine X 1i and find high order dispersion X k (k > 1) up to K order, we included a msTM across a large spectrum in the fitting set, and minimized the corresponding complementary correlation Here, N ω is the number of total pairs of matrices (M n , M 0 ), ∆ω n is the frequency shift between (M n , M 0 ), tr is the matrix trace, and K is the truncation index of the X k series. The challenge is to obtain an analytical gradient of f (D K ) with respect to the individual orders X k of the exponential map X = K k=0 X k ∆ω k , for efficient steepest gradient descent optimization under the unitary constraint of D K (i.e., skew-Hermitian constraint on all X k ). The gradient with respect to D K for the complementary correlation of a single pair n is Following Lezcano et al. [3], we can use this gradient to express the derivate with respect to the exponential map With respect to a single order of the exponential polynomial, the derivative becomes Summing over all matrix pairs to obtain the full gradient then gives The differential of the exponential of matrices (d exp) X can be approximated to machine-precision by computing the exponential of a matrix with twice the width and height of X [4]. For the gradient descent optimization, we have the iterative update rule where the superscript i is the iteration index and α is the step size. We developed an adaptive update step size for iterations based on backtracking line search with a shrinking/expanding rate of 0.9 [5]. It typically took few hundreds of iterations to reach convergence, and the computation time, depending on the number of matrices and their dimensions (or the DOFs of MMF), ranged from several seconds to tens of minutes. Figure S4(b) shows examples of the spectral perturbation matrix and the first two orders of X k series. Figure S4(c) shows the pipeline of fast construction of linear dispersion model. We computed the eigenvectors ofD 1 (δω large ) from a msTM with larger step size, and used the eigenvectors to diagonalize and collected the main diagonal of theD 1 (δω small ) from a msTM with smaller step size. The eigenvectors and the main diagonal elements allow us to construct a more accurate linear dispersion model D 1 within few seconds. We computed X 1i from estimated D 1 and initialized the optimization process for finding higher-order dispersion, X k , with a msTM across large spectrum. With fitted X k series, we can predict the TM at an arbitrary optical frequency. (c) By using two msTMs of different step sizes, we can construct linear dispersion model with high accuracy and time efficiency.

Phase wrapping issue
In experiments, we measured TMs at discrete optical frequencies.
To construct a dispersion model in Eq. 3 over continuous bandwidth, we need D(ω, ω 0 ) with correct eigenvalues without 2π phase wrapping ambiguity. With a measured msTM of small enough step size δω such that the maximal phase difference of eigenvalues ofD 1 (δω) does not exceed 2π, we can derive the correct changing rate of individual eigenvalues (equivalent to the group delays of PMs). In reverse, the temporal spread due to modal dispersion of signals transmitted through MMF imposes the required spectral sampling to correctly resolve all delays.
To show the phase wrapping issue, we constructed two linear dispersion models of the same 1m-long 50-µmcore SI-MMF geometry with two sets of msTM measurements, respectively. The first set has two msTMs of (ω s , δω, Ω) = (190.9612, 0.0122, 0.1216) and (192.3077, 0.3033, 3.033) THz, and the second set has two msTMs of (ω s , δω, Ω) = (190.4762, 0.0607, 0.6066) and (192.3077, 0.3033, 3.033) THz. Note that the two sets of measurements only have difference in the spectral sampling rate of the first msTM, where the sampling rate is respectively above or below the Nyquist criterion, i.e., the minimum spectral sampling rate that avoids aliasing of the relative phase delays inD 1 eigenvalues. Fig. S5(a) shows the eigenvalues ofD 1 (δω) of the first msTM of each set in the complex plane: When above Nyquist sampling rate (top), the angular distribution of the meaningful eigenvalues does not cover the entire plane, and we can retrieve the relative phase offsets of the eigenvalues without ambiguity; However, when below the Nyquist sampling rate (bottom), the eigenvalues spread all over the plane, and we cannot determine the phase offsets due to the 2π ambiguity. We then tested the two models with separately measured TMs at uncalibrated frequencies, which is associated with fractional matrix power ofD 1 (δω) by linearly scaling the eigenvalue phases by a real value. Fig. S5(b) shows the computational spectral correlation after dispersion compensation to the first order. When below the Nyquist sampling rate (bottom), the TM prediction at a frequency off the grid of δω is inaccurate due to the incorrect phases of eigenvalues. Mathematically, if we develop the fractional matrix power in closed form by diagonalization, since we can add any integer of 2π to the phases of complex eigenvalues without changing the values, finding numerical roots based on commonly used De Moivre's Theorem may lead to a false solution when the Nyquist condition is not satisfied. Generally speaking, the Nyquist rate increases with MMF length and NA, and the required narrower spectral resolution also has implications for the coherence length of the laser source. For high order dispersion optimization, we need spectral sampling over a large spectrum to measure dispersion nonlinearity. However, similar to the linear dispersion, an insufficient sampling rate may also result in incorrect X k , k ≥ 2 convergence. To show this, in a separate experiment with the same 1m-long 50-µmcore SI-MMF, we first computed the linear dispersion D 1 centered at ω 0 = 191 THz by using two msTMs of (ω s , δω, Ω) = (190.955, 0.015, 0.105) and (190.6, 0.08, 0.8) THz. We then computed the eigenvalues of the linear dispersion D 1c 1 (δω) of linear-dispersion-compensated M 1c (ω) = D −1 1 (∆ω = ω − ω 0 )M(ω) at two different spectral sampling (ω s , δω, Ω) = (189.134., 0.933, 3.732) and (187.28, 1.86, 7.44) THz, respectively. As shown in Fig. S6(a), D 1c 1 (δω = 0.933) has eigenvalues spread only half of the complex plane, and D 1c 1 (δω = 1.86) almost covers the entire plane. This informs that δω = 0.933 THz is a sufficient sampling rate for second order dispersion. Resuming back to regular higher order dispersion compensation, we then used the two msTMs with δω = 0.933 or 1.86 THz for optimizing second order dispersion of M(ω) and tested the dispersion models. As plotted in Fig. S6(b), the spectral correlation with dispersion compensation to the second order is smooth over ∼72 nm when using the msTM at δω = 0.933 THz. However, the spectral correlation is oscillating when using the msTM at δω = 1.86 Thz due to incorrect X 2 .

Generalization of dispersion model to various MMF
We verified the generalizability of our parametric dispersion model by testing it on different SI-and GI-MMF, as specified in Table S1. For each MMF, we constructed the corresponding dispersion model by using msTMs centered at 191 THz (1570 nm). We truncated the model to the k th order dispersion if no improvement was observed to the k + 1 th order. We then evaluated the dispersion model of order K with separately measured msTMs, and calculated the spectral correlation before and after dispersion compensation. The best attainable FWHM of computational spectral correlation bandwidth, δν e , is recorded in Table S1. Therefore, we concluded that the model is valid for the various MMF regardless of refractive index profile, geometry, and length. Increased coupling, due to additional fiber perturbation or longer fiber length limits the spectral width of the model, although insufficiently dense spectral sampling for longer fibers due to experimental limitations may contribute to the observed performance.
With the corresponding constructed dispersion model of different MMF, we computed and studied the characteristics of their spectral-variant PMs. Here we show two representative results of the 1m-long few-mode fiber (FMF, indexed as "a" in Table S1) and the 1m-long GI-MMF (indexed as "h"). For the FMF, as shown in Fig.  S7(a), we achieved a correlation above 0.8 over the full 115 nm spectrum after the dispersion compensation to K = 3, whereas the original correlation has multiple lobes and a coherence length of δν = ∼0.06 THz. We identified 6 PMs in the FMF, with the corresponding permanence and mode profile shown in Fig. S7(b) Fig. S8(a), the correlation with estimated linear dispersion (orange curve) has a pyramid shape, different to that in SI-MMFs. This is observed in other GI-MMFs and may be due to the parabolic refractive index profile and propagation properties. Nevertheless, we achieved 110 nm correlation bandwidth after the dispersion compensation to the third order. The permanence of each spectrally-dependent PM is shown in Fig. S8(b), where the average permanence is much reduced compared to the 1m-long 200-mode SI-MMF (indexed as "d") due to strong mode mixing and degeneracy within each mode group. We visualized the PM profiles in Fig. S8(c), which are scrambled speckle patterns instead of theoretical Hermite-Gaussian modes. The group delays are plotted in Fig. S8(d) and span across ∼5 ps. The results are not only consistent with theory and literature [6,7], but the computational methods here avoid extensive experimental measurements. We also describe here the computational spectral correlation in other MMF after dispersion compensation: The tightly coiled MMF (indexed as "e") has 7 loops with 2.5 cm radius of curvature and stronger modal scrambling compared to the MMF (indexed as "d") in the article. The spectral correlation after the first order dispersion compensation has a reduced bandwidth ∼2.5 THz (∼20 nm) compared to the same fiber in a loosely coiled condition. Nevertheless, the second order correction can further improve the correlation bandwidth to ∼58 nm. This indicates that the strong modal scrambling induces higher order dispersion; The 2-m-long MMF (indexed as "f"), compared to the identical MMF with 1 m length, has a shorter original δν = ∼20 GHz (∼0.16 nm) and a reduced bandwidth of ∼18 nm after the first order dispersion compensation. This implies that the longer fiber length may result in more mode mixing. After correction to the second order, we can compensate additional dispersion and increase the correlation bandwidth to ∼40 nm; In the 3m-long OM1 GI-MMF (indexed as "i"), the modes within each group have similar propagation constants and strong mode mixing, and the MMF has an original coherence length of δν = ∼20 GHz (∼0.16 nm). The mode mixing effect creates higher-order dispersion and compromises the reconstruction of dispersion model, where the longer fiber length exacerbates the effect. Nevertheless, with dispersion correction to the second order, we can improve the bandwidth to δν e = 8 nm (∼1.5 THz); Finally, the 10-m-long OM4 MMF (indexed as "j") has a strong modal scrambling effect within individual mode groups. While the dispersion compensation to the first order slightly improves the correlation, the FWHM bandwidth remains the same. For the longer MMF "g," "i," and "j," where the first order model provides improvement to spectral correlation over a much shorter spectral range, the individual PMs computed at the center frequency similarly manifest fast correlation collapse over a small spectral range. This is in contradiction with previous reports of significant bandwidths in MMF in fibers as long as 100 m [7]. Also, the fact that the model fits well to shorter GI-MMF, where the mixing within individual mode groups is already in the fully coupled regime, suggests that our model, in principle, is able to describe dispersion in this regime. The observation that even the PMs only exhibit poor bandwidths in these longer fibers suggest that our measurement and phase stabilization strategy may be inadequate for the characterization of longer fibres with a larger pathlength offset.

Applicability of dispersion model to different measurement system
To test the generalization of our model to measurements from different hardware setups, we used a public msTM dataset of (ω s , δω, Ω) = (191.5, 0.0102, 5.22) THz (N ω = 512) acquired from loosely coiled OM1 MMF (gradedindex, 62.5 µm core, 0.275 NA, 2 m fiber length, 420 modes) [8], and the TMs are represented in mode basis. We Original Amp. Figure S7: Computational spectral correlation in the 1m-long few-mode MMF ("a" in Table S1). (a) The spectral correlation before and after dispersion compensation to different orders: The orange curve is with initial linear dispersion estimation; The yellow curve is with first order dispersion optimized; The purple curve is with correction to the second order. The gray curves are the correlation of conventional PMs evaluated at the center frequency  Figure S9 shows the spectral correlation before and after dispersion compensation to different orders. The model to the third order captures dispersion across the total 41 nm spectrum with excellent accuracy. As a result, the architecture is robust and can be applied to different MMF calibration system with different TM representations. Original Amp. Figure S8: Computational spectral correlation in the 1m-long GI-MMF ("h" in Table S1). (a) The spectral correlation before and after dispersion compensation to different orders: The orange curve is with initial linear dispersion estimation; The yellow curve is with first order dispersion optimized; The purple curve is with correction to the second order; The green curve is with correction to the third order. The gray curves are the correlation of conventional PMs evaluated at the center frequency ω 0 . (b) The permanence of the 300 spectral-variant PMs.  191. 5 192 192.5 193 193.5 194 194.5 195 195.5 196 196 Figure S9: Computational spectral correlation in 2m-long GI-MMF using the public dataset.