Long tailed trions in monolayer MoS2: Temperature dependent asymmetry and resulting red-shift of trion photoluminescence spectra

Monolayer molybdenum disulfide (MoS2) has emerged as a model system for studying many-body physics because the low dimensionality reduces screening leading to tightly bound states stable at room temperature. Further, the many-body states possess a pseudo-spin degree of freedom that corresponds with the two direct-gap valleys of the band structure, which can be optically manipulated. Here we focus on one bound state, the negatively charged trion. Unlike excitons, trions can radiatively decay with non-zero momentum by kicking out an electron, resulting in an asymmetric trion photoluminescence (PL) peak with a long low-energy tail and peak position that differs from the zero momentum trion energy. The asymmetry of the trion PL peak and resulting peak red-shift depends both on the trion size and a temperature-dependent contribution. Ignoring the trion asymmetry will result in over estimating the trion binding energy by nearly 20 meV at room temperature. We analyze the temperature-dependent PL to reveal the effective trion size, consistent with the literature, and the temperature dependence of the band gap and spin-orbit splitting of the valence band. This is the first time the temperature-dependence of the trion PL has been analyzed with such detail in any system.

M (p) ∝ d 2 ρ ψ tr (ρ ρ ρ 1 = 0, ρ ρ ρ 2 = ρ ρ ρ) exp −i p · ρ ρ ρ h m X m tr . (S1) where ψ tr is the trion wave function, and ρ ρ ρ 1 and ρ ρ ρ 2 are the locations of the electrons relative to the hole. The matrix element is computed with one of the electrons having a relative coordinate of zero, which makes intuitive sense as one of the electrons should be recombining with the hole at the origin. Hence, the optical matrix element is just the Fourier transform of the trion wave function's second electron position with the first electron position set to zero. If we make the substitution into equation (S1) that the trion wave function is a Gaussian packet with standard deviation a, ψ 1P tr (0, ρ ρ ρ) ∝ exp −ρ 2 4a 2 , then M 1P (p) ∝ exp − m X m tr ā h 2 p 2 and we can interpret a as the effective trion size as discussed in the text.
We have emphasized in the previous paragraph that the Gaussian wave packet corresponds with a one parameter trion wave function by using the superscript 1P, the single parameter being the effective trion size, a. However, the absorption spectroscopy work 4 and theoretical calculations 5 both used more realistic two parameter trion wave functions as follows. It is expected, based on observations in GaAs quantum wells, that the two electrons will form a singlet state, in which case ψ tr must be symmetric when swapping the positions of the two electrons. A standard way to incorporate this symmetry is to form the symmetric product of two single particle wave functions. In the absorption spectroscopy work 4 and theoretical calculations 5 the Figure S1. Waterfall plot of data before removing the trapped exciton peak, marked with D, and an exponential tail from lower energy defect states.  Table S1. Table of two parameter, b and c, wave function sizes along with adapted single parameter wave function size, a.
single particle wave function is chosen to be the zero angular momentum 2D hydrogen atom wave function ψ X (ρ ρ ρ; a) = 2 where the X denotes exciton since this is expected to be the lowest energy exciton wave function, and a is a free parameter for the size of the orbital. The trion singlet state wave function formed from this single particle wave function is where the superscript 2P denotes that this is a two parameter wave function with parameters b and c that describe the size of the two electron orbits about the hole. To compare results from Berkelbach et. al. 5 and Zhang et. al. 4 with ours we have adapted their values for b and c by finding the value of a that minimizes the sum square error (SSE) between the optical matrix elements of their two parameter wave function and our single parameter wave function. The SSE is given by where M 1P is the matrix element given by our one parameter, Gaussian wave function, and M 2P is the matrix element given by the more complicated two parameter, symmetrized hydrogen atom wave function. The values of a that minimized the SSE for different values of b and c found in the literature are shown in Table S1.

S4 Absorption Spectroscopy Trion Peak Shift
As discussed in the main text, the trion peak is red-shifted relative to zero momentum trion energy because of the convolution of the phenomenological broadening with the asymmetric intrinsic spectrum. The Feynman diagram for trion decay can be run in the reverse time direction to create the diagram for optical absorption via trion formation, so the absorption spectra will also have a low energy tail just like equation (1a). However, because the absorption will be proportional to the occupation number of electrons (which are being captured by the photo excited excitons to create trions) equation (1b) will be modified slightly to which makes the temperature dependent first term roughly 3 times larger than in the PL case. This means that the absorption tail length at temperature T is approximately equal to the PL tail length at temperature T /3. The peak red-shift is determined by the trion tail length and lifetime, so if we neglect thermal changes to the trion lifetime, then Fig. 4b & c can be used as a mapping from tail length to peak red-shift. Under this approximation, the red-shift of the absorption peak at temperature T is simply the red-shift of the PL peak at temperature T /3. Since the red-shift monotonically increases with temperature, the peak shift observed in absorption spectroscopy will always be smaller than the shift observed in PL spectroscopy. For example, at room temperature the PL peak is red-shifted by ∼15 meV, but the absorption peak is expected to red-shifted by only ∼10 meV (the red-shift in Fig. 4b at 100 K). In short, the red-shift of the absorption trion peak will be less than the PL peak.

S5 Fitting Method and Statistical Analysis
We fit our measured spectra to two Lorentzian peaks for the A and B excitons, and a long-tailed peak for the trion using equation (1a) convolved with a Lorentzian to account for the finite trion lifetime. Additionally, we accounted for inhomogeneous broadening 6-8 by convolving the fit function with a Gaussian. This broadening is relatively small at the lower temperatures where it is less than half the FWHM of any of the peaks, but is increasingly important at the higher temperatures where it becomes the same size as the Trion FWHM, the smallest FWHM of any of the peaks. In total, these peaks amount to 11 fit 3/6 Figure S2. a) χ 2 ν cumulative distribution for PL measured at 123 K. The dashed red line denotes the 68th percentile, which indicates the threshold χ 2 ν value that sets the 1σ confidence interval for the joint probability distribution of all variables. b) Distribution of trion tail length, ε, for the PL measured at 123 K. The cyan vertical line denotes the best-fit value to the data, and the green vertical lines denote the upper and lower cut offs which contain 68% of the distribution, establishing the 1σ confidence interval for the single variable probability distribution for ε.
parameters: amplitude, position and width of all three peaks, the trion tail length, and the inhomogeneous broadening parameter. This is only one additional parameter, the trion tail length, relative to the number of parameters that are typically used, and all the parameters are well motivated by the physics. A small trapped exciton peak 9, 10 was observed in the spectra below 198 K, which we were easily able to subtract from the spectra due to the large energy difference between the trapped state and the trion (see Supplementary Information S1).
The addition of the trion tail length to the set of typical fit parameters does complicate the fit procedure because the tail length and peak widths can compensate for each other creating unphysical, local χ 2 minima. We use a multi-step fit procedure to steer the fitting parameters away from these spurious, local minima. First, we take advantage of the separation between the trapped exciton peak and trion peak to fit the trapped exciton peak and background parameters independently 11 . These parameters are then held constant while performing an initial fit to the trion and exciton peaks. For some spectra, the peak widths and inhomogeneous broadening extracted at this stage are unphysically small, in which case we replace them with realistic values which are held constant during a refitting of the remaining parameters. Then a final fit is performed in which all parameters are free to minimize χ 2 , yielding the best-fit overall parameters. This procedure was successful at yielding physical fit parameters for all but one spectrum, measured at 148 K, so we have removed this spectrum from all further analysis. Finally, we quantify the confidence intervals of our fit parameters by bootstrapping our data to perform a Monte Carlo simulation of our experiment 12 . Bootstrapping also estimates the distribution of each fit parameter for each spectrum. In all cases, the distribution contained a single peak clustered around the best-fit value, indicating the robustness of our approach (see below). Fit errors were determined by bootstrapping the data to create 100 new data sets for each measured spectra. Each of the 100 new data sets was then fit to create a distribution of fit parameters with different χ 2 values. Confidence intervals were calculated by ranking fit results by their χ 2 values and increasing the cutoff χ 2 value until 68% of the fits or more fell between the upper and lower bounds set by the cutoff. Note that this was necessary as the Jacobian fit matrix was very flat at the bottom of the χ 2 potential, so simple first order propagation of error yielded erroneously large fit errors. Fig. S2a shows the cumulative distribution of reduced chi squared, χ 2 ν , for 100 fits to the bootstrapped data from the PL measurements made at 123 K. The distribution appears like an error function suggesting that 100 samples is a good representative sample. Fig. S2b shows the distribution of trion tail lengths. The best-fit to the original data is shown in cyan, which the distribution is centered on. That there is no other value of the tail length around which the distribution is clustered is indicative of a robust fit. The green bars indicate the low and high cutoffs that surround 68% of the distribution and set the 1σ confidence interval for the trion tail length for the data measured at 123 K. All parameters for each of the PL measurements similarly show distributions clustered around a single value of the fit parameter, indicating a robust fit.

4/6 S6 Peak Red-Shift Temperature Dependence
The PL spectrum of a trion is the convolution of a Lorentzian with equation 1a in the main text where in the second line we temporarily defined X as E 0 tr − E /ε and γ as Γ/ε, and in the third line we have Taylor expanded the non-exponential part of the integrand about zero, a standard method for approximating integrals with exponential factors 13 . The peak position is where the derivative with respect to energy is zero, which gives the following implicit equation for the peak position as a function of temperature The temperature dependence is hidden in this equation in E 0 tr , ε, and Γ. However, we are not interested in the complete shift, as much of the shift will be due to band gap and binding energy renormalization, but the shift relative to E 0 tr . For convenience we will denote the shift we are interested in as ∆E = E 0 tr − E. Keeping only the terms we have explicitly calculated this gives a third order polynomial equation for ∆E, which we can Taylor expand in temperature about some temperature T 0 .
where primes denote derivatives with respect to temperature. We could attempt to gain new information from this formula by using it to compare our peak fit parameters to our linear fit to peak shift versus temperature. However, the degree of approximation in the calculation is large and any new insights gained would be very speculative. Further, this formula is too complicated to gain physical intuition from, so we find it doesn't assist in understanding the data.

S7 Tail Length Low Temperature Expansion
The first three terms of the low temperature expansion of the tail length, equation (1b) Values of T c for the values of a used in Fig. 4c of the main text are shown in Table S2. Since T c is lower when a is larger, higher order terms need to be included when a is larger, which explains the increasingly non-linear behavior observed for larger values of a. T c is on the order of a few hundred Kelvin for most values of a in Fig. 4c, which necessitates the use of higher order terms to describe the tail length curves at even the lowest temperature, < 83 K, in Fig. 4c Table S2. Convergence temperature scale for each value of a used in Fig. 4c of the main text.