Introduction

Angle-resolved photoemission spectroscopy (ARPES) and its variants have developed in recent years to be established among the pre-eminent experimental methods in solid-state physics. With an intimate connection to the one-electron removal spectral function, ARPES is unique among the suite of techniques available to condensed matter physicists in its direct correspondence to the electronic structure of crystalline materials, providing access to the one electron removal spectral function within its native momentum space.1,2,3

In the framework of Fermi’s Golden Rule, the photoemission intensity is described as:

$$I(k,\omega ) \propto \mathop {\sum}\limits_{i,f} {A_{f,i}} (k,\omega )\left| {\left\langle {\psi _f} \right|{\hat{\mathbf \Delta }}\left| {\psi _i} \right\rangle } \right|^2,$$
(1)

where \(\left| {\left\langle {\psi _f} \right|{\hat{\mathbf \Delta }}\left| {\psi _i} \right\rangle } \right|^2\) is the photoemission matrix element, and Af,i is the one-electron removal spectral function, given by:

$$A_{f,\,i}(k,\omega ) = \left\langle {{\mathrm{\Psi }}_f^{N - 1}} \right|c_k\left| {{\mathrm{\Psi }}_i^N} \right\rangle \delta (\omega - E_f^{N - 1} + E_i^N),$$
(2)

which reflects the overlap between the initial N-particle many-body wavefunction upon removal of an electron and the ensemble of (N − 1)-particle final state wavefunctions, while preserving energy conservation. Written as the imaginary part of the retarded Green’s function, the spectral function becomes:

$$A(k,\omega ) = \frac{1}{\pi }\frac{{ - \Sigma ^{\prime\prime} (k,\omega )}}{{(\epsilon _k^0 - \omega - \Sigma {^\prime}(k,\omega ))^2 + \Sigma ^{\prime\prime} (k,\omega )^2}}.$$
(3)

The spectral function is seen to carry details of both the underlying bare dispersion associated with the electronic structure of the material \(\epsilon _k^0\), as well as correlations via the self energy \(\Sigma (k,\omega ) = \Sigma {^\prime}(k,\omega ) + i\Sigma ^{\prime\prime} (k,\omega )\). In the opposing limits of vanishing and strong interactions, ARPES is described as an ideal probe of the bandstructure and correlation effects respectively.

In practice, the photoemission can be strongly modulated by the \(\left| {\left\langle {\psi _f} \right|{\hat{\mathbf \Delta }}\left| {\psi _i} \right\rangle } \right|^2\) term, altering the spectral intensity through the dependence of the initial and final states on energy, momentum, and band index. At worst, this suppresses all intensity from certain bands, precluding their study by ARPES entirely. From a different perspective however, this modulation can be viewed as an additional experimental signature in the ARPES intensity which encodes a description of the photoemitted electron’s wavefunction. This term can be simulated to allow for quantitative descriptions and insights regarding the experimental signal. While such an approach has been made at some level for a number of ARPES experiments, this requires substantial effort in developing a specific model for each study.4,5,6,7,8,9 The development of a standard numerical framework would allow for a much larger set of experiments to be analysed at this level, providing the opportunity to understand and leverage the matrix element effects in a broad class of materials. We have pursued this objective through the development of an open source software package, chinook, implemented in Python to enable a broad audience to perform quick and easy simulation of photoemission-related phenomena, thereby improving both the interpretation and analysis of experimental data.

In the following, we will outline the primary workflow of our numerical approach, and the various ways in which this package can be applied to the study of the electronic structure of solids via ARPES.

Results

Matrix element effects

In the design of an ARPES experiment, a cursory understanding or prediction of the matrix elements relevant to a given system can dramatically improve one’s ability to study aspects of the electronic structure of interest. Before proceeding to explicit description of our software, it is instructive to consider a motivating example, taken here to be the iron-based superconductor FeSe. In Fig. 1, we plot experimental data along the ΓM direction taken with two, orthogonal linear polarizations of light at  = 37 eV. Near the Brillouin zone centre, three hole-bands disperse away from the Fermi level (c.f. Fig. 2). For light polarized along the momentum axis in Fig. 1a, only a single state is observed clearly, whereas the perpendicular polarization in Fig. 1b. illuminates this and several other states. The third hole band is almost imperceptible, for any choice of polarization. These observations can be explained through an understanding of the orbital structure of the underlying electronic states, indicated by the insets of Fig. 1, in combination with the orbital-mixing effects of spin-orbit coupling (SOC), as will be discussed in more detail below.10,11 Recent experiments designed with these effects in mind have leveraged the dipole matrix elements to perform targeted spin- and angle- resolved photoemission from states of particular interest, extracting fundamental information pertaining to a broad variety of orbital-related phenomena.11,12,13,14,15,16

Fig. 1
figure 1

Experimental ARPES on FeSe. Both panels display ARPES intensity from valence states taken at hν = 37 eV and 120 K, directed along the ΓM direction. Polarization is set to linear vertical (a) and horizontal (b), allowing for photoemission from states of different orbital character, as indicated by the insets. Adapted from ref. 11 with permission from the authors

Fig. 2
figure 2

Tight-binding model of FeSe. Built using modified model from ref. 60 for tetragonal FeSe. In (a) we plot the bandstructure along a high symmetry path, with the colourscale indicating the expectation value of \(\langle \vec{L}\cdot \vec{S} \rangle\). The Fe-3d density of states is shown in (b). In (c) we plot the crystal structure projected into 2D, with Fe in red and the Se above (large) and below (small) in grey. In plane hopping terms are indicated, in addition to the primitive unit cell. The Fermi surface is plotted in (d)

In many cases, orbital symmetry can be extracted from the polarization dependence alone. The information available from these arguments is limited, in particular for high-orbital (l > 1) states, where the dimension of the vector field provides insufficient means to identify all orbitals uniquely. This is further complicated in multi-atom bases, where now relative phases between different sites can differ from symmetric combinations in a momentum-dependent fashion. In these situations, the photoemission intensity pattern is found to depend sensitively on the relative phases within the initial state wavefunction, producing so-called photoelectron interference patterns. In this way however, the matrix elements encode further information regarding the initial state beyond orbital symmetry alone. These effects have been seen in for example graphene7,17 and topological insulators,8,9 demonstrating the full depth of information regarding the initial state wavefunction which is contained in the ARPES matrix element. To further leverage the information available from ARPES experiments, it is advantageous to be able to simulate the full ARPES experimental intensity, while maintaining physical transparency. By preserving access to the relevant model parameters, one can then establish a more fundamental, and conceptual understanding of the electronic structure.

Model Hamiltonian

There are various levels at which the ARPES matrix element can be modelled.1,4,6,18,19,20 While the most sophisticated approaches account for the possibility of scattering processes subsequent to the photoemission event such as those which make use of Korringa-Kohn-Rostoker final states,21,22,23 we make two important simplifying assumptions here. First, the final states are taken to be free-electron plane waves:

$$\left| {e^{i\vec k \cdot \vec r}} \right\rangle = \mathop {\sum}\limits_{l,m} {i^l} j_l(kr)Y_l^m(\theta ,\phi )Y_l^{m \ast }(\theta _k,\phi _k).$$
(4)

At high photon energies, the assumption of the plane wave final state is particularly well justified, as the crystal potential can be treated as a perturbation and sensitivity to the momentum structure of the exact final states becomes negligible.3 The validity of this assumption is ultimately material dependent, however similar logic as that applied to the domain of suitability for the Born approximation can be made: such an assumption is reasonable when either the crystal potential Voħ2/mea2 or in the high-energy limit, Vo/(ħ2/mea2) ka, where a is the range of the potential. At present it is possible within chinook to relax this assumption only in the restricted sense of ref. 18, as one can include phase shifts to the final state expansion. While beyond the scope of chinook in its current form, it would be possible to write the final states in the form of more sophisticated scattering final states, where the radial and orbital components of the ket in Eq. (4) are modified appropriately to reflect the presence of a finite crystal potential. This can be done through modification of the radial integrals \(B_b^{l^\prime }\) defined below.

Secondly, we work within a tight-binding framework wherein the initial states can be described by localized atomic-like orbitals, centred on the sites of the lattice basis. In materials where the spin degree of freedom is relevant, the orbital basis can be doubled to define a complete spinor basis, represented here by χ±. Formally, the tight-binding basis set is expressed typically as:

$$\phi _a = R_{n,l}^a(r)K_l^a(\Omega )\chi _ \pm ,$$
(5)

where a represents a basis index and n, l the principal and orbital quantum numbers respectively. \(R_{n,l}^a(r)\) is a radial wavefunction, \(K_l^a(\Omega )\) a cubic harmonic, and χ± the spinor projection. Alternatives such as distorted and rotated basis states can also be accommodated, so long as a unitary transformation into the basis of spherical harmonics can be made for the purpose of photoemission calculations. While these simplifications are in some cases unable to capture the full structure of the experimental photoemission intensity, we trade this level of universality for the substantial gains in transparency and physical insight which can be extracted from this approach.

Regarding the definition of the tight-binding model, there are various formalisms which are found in the literature, including Slater-Koster,24 tab, and Wannier Hamiltonians: we have made an effort to accommodate all possible variations without loss of functionality. We require only that the Hamiltonian matrix elements can be written as a Fourier series of bilinear terms in the orbital Hilbert space:

$$H_o^{ab}(\vec k) = \mathop {\sum}\limits_{\{ \vec r_{ab}\} } {t_{ab}} e^{i\vec k\cdot \vec r_{ab}}c_{k,a}^\dagger c_{k,b}.$$
(6)

In this expression rab denotes the full connecting vector between basis states ϕa and ϕb, as opposed to the equivalent form where one refers to the connecting lattice vector alone. In addition to Ho(k), any other bilinear functions of momentum can also be added to the full Hamiltonian at this stage, including spin-orbit coupling, and orbital or spin order. Adherence to Bloch’s theorem can be further relaxed in the context of the low-energy effective models which describe a narrow region of momentum space: in this scenario, a more general function of momentum which satisfies the point-group symmetry can be employed.25

With the basis and Hamiltonian so defined, the eigenvalue problem can be solved, and the initial state wavefunctions then defined as a superposition of the basis states described by Eq. (5):

$$\psi _i = \mathop {\sum}\limits_b {\eta _b} \phi _b.$$
(7)

With this information, a full characterization of the model for a specific material can be performed, followed by subsequent simulation of ARPES matrix elements. This includes density of states, bandstructure, 3D Fermi surface, orbital projections of the eigenstates, as well as the expectation values of various operators of interest, such as for example, 〈LS〉 as in Fig. 2 for tetragonal FeSe. By defining an N × N Hermitian matrix, the expectation value of any observable operator can be computed in this way.

Computation of matrix elements

The workflow of chinook is sketched in Fig. 3. Once a satisfactory material model is established, one can proceed to the simulation of ARPES intensity maps. A suitable region of interest in momentum and energy space must be defined, and the eigenvalue problem is then solved over this domain.

Fig. 3
figure 3

chinook workflow. Essential steps are denoted by solid dark lines, with dashed arrows indicating optional iterative methods. Informed by experiment, DFT, or literature, the user defines a model Hamiltonian, including lattice geometry, orbital basis, and kinetic terms. This model can be characterized through consideration of the (orbitally-projected) bandstructure, total and projected density of states (DOS), Fermi surface, and the expectation value of other relevant operators. The experimental configuration is then defined, and the matrix element integrals can be computed. The resulting intensity is plotted, and compared against experiment. The model can be further refined, or the results exported for additional analysis

We model the matrix elements of the dipole operator as:

$$M_\alpha (\vec k,\omega ) \propto \langle {e^{i\vec k \cdot \vec r}} |{\hat{\bf \epsilon }} \cdot \vec r| {\psi _i^\alpha } \rangle ,$$
(8)

where we have made use of the commutation relations to express the dipole operator in the position representation.

In explicit evaluation of the ARPES matrix element, we expand both the initial and final states as prescribed by Eqs. (4) and (5), which allows us to express Eq. (8) as:

$$\begin{array}{l}M_\alpha (\vec k,\omega ) \propto \langle {e^{i\vec k \cdot \vec r}}|{\hat{\bf \epsilon }} \cdot \vec r| {\psi _i^\alpha }\rangle \\\quad = \mathop {\sum}\limits_b {c_\alpha ^b} (\vec k,\omega )\eta _b{\displaystyle\int} {d^3} re^{i\vec k \cdot \vec r - z_b/2\xi }{\hat{\bf \epsilon }} \cdot \vec rR_b(r)Y_{l_b}^{m_b}({\mathrm{\Omega }})\\\quad = \mathop {\sum}\limits_{b,l^{\prime}} {c_\alpha ^b} \eta _bY_{l^{\prime}}^{m^{\prime} }({\mathrm{\Omega }}_k)(i)^{l^{\prime}}{\displaystyle\int} \mathrm{d} rj_{l^{\prime}}(kr)r^3R_b(r)\\\qquad \times \mathop {\sum}\limits_\mu {\epsilon _\mu } {\displaystyle\int} \mathrm{d} {\mathrm{\Omega }}Y_{l^{\prime}}^{m^{\prime}}({\mathrm{\Omega }})Y_1^\mu ({\mathrm{\Omega }})Y_{l_b}^{m_b}({\mathrm{\Omega }}).\end{array}$$
(9)

This sum over integrals can be expressed in compact form as:

$$M_\alpha (\vec k,\omega ) \propto \mathop {\sum}\limits_{\mu ,b,l^{\prime} ,m} {\epsilon _\mu } c_\alpha ^b(\vec k,\omega )\eta _bY_{l^{\prime}}^m({\mathrm{\Omega }}_k)B_b^{l^\prime }(k)G_{l^{\prime},m}^{b,\mu },$$
(10)

where \(c_\alpha ^b(\vec k,\omega ) \equiv \left\langle {\phi _b|\psi _i^\alpha } \right\rangle\), and \(\epsilon _\mu\) the components of the polarization vector. In the third line, we have absorbed an extinction factor \(e^{ - z_b/2\xi }\) into ηb, where ξ represents the mean-free path of a photoemitted electron, and zb the spatial extent of the basis orbital below the surface.8,26 The radial and angular integrals are contained in the following terms, respectively:

$$B_b^{l^\prime }(k) = (i)^{l^\prime }{\int} \mathrm{d} rj_{l^\prime }(kr)r^3R_b(r),$$
(11)

and:

$$G_{l^{\prime} ,m^{\prime} }^{b,\mu } = {\int} \mathrm{d} {\mathrm{\Omega }}Y_{l^{\prime }}^{m^{\prime} }({\mathrm{\Omega }})Y_1^\mu (\Omega )Y_{l_b}^{m_b}({\mathrm{\Omega }}),$$
(12)

Here, \(G_{l^\prime ,m^\prime }^{b,\mu }\) is equivalent to a small subset of Gaunt coefficients, allowing for efficient and exact evaluation of this term.

Meanwhile, the radial integrals cannot necessarily be expressed in an analytical form and must be computed numerically, as the radial wavefunction is loosely constrained in most tight-binding models.27 Whether a hydrogenic, Slater, or more complex object should be employed to describe the radial wavefunction is left to the discretion of the user, as the best choice is somewhat dependent on the nature of the material and states of interest. The user is given the opportunity to select from a variety of initial state wavefunctions in addition to importing their own functions or radial integrals at the start of the calculation. This could be the Wannier function grid as generated by for example Wannier90.28

We note that elsewhere it is common to take advantage of the plane-wave final state to recast the matrix element as a polarization modulated Fourier transform of the initial state.19,29 Specifically, one can write

$$M_{\mathrm{FT}} \propto \langle {e^{i\vec k \cdot \vec r}}|\hat \nabla \cdot {\hat{\bf \epsilon }}| {\psi _i}\rangle = i\vec k \cdot {\hat{\bf \epsilon }}\langle {e^{i\vec k \cdot \vec r}|\psi _i}\rangle .$$
(13)

MFT can be expanded as in Eq. (10), establishing some formal equivalence to Eq. (8). However, one commonly observes qualitative deviation from experiment within this description, due to the form of the radial integrals \(B_b^{l^\prime }\) introduced above. In the Fourier representation, these are written as:

$$B_b^{l^{\prime}}(k) = (i)^l{\int} \mathrm{d} rj_l(kr)r^2R_b(r).$$
(14)

One can contrast this with Eq. (11) which we employ in chinook. It is made explicit in Eq. (14) that the Fourier representation of the dipole operator imposes radial integrals which are independent of final state angular momentum. The implications for multi-orbital systems, and for those where l > 0, are important as final state interference becomes relevant. This is ultimately why for a plane-wave final state, the position, rather than momentum representation of the dipole operator yields a better description of experiment. We emphasize that although the limited constraints of tight-binding imply that the integrals \(B_b^{l\prime }\) are to some extent parameters of the calculation, support for distinct final state angular momentum cross-sections is essential to the success of the position representation used here. Furthermore, it offers a natural extension of our framework to scattering-final states, wherein the commutation relations required to establish Eq. (8) are more rigourously justified (Further discussion of these approximations can be found in the Supplementary Materials).

Returning to the calculations executed in chinook, the central object of importance is the coherent matrix element factor:

$$M_\alpha (\vec k) = \left[ {\begin{array}{*{20}{c}} {M_{\mu = - 1}^ \downarrow } & {M_{\mu = 0}^ \downarrow } & {M_{\mu = 1}^ \downarrow } \\ {M_{\mu = - 1}^ \uparrow } & {M_{\mu = 0}^ \uparrow } & {M_{\mu = 1}^ \uparrow } \end{array}} \right]_\alpha (\vec k).$$
(15)

Evaluation of this object proceeds following Eq. (9). Each column corresponds to the projection of the polarization vector in the basis of spherical harmonics, (i.e. μ ≡ Δm = ±1, 0), and the rows indicate the spinor projections. By retaining the matrix element in this coherent form, the ARPES intensity for arbitrary polarization and spin projection can be recalculated at run-time with minimal computational overhead. For each band and k-point in the region of interest, a spectral function as defined in Eq. (3) is added to the total intensity map, with its amplitude multiplied by:

$$|M_\alpha |^2 = |\mathop {\sum}\limits_\mu {\epsilon _\mu } M_{\alpha ,\mu }^ \downarrow (k)|^2 + |\mathop {\sum}\limits_\mu {\epsilon _\mu } M_{\alpha ,\mu }^ \uparrow (k)|^2.$$
(16)

The photoemission intensity is then computed as described in Eq. (1), with \(\left| {\left\langle {\psi _f} \right|{\hat{\mathbf \Delta }}\left| {\psi _i} \right\rangle } \right|^2 \to |M_\alpha |^2\). Spin projection, polarization, resolution, temperature and self-energy can all be updated with little overhead at run-time. With ARPES intensity maps then calculated for different experimental configurations, these results can be exported for further analysis, or combined to define quantities such as spin-polarization and circular/linear dichroism. In this sense, the output of the standard chinook calculation is a three-dimensional array of intensity in coordinates of momentum and energy which can be explored and analyzed in the same way as an experimental ARPES measurement.

Discussion

Bulk electronic structure and orbital texture

Returning to the motivating case of the Fe-based superconductor FeSe, we can implement the model characterized above in Fig. 2 and compare the simulated ARPES intensity against the low-energy region of Fig. 1. As with the experiments, the calculations were done at  = 37 eV and T = 120 K. A Fermi-liquid type self-energy has been applied to the spectral features, resulting in an energy-dependent broadening of the photoemission linewidth. In the present case, the tight-binding model has already been renormalized to match the experimental spectra, such that the dispersion is more appropriately defined as \(\epsilon _k^\prime = \epsilon _k^0 - \Sigma ^\prime (k,\omega )\). Consequently, the self-energy used in the ARPES simulation is purely imaginary, \(\Sigma (\vec k,\omega ) = i\Sigma^{\prime\prime} (\omega ) = - i(0.005 + 1.0\omega ^2)\), which is plotted in Fig. 4e. As ARPES matrix-elements can confound the evaluation of the spectral function and correlation effects in experimental data,30,31,32 the ability to model both components in the same environment can facilitate the disentanglement of these two objects of interest.

Fig. 4
figure 4

Calculated ARPES spectra for FeSe. Performed with  = 37 eV, along ΓM direction. The balance of SOC and the crystal-field can be observed with the orbital projections plotted in (c) for several points along the dispersion, as indicated by the cursors in (b). In (b, c), polarization vectors are indicated by arrows, corresponding to s-polarized \(\hat \epsilon = [0,1,0]\) in (b) and p-polarized \(\hat \epsilon = \sqrt {\frac{1}{2}} [ - 1,0,1]\) light in (c). The sample is aligned with the Fe-Fe bond direction oriented along the Cartesian basis. In (d), calculated Az as from circularly polarized spin-ARPES (CPS-ARPES) provides a more direct perspective on SOC, with an explicit connection to the \(\langle {\vec L \cdot \vec S}\rangle\), projected along the \({\hat{\mathbf z}}\) axis. Finite linewidth of the spectra in (bd) reflect the convolution of experimental resolution ΔE = 10 meV, Δk = 0.01 Å−1 and Im[Σ(ω)] plotted in (e)

The simulation in Fig. 4 captures the relative intensity ratio between the three hole bands, with the heaviest (largest effective mass) band visible only through the SOC-induced hybridization gaps near EB = 50 meV. This latter state, composed primarily of dxy orbitals, has vanishing photoemission intensity along the normal emission (\(\lim _{k_{||} \to 0}\)) direction due to the selection rules associated with its definition in terms of spherical harmonics \(Y_2^{ \pm 2}\): all possible final states have a node along the normal emission direction. While conventional interpretation of the remaining states assumes dxz/yz-like wavefunctions, SOC allows for finite intensity from both states, as observed both experimentally and in the simulation near k|| = 0 Å−1. This orbital character mixing of the initial states is supported by projection of the tight-binding eigenstates onto the basis of spherical harmonics, as done at select k-points in Fig. 4a using built-in diagnostic tools from chinook.

A more direct measure of the influence of SOC can be achieved through combining circularly polarized light with spin resolution to gain explicit access to both spin and orbital degrees of freedom. One can define the polarization asymmetry as:

$$A_z = \frac{{\sqrt {I_ + ^ \downarrow I_ - ^ \uparrow } - \sqrt {I_ + ^ \uparrow I_ - ^ \downarrow } }}{{\sqrt {I_ + ^ \downarrow I_ - ^ \uparrow } + \sqrt {I_ + ^ \uparrow I_ - ^ \downarrow } }},$$
(17)

where subscripts indicate the helicity of light polarization, and superscripts the spin-projection of the photoelectrons. This quantity is closely related to the projection of \(\langle \vec{L}\cdot\vec{S}\rangle\) along the quantization axis of the experiment, allowing for a connection between Figs. 2c and 4d to be made. This technique has been applied to both ruthenates13,33 and Fe-based superconductors,11 and utilizes the dipole selection rules encoded within the matrix element factor to provide the most direct measure of spin-orbital entanglement in solid state.

Supercell impurity model

Consideration of a supercell model illustrates the information encoded in the ARPES matrix element beyond orbital symmetry alone. Regarding the electronic structure of periodic systems, one can choose an arbitrarily large unit cell in exchange for a reduced Brillouin zone and additional backfolded bands. By contrast, impurities or other symmetry-breaking potentials (SBP) explicitly require such an expanded unit cell. While one can numerically perform an unfolding of these bands in an attempt to recover the spectrum within the extended Brillouin zone,34 such an unfolding is carried out naturally in the photoemission experiment. In the absence of the SBP, Bloch’s theorem would impose that the original band becomes a symmetric superposition over the neighbouring lattice sites. The additional bands, which must be orthogonal to the original state will destructively interfere in evaluation of the ARPES matrix element, preventing observation of many of the folded states. Ultimately, an SBP can mix these states; when the SBP is an essential feature of the potential landscape, as in graphene and the Fe-based superconductors,7,35 the folded bands can be observed with strong intensity over a range of momenta. When the SBP is weak or disordered, intensity from these folded bands vanishes away from the avoided crossings.

To demonstrate these effects, we consider the artificial example of a square lattice of Li 2s orbitals, into which we substitute some number of Na 3s orbitals. Allowing for nearest neighbour hopping alone, and imposing a \(\epsilon _{Na}\) = −0.35 eV impurity potential for the Na sites, we simulate the effect of local defects in this lattice and the resulting ARPES spectra. Kinetic terms and onsite potentials have been adapted from the phenomenological rules set out in ref. 36. In an attempt to consider the impurity problem realistically, we populate a 30 × 30 supercell of Li with various concentrations of randomly distributed Na impurities. For each particular distribution, the density of states is integrated to fix the Fermi-level at half-filling, consistent with electron counting. We then compute the photoemission intensity at  = 21.2 eV over the ΓM direction of the extended Brillouin zone. For clarity, we assume a constant intrinsic linewidth of 10 meV along the entire dispersion. Energy and momentum resolution are set to 10 meV and 0.005 Å−1. The results, plotted in Fig. 5c have been averaged over 80 such configurations, corresponding to a nominal doping of Li0.9Na0.1. This can be compared against the pure Li-supercell in Fig. 5b. The full spectrum of the latter is displayed in Fig. 5a.

Fig. 5
figure 5

Na impurity-substituted Li supercell. In (ac), we plot the ARPES intensity at 21.2 eV with s-polarized (along momentum axis) light over the ΓM direction of the extended Brillouin zone. The spectra have been averaged over 80 random configurations of a 30 × 30 square lattice (as for example in (d)). Panels (a, b) represent the pristine lattice of Li: the dispersion follows precisely that of the 1-Li unit cell, with all other states destructively interfering to produce zero intensity. Representative tight-binding bandstructures are plotted over the spectra. In (c) however, 90 Na atoms (\(\epsilon _{Na} = - 0.35\) eV) have been substituted for Li. In (e), the effect of impurity potential \(\epsilon _{Na}\) is demonstrated for the series of EDCs at the Fermi momentum kF, at a fixed concentration of 10% Na. Similarly in (f), fixing \(\epsilon _{Na} = - 0.35\) eV, the same is done for different concentrations

In each case, we compute the photoemission intensity from all states. However intensity from all folded bands is vanishing in the absense of the SBP; the full bandstructure is plotted in white over the spectra to demonstrate the large suppression of photoemission intensity. At the bottom of the band, the dipole selection rules suppress photoemission intensity from even the main band. In the disorder-averaged supercell, a substantial broadening of the spectral lineshape is observed.37,38 As indicated by the overlain bandstructures of Fig. 5a–c, the impurity potential introduces a high density of avoided crossings, where the eigenvector supports finite photoemission intensity. In this sense, the broadening can be associated with the relative phases within the tight-binding eigenvector to which the ARPES matrix element is sensitive.

One can demonstrate that the linewidth broadening is dependent on both concentration and strength of impurities. In Fig. 5e, f we plot energy distribution curves (EDCs) at kF = 0.44 Å−1 for fixed concentration (Li0.9Na0.1) with variable attractive (negative) \(\epsilon _{Na}\), and fixed \(\epsilon _{Na} = - 0.35\) eV with variable concentration. Each spectrum has been averaged over 80 similar configurations, and normalized to its peak intensity. The linewidth is observed to increase monotonically with both concentration and impurity potential, indicating the similar role these degrees of freedom play in modifying the spectra of disordered systems. Despite this lineshape broadening, the low-energy dispersion is resilient against a high level of disorder, as illustrated by Fig. 5c. By applying an out-of-plane polarization sensitive to states near the bottom of the band, we also confirm an increase of the bandwidth for this attractive impurity potential, which grows quadratically with impurity potential for the modest \(\frac{{|\epsilon _{\mathrm{Na}}|}}{W} \;<\; 0.15\) considered here: at 10% Na and \(\epsilon _{\mathrm{Na}} = - 0.35\) eV, the band bottom is extended 30 meV. Such detailed study of the impurity-substituted ARPES spectra is not possible without consideration of the matrix elements, which allow for a straightforward disentanglement of the supercell bandstructure and an opportunity to achieve meaningful insights from disordered materials.

Surface vs bulk, and emergent k z dispersion

To this point we have considered the bulk-electronic structure, but it is important to appreciate the surface-sensitivity of the ARPES experiment: the high scattering cross-section in the ultraviolet regime results in penetration depths of the order of 5–10 Å.26 This corresponds to the top few unit cells of the lattice, depending on experimental details. In many cases, the surface introduces modest corrections to the local electronic structure, facilitating a direct connection between the measured photoemission intensity and the bulk electronic structure.39 In others, details of the surface preparation result in reconstructions of the ARPES spectra which deviate profoundly from the bulk electronic structure.39,40,41,42

This surface sensitivity becomes rather important in the context of three- or even quasi-two- dimensional materials, where the photoelectron escape depth and kz information are intimately connected. For intermediate energies in the ultraviolet regime, where the penetration depth is of the order of 5 Å, the Δkz required by the uncertainty principle becomes comparable to the size of the Brillouin zone. In the presence of finite kz dispersion, this can result in anomalously broad linewidths, as the spectrum effectively integrates over the third dimension of momentum space. This is visualized well in Fig. 6, where we have projected our FeSe model onto a 20-layer slab model along the (001) direction. While the slab bandstructure is by construction independent of kz, signatures closely related to the bulk kz dispersion are observed in photon energy-dependent matrix element calculations, as seen in Fig. 6b. We estimate the attenuation factor \(e^{ - \xi /2z_i}\) of the escaping photoelectrons using the universal escape depth curve from ref. 26. The kz value probed is calculated using an inner potential of V0 = 12.2 eV.10 While at both low and high photon energies the penetration depth is sufficiently large that that Δkz should be less than 0.05 Å−1, at  = 71 eV Δkz = 0.11 Å−1, and linewidth broadening is observed as a result (note here π/c = 0.57 Å−1). Conversely, for larger ξ values, Δkz becomes negligible, and something akin to the bulk electronic structure is recovered. Note that Δkz is not explicitly included in these calculations, but emerges naturally from the combination of slab geometry with variable penetration depth. From these results, it becomes evident that the surface sensitivity can complicate successful estimation of the bulk electronic structure. This emphasizes the need for proper characterization of the kz dispersion, accessible via photon-energy-dependent measurements, as in Fig. 6b.

Fig. 6
figure 6

Surface-projected photon energy dependence in FeSe. In (a), we build a 20 layer slab tight-binding model from the bulk model in Fig. 2. We calculate the ARPES intensity along the ΓM direction, at several photon energies; the results are summarized by a cut at constant binding energy of EB = 25 meV in panel (b). Spectra at each photon energy have been renormalized to their maximum intensity. In panels (ce), spectra at select photon energies are plotted, chosen to correspond to the same kz = 0 Å−1 point in successive Brillouin zones to enable direct comparison. Photon energy dependence of the measured linewidth is observed due to effective kz integration

Photoelectron interference, and spin-ARPES

Despite these challenges associated with surface sensitivity, the surface can also precipitate new states localized to the interface region which are not possible in bulk systems. Such is the case for example in the Shockley surface states observed along the (111)-termination of noble metals,23,43 Fermi arcs on Weyl-semimetals,44 and conductive surface states observed in topological insulators such as Bi2Se3.45

To model the ARPES spectra from these surface states, an extended lattice basis is required, with the unit cell projected onto a slab-geometry. Our implementation of the slab generation is inspired by the algorithm in ref. 46, allowing for nearly total automation of the slab Hamiltonian initialization. Given a surface Miller index, new lattice vectors can be defined which projects the new unit cell along the desired surface direction to the desired thickness. The bulk Hamiltonian can be propagated over this slab supercell. While we formally maintain periodic boundary conditions, rather than preserve the full translational symmetry of the bulk crystal, a vacuum buffer is defined, with a thickness sufficiently large to suppress hopping elements between neighbouring slab unit cells. The precise location of the crystal-vacuum interface is tuned by the user to achieve the desired surface termination of the crystal. In the case of Bi2Se3, this termination must occur between the van der Waals-bonded layers of two adjacent quintuple layers (QL) to preserve the topological surface states. The procedure is illustrated in Fig. 7a–c.

Fig. 7
figure 7

Topological surface states in Bi2Se3. The progression from the rhombohedral unit cell of the bulk lattice (a) to the (111)-surface projected (b) hexagonal unit cell, and finally a Se1-terminated slab (c) is plotted in the top row. In the bottom row, the bulk (d) and an 11-QL slab (e) bandstructure are compared. The colourscale of panel (e) reflects the expectation value of the surface-projected spin, directed orthogonal to the momentum axis (〈Sy〉 along kx, and 〈Sx〉 along ky)

Expansion of the basis set to a suitably large slab carries the caveat of a significant memory overhead, which can be to an extent mitigated in the calculation of ARPES intensity: as the finite penetration depth of the probe and photoemitted electrons limit the volume of the unit cell to which we are actually sensitive, the eigenvectors are truncated beyond a modest multiple of the mean-free path, allowing for both efficient and high-fidelity surface-projected ARPES maps to be computed, as done for the 400-orbital basis used for the simulation in Fig. 8a. As a result, it is the mean-free path more than the size of the basis which limits the ability to treat very large slab unit cells. As an example of this functionality, simulated and experimental ARPES intensity from Bi2Se3 are plotted in Fig. 8. Many of the central tenets of a model strong TI have been confirmed in this material, such as the anticipated chiral spin texture, observed directly via spin-resolved ARPES.45,47,48,49 Such spin-resolved experiments50,51,52 can also be simulated within the chinook software, as shown in Fig. 8b, where we present the simulated spin polarization:

$$P_y = \frac{{I_{ \uparrow _y} - I_{ \downarrow _y}}}{{I_{ \uparrow _y} + I_{ \downarrow _y}}}.$$
(18)

This result is in agreement with experiment,45 and can be compared favourably with the surface-projected expectation value of the spin \({\hat{\mathbf S}}\) operator, \(e^{ - \frac{{|{\hat{\mathbf z}}|}}{\xi }}{\hat{\mathbf S}}\), plotted in Fig. 7e. The bulk states, which lack any discernible spin-polarization (Fig. 7d), vanish from the calculation of Py and so do not appear in Fig. 8b.

Fig. 8
figure 8

ARPES spectra from topological surface states on Bi2Se3. In (a), simulated photoemission spectra along the kx direction around the surface Brillouin zone Γ point, as observed with p-polarized light (\(\hat \epsilon = \sqrt {\frac{1}{2}} [ - 1,0,1]\)). Calculated spin-polarization, projected out of the plane of the page (c.f. Eq. (18)) for the same region is plotted in (b). Constant energy contours give evidence for bulk-hybridization and interlayer interference, as seen in both the simulation (c) and experiment (d). Data in panel (d) reproduced from8 with permission from the authors

While the topological surface states ΨTSS are primarily composed of pz orbitals at the surface, a pronounced modulation of the photoemission intensity around the Dirac cone is observed as a consequence of the finite extension of ΨTSS into the crystal bulk. Hybridization with bulk states, in addition to interlayer photoelectron interference can be understood as the progenitor of this modulation, as explored in depth in ref. 8 The interpretation of this angular intensity pattern in ARPES measurements presents an essential experimental verification and explanation of the limitations of applying a simple \(\vec{k}\cdot\vec{p}\) model to the description of real topological insulators such as Bi2Se3. While localized within a finite region near the vacuum interface, the full three-dimensionality of the surface state becomes apparent through consideration of this spectroscopic evidence. Convenient extension to a slab-geometry is then a critical functionality offered by the chinook package.

Variable experimental geometry

In practice, ARPES experiments rotate either the sample normal or spectrometer in order to access a broad set of emission angles. While some modern techniques such as photoemission electron microscopy (PEEM),53 angle-resolved time-of-flight (ARTOF),47,54 and deflector-based ARPES55,56,57 apparatus avoid this complication, the assumption of a constant experimental geometry is not always possible. Furthermore, it is often advantageous to rotate the sample orientation in order to for example explore large regions of momentum space, or to achieve better momentum resolution available at higher emission angles.3 While the most direct complication is associated with variable photon polarization, in the case of S-ARPES, the relative orientation of the detector with respect to the sample is essential to interpret data correctly.

To exemplify the practical considerations associated with such experiments wherein the geometry is variable, one may consider an exploration of Rashba-split spin-polarized surface states, as in for example PtCoO2.58 Using the model presented in ref., 58 in Fig. 9, the Fermi surface is surveyed over several neighbouring Brillouin zones, accessed by rotation of the sample about the horizontal (i.e. kx) axis. The ARPES intensity calculated with and without consideration of the rotated polarization vector can be compared in Fig. 9a, b. While the intensity in the first Brillouin zone is fairly homogeneous in either case, the higher order zones reflect more substantial variation. These rotations complicate the extraction of orbital character from the photoemission intensity, requiring explicit consideration of the polarization rotation.

Fig. 9
figure 9

Experimental geometry considerations. In panels (a, b), we plot the Fermi surface of PtCoO2 over several neighbouring Brillouin zones, neglecting (a) and including (b) the rotation of the polarization vector associated with the sample orientation. The range of polarization vectors are illustrated by arrows in panel (b): for the horizontal (kx) analyzer entrance slit, the polarization is fixed along lines of constant ky. Moving from bottom to top, the sample rotates by ≈90°, and the polarization goes from entirely in-plane, projected almost along the \({\hat{\mathbf y}}\) direction, to almost entirely out of plane along the \({\hat{\mathbf z}}\) direction. The geometry is drawn schematically in (c): the sample is rotated about the red (\({\hat{\mathbf x}}\)) axis to access the full domain of ky, with the polarization as drawn: \(\hat \epsilon = \sqrt {\frac{1}{2}} [0,1,1]\) in the laboratory frame. Finite \(\epsilon _x\) at all angles results from a rotation of 7.2° about the grey axis in (c) to select the desired kx window. In (df) we compare the measured spin polarization in the first (d) and second Brillouin zones (e, f). Axis labels indicate distance from Γ1,2. While a chiral Rashba spin texture is observed near normal emission, contamination between the different spin-channels is manifest as a substantial and artificial out-of plane spin projection in the second zone. Pz is zero near Γ1 and not shown here. All related colourscales are represented on the same scale to facilitate direct comparison

In the context of spin-resolved measurements, the spin-projection is measured within the laboratory frame-of-reference, which remains fixed for all sample orientations. As the sample is rotated, contamination of orthogonal spin-channels is inevitable. While such effects are minimal near Γ1, a significant out-of-plane spin-polarization arises near Γ2, with Pz over 36% of Px, as demonstrated in Fig. 9d–f. It is important to note that the intrinsic spin-polarization is confined entirely to the plane; this apparent out-of-plane polarization exists only in the coordinate frame of the laboratory apparatus. Accounting for the rotation of the measurement coordinate frame for the given experiment, one can then redistribute this information into the channels associated with the sample’s intrinsic spin texture:

$$\vec P_i = \vec P_{\mathrm{exp}} \cdot \vec S(\theta ),$$
(19)

where here \(\vec P_{\mathrm{exp}}\) is the measured spin polarization from Eq. (18) and \(\vec S(\theta )\) the spin-projection axis measured at each emission angle. Although Fig. 9b indicates higher photoemission intensity is available in the second Brillouin zone, Fig. 9e, f illustrate the practical challenges associated with resolving the spin-texture near Γ2. By affording the user with an ability to encode a realistic experimental configuration in the simulation, such effects can be accounted for in detail, circumventing a significant experimental limitation which may otherwise restrict more general application of the techniques detailed here.

We have presented here a simple and powerful numerical framework implemented in Python for the simulation and interpretation of ARPES spectra for a broad variety of materials of interest. Designed with this specific purpose in mind, the open-source structure of the chinook software package is engineered to accommodate further extension beyond this application, as we have done recently for the study of resonant optical excitations in pump-probe spectroscopy experiments.59 Through the development of these tools, we hope to motivate and facilitate the consideration of the great depth of information encoded in the matrix element of ARPES towards a better understanding of these experiments and the electronic structure of the materials under consideration.

Methods

All work presented here was performed in Python using the chinook package.