Efficient calculation of carrier scattering rates from first principles

The electronic transport behaviour of materials determines their suitability for technological applications. We develop a computationally efficient method for calculating carrier scattering rates of solid-state semiconductors and insulators from first principles inputs. The present method extends existing polar and non-polar electron-phonon coupling, ionized impurity, and piezoelectric scattering mechanisms formulated for isotropic band structures to support highly anisotropic materials. We test the formalism by calculating the electronic transport properties of 23 semiconductors, including the large 48 atom CH3NH3PbI3 hybrid perovskite, and comparing the results against experimental measurements and more detailed scattering simulations. The Spearman rank coefficient of mobility against experiment (rs = 0.93) improves significantly on results obtained using a constant relaxation time approximation (rs = 0.52). We find our approach offers similar accuracy to state-of-the art methods at approximately 1/500th the computational cost, thus enabling its use in high-throughput computational workflows for the accurate screening of carrier mobilities, lifetimes, and thermoelectric power.

Electron mobility, µ e , can be computed through the linearized Boltzmann transport equation (BTE) [1][2][3][4], given for electrons as where α and β denote Cartesian coordinates, n e is the electron concentration, Ω and Ω BZ are the volumes of the unit cell and first Brillouin zone, respectively, v nk,α is the group velocity of band index n and wave vector k, "cb" stands for conduction bands, and ∂ E β f nk is the perturbation to the Fermi-Dirac distribution by an electric field E. The Fermi-Dirac distribution is given by where ε nk is the energy of state nk, ε F is the Fermi level, k B is the Boltzmann constant, and T is temperature. The perturbation to the equilibrium Fermi-Dirac distribution is given by the self-consistent solution of where τ nk is the electron lifetime, δ is the Dirac delta function, ∆ε nm k,q = ε nk − ε mk+q , is the reduced Planck constant, and n q is the Bose-Einstein occupation. The matrix elements g nm (k, q) give the probability of scattering from an initial state nk to final state mk + q via a phonon with wave vector q and frequency ω q .
The primary complexity in the Boltzmann transport equation results from the dependence of the linear response coefficients ∂ E β f nk of state nk on all other states mk + q. Accordingly, there are several common approximations to the BTE that can significantly reduce the computational cost. The momentum relaxation time approximation (MRTA) makes two simplifications: (i) Firstly, the linear response coefficients are presumed to only act in the direction of the band velocity, such that the electron lifetimes will be scalar quantities [2,4]. (ii) Secondly, the probability of scattering from state nk to mk + q is assumed to be the same as scattering from state mk + q to nk. The result is that the effects of back scattering are accounted for by a geometrical factor resulting from the electronic group velocities. The resulting expression for τ −1 nk can be written where τ −1 nk→mk+q is the partial decay rate for scattering from initial state nk to final state mk + q. In this approximation, Supplementary Eq. (1) can be rewritten A further simplification can be made by ignoring the effcts of scattering back into the state nk entirely. This corresponds to neglecting the second term on the righthand side of Supplementary Eq. (3) or setting the geometric factor in the square bracket of Supplementary Eq. (4) to 1. In this approach, termed the self-energy relaxation time approximation (SERTA) [3], the electron lifetimes can be obtained according to and the mobility calculated in the same manner as Supplementary Eq. (5). The partial decay rates of Supplementary Eqs. (4) and (6) can be obtained through Fermi's golden rule. In the present work, we implement two classes of scattering: (i) inelastic scattering which occurs via emission or absorption of a phonon and (ii) perfectly elastic scattering in which no energy is gained or lost. In the case of inelastic scattering, the partial decay rate can be written [5,6] τ −1 nk→mk+q = 2π |g nm (k, q)| 2 × [(n q + 1 − f 0 mk+q )δ(∆ε nm k,q − ω q ) + (n q + f 0 mk+q )δ(∆ε nm k,q + ω q )], Supplementary  [7,8] Acoustic deformation potential Deformation potential, elastic constant Elastic [9][10][11][12] Piezoelectric acoustic Piezoelectric constant Elastic [13][14][15] Polar optical phonon Static and high-frequency dielectric, phonon frequency Inelastic [16] where the − ω q and + ω q terms correspond to scattering by emission and absorption of a phonon, respectively. The dependence of τ −1 nk→mk+q on the occupation of state mk + q and the observation that f mk+q = f nk reveals that inelastic scattering is not commutativei.e., τ −1 nk→mk+q = τ −1 mk+q→nk . We note that for spin polarized materials, scattering only occurs between states in the same spin channel -i.e., there are no interactions between spin-up and spin-down electrons.
For elastic scattering, Supplementary Eq. (7) reduces to In contrast to inelastic scattering, elastic processes do not depend on the occupation of state mk + q. Accordingly, τ −1 nk→mk+q = τ −1 mk+q→nk and a primary assumption of the MRTA is satisfied. For this reason, we treat elastic scattering processes under the MRTA, whereas inelastic scattering processes are treated in the SERTA.

Scattering matrix elements
The general form of the quantum mechanical scattering matrix elements in Supplementary Eqs. (3), (4), and (6) is g nm (k, q) = mk + q|∆ q V |nk (9) where ∆ q V is an electronic perturbation associated with a scattering process [6]. In the present work we calculate matrix elements within the Born approximation [17]; namely, the electronic perturbation is assumed to only weakly impact the wave function of the final state mk + q. The scattering matrix elements considered in this work and the materials parameters needed to calculate them are summarized in Supplementary Table 1. G-vector summation The matrix elements include a sum over reciprocal lattice vectors G. In this work, we restrict the summation to only include a single reciprocal lattice vector, G = [0, 0, 0], such that only phonons within the first Brillouin zone are considered but Umklapp scattering processes (with respect to the electronic Brillouin zone) are taken into account.
Impurity scattering The inverse screening length β, required in the calculation of the ionized impurity matrix element, is given by where 1/β corresponds to the Debye length and Thomas-Fermi screening length for non-degenerate and degenerate doping regimes, respectively [18].

Transport properties
Electronic transport properties -namely, conductivity, Seebeck coefficient, and electronic component of thermal conductivity -are calculated through the Onsager coefficients [19,20]. The spectral conductivity, defined as is used to compute the moments of the generalized transport coefficients where ε F is the Fermi level at a certain doping concentration and temperature T . Electrical conductivity (σ), Seebeck coefficient (S), and the charge carrier contribution to thermal conductivity (κ) are obtained as B. Computational Framework

Brillouin-zone interpolation and integration
As described in the main text, we employ a combined Fourier-linear interpolation scheme when calculating scattering and transport properties. Electronic eigenvalues -calculated using density functional theory (DFT) on a coarse k-point mesh -are Fourier interpolated onto a denser mesh. Fourier interpolation is performed using the boltztrap2 software [22,23] which enforces symmetry using star functions and employs the Shankland algorithm to ensure that both quasi-particle Figure 1. Schematic of the linear-tetrahedron method. (a) A 2 × 2 × 2 k-point submesh can be broken up into (b) six tetrahedra. Adapted from Supplementary Ref. [21]. (c) The constant energy surfaces (light gray planes) defined by εa and ε b intersect the tetrahedron to produce the cross sections fa (dark gray triangle) and f b (dark gray quadrangle). The triangular cross section fa is defined by the points c1, c2, and c3. The k-points at the tetrahedron vertices have been numbered according to increasing energy, i.e., ε k 1 < ε k 2 < ε k 3 < ε k 4 . (d) Coordinate transformation from initial basis (black arrows) to transformed basis (pink arrows) that maps the cross section onto a 2D plane. The x * coordinates of all points on the cross section are zero. energies and their derivatives (group velocities) are exactly reproduced [24][25][26]. This approach aims to minimise the roughness function proposed in Supplementary Ref. [27].
Scattering rates are calculated on the Fourier interpolated k-point mesh. When calculating the partial decay rate, scattering is limited to the constant energy surface defined by ε = ε nk in the case of elastic processes [Supplementary Eq. (8)] and ε = ε nk ± ω q for inelastic processes [Supplementary Eq. (7)]. Note that, in our implementation of polar optical phonon scattering we rely on a single dispersionless phonon mode, whose energy ω po is independent of q. Due to finite k-point sampling, it is common replace the delta function in Supplementary Eqs. (7) and (8) by Gaussian or Lorentzian functions with finite broadening. This procedure has the effect that the calculated lifetimes will depend on the chosen broadening parameter.
An alternative approach is to employ the linear tetrahedron method to analytically integrate the scattering rates across the constant energy surface [21,28]. In this method, the Brillouin zone is divided into tetrahedra [Supplementary Figs. 1(a) and 1(b)]. For each electronic band, the eigenvalues are obtained for the kpoints at the corners of the tetrahedra. The constant energy surface defined by ε nk intersects a tetrahedron if ε min tetra < ε nk < ε max tetra , where ε min tetra and ε max tetra are the minimum and maximum energies of the tetrahedron's vertices [ Supplementary Fig. 1(c)]. Computing the intersections of ε nk with all tetrahedra gives rise to a set of tetrahedron cross-sections that define the constant energy surface. In the traditional implementation of the tetrahedron method, the integration for each tetrahedron is performed analytically after linearly interpolating the eigenvalues and matrix elements inside the tetrahedron. As we note in the main text, this approach is only valid for matrix elements that show a linear dependence on q. For ionized impurity scattering, where the matrix ele-ment has a 1/|q| 2 dependence, this assumption does not hold and results in severe overestimation of the scattering rate.
To overcome this limitation, we employ a modified linear-tetrahedron approach. The constant energy surface is determined in the same manner as the tetrahedron method. However, instead of analytically integrating within each tetrahedra, the tetrahedron cross sections (comprising the constant energy surface) are numerically resampled with hundreds of extra points. By only computing additional k-points that exactly satisfy the delta term in Supplementary Eqs. (7) and (8), this allows for "effective" k-point mesh densities that would be almost impossible to achieve with uniform k-point sampling. The scattering matrix elements are computed on the denser submesh by linear interpolation of the electronic wave functions ψ nk and group velocities v nk . We note that the scattering wave vector q is a geometric term that is known exactly for all points on the submesh. A primary advantage of this approach is that while the matrix elements cannot be linearly interpolated with q, the constituent parameters (electronic wave functions and group velocities) are linearly interpolatable.
In order to resample the constant energy surface, the tetrahedron cross sections are projected onto a twodimensional plane. First, the k-points that define the tetrahedron cross sections are identified. These are the points at the intersection of the constant energy surface and tetrahedron boundary under the assumption that the band energies vary linearly between adjacent vertices in the tetrahedron [points labelled c in Supplementary  Fig. 1(c)]. This results in three and four sets of k-points for triangular and quadrilateral cross sections, respectively, termed C. The first basis vector for the new coordinate system, B, is the vector normal to the plane of the cross section, namely where c 1 and c 2 are the coordinates of the first and second vertices defining the cross section. The second and third basis vectors are defined as The reciprocal space coordinates defining the cross section are transformed onto the new basis through In the new coordinate system, the first component of all coordinates will be the same, as all vertices lie on a plane. The last two components of the coordinates define a two-dimensional (2D) projection of the cross section which can be resampled through numerical quadrature schemes [ Supplementary Fig. 1(d)]. In the present work, we employ degree 50 Xiao-Gimbutas (containing 453 sample points, [29]) or Festa-Sommariva quadratures (454 points, [30]) for resampling triangular and quadrilateral tetrahedron cross-sections, respectively. Resampling, including generating sample points and integration weights w res i , is performed using the quadpy software [31]. The set of sample points are transformed back into the original coordinate system through The contribution of each tetrahedron to the constant energy surface is weighted by a geometric factor that accounts for the tetrahedron's shape in four dimensional space (reciprocal coordinates and energy space) [28]. Using the triple r i contragradient to vertices of the tetrahedron k i r i k i = δ ij , where the k-points have been numbered according to increasing energy, i.e., ε k1 < ε k2 < ε k3 < ε k4 , the tetrahedron weight is given by [28] We stress that this weight is distinct from the integration weights defined by Blöchl et al. [21] in which the contragradient cancels when averaging over all adjacent tetrahedra. The final integration weights w i for the sample k-point coordinates of each cross section are scaled by the tetrahedron weight to give w i = w res i · w tet .
When evaluating the density of states and the spectral conductivity in Supplementary Eq. (11), we employ the traditional approach to the lineartetrahedron method described by Blöchl et al. [21]. Specifically, we use the energy-dependent integration weights as described in Supplementary Ref. [32] and elsewhere. Unlike the partial decay rates τ nk→mk+q −1 , the final lifetimes τ nk vary smoothly across the Brillouin zone. Accordingly, use of the linear-tetrahedron method can significantly improve the convergence of transport properties without issue.

Optimization of scattering calculations
Under typically achievable carrier concentrations (10 16 to 10 21 cm 2 /Vs) the Fermi level will sit close to either the conduction or valence band edge. Accordingly, only kpoints that lie within a few hundred meV of the band edge will contribute to electronic transport. It is therefore unnecessary to compute the electron lifetimes for all k-points in the band structure, as most will have no impact on transport properties. From the generalized transport coefficients L in Eq. (12), it can be seen that each k-point's contribution to the transport properties is scaled by a factor (ε nk − ε F ) n −∂f 0 nk /∂ε nk , which depends entirely on the energy of the state. Accordingly, we have designed a procedure to assess which energy range is important for transport, illustrated in Supplementary  Fig. 2(a). We begin by denoting the "moment-coefficient weight" as where the indices n = 0, 1, 2, correspond to the moments of L n required to compute conductivity, Seebeck coefficient, and the electronic component of thermal conductivity, respectively. This is weighted by the spectral conductivity Σ crt under the assumption of a constant relaxation time [i.e., Supplementary Eq. (11) with τ = 1] to give Finally, we compute the normalized cumulative integral of the weights according to We can then define a tuneable parameter λ than controls the minimum and maximum energy ranges within which Convergence of electronic transport properties p as a function of λ at 300 K for GaAs, Si, SnSe, and CuAlO2. Absolute percentage difference from converged value |(p − p λ 0 )/p λ 0 | given for conductivity (p = σ, orange), Seebeck coefficient (S, teal), and electronic contribution to the thermal conductivity (κ, pink), respectively. p λ 0 corresponds to the value of the transport properties at λ = 0 -i.e., the scattering rates for all k-points are calculated explicitly. Convergence within 1 % is highlighted by a dashed gray line.
to calculate the scattering rates. Namely, where λ can vary between 0 (in which case ε min n and ε max n will be the minimum and maximum energies in the band structure) and 1 (where ε min n and ε max n will be the same value). A value of λ = 0.1, indicates that 90 % of the integrated w Σ crt n will be included in the energy range. Alternatively put, a value of λ = 0.1 results in ε min n and ε max taking the energies where w cum n = 0.05 and 0.95, respectively. The final energy range is given by ε min = min({ε min n : n = 0, 1, 2}) and ε max = max({ε max n : n = 0, 1, 2}). The scattering rate is only calculated for states where ε min ≤ ε nk ≤ ε max , with the scattering rates of the remaining states set to the average value of the rates that have been calculated explicitly. By setting λ to an appropriate value, the scattering rates for k-points outside the energy range will not impact the transport properties.
To demonstrate the impact of λ and determine reasonable values to use in our calculations, we have investigated the convergence of the transport properties for GaAs, Si, SnSe, and CuAlO 2 at 300 K [Supplementary Fig. 2(b)]. The conductivity, Seebeck coefficient, and electronic contribution to the thermal conductivity of all materials are converged to within than 1 % by λ = 0.02. In most cases, the Seebeck coefficient converges the fastest, most likely due to its weaker dependence on the scattering rate. The electronic contribution to the thermal conductivity is the slowest property to converge, as expected from its reliance on a broader momentum coefficient weight. If only the conductivity or Seebeck coefficient are of interest, a much larger value of λ can be used. For example, using a λ of 0.1 converges these properties to within 1 %. In our calculations, we employ a λ of 0.05 which offers a reasonable trade-off between speed and convergence. This property is controlled in our software implementation through the fd_tol parameter.

Software implementation
An open-source implementation of the formalism, used to perform all calculations in this work, is released as a package called amset [33]. amset is freely available under a modified Berkeley Software Distribution (BSD) license. The current version is developed and maintained using Git and is accessible at https:// hackingmaterials.lbl.gov/amset. The code can be run on both high-performance computing clusters or personal computers. amset is implemented in Python 3 and relies on several open-source libraries including pymatgen [34] for parsing vasp calculation outputs, BoltzTraP2 [20,35] for Fourier interpolation of electronic eigenvalues and group velocities, spglib [36] for symmetry analysis, quadpy [31] for numerical integration, and matplotlib [37] for plotting. The NumPy [38] and SciPy [39] libraries are used extensively to minimize the cost of expensive matrix operations. All-electron wave function coefficients are generated from the pseudo-wave functions using the MomentumMatrix functionality of the pawpyseed package [40].
amset can be used through either the the commandline or a Python application programming interface (API). A typical workflow, showing computational inputs and outputs, is illustrated in Supplementary Fig. (3). The primary inputs are vasprun.xml and WAVECAR vasp output files, calculated on a uniform k-point mesh. Additional settings, such as the materials parameters used to calculate scattering, the doping concentrations and temperatures to consider, and accuracy settings such as fd_tol, can be specified in a separate file or as commandline arguments. Information on all the available settings is provided on the amset website. After obtaining the first principles inputs, two pre-processing steps are required. Firstly, the all-electron wave function coefficients must be extracted from the vasp WAVECAR file using the wave tool. Secondly, the "effective-phononfrequency" should be calculated from phonon frequencies and eigenvectors, and the Born effective charges using the phonon-frequency tool. This process is described in more detail in Section I C 1. Scattering rates and transport properties are computed using the run command. The primary output is the transport file, which by default contains the calculated mobility, Seebeck coefficient, and electronic contribution to the thermal conductivity in the JavaScript Object Notation (JSON) format. The scattering rates, and interpolated eigenvalues and group velocities can be written to the mesh file with the Hierarchical Data Format version 5 (HDF5) format [41] using the write_mesh option. Finally, the plot command can be used to plot transport properties, lifetimes, and electron linewidths from the transport and mesh files. The sumo package is used for plotting band structures [42].

Timing analysis
A primary goal of the present approach is to be amenable to high-throughput computational workflows. To investigate the computational requirements of the amset package, we have illustrated the time taken to calculate the scattering rates of several of the test materials in Supplementary Fig. 4(a). All calculations were performed on a MacBook Pro with a quad core 2.9 GHz Intel Core i7 processor. The maximum time taken was 42 min for GaN, with most of the remaining materials completed in under 20 min. To understand which portions of the code are the most computationally demanding, we have broken down the results into the time taken to: (i) perform Fourier interpolation of electronic eigenvalues, (ii) compute the density of states through the tetrahedron method, (iii) obtain the scattering rates, (iv) calculate transport properties, and (v) write the output data to disk. We note, the benchmarks were performed with the write_mesh option enabled, so the output includes the scattering rates and interpolated band structure. In general, writing the output data takes the least amount of time relative to the other functions of the code. The breakdown for the rest of the computational steps depends strongly on the material and run time parameters, with most of the time spent calculating the scattering rates or transport properties.
To understand the scaling performance of amset with interpolation density, we have investigated the correlation of runtime with number of k-points. We find there is not a simple correlation between the total number of k-points and total runtime. Instead, each function of the code shows different scaling behaviour. The interpolation routines show O(n log n) scaling (where n is the total number of k-points in the dense mesh), which is consistent with the time complexity of the fast Fourier transform algorithm. The time taken to compute scattering does not correlate well with total number of kpoints. This is primarily as we only compute the scattering rates for the k-points which fall within the energy cutoffs defined by the λ parameter (see Section I B 2). In addition, we use the symmetry of the reciprocal lattice to limit our calculations to the k-points in the irreducible Brillouin zone (denoted k ir -points). The timing of the scattering routines correlates with the number of irreducible k-points that fall within the energy cutoffs, exhibiting a O(n 1.3 ) scaling complexity. We note that,  Supplementary  Table 4 and at the carrier concentrations and temperatures specified in Supplementary Table 6. (a) The total runtime for each system, broken up into the different functions of the code. (b) Correlation between time and number (denoted by #) of k-points for the interpolation, scattering, and transport routines. k ir indicates the k-points within the irreducible Brillouin zone. The number of temperatures and carrier concentrations are denoted by # T and # n, respectively. The computational complexity, provided in big O notation relative to the x-axis, is given in grey text and highlighted by dashed grey lines.
while the scattering rate is only calculated for the irreducible k-points within the energy cutoffs, the scattering rate for each state requires integrating the partial decay rates over the full Brillouin zone and not just the irreducible part. The time taken to compute transport properties correlates to the number the number of irreducible k-points multiplied by the number of carrier concentrations and temperatures included in the calculation, with a O(n 0.9 ) scaling complexity. The primary expense when computing transport properties is generating the energy-dependent tetrahedron integration weights used to obtain the spectral conductivity.
Supplementary Table 2. Time required to obtain firstprinciples inputs given in core hours. Calculations were performed as described in the Computational Methodology. We note that the DFPT calculation listed here is performed only for a single q-point at Γ and is used to obtain the effective phonon frequency, static and high-frequency dielectric constants, and piezoelectric constants rather than the matrix elements g(k, q). Static+NSCF (non self-consistent field) refers to a single point calculation on the relatively dense DFT kpoint meshes listed in Supplementary The total time to obtain transport properties is dominated by the calculation of the first-principles inputs (materials parameters and band structure calculation). In Supplementary Table 2, we provide the full timing information (in core hours) required to calculate all materials parameters used in this work. In Fig. (1) in the main text, we compare these times against DFPT+Wannier calculations performed using quantum espresso and epw. In Supplementary Table 3 we provide the full breakdown of the DFPT+Wannier calculations, including the references from which the timing information and mobility was extracted.

Reproducing the Brooks-Herring model of impurity scattering
A primary advantage of the present approach is that it allows, for the first time, evaluation of ionized impurity scattering in anisotropic multi-band systems. Most Supplementary Table 3. Time required to obtain electron mobility using DFPT+Wannier, as implemented in quantum espresso (DFPT to obtain g(k, q) portion) and epw (Wannier interpolation and scattering portion) in core hours. References are given to the publications in which the timing information and mobility results are reported  [44,45] modern computational evaluations of impurity scattering instead employ the closed-form Brooks-Herring formula [7,8]. We will not reproduce the full derivation here but refer the reader to the excellent introduction provided in Supplementary Ref. [46]. In this approach, the scattering matrix element where n ii and Z are the concentration and charge of the charge of the impurities, s is the static dielectric constant, and β is the inverse screening length given by Supplementary Eq. (10), is analytically integrated for a single parabolic band [7,8]. Under the assumption of complete overlap between the states the nk and mk + q, the resulting energy-dependent lifetime can be written where m * d is the density of states effective mass, 0 is the vacuum permittivity, G(b) = ln(b + 1) − b/(b + 1), and b = 8m * d ε/ 2 β 2 . Further integration of the energydependent lifetime yields the well-known Brooks-Herring mobility formula To validate our implementation of ionized impurity scattering, we have generated a model parabolic electronic structure according to where ε k and v k are the energy and group velocity at wave vector k, respectively. We calculated the ionized impurity scattering rate and resulting mobility using the AMSET package and Brooks-Herring formulas, parameterized according to Z = 1, m * d = 0.2 m 0 , s = 20 0 , n ii = 1 × 10 16 cm −3 to 1 × 10 19 cm −3 , and T = 500 K. A comparison between the two approaches is presented in Supplementary Fig. (5). Close agreement is observed for  the both the mobility and carrier lifetime, indicating our approach is accurately reproducing the Brooks-Herring results.
The Brooks-Herring formula is known to lead to inaccurate results for non-parabolic band structures or systems with multiple valleys. To demonstrate this, we compare our method against Brooks-Herring on an idealized Silicon band structure, as parameterized in Supplementary Refs. [47] and [48] and using the experimental effec- tive masses according to where m * = 0.98 m 0 , m * ⊥ = 0.19 m 0 , and k 0 denotes the wave vectors of the conduction band minima. The Brooks-Herring mobility is calculated using the harmonic mean of the effective masses, namely 3/(m −1, * + 2m −1, * ⊥ ) = 0.26 m 0 . As can be seen in Supplementary Fig. (6), the Brooks-Herring mobility is considerably over estimated by almost an order of magnitude relative to the mobility computed by AMSET. This agrees well with empirical investigations into the mobility of Silicon that have noted the overestimation of the Brooks-Herring result [49].

Electron linewidths
Access to band and k-dependent lifetimes can further be used to calculate electron linewidths that are qualitatively comparable to those measured through techniques such as angle-resolved photoemission spectroscopy (ARPES) [50]. In Supplementary Fig. (7) we plot the spectral band structure of SnO 2 along a high symmetry Brillouin zone path, where the spectral function A k (ε) = π −1 n (τ −1 nk /2)/[(ε − ε nk / ) 2 + (τ −1 nk /2) 2 ] was calculated at 300 K. The spectral function provides insight into the k-dependence of the carrier lifetimes. States close to the conduction band edge at Γ exhibit long lifetimes (low energy broadening) due to the reduced phase space of available states for scattering. Between the Z and R high symmetry points, the lowest conduction band is relatively flat leading to large scattering rates and considerable broadening of the spectral function.

Computational methodology
First-principles calculations were performed using Kohn-Sham DFT [51,52] as implemented in the Vienna ab initio Simulation Package (vasp) [53][54][55]. All ab initio inputs were computed within the generalizedgradient approximation (GGA) [56] using the Perdew-Burke-Ernzerhof (PBE) exchange-correlation functional [57]. Calculations were performed in a plane-wave basis set with scalar relativistic psueodpoentials and with the interactions between core and valence electrons described using the projector augmented-wave method (PAW) [58,59]. The set-up, submission, and management of first-principles calculations was handled using the atomate workflow management software with the default parameters of version 0.8.3 [60,61]. The plane-wave energy cutoff was set to 520 eV. Structure optimization was performed using the standard pymatgen MPRelaxSet with a reciprocal k-point density of 64 k-points/Å 3 [34]. The uniform non-self-consistent calculations used as input to the scattering calculations were run with a reciprocal k-point density of 1000 k-points/Å 3 . Spin-orbit interactions were included for calculations on CH 3 NH 3 PbI 3 as they were necessary to obtain the correct band ordering at the conduction band minimum.
Piezeoelectric constants, and static and high-frequency dielectric constants were computed using density functional perturbation theory (DFPT) based on the method developed by Baroni and Resta [62] and adapted to the PAW formalism by Gajdoš et al. [63]. Elastic constants were obtained through the stress-strain approach detailed in Supplementary Ref. [64]. These calculations were automated using the piezeoelectric_constant, dielectric_constant, and elastic_constant preset workflows available in atomate [60]. The calculation outputs (permitivies, elastic constants) were extracted using the pymatgen materials science software [34].
Absolute volume deformation potentials were calculated in the manner proposed by Wei and Zunger [65]. The deformation potential describes the change in energy of the bands with applied stress. Starting from a relaxed structure, a set of distorted structures are generated as follows. The Green-Lagrange strain tensor, S αβ , has 6 independent components (S 11 , S 22 , S 33 , S 12 , S 13 , S 23 ), each of which is applied independently to deform the structure. For each deformed structure, the deformation potential D nk,αβ at band, n, and k-point, k, is calculated as where ε 0 and ε s are the electronic energies of the bulk and strained structures, respectively, and ∆ζ is a correction that accounts for the shift in the electrostatic reference energy between the two calculations. In this work, ∆ζ, is calculated as ∆ζ = ζ s − ζ 0 , where ζ 0 and ζ s are the energies of the deepest core states of the bulk and strained structures, respectively [65]. We note that, in practice, even the reference energy levels can shift upon strain, leading to a small degree of error in the deformation potentials for non-covalent crystals [66,67]. By repeating this procedure for each of the 6 independent strain components, all elements of the 3 × 3 deformation potential tensor (at each band and k-point) can be calculated. To alleviate issues of numerical noise, we average the the deformation potentials for both contraction (−0.5 % strain) and expansion (+0.5 % strain) of the lattice. Furthermore, to reduce the computational requirements, the 12 independent calculations (comprising 6 independent strain components × 2 displacement strain magnitudes) are reduced using the symmetry of the bulk structure. For highly symmetric structures such as Si and GaAs, this means only 3 deformation calculations are required. We have released an open source tool deform as part of the amset package that automates the set-up of deformation calculations and the extraction of deformation potentials from vasp calculation outputs.
The "effective phonon frequency" used in the calculation of polar-optical phonon scattering was determined from the phonon frequencies ω qν (where ν is a phonon branch and q is a phonon wave vector) and eigenvectors e κν (q) (where κ is an atom in the unit cell). In order to capture scattering from the full phonon band structure in a single phonon frequency, each phonon mode is weighted by the dipole moment it produces according to where Z * κ is the Born effective charge. This naturally suppresses the contributions from transverse-optical and acoustic modes in the same manner as the more general formalism for computing Frölich based electron-phonon coupling [68,69]. The weight is calculated only for Γpoint phonon frequencies and averaged over the unit sphere scaled by 0.01 to capture both the polar divergence at q → 0 and any anisotropy in the dipole moments. The effective phonon frequency is calculated as the weighted sum over all Γ-point phonon modes according to We have released an open source tool phonon-frequency as part of the amset package that automates this computation from vasp calculation outputs.

Computational input settings
In this section we detail the computational input parameters used to calculate the materials properties in the vasp DFT code. We stress that in addition to the settings listed below, the accuracy of the calculated properties can depend on the exchange-correlation functional, the presence of spin-orbit interactions, and the treatment of highly correlated electrons, which must be assessed on a per material basis. To obtain accurate results, the crystal structure should first be relaxed using "tight" calculation settings including high force and energy convergence criteria. An example of the vasp settings required is: Our approach requires a vasprun.xml file from a "dense" uniform band structure calculation. Typically a k-point mesh density at least twice that needed to converge the total energy will be necessary to converge transport properties. Note, this refers to the initial DFT mesh before Fourier interpolation. In order to obtain accurate band gaps often a hybrid DFT functional such as HSE06 is required.
The wave function coefficients are required to calculate wave function overlaps. This requires the WAVECAR file to be written by vasp (achieved by setting LWAVE = True). The wave function coefficients can be extracted using the amset wave command. Coefficients are stored in the wavefunction.h5 file. An example of the vasp settings required to generate the wave function information is: amset includes a tool to assist with the calculation of the deformation potentials. The initial input is a "tight" optimised structure as described above. Deformed structures are generated using the amset deform create command, which will generate a set of deformed POSCARs each corresponding to a component of the strain tensor. Symmetry is automatically used to reduce the number of calculations needed. A single point calculation (no relaxation, i.e., NSW = 0) should be performed for each deformed POSCAR as well as the bulk (undeformed) structure. An example of the vasp settings required for the single point calculations is: ADDGRID = True EDIFF = 1E-8 PREC = Accurate NSW = 1 ICORELEVEL = 1 # write core levels to OUTCAR The deformation potentials can be calculated using the amset deform read command. This requires the paths to the bulk and deformation calculations as inputs. The bulk folder should be specified first, followed by the deformation folders. For example, amset deform read bulk def-1 def-2 def-3 This will write the deformation potentials to a deformation.h5 file in the current directory.
Static and high-frequency dielectric constants, piezoelectric constants, and the "effective polar phonon frequency" can be obtained using density functional perturbation theory. It is very important to first relax the structure using tight convergence setting. An example of the vasp settings required to perform DFPT is: Note, DFPT cannot be used with hybrid exchangecorrelation functionals. Instead, the LCALCEPS flag should be used in combination with IBRION = 6. The dielectric constants and polar phonon frequency can be extracted from the vasp outputs using the command amset phonon-frequency. This command should be run in a folder containing the vasprun.xml file from a DFPT calculation.

Materials parameters
All materials parameters were computed from firstprinciples in the manner described in the Computational Methodology. A summary of the materials parameters used to compute carrier scattering rates is provided in Supplementary Table 4. We have additionally employed the rigid scissor approximation such that band gaps match those calculated using the hybrid HSE06 exchange-correlation functional. Supplementary Table 5 gives the band gaps and k-point meshes employed in our calculations. Furthermore, we report the range of temperatures and carrier concentrations at which mobility and Seebeck coefficients are computed in Supplementary Tables 6 and 7.

Experimental data
In the main text, we calculate the mobility and Seebeck coefficient of 17 semiconductors and compare our results to experimental measurements. Our set of test materials spans a range of chemistries and doping-polarities and contains both isotropic and anisotropic materials. The set includes: (i) conventional semiconductors, Si, GaAs, GaN, GaP, InP, ZnS, ZnSe, CdS, CdSe, and SiC; (ii) the thermoelectric candidate SnSe; (iv) photovoltaic absorbers CH 3 NH 3 PbI 3 , PbS, and CdTe; and (iii) transparent conductors, SnO 2 , ZnO, and CuAlO 2 . The reference samples are of the highest purity and crystallinity in order to minimize the mesoscopic effects of grain boundary scattering and crystallographic one-dimensional and two-dimensional defects (e.g., line dislocations, edge dislocations, and stacking faults). We favor bulk crystals over thin films (which can exhibit surface effects that impact carrier transport, e.g., strain, oxidation, offstoichiometries, and surface dipole moments), however, in some cases we use epitaxial single crystal films. We also favor undoped or dilutely doped crystals (to less than 0.5 % at.) to avoid the formation of secondary crystal phases and degenerate doping. Lastly, we favor studies that look at a wide range of carrier concentrations and/or temperatures (greater than 300K). In all cases, experimental mobility is measured via the DC Hall effect. A summary of the reference data used in the comparisons against carrier mobility and Seebeck coefficient are provided in Supplementary Tables 6 and 7.
We note that for SnSe, significant anisotropy is seen in the measured carrier concentrations for the a and b directions [122]. We believe this discrepancy is an artefact of the Hall effect measurements from which the concentrations were calculated. The authors assumed a Hall factor r H of unity when extracting the carrier concentrations when in practice the Hall factor will depend on the band structure, temperature, and doping, and will likely be direction dependent. Although, the carrier concentration of a sample should be independent of the orientation of the sample, the measured carrier concentrations vary by up to three times suggesting that r H actually deviates from unity significantly. This anomalous behaviour was also recently highlighted by Ma et al. [123] who calculated the mobility of SnSe using DFPT+Wannier as implemented in EPW. To compare directly to their EPW results, we have used the same carrier concentrations in our calculations. We further note that use of a constant Hall factor may also explain the large anisotropy of the carrier mobilities between the b and c directions [122] which is not reproduced in DFPT+Wannier calculations.
Supplementary Table 4. Materials parameters used to compute scatterings rates. C is the elastic tensor in Voigt notation, with the unit GPa. s and ∞ are the static and high-frequency dielectric constants in 0. D vb and D cb are the absolute deformation potentials at the valence and conduction band edge, respectively. d is the dimensionless piezoelectric coefficient. ωpo is the effective polar phonon frequency given in THz. For all tensor properties, components that are not explicitly listed are zero    3. Mobility calculated using the HSE06 functional  Figure 13. Computed scattering rates compared against DFPT+Wannier calculations [3,99,113,123]. Results calculated at 300 K using the the lowest carrier concentrations for each material given in Supplementary Table. 6 .