Introduction

A conventional theoretical approach to materials exploration is to search for Hamiltonians that produce physical properties of interest (Fig. 1a). This is not only tedious but also nontrivial since the parameter space to be explored is usually unknown a priori. Therefore, most of the research to date has been conducted for the known Hamiltonians and their extensions. However, these approaches make it difficult to reach qualitatively new models and principles. In contrast, the inverse approach to find appropriate Hamiltonians directly from the desired properties is not only efficient but also has the potential to unveil qualitatively new physics (Fig. 1a). Many proposals have been made for the inverse approach1,2,3,4,5,6,7,8,9,10,11,12,13. Since the early stage, the perturbation theory2,7,14, the potential interpolation15,16, and the eigenstate-to-Hamiltonian construction17 have been employed, but their applications were limited to the objective functions in terms of energy. In recent years, machine learning-based methods, such as the generative models using neural networks1,18,19, the Bayesian optimization using Gaussian processes5,20,21, and the genetic algorithms22,23 have been developed, but they require numerous data and computational resources for training. In particular, the Bayesian optimization and the genetic algorithms do not necessarily improve the objective function after parameter update, and the generative models would fail in the parameter space where data is insufficient. For these reasons, the previous research has been limited to the quantitative estimation of a few parameters within known Hamiltonians. Thus, it is still challenging to explore new models and principles by taking full advantage of the inverse problem.

Fig. 1: Inverse design of Hamiltonian.
figure 1

a In the conventional approach, the Hamiltonian is first constructed based on phenomenology or first principles, and then, the optimal parameters of the Hamiltonian are explored through physical properties calculated from the Hamiltonian. In contrast, in the inverse approach, the desired physical properties are prepared first, and then, the Hamiltonian to realize them is obtained directly. b Flowchart proposed in the present study to solve the inverse problem by using automatic differentiation. First, we prepare a Hamiltonian \({{{{{{{\mathcal{H}}}}}}}}({{{{{{{\mathbf{\theta }}}}}}}})\) that depends on parameters θ. Next, we calculate the objective function L(θ), which represents the desired physical properties. By optimizing θ to minimize L(θ), we obtain a Hamiltonian \({{{{{{{\mathcal{H}}}}}}}}({{{{{{{{\mathbf{\theta }}}}}}}}}_{{{{{{{{\rm{opt}}}}}}}}})\) that satisfies the desired physical properties, where θopt are the parameters after the optimization.

To address these issues, we develop a framework that can automatically design a Hamiltonian with desired physical properties by using automatic differentiation. Automatic differentiation enables us to compute the analytic derivatives of any functions by adapting chain rules, which have been widely used in the field of deep learning in the process of backpropagation24, even for over a trillion parameters25. In recent years, automatic differentiation has been applied to physics, such as computing physical quantities represented by derivatives26,27, calculating conditions for solar cells28, applications to quantum gate control29,30,31, non-equilibrium steady state32, numerical renormalization group33, Hatrtee–Fock calculation34,35, molecular dynamics36, and density functional theory37,38. However, the application to the inverse design of a Hamiltonian has not been fully explored thus far to the best of our knowledge.

In this article, we first describe the framework and its advantages over previous methods. Then, we demonstrate a proof of concept of this framework by applying it to two problems: the anomalous Hall effect (AHE) and the photovoltaic effect (PVE). We show that our framework can automatically construct the Haldane model with the quantum AHE on the honeycomb lattice. Moreover, by applying the framework to a model on the triangular lattice, we find a Hamiltonian that exhibits a six-time larger AHE than that of the Haldane model. For the PVE, we are able to automatically generate a spin-charge-coupled Hamiltonian with electrons moving over an umbrella-shaped spin configuration, which can produce a photocurrent of about 700 A m−2. Our framework is applicable to a wide range of systems and physical properties, including first-principles Hamiltonians, strongly correlated electron systems, and interacting bosonic systems.

Results and discussion

Framework

The flowchart of our framework is shown in Fig. 1b. First, we prepare a Hamiltonian \({{{{{{{\mathcal{H}}}}}}}}({{{{{{{\mathbf{\theta }}}}}}}})\) with a set of parameters θ. We also define the objective function L(θ) to be minimized for achieving the desired properties; for instance, if the objective is to maximize the expectation value of a physical quantity P, we can take L(θ) = −〈P(θ)〉. Next, we compute the derivative \(\frac{\partial L}{\partial {{{{{{{\mathbf{\theta }}}}}}}}}\) by automatic differentiation. Then, we update the Hamiltonian by changing the parameters θ according to \(\frac{\partial L}{\partial {{{{{{{\mathbf{\theta }}}}}}}}}\). By repeating this procedure until θ converge, we end up with the Hamiltonian \({{{{{{{\mathcal{H}}}}}}}}({{{{{{{{\mathbf{\theta }}}}}}}}}_{{{{{{{{\rm{opt}}}}}}}}})\) that optimizes the desired properties, where θopt are the parameters after the convergence, as commonly done in machine learning.

Our framework has the following advantages in comparison with the existing methods1,2,5,7,14,15,16,17,18,19,20,21,22,23: (i) It does not require training, hence, there is no need to collect data or consume computational resources on the training. (ii) It performs the optimization by using the analytical derivatives, which can achieve higher accuracy than the approximations based on neural networks even for large parameter space. (iii) It is applicable to a wide range of objective functions, unlike the perturbation theory. Therefore, our framework is able to deal with a large number of parameters in the Hamiltonian, which may lead to the findings of Hamiltonians that have not been reported thus far.

Automatic construction of the Haldane model showing spontaneous quantum AHE

First, we demonstrate that our framework can automatically find the Haldane model with a spontaneous quantum AHE39. We consider a tight-binding model on a honeycomb lattice with two sublattices, whose Hamiltonian reads

$${{{{{{{\mathcal{H}}}}}}}}=\mathop{\sum}\limits_{\begin{array}{c}i,{a}_{i}\in \{A,B\}\end{array}}{M}^{{a}_{i}}{c}_{i}^{{{{\dagger}}} }{c}_{i}+\mathop{\sum}\limits_{\langle i,j\rangle }{t}_{1}{c}_{i}^{{{{\dagger}}} }{c}_{j}+\mathop{\sum}\limits_{\langle \langle i,j\rangle \rangle }{t}_{2}^{{d}_{ij}}{c}_{i}^{{{{\dagger}}} }{c}_{j},$$
(1)

where \({c}_{i}^{{{{\dagger}}} }\) (ci) is the creation (annihilation) operator of a spinless fermion at site i; the first term describes an on-site staggered potential with real coefficients \({M}^{{a}_{i}}\) (ai = A or B denotes the sublattice), and the second and third terms represent the hopping of fermions to nearest- and second-neighbor sites, respectively. Here, we set t1 = 1 as an energy unit and parametrize \({t}_{2}^{{d}_{ij}}\) as \({t}_{2}^{{d}_{ij}}=\sigma ({r}^{{d}_{ij}})\exp ({{{{{{{\rm{i}}}}}}}}{\phi }^{{d}_{ij}})\) with real variables \({r}^{{d}_{ij}}\) and \({\phi }^{{d}_{ij}}\), where σ(x) = 1/(1 + ex) is the sigmoid function to avoid the divergence of the absolute value of \({t}_{2}^{{d}_{ij}}\), and dij denotes the direction of the second-neighbor hopping, dij {A1, A2, A3, B1, B2, B3} (see Fig. 2a). Thus, the model includes 14 parameters in total represented by \({{{{{{{\mathbf{\theta }}}}}}}}=\{{M}^{A},{M}^{B},\{{r}^{{d}_{ij}}\},\{{\phi }^{{d}_{ij}}\}\}\). The Haldane model is given by taking MA = +M, MB = −M, and \({t}_{2}^{{d}_{ij}}={t}_{2}\exp ({{{{{{{\rm{i}}}}}}}}\phi )\) regardless of d. The phase diagram is shown in Fig. 2b, which has two topologically nontrivial phases with a spontaneous quantum AHE corresponding to the nonzero Chern numbers C = ±1.

Fig. 2: Automatic construction of the Haldane model.
figure 2

a A tight-binding model on a honeycomb lattice in Eq. (1). There are 14 parameters including on-site potential \({M}^{{a}_{i}}\) and the amplitudes and phases of the second-neighbor hopping \({t}_{2}^{{d}_{ij}}\), where ai {A, B} is the index of the sub-lattice and dij {A1, A2, A3, B1, B2, B3} is the direction of the hopping. The nearest-neighbor hopping t1 is fixed to 1. b Phase diagram of the Haldane model, where MA = +M, MB = −M, t1 = 1, and \({t}_{2}^{{d}_{ij}}={t}_{2}\exp (i\phi )\). There are two topologically nontrivial phases with nonzero Chern numbers C = ±1. The yellow star represents where our framework reaches after the convergence. c, Schedule of \(\log \beta\), where β is the inverse temperature. d, e Changes of the Hall conductivity σxy (d) and the Chern numbers C for the two bands (e) through the optimization process. The inset in d shows the change of the band gap. fh Changes of the parameters: \({M}^{{a}_{i}}\) (f), \(| {t}_{2}^{{d}_{ij}}|\) (g), and \({\phi }^{{d}_{ij}}\) (h), where \({\phi }^{{d}_{ij}}\) is the phase of \({t}_{2}^{{d}_{ij}}\).

With this setup of \({{{{{{{\mathcal{H}}}}}}}}({{{{{{{\mathbf{\theta }}}}}}}})\), we try to obtain a Hamiltonian that maximizes the AHE by the framework in Fig. 1b. For this aim, we take the objective function as L(θ) = −σxy(θ), where σxy is the Hall conductivity. Details of the calculations are described in the “Methods” section. We find that σxy increases monotonically through the optimization, as shown in Fig. 2d. Note that we introduce temperature and control it as shown in Fig. 2c to avoid that \(\frac{\partial L}{\partial {{{{{{{\mathbf{\theta }}}}}}}}}\) becomes zero due to the quantization (β is the inverse temperature). In contrast to the continuous change of σxy, the Chern numbers of the two bands, which are separated by the band gap shown in the inset of Fig. 2d, converge quickly to C ±1 in the very early stage of the optimization, as shown in Fig. 2e. The evolution of each parameter is plotted in Figs. 2f–h. We find that both MA and MB converge to zero, and \(| {t}_{2}^{{d}_{ij}}| \to 1\) and \({\phi }^{{d}_{ij}}\to \pi /2\) for all dij. These values correspond to the center of the topological phase with C = 1 in the Haldane model, indicated by the star in Fig. 2b. We confirm that different initial conditions converge to the same state (see Supplementary Note 1). Thus, our framework automatically constructs the Haldane model with a spontaneous quantum AHE under the condition of maximizing σxy. The reason why the optimal state is always at the center of the C = 1 phase is due to the introduction of temperature; at nonzero temperature, σxy becomes largest at the center where the band gap becomes largest in the topological phase. We note that the value of σxy in Fig. 2d is considerably smaller than the quantized value +1, which is also due to the finite temperature.

Finding a Hamiltonian with large quantum AHE on a triangular lattice

To demonstrate that our framework can find more complex models automatically, we apply it to a triangular lattice assuming a four-sublattice unit cell (Fig. 3a). The Hamiltonian reads

$${{{{{{{\mathcal{H}}}}}}}}=\mathop{\sum}\limits_{\langle i,j\rangle }{t}_{1}^{ij}{c}_{i}^{{{{\dagger}}} }{c}_{j}+\mathop{\sum}\limits_{\langle \langle i,j\rangle \rangle }{t}_{2}^{ij}{c}_{i}^{{{{\dagger}}} }{c}_{j}+\mathop{\sum}\limits_{\langle \langle \langle i,j\rangle \rangle \rangle }{t}_{3}^{ij}{c}_{i}^{{{{\dagger}}} }{c}_{j}.$$
(2)

We take \({t}_{1}^{ij}=\exp ({{{{{{{\rm{i}}}}}}}}{\phi }_{1}^{ij})\) and \({t}_{m}^{ij}=\sigma ({r}_{m})\exp ({{{{{{{\rm{i}}}}}}}}{\phi }_{m}^{ij})\) for m = 2 and 3 (see the arrows in Fig. 3a). Thus, the model includes 38 parameters in total represented by \({{{{{{{\mathbf{\theta }}}}}}}}=\{{r}_{2},{r}_{3},\{{\phi }_{1}^{ij}\},\{{\phi }_{2}^{ij}\},\{{\phi }_{3}^{ij}\}\}\). As in the previous calculation, we take L(θ) = −σxy(θ) to maximize the AHE. We optimize the parameters with a schedule of temperature shown in Fig. 3b. At each optimization step, the fermion density is fixed at half filling by tuning the chemical potential using the bisection method.

Fig. 3: Automatic construction of a Hamiltonian showing a six-times larger quantum anomalous Hall effect than the Haldane model.
figure 3

a A tight-binding model on a triangular lattice with 38 parameters, including the nearest-neighbor (t1), the second-neighbor (t2) and the third-neighbor (t3) hoppings. The shades denote four-sublattice unit cells. The color of the arrows represents the optimum phase of each hopping, \({\phi }_{m}^{ij}\), after the convergence, according to the inset below. b Schedule of \(\log \beta\), where β is the inverse temperature. c, d Changes of the Hall conductivity σxy (c) and the Chern number C for four bands (d). e The band structure after the convergence plotted with the Berry curvature Ω(k) at each wavenumber k = (kx, ky). f and g Fictitious magnetic fluxes defined by the sum of phases along the counter-clockwise direction as \({{{\Phi }}}_{m}={\sum }_{ij}{\phi }_{m}^{ij}\) on the smallest triangles by the nearest-neighbor hopping t1 (f) and larger ones by the second-neighbor hopping t2 (g), which are indicated by the same color code as the inset of (a).

We find that the Chern numbers for four bands converge to C = 5, 1, −3, and −3 from the lower band, as shown in Fig. 3d. This indicates that σxy reaches 6 at half filling, which is six times larger than that in the Haldane model, although σxy in Fig. 3c is much smaller due to the finite temperature similar to the previous case. The band structure is shown in Fig. 3e with the Berry curvature Ω (see the “Methods” section). Note that the system recovers (approximately) threefold rotational symmetry after the convergence (see Supplementary Note 2). Ω of the lowest energy band is positive at all wave numbers, whose sum gives the largest C = 5, while the other bands include negative contributions. This indicates that our framework tries to maximize C for the lowest energy band. We note that the same conclusion is obtained for many other initial conditions, while some cases converge to C = 3, 3, −1, and −5 from the lower band, which gives the same value of σxy = 6. The reason why the solution in Fig. 3 is rather preferred is the finite temperature introduced in the optimization process, for the same reason as in the honeycomb lattice model for which the center of the topological phase was obtained (see Supplementary Note 2).

Let us discuss the optimized parameters. We find that both t2 and t3 converge to 1, while the phases take the various values shown by colors in Fig. 3a. We show, however, that their sums along closed loops in the counter-clockwise direction, \({{{\Phi }}}_{m}=\sum {\phi }_{m}^{ij}\), representing the fictitious magnetic fluxes, take some regular values: Φ1 7π/4 for the smallest triangles composed of t1 (Fig. 3f), and Φ2 takes 0.91π and 1.59π for larger triangles of t2 facing right and left, respectively (Fig. 3g), while Φ3 is always π (\({\phi }_{3}^{ij}\) is either  0 or π). Although \({\phi }_{m}^{ij}\) take different values for different initial conditions, Φm converges to the same values. These results indicate that our framework automatically finds a model whose complex hoppings realize spontaneous fictitious magnetic fluxes to maximize σxy, which is hard to obtain by intuition. Based on the results, we can also refine the Hamiltonian by taking more regular values of the phases (multiples of π/4) (see Supplementary Note 2).

Maximizing photovoltaic current generation in a spin-charge-coupled system

Finally, we apply our framework to optimize the PVE in a bulk system with broken spatial inversion symmetry40,41,42,43,44. An example is the shift current, which is understood as a shift in the real space of electron wave functions excited by light. For simplicity, here we focus on (quasi-)one-dimensional spin-charge-coupled systems where the spin configurations break spatial inversion symmetry45. The schematic is shown in Fig. 4a. Note that the model approximately describes chiral magnetic metals, such as CrNb3S646 and Yb(Ni1−xCux)3Al947. The Hamiltonian reads

$${{{{{{{\mathcal{H}}}}}}}}=\mathop{\sum}\limits_{i,\alpha }\left({t}_{1}{c}_{i\alpha }^{{{{\dagger}}} }{c}_{i+1\alpha }+{t}_{2}{c}_{i\alpha }^{{{{\dagger}}} }{c}_{i+2\alpha }+{{{{{{{\rm{H.c.}}}}}}}}\right)+J\mathop{\sum}\limits_{i,\alpha ,\beta }{c}_{i\alpha }^{{{{\dagger}}} }{{{{{{{{\boldsymbol{\sigma }}}}}}}}}_{\alpha \beta }{c}_{i\beta }\cdot {{{{{{{{\bf{S}}}}}}}}}_{i},$$
(3)

where \({c}_{i\alpha }^{{{{\dagger}}} }\) (ciα) denotes the creation (annihilation) operator of an electron at site i with spin α. Here, we take \({t}_{1}=\sqrt{2}\tanh ({r}_{t})\cos ({\theta }_{t})\times 0.1\) [eV], \({t}_{2}=\sqrt{2}\tanh ({r}_{t})\sin ({\theta }_{t})\times 0.1\) [eV], and \(J=\log (1+\exp ({r}_{J}))\) [eV]; the spins are treated as classical and their configurations are parametrized as \({{{{{{{{\bf{S}}}}}}}}}_{i}=(\sin {\theta}_{i}\cos {\phi }_{i},\sin {\theta }_{i}\sin {\phi }_{i},\cos {\theta }_{i})\), with θi = πσ(ηi). t1 and t2 are represented by the hyperbolic tangent functions to be bounded, otherwise, they will become too large through the optimization since the shift current increases with increasing momentum derivatives of the band dispersions. We set t1 and t2 to be within about 0.1 eV, considering the situation in the real materials. J is set to be positive without loss of generality. We set the number of sublattice sites to N = 12. Thus, the model includes 3 + 2N = 27 parameters in total represented by θ = {rt, θt, rJ, {ηi}, {ϕi}}. The quantity of our interest is the photocurrent under solar radiation, defined as I = ∫dωσPVE(ω)E(ω)2 [A m−2], where σPVE(ω) is the nonlinear optical conductivity48,49, and E(ω)2 denotes the intensity of the linearly polarized solar light with frequency ω, approximately given by blackbody radiation at T = 5500 K (the inset of Fig. 4a) (see the “Methods” section); we take L(θ) = −I. We consider a three-dimensional system in which the one-dimensional chains are arranged in a square lattice fashion for simplicity, taking the lattice constants az = 9 Å in the chain direction and ax = ay = 4 Å in the orthogonal directions, referring to a chiral magnet47. The fermion density is fixed at half filling as for the previous model.

Fig. 4: Automatic construction of a Hamiltonian for electrons moving on a noncoplanar spin texture, which can generate ~ 700 A m−2 under solar radiation.
figure 4

a Schematic of the system. A photocurrent is generated by solar radiation (blackbody radiation at 5500 K in the inset) onto the one-dimensional spin-charge coupled system. b Schedule of temperature T [K]. c Change of the photocurrent I [A m−2]. The insets show the changes of the nearest-neighbor hopping t1 [eV], the second-neighbor hopping t2 [eV], and the coupling constant J [eV]. df Spin configurations after the convergence (d), plotted with the z components Sz (e) and the angles of spins projected onto the xy plane, \({\phi }_{{S}_{xy}}\) (f). The Sz axis is taken in the direction of the total magnetization. g ω dependence of I(ω) = σPV E(ω)E(ω)2, where σPVE(ω) is the nonlinear optical conductivity and E(ω)2 is the intensity of solar light (inset). h The band structure of electrons. Iband(k) shown in color bar represents the contribution to I from each band.

Figure 4c shows the optimization process of the photocurrent I under the schedule of temperature shown in Fig. 4b. We obtain I ~ 700 A m−2 after the convergence. This value is comparable to or larger than those for Ge semiconductors50 and perovskites substances51,52. Changes in the parameters t1, t2, and J are plotted in the inset of Fig. 4c. The optimized spin configuration is an umbrella-shaped chiral state with a three-site period, as shown in Fig. 4d–f. We also note that other noncoplanar spin configurations are also obtained for different initial conditions, but they generate smaller I (see Supplementary Note 3).

To elaborate the mechanism behind the optimization of the photocurrent, we plot the ω dependence of \(I(\omega) = \sigma_{\rm PVE}(\omega) |E(\omega)|^2\) in Fig. 4g, together with σPVE(ω)ω2 and E(ω)2 in the inset. We find that I(ω) has a sharp peak at ω ~ 7.15 × 1014 [rad s−1], due to the peak of σPVE(ω)ω2 located at the frequency where E(ω)2 becomes large. We show that dominant contributions to the peak come from the interband processes between the conduction and valence bands split by 2J 0.5 [eV]  7.15 × 1014 [rad s−1], as shown in Fig. 4h (see the “Methods” section). The results indicate that the enhanced photocurrent of ~ 700 A m−2 under solar radiation is generated by band engineering with automatic optimization of t1, t2, J, and the spin configurations. We note that the peak value of σPVE(ω) ~ 0.06 A V−2 is considerably large compared to existing materials, such as BaTiO340,53 and TaAs54, and is also even an order of magnitude larger than the value obtained in the previous theoretical study45, while we may need substantially large competing magnetic interactions to stabilize the umbrella spin configuration at room temperature.

Conclusions

Through the applications to AHE and PVE, our framework has proven capable of automatically finding Hamiltonians that optimize the physical properties of interest. The key aspect is in the use of automatic differentiation in the inverse problem, which provides the derivatives of the objective function in terms of a large number of parameters; although the current studies are limited to several tens of parameters, we can practically deal with a million or more. Since automatic differentiation is a versatile technique, our framework has a wide range of applicability, such as first-principles Hamiltonians computed by the Kohn–Sham equations, strongly correlated electron systems, quantum spin systems, and interacting bosonic systems, as long as the forward computation can be performed efficiently. In addition, it is applicable to a wide range of physical properties to be optimized, including the reproduction of experimental raw data. Thus, our findings will be useful for the exploration of new models and principles in materials science.

Methods

Application to the AHE

The Hall conductivity is calculated by using the Kubo formula as

$${\sigma }_{xy}=-\frac{{e}^{2}}{h}\frac{V}{2\pi {N}_{{{{{{{{\bf{k}}}}}}}}}}\mathop{\sum}\limits_{m,n,{{{{{{{\bf{k}}}}}}}}}(\, f({E}_{{{{{{{{\bf{k}}}}}}}}n},\beta )-f({E}_{{{{{{{{\bf{k}}}}}}}}m},\beta )){{\Omega }}({{{{{{{\bf{k}}}}}}}}),$$
(4)

where e is the elementary charge, h is the Planck constant, V is the volume of the Brillouin zone, Nk is the number of k points, f(E, β) is the Fermi distribution function at inverse temperature β, Ekn is the energy at k in nth band; Ω(k) is the Berry curvature given by

$${{\Omega }}({{{{{{{\bf{k}}}}}}}})={{{{{{{\rm{Im}}}}}}}}\frac{\langle {{{{{{{\bf{k}}}}}}}}n| \frac{\partial {{{{{{{\mathcal{H}}}}}}}}}{\partial {k}_{y}}| {{{{{{{\bf{k}}}}}}}}m\rangle \langle {{{{{{{\bf{k}}}}}}}}m| \frac{\partial {{{{{{{\mathcal{H}}}}}}}}}{\partial {k}_{x}}| {{{{{{{\bf{k}}}}}}}}n\rangle }{{({E}_{{{{{{{{\bf{k}}}}}}}}n}-{E}_{{{{{{{{\bf{k}}}}}}}}m})}^{2}+{{{{{{{\rm{i}}}}}}}}\delta },$$
(5)

where \(\left\vert {{{{{{{\bf{k}}}}}}}}n\right\rangle\) is an eigenstate at k in nth band. We take e = h = 1, Nk = 1002, and δ = 10−5.

The optimization starts from initial parameters randomly chosen as MA, MB (−1, 1), \({r}^{{d}_{ij}}\in (0,1)\), and \({\phi }^{{d}_{ij}}\in (-\pi ,\pi )\) for the honeycomb lattice model, and r2, r3 (0, 1) and \({\phi }_{1}^{ij},{\phi }_{2}^{ij},{\phi }_{3}^{ij} \in (-\pi ,\pi )\) for the triangular lattice model. Automatic differentiation is implemented using JAX55. Note that \(\frac{\partial {{{{{{{\mathcal{H}}}}}}}}}{\partial {k}_{x}}\) and \(\frac{\partial {{{{{{{\mathcal{H}}}}}}}}}{\partial {k}_{y}}\) in Eq. (5) are also calculated by using automatic differentiation. We employ RMSPROP56 as an optimization method, in which we take the learning rate, the decay factor, and the infinitesimal as 0.1, 0.99, and 10−8, respectively.

Application to the PVE

According to the second-order optical response theory44,45, a nonlinear electric current produced by electric fields E(ω1) and E(ω2) with two frequencies ω1 and ω2, respectively, is given by

$$I({\omega }_{1}+{\omega }_{2};{\omega }_{1},{\omega }_{2})={\sigma }_{{{{{{{{\rm{opt}}}}}}}}}({\omega }_{1}+{\omega }_{2};{\omega }_{1},{\omega }_{2})E({\omega }_{1})E({\omega }_{2}),$$
(6)

with the second-order optical conductivity σopt(ω1 + ω2; ω1, ω2). In the case of ω1 = − ω2, a DC current is generated as

$$I(\omega )={\sigma }_{{{{{{{{\rm{PVE}}}}}}}}}(\omega )| E(\omega ){| }^{2},$$
(7)

where I(ω) = I(0; ω, −ω) and σPVE(ω) = σopt(0; ω, −ω). The ω integral I = ∫dωI(ω) gives a photocurrent generated by the shift current mechanism42,43,45, which is used for the objective function in the main text. We approximate solar radiation by blackbody radiation B(ω, T) at 5500 K as

$$| E(\omega ){| }^{2}=2{\mu }_{0}c{C}_{{{{{{{{\rm{solar}}}}}}}}}\frac{B(\omega ,T=5500\,{{{{{{{\rm{K}}}}}}}})}{\int\,{\rm {d}}\omega B(\omega ,T=5500\,{{{{{{{\rm{K}}}}}}}})},$$
(8)

where μ0, c, and Csolar are the magnetic constant, speed of light, and solar constant, respectively;

$$B(\omega ,T)=\frac{\hslash {\omega }^{3}}{4{\pi }^{3}{c}^{2}}\frac{1}{\exp (\frac{\hslash \omega }{{k}_{{\rm {B}}}T})-1},$$
(9)

where and kB are the reduced Planck constant and the Boltzmann constant, respectively. In Eq. (7), σPVE(ω) is computed as44,45

$$\begin{array}{r}{\sigma }_{{{{{{{{\rm{PVE}}}}}}}}}(\omega )=-\frac{Ve^3}{{(2\pi )}^{3}}\frac{1}{{N}_{k}{\omega }^{2}}({\sigma }_{{{{{{{{\rm{PVE}}}}}}}},1}+{\sigma }_{{{{{{{{\rm{PVE}}}}}}}},2}+{\sigma }_{{{{{{{{\rm{PVE}}}}}}}},3}+{\sigma }_{{{{{{{{\rm{PVE,4}}}}}}}}}),\end{array}$$
(10)

where

$${\sigma }_{{{{{{{{\rm{PVE}}}}}}}},1}=-\mathop{\sum}\limits_{k,a}f({E}_{k},\beta ){J}_{aa}^{(3)},$$
(11)
$${\sigma }_{{{{{{{{\rm{PVE}}}}}}}},2}=\mathop{\sum}\limits_{k,a,b}\left(\frac{{f}_{ab}{J}_{ab}^{(1)}{J}_{ba}^{(2)}}{\omega +{{{{{{{\rm{i}}}}}}}}\gamma/2 -{E}_{ab}}+\frac{{f}_{ab}{J}_{ab}^{(1)}{J}_{ba}^{(2)}}{-\omega +{{{{{{{\rm{i}}}}}}}}\gamma/2 -{E}_{ab}}\right),$$
(12)
$${\sigma }_{{{{{{{{\rm{PVE}}}}}}}},3}=\mathop{\sum}\limits_{k,a,b}\frac{{f}_{ab}{J}_{ab}^{(2)}{J}_{ba}^{(1)}}{{{{{{{{\rm{i}}}}}}}}\gamma -{E}_{ab}},$$
(13)
$${\sigma }_{{{{{{{{\rm{PVE,4}}}}}}}}}=-\mathop{\sum}\limits_{k,a,b,c}\frac{{J}_{ab}^{(1)}{J}_{bc}^{(1)}{J}_{ca}^{(1)}}{{{{{{{{\rm{i}}}}}}}}\gamma -{E}_{ca}}\left(\frac{{f}_{ab}}{\omega +{{{{{{{\rm{i}}}}}}}}\gamma/2 -{E}_{ba}}+\frac{{f}_{cb}}{\omega +{{{{{{{\rm{i}}}}}}}}\gamma/2 -{E}_{cb}}+\frac{{f}_{ab}}{-\omega +{{{{{{{\rm{i}}}}}}}}\gamma/2 -{E}_{ba}}+\frac{{f}_{cb}}{-\omega +{{{{{{{\rm{i}}}}}}}}\gamma/2 -{E}_{cb}}\right).$$
(14)

Here, a, b, and c denote the bands; Eab = EkaEkb, fab = f(Eka, β)−f(Ekb, β), and \({J}_{ab}^{(n)}=\left\langle ka\right\vert \frac{{\partial }^{n}{{{{{{{\mathcal{H}}}}}}}}}{\partial {k}^{n}}\left\vert kb\right\rangle\). We use \(V=\frac{{(2\pi )}^{3}}{{a}_{x}{a}_{y}N{a}_{z}}\), Nk = 100, and γ = 2π × 1013 [rad s−1]. \(\frac{{\partial }^{n}{{{{{{{\mathcal{H}}}}}}}}}{\partial {k}^{n}}\) in \({J}_{ab}^{(n)}\) are calculated by using automatic differentiation. We also calculate the contribution to I from each k point in each band, Iband(k), by calculating I without taking the summations of k and the band indices in Eqs. (11)–(14). The optimization starts from initial parameters randomly chosen as rt (−1, 1), θt (−π, π), rJ (0, 0.5), ηi (−1, 1), and ϕi (−π, π).