Inverse Hamiltonian design by automatic differentiation

An ultimate goal of materials science is to deliver materials with desired properties at will. In the theoretical study, a standard approach consists of constructing a Hamiltonian based on phenomenology or first principles, calculating physical observables, and improving the Hamiltonian through feedback. However, there is also an approach that bypasses such a cumbersome procedure, namely, to obtain an appropriate Hamiltonian directly from the desired properties. Solving the inverse problem has the potential to reach qualitatively different principles, but most research to date has been limited to quantitative determination of parameters within known models. Here, we present a general framework that enables the inverse design of Hamiltonians by optimizing numerous parameters using automatic differentiation. By applying it to the quantum anomalous Hall effect (AHE), we show that our framework can not only rediscover the Haldane model but also automatically generate a new Hamiltonian that exhibits a six-times larger AHE. In addition, the application to the photovoltaic effect (PVE) gives an optimal Hamiltonian for electrons moving on a noncoplanar spin texture, which can generate $\sim 900$~A/m$^2$ under solar radiation. This framework would accelerate materials exploration by establishing unprecedented models and principles beyond human intuition and empirical rules.

A conventional theoretical approach to materials exploration is to search for Hamiltonians that produce physical properties of interest (Fig. 1a). This is not only tedious but also nontrivial since the parameter space to be explored is usually unknown a priori. Therefore, most of the research to date has been conducted for the known Hamiltonians and their extensions. However, these approaches make it difficult to reach qualitatively new models and principles. In contrast, the inverse approach to find appropriate Hamiltonians directly from the desired properties is not only efficient but also has the potential to unveil qualitatively new physics (Fig. 1a). Over the past years, many proposals have been made in this direction [1,2,5,14], in particular, for estimation of the model parameters from experimental or simulation data [6][7][8] and construction of the Hamiltonians from the desired properties [1,5,[15][16][17]. In most cases, however, the number of the parameters is a few due to the computational cost, and some methods have limitations on the models and physical quantities that can be applied. For these reasons, the previous research has also been limited to quantitative estimation of parameters within known Hamiltonians; it is still challenging to explore new models and principles by taking full advantage of the inverse problem.
To address these issues, we develop a framework that can automatically design a Hamiltonian with desired physical properties by using automatic differentiation. Automatic differentiation enables us to compute the analytic derivatives of any functions by adapting chain rules, which has been widely used in the field of deep learning in the process of backpropagation [18], even for over a trillion parameters [19]. The flowchart of our framework is shown in Fig. 1b. First, we prepare a Hamiltonian H(θ) with a set of parameters θ. We also define the objective function L(θ) to be minimized for achieving the desired properties; for instance, if the objective is to maximize the expectation value of a physical quantity P , we can take L(θ) = − P (θ) . Next, we compute the derivative ∂L ∂θ by automatic differentiation. Then, we update the Hamiltonian by changing the parameters θ according to ∂L ∂θ . By repeating this procedure until θ converge, we end up with the Hamiltonian H(θ opt ) that optimizes the desired properties, where θ opt are the parameters after the convergence. The key aspect of this framework is that it enables us to optimize a large number of parameters simultaneously, which dramatically reduces the computational cost compared to the conventional approach. More importantly, as there is in practice no limitation on the parameter space of the Hamiltonian, it may lead us to discover unprecedented Hamiltonians that are hard to obtain by human intuition. In the following, as a proof of concept of this framework, we show its application to two problems: AHE and PVE.
First, we demonstrate that our framework can automatically rediscover the Haldane model with a spontaneous quantum AHE [9]. We consider a tight-binding model on a honeycomb lattice with two sublattices, whose Hamiltonian reads where c † i (c i ) is the creation (annihilation) operator of a spinless fermion at site i; the first term describes an on-site staggered potential with real coefficients M a (a = A or B denotes the sublattice), and the second and third terms represent the hopping of fermions to nearest-and second-neighbor sites, respectively. Here, we set t 1 = 1 and parametrize t d 2 as t d 2 = σ(r d ) exp(iφ d ) with real variables r d and φ d , where σ(x) = 1/(1 + e −x ) is a sigmoid function and d denotes the direction of the second-neighbor hopping (d = A1, A2, A3, B1, B2, B3); see Fig. 2a Inverse design of Hamiltonian. a, In the conventional approach, the Hamiltonian is first constructed based on phenomenology or first principles, and then, the optimal parameters of the Hamiltonian are explored through physical properties calculated from the Hamiltonian. In contrast, in the inverse approach, the desired physical properties are prepared first, and then, the Hamiltonian to realize them is obtained directly. b, Flowchart proposed in the present study to solve the inverse problem by using automatic differentiation.  Fig. 2b, which has two topologically nontrivial phases with a spontaneous quantum AHE corresponding to the nonzero Chern numbers C = ±1.
With this setup of H(θ), we try to obtain a Hamiltonian that maximizes the AHE by the framework in Fig. 1b. For this aim, we take the objective function as L(θ) = −σ xy (θ), where σ xy is the Hall conductivity. Details of the calculations are described in Methods. We find that σ xy increases monotonically through the optimization, as shown in Fig. 2d. Note that we introduce temperature and control it as shown in Fig. 2c to avoid that ∂L/∂θ becomes zero due to the quantization (β is the inverse temperature). In contrast to the continuous change of σ xy , the Chern numbers of the two bands, which are separated by the band gap shown in the inset of Fig. 2d, converge quickly to C ±1 in the very early stage of the optimization, as shown in Fig. 2e. The evolution of each parameter is plotted in Figs. 2f-h. We find that both M A and M B converge to zero, and |t d 2 | → 1 and φ d → π/2 for all d. These values correspond to the center of the topological phase with C = 1 in the Haldane model, indicated by the star in Fig. 2b. We confirm that different initial conditions converge to the same state (see Supplementary Information). Thus, our framework automatically rediscovers the Haldane model with a spontaneous quantum AHE under the condition of maximizing σ xy . The reason why the optimal state is always at the center of the C = 1 phase is due to the introduction of temperature; at nonzero temperature, σ xy becomes largest at the center where the band gap becomes largest in the topological phase. We note that the value of σ xy in Fig. 2d is considerably smaller than the quantized value +1, which is also due to the finite temperature.
Next, we apply our framework to a triangular lattice assuming a four-sublattice unit cell (Fig. 3a). The Hamiltonian consists of the nearest-, second-, and thirdneighbor hopping, whose coefficients t ij 1 , t ij 2 , and t ij 3 , respectively, can take different values for different combinations of i and j within the unit cell; specifically we take t ij 1 = exp(iφ ij 1 ) and t ij m = σ(r m ) exp(iφ ij m ) for m = 2 and 3 (see the arrows in Fig. 3a). Thus, the model includes 38 parameters in total represented by θ = {r 2 , r 3 , {φ ij 1 }, {φ ij 2 }, {φ ij 3 }}. Again, we take L(θ) = −σ xy (θ) to be minimized, with a schedule of temperature shown in Fig. 3b. The fermion density is fixed at half filling at each step of the optimization by using the bisection method.
We find that the Chern numbers for four bands converge to C = 5, 1, −3, and −3 from the lower band,  as shown in Fig. 3d. This indicates that σ xy reaches 6 at half filling, which is six times larger than that in the Haldane model, although σ xy in Fig. 3c is much smaller due to the finite temperature similar to the previous case.
The band structure is shown in Fig. 3e with the Berry curvature Ω (see Methods). Note that the system recovers (approximately) threefold rotational symmetry after the convergence (see Supplementary Information). Ω of the lowest energy band is positive at all wave numbers, whose sum gives the largest C = 5, while the other bands include negative contributions. This indicates that our framework tries to maximize C for the lowest energy band. We note that the same conclusion is obtained for many other initial conditions, while some cases converge to C = 3, 3, −1, and −5 from the lower band, which gives the same value of σ xy = 6. The reason why the solution in Fig. 3 is rather preferred is the finite temperature introduced in the optimization process, for the same reason as in the honeycomb lattice model for which the center of the topological phase was obtained (see Supplementary   Information). Let us discuss the optimized parameters. We find that both |t 2 | and |t 3 | converge to 1, while the phases take the various values shown by colors in Fig. 3a. We show, however, that their sums along closed loops, Φ m = φ ij m , representing the fictitious magnetic fluxes, take some regular values: Φ 1 7π/4 for the smallest triangles composed of t 1 , and Φ 2 takes 0.91π and 1.59π for larger triangles of t 2 facing right and left, respectively ( Fig. 3f), while Φ 3 is always π (φ ij 3 is either 0 or π). Although φ ij m take different values for different initial conditions, Φ m converge to the same values. These results indicate that our framework automatically finds a new model whose complex hoppings realize spontaneous fictitious magnetic fluxes to maximize σ xy , which is hard to obtain by human intuition. Based on this discovery, we can also refine the Hamiltonian by taking more regular values of the phases (multiples of π/4); see Supplementary Information.
Finally, we apply our framework to optimize the PVE from the so-called shift current in noncentrosymmetric bulk systems [10][11][12][13]20]. For simplicity, here we focus on (quasi-)one-dimensional spin-charge coupled systems where the spin configurations break spatial inversion symmetry [21]. The schematic is shown in Fig. 4a. Note that the model approximately describes chiral magnetic metals, such as CrNb 3 S 6 [22] and Yb(Ni 1−x Cu x ) 3 Al 9 [23]. The Hamiltonian reads where c † iα (c iα ) denotes the creation (annihilation) operator of an electron at site i with spin α. Here, we take is the nonlinear optical conductivity, and |E(ω)| 2 denotes the intensity of the linearly polarized solar light with frequency ω, approximately given by blackbody radiation at T = 5500 K (the inset of Fig. 4a) (see Methods); we take L(θ) = −I. We consider a three-dimensional system in which the one-dimensional chains are arranged in a square lattice fashion for simplicity, taking the lattice constants a z = 9Å in the chain direction and a x = a y = 4Å in the orthogonal directions, referring to a chiral magnet [23]. The fermion density is fixed at half filling as for the previous model. Figure 4c shows the optimization process of the photocurrent I under the schedule of temperature shown in Fig. 4b. We obtain I ∼ 900 A/m 2 after the convergence. This value is comparable or larger than those for Ge semiconductors [24] and perovskites substances [25,26]. Changes of the parameters t 1 , t 2 , and J are plotted in the inset of Fig. 4c. The optimized spin configuration is an umbrella-shaped chiral state with three-site period, as shown in Fig. 4d. We also note that spin configurations with four-site period are also obtained for different initial conditions, but they generate smaller I (see Supplementary Information).
To elaborate the mechanism behind the optimization of the photocurrent, we plot the ω dependence of I(ω) = σ PVE (ω)|E(ω)| 2 in Fig. 4e, together with σ PVE (ω)ω 2 and |E(ω)| 2 in the inset. We find that I(ω) has a sharp peak at ω ∼ 7.15 × 10 14 [rad/s], due to the peak of σ PVE (ω)ω 2 locating at the frequency where |E(ω)| 2 becomes large. We show that dominant contributions to the peak come from the interband processes between the conduction and valence bands split by 2J 0.5 [eV] 7.15×10 14 [rad/s], as shown in Fig. 4f (see Methods). The results indicate that the enhanced photocurrent of ∼ 900 A/m 2 under solar radiation is generated by the band engineering with automatic optimization of t 1 , t 2 , J, and the spin configurations. We note that the peak value of σ PVE (ω) ∼ 0.04 A/V 2 is considerably large compared to existing materials, such as BaTiO 3 [10,27] and TaAs [28], and is also even an order of magnitude larger than the value obtained in the previous theoretical study [21], while we may need substantially-large competing magnetic interactions to stabilize the umbrella spin configuration at room temperature.
Through the applications to AHE and PVE, our framework has proven capable of automatically finding Hamiltonians that optimize the physical properties of interest.
The key aspect is in the use of automatic differentiation in the inverse problem, which provides the derivatives of the objective function in terms of large number of parameters; although the current studies are limited to several tens of parameters, we can practically deal with a million or more. Since automatic differentiation is a versatile technique, our framework has a wide range of the applicability, such as first-principles Hamiltonians computed by the Kohn-Sham equations, strongly correlated electron systems, quantum spin systems, and interacting bosonic systems. In addition, it is applicable to a wide range of physical properties to be optimized. Thus, our finding will open up new directions to explore new models and principles in materials science.

Application to the AHE
The Hall conductivity is calculated by using the Kubo formula as where e is the elementary charge, h is the Planck constant, V is the volume of the Brillouin zone, N k is the number of k points, f (E, β) is the Fermi distribution function at inverse temperature β, E kn is the energy at k in nth band; Ω(k) is the Berry curvature given by where |kn is an eigenstate at k in nth band. We take e = h = 1, N k = 100 2 , and δ = 10 −5 .

CODE AVAILABILITY
We have published the code to reproduce all the results on https://github.com/koji-inui/automatic-hamiltoniandesign.git. Figure S1 shows the optimization process of σ xy , the band gap, and the Chern number of each band for nine sets of different initial parameters of the Hamitonian in Eq. (1) in the main text. All the cases converge to the same results. The changes of the model parameters are shown in Fig. S2. All of them converge to the same results corresponding to the center of the C = 1 phase of the Haldane model indicated by the star in Fig. 2b in the main text.

S2. DISCOVERY OF A NEW HAMILTONIAN ON A TRIANGULAR LATTICE
A. Initial condition dependence Figure S3 shows the optimization process of σ xy for eighteen sets of different initial parameters of the Hamitonian on a triangular lattice in the main text. σ xy converges to three different values; the largest one takes the value around 3.2, while the rest two are around 2.5 and 2.4. Table S1 shows the Chern numbers of the four bands obtained from the different initial parameters. The number of the initial conditions that converge to the corresponding result is also listed. The majority is the group with C = 5, 1, −3, and −3, which gives σ xy 3.2 in Fig. S3: ten out of eighteen initial conditions converge to this result, and one of them is discussed in the main text. Meanwhile, the group with C = 3, 3, −1, and −5 (three out of eighteen) also gives σ xy 3.2 as the sum of C for the filled two bands are the same as the majority group. The rest two groups give the smaller σ xy 2.5 and 2.4 in Fig. S3. Note that the four groups give σ xy = 6, 6, 5, and 4 at zero temperature, corresponding to the sum of C for the filled two bands. Figure S4 shows the energy dispersion of each band for one of the optimal solution shown in Fig. 3e in the main text. The results indicate that the threefold rotational symmetry is approximately restored by the automatic optimization of σ xy .

C. Refinement of the model parameters
As discussed in the main text, the phases of the hopping coefficients converge to rather scattered values. We  find, however, that it is possible to regularize the parameters by hand while keeping the value of σ xy by the following procedures. First, we set |t 1 | = |t 2 | = |t 3 | = 1 and transform the hopping coefficients so as to shift the band bottom of the 1st band around k 0 = (π/2, π/(2 √ 3)) to the Γ point ast ij m = t ij m exp(id ij ·k 0 ), where d ij is a vector from site i to j in real space. Next, we set the phase of each hopping to a multiple of π/4 close to the value after the transformation. In addition, we tune a few of them by hand so as to make Φ m spatially uniform. Then, we   All the phases are set to be multiples of π/4, as represented by the color of the arrows. The amplitudes of the hopping are taken as |t1| = |t2| = |t3| = 1. b, Sum of the phases Φm defined in the main text for the triangles composed of t1 (top) and t2 (bottom). Figure S6 shows the optimization process of the photocurrent I for nine sets of different initial parameters of the spin-charge coupled model in Eq. (2) in the main text. After the convergence, the results are grouped roughly into three; one of them gives the largest I ∼ 900 A/m 2 , as discussed in the main text. Table S2 shows the values of I, t 1 , t 2 , and J after the convergence starting from the different initial parame-ters. Four out of nine solutions converge to the largest value of I ∼ 900 A/m 2 . Spin configurations after the convergence are shown in Fig. S7. The results with the optimal value of I (number 1, 3, 5, and 9) have an umbrellashaped structure with three-site period. We note that two of the other results (number 4 and 8) gives the modest better value of I ∼ 670 A/m 2 with a different umbrella structure with four-site period. The others have more disordered spin configurations with modulations of S z .