Introduction

The recent discovery of correlated insulating states and superconductivity in twisted bilayer graphene (TBG)1,2,3,4 has opened a new window to exploring strong correlation effects in systems whose doping can be easily tuned, enabling the exploration of a rich range of interaction-driven phenomena. Although the underlying reason for the correlated physics is understood to arise from a relatively narrow electronic bandwidth induced by the long wavelength Moiré pattern5,6, several details, including the symmetry breaking within the insulating phase and the nature and mechanism of pairing in the neighboring superconductor, remain under debate7,8,9,10,11,12,13,14,15,16,17,18,19. One of the difficulties in addressing these questions arises from the complexity of the theoretical treatment of TBG, which involves at least a pair of narrow bands per spin per valley with a symmetry-protected band touching, leading to eight bands in total. On top of that, the limited tunability of the band structure makes it experimentally difficult to explore the dependence of different phases on microscopic parameters.

Motivated by recent experimental report20, we study a related system—twisted double-bilayer graphene (TDBG)—which consists of a pair of bilayer-graphene sheets, twisted with respect to one another with AB–AB-stacking structure. Due to the absence of \({C}_{2}\) rotation symmetry, TDBG has a lower symmetry compared with TBG, which simplifies the problem by removing the band touching at the Dirac points, leading to a low energy effective description involving one rather than two narrow bands per spin and valley. Moreover, the band separation can be controlled by applying a vertical displacement field enabling the exploration of different regimes of band isolation and bandwidth within the same device.

We identify three main ingredients necessary to explain the emergence of insulating and superconducting behavior in TDBG. First, we perform an accurate calculation of the single-particle band structure to identify ranges of displacement field and twist angle for which a single band is isolated and relatively flat. We show that lattice relaxation, known to be important in TBG21,22, as well as several other effects such as trigonal warping, which are absent in TBG, significantly influence the band structure in TDBG, in excellent agreement with experiments. Moreover, we identify a hitherto-neglected in-plane orbital effect which is used to explain the experimentally observed deviation of the in-plane \(g\) factor from 220, as well as the effect of in-plane field on superconducting \({T}_{{\rm{c}}}\).

Second, we address the key question of the nature of the interaction-driven insulating state. The similarity between the phase diagram of TBG to that of cuprates was invoked to argue that Mott physics is the underlying mechanism responsible for the correlated insulator1,7,12. On the other hand, a different route to correlated insulators is observed in graphene quantum-Hall systems, for instance, when the spin and valley degeneracy of the Landau levels are spontaneously broken by interactions23. This usually leads to ferromagnetic insulators, which are otherwise rare in correlated solids where antiferromagnetic order is the norm. For similar reasons, in the TDBG with nonzero valley Chern number, ferromagnetism may be preferred24 at integer fillings. The situation here is reminiscent of strained graphene, where a suitably chosen strain profile leads to Landau levels arising from the opposite strain magnetic fields applied on the two valleys25. At partial fillings that are integers, ferromagnetic ground states were obtained with repulsive interactions26, and we show that a similar scenario is likely to occur here in TDBG. Indeed a related ground state with spontaneous quantum-Hall response, although metallic, was observed in the twisted monolayer-monolayer graphene (TBG) with \({C}_{2}\)-breaking substrate potentials13,19,24,27,28,29.

Third, we investigate the nature of the superconducting phase by highlighting that the valley degree of freedom, which behaves as a pseudospin, allows for exotic pairing possibilities which are relatively rare in other materials. In particular, we show that spin triplet with valley-singlet pairing, which is momentum-independent within each valley, is favored. We investigate the consequences of such scenario and show it can be used to explain the measured dependence of \({T}_{{\rm{c}}}\) on in-plane field20.

Results

Single-particle physics

We consider a system consisting of two AB-stacked graphene bilayers twisted relative to AB–AB stacking by a small angle \(\theta\), illustrated in Fig. 1. For a detailed discussion on the Hamiltonian and model parameters, see the Methods section. The bottom layer of the top BLG and the top layer of the bottom BLG are coupled via Moiré hopping between \(AA\) and \(AB\) sites, parametrized by \(({w}_{0},{w}_{1})\)21,22. In the original Bistritzer–Macdonald model, \({w}_{0}\) and \({w}_{1}\) are taken to be equal30. However, in a realistic twisted model, the ratio \(r\equiv {w}_{0}/{w}_{1}\) is smaller than one due to the lattice relaxation which expands (shrinks) AB (AA) regions. In TBG, \(r\) is taken to be ~0.75 for the first magic angle21,22. Here, we similarly include lattice relaxations by taking \(r\) to be <1. This is crucial for the existence of a gap between first and second conduction (valence) bands in TDBG which is necessary to explain the band insulator at \(\nu =\pm 4\) filling. In this work, we take \(({w}_{0},\,{w}_{1})=(88,100)\,{\rm{meV}}\) corresponding to \(r=0.88\). For different values of \(({w}_{0},\,{w}_{1})\), we obtained qualitatively similar features (Methods).

Fig. 1
figure 1

Twisted double BLG model (AB–AB stacking) with the gating voltage \(U\) across the system. Throughout this work, we assume the voltage drop across the layers is uniform, \({U}_{i}-{U}_{i+1}=U/3\).

Unlike TBG, a realistic description of TDBG does not exhibit magic-angle physics whose origin is the vicinity to a chiral symmetric model with perfectly flat bands at specific angles31,32. In the quadratic approximation of the bilayer-graphene dispersion, the first conduction and valence bands in TDBG become almost perfectly flat at the angle \(\theta \approx 1.05\)24. However, once trigonal warping (\({\gamma }_{3}\)) and particle–hole asymmetry (\({\gamma }_{4}\)) terms are included, the flat bands acquire a significant dispersion and become overlapped with each other (Fig. 2a, b). Theses bands can only be separated by applying a strong enough gate voltage between top and bottom layers (Fig. 2c). Using numerical simulations, we identify the parameter space of twist angle \(\theta\) and applied voltage \(U\) where the first conduction band is isolated (Fig. 3a). On the other hand, we find that there is barely any regime where the first valence band is isolated (Fig. 3c). Such a particle–hole asymmetry in the band structure is originated from \({\gamma }_{4}\) and \(\Delta\) terms. The results are consistent with the experimental findings20, showing that the system at charge neutrality remains metallic unless a rather large vertical electric field is applied. Furthermore, a correlated insulating phase is only observed on electron-doping side, consistent with the theoretically expected particle–hole asymmetry. Note that the bandwidth is not as flat as that of magic-angle TBG. However, the bandwidth is still small compared with the interaction scale which implies that strongly correlated physics can still arise. Indeed, there is some debate regarding the bandwidth of magic-angle TBG itself, with reported bandwidths ranging from 10 to 40 meV33.

Fig. 2
figure 2

Moiré band structures of TDBG. a, b At \((\theta ,U)=(1.0{5}^{\circ },0)\). Solid (dotted) line represents the band originated from \({{\bf{K}}}_{+}\) (\({{\bf{K}}}_{-}\)) valley. Red, blue and black represent conduction, valence, and the other bands, respectively. a The band structure for the idealized model with only \({\gamma }_{0}\) and \({\gamma }_{1}\) being nonzero. The flat band is observed with the bandwidth \(0.25\) meV. b The band structure for the realistic model with overlapping bands. The “magic angle” does not exist in this case. c Moiré band structure at \((\theta ,\,U)=(1.3{3}^{\circ },60\,{\rm{meV}})\). The first conduction band (red) is isolated and relatively flat.

Fig. 3
figure 3

Summary of single-particle calculations of TDBG. a Isolation region for the first conduction band (colored) with the bandwith indicated by the color. We observe two seperate isolation regions for \(\theta\) smaller or larger than \(1.{1}^{\circ }\). The former is not very robust, and is sensitive to fine-tuning of parameters whereas the latter is very robust and is associated with a valley Chern number of 2 (See b). b The Chern number of the first conduction band from \({{\bf{K}}}_{+}\) valley. Note, the Chern number is defined as long as a direct bandgap is present. c A schematic plot for the insulating (black) regions and the first conduction/valence band isolated region (red/blue) in the TDBG at \(\theta =1.3{3}^{\circ }\). The red dot is charge neutrality point (CNP). In the shaded region, strongly correlated physics is expected near integer fillings. Asymmetry between electron and hole dopings is predicted from the theory. dg Color plots for \(g\)-factor associated with orbital magnetic effects \({g}_{+}^{x}({\bf{k}})\), \({g}_{+}^{y}({\bf{k}})\), \({g}_{+}^{z}({\bf{k}})\), and single-particle dispersion \({\xi }_{+}({\bf{k}})\) over the Moiré Brillouin zone for the first conduction band at \((\theta ,U)=(1.3{3}^{\circ },60\ {\rm{meV}})\), where the band is isolated. \({g}^{x,y,z}({\bf{k}})\) are in the unit of \({\mu }_{{\rm{B}}}\), and \(\xi ({\bf{k}})\) is in the unit of meV. Both \({g}^{x}\) and \({g}^{y}\) vanish at high symmetric points \(\Gamma\), \({K}_{1}\), and \({K}_{2}\).

Another crucial difference compared with TBG is the absence of twofold rotational symmetry, which protects the Dirac points in TBG. As a result, the physics of TDBG is controlled by a single narrow band (per spin per valley) rather than two as in TBG. The TDBG Hamiltonian has the following symmetries (i) threefold rotation symmetry \({C}_{3}\), (ii) time-reversal symmetry \({\mathcal{T}}\), and, (iii) mirror reflection about the \(x\)-axis \({M}_{y}\), which only exists in the absence of vertical electric field, and (iv) SU(2) spin-rotation symmetry. Finally, we assume that in the small angle limit, there is valley-charge-conservation symmetry \(U{(1)}_{{\rm{v}}}\), arising from the decoupling of Moiré and atomic lattice-scale physics.

In addition, the conduction band within each valley carries a nonzero Chern number. In ordinary condensed matter systems, \({\mathcal{T}}\)-symmetry forbids the existence of Chern bands. However, in Moiré systems, Chern bands carrying opposite Chern numbers for opposite valleys can arise due to the valley decoupling. The overall system still satisfies \({\mathcal{T}}\)-symmetry, which exchanges the two valleys. Therefore, spontaneous valley polarization would lead to a Chern band without explicitly breaking \({\mathcal{T}}\)-symmetry13,24,26,29. At \(U=0\), the reflection symmetry \({M}_{y}\) enforces \(C=0\) for both valleys. At \(U \, \ne \, 0\), the conduction band develops a nonvanishing Chern number computed numerically in Fig. 3c which is equal to \(\pm 2\) for the parameter region corresponding to band isolation. The evolution of Chern number as a function of \(U\) is further confirmed using symmetry indicator (Methods). This can be also understood from the well-known behavior of a AB-stacked bilayer graphene under an electric field. Under the electric field, the bilayer graphene becomes gapped and accumulates opposite Berry curvatures at \({{\bf{K}}}_{+}\) and \({{\bf{K}}}_{-}\) valleys, which amounts to a Chern number \({C}_{{\rm{v}}}=\pm 2\) for each valley.34,35,36,37.

Finally, we discuss the effect of applied magnetic field which influences the single-particle physics in two distinct ways. First, it couples to the electron spin via Zeeman effect leading to the splitting of bands with opposite spin by \(2{\mu }_{{\rm{B}}}B\). Second, it couples to the electron orbital motion leading to modifications in the band structure. For out-of-plane field, the orbital effect arises from the magnetic field coupling to the planar motion of the electron38,39. It leads to an energy correction of \({\mu }_{{\rm{B}}}{g}_{\tau }^{z}({\bf{k}}){B}_{z}\), with a \({\bf{k}}\)-dependent \(g\)-factor \({g}_{\tau }^{z}({\bf{k}})\) satisfying \({g}_{-\tau }^{z}(-{\bf{k}})=-{g}_{\tau }^{z}({\bf{k}})\) due to time-reversal symmetry (\(\tau\) is a valley index). As shown in Fig. 3f, \({g}_{\tau }^{z}({\bf{k}})\) can be much larger than the Zeeman effect. For in-plane field, the orbital effect arises from coupling to the interlayer motion of electrons. For an in-plane field \({\bf{B}}\), we can choose the gauge \({\bf{A}}(z)=-z\times {\bf{B}}\) which does not depend on \(x\) or \(y\), thus preserving the Moiré translation symmetry. The resulting change in the hopping parameters is obtained by the Peierl’s substitution, effectively providing an additional momentum shift of \(-\frac{e}{\hslash }\frac{(l \, + \, m)d}{2}\ {e}_{z}\times {\bf{B}}\) to the hopping connecting layers from \(l\) to \(m\), where \(d\) is the interlayer separation (see the Methods section). This leads to an energy correction of the form \({\mu }_{{\rm{B}}}({g}_{\tau }^{x}({\bf{k}}){B}_{x}+{g}_{\tau }^{y}({\bf{k}}){B}_{y})\) to the leading order in \({\bf{B}}\) with \({g}_{-\tau }^{x,y}(-{\bf{k}})=-{g}_{\tau }^{x,y}({\bf{k}})\). The orbital effect due to in-plane field amounts to a very small relative momentum shift \(\sim \frac{eda}{\hslash }\approx 1{0}^{-5}\). However, it cannot be neglected since it is of the same order of magnitude as the Zeeman effect, \(\frac{e{v}_{F}d}{{\mu }_{{\rm{B}}}} \sim 1\) (see Fig. 3d, e). In general, the in-plane orbital contribution changes the band dispersion due to its \({\bf{k}}\)-dependence, whereas the Zeeman effect shifts the entire band uniformly. Moreover, it acts oppositely for different valleys. These properties can be crucial in understanding the effect of in-plane field on the insulating gap and the superconducting temperature (see the Methods section and Supplementary Note 6).

Correlated insulating states

In the band isolation regime, the first conduction band carries a nonzero Chern number as shown in Fig. 3a, b which prevents the existence of exponentially localized Wannier functions40. As a result, one cannot construct a Hubbard model for the band unless valley symmetry is broken, or the model is enlarged to include more bands so that the net Chern number is zero. Instead of seeking a complicated real-space description, we discuss the interaction effects in the momentum space, as in the case of quantum-Hall ferromagnetism. One major consequence of the absence of localized Wannier orbitals is the inadequacy of the Mott picture, where the insulating phase is driven by strong repulsion between localized orbitals. Thus, we will use the terminology, correlated insulator to refer to the interaction-driven insulating phase for the following physics.

In order to uncover the nature of the possible correlated insulating states at half and quarter-filling20, we perform a self-consistent Hartree–Fock mean-field theory similar to the one employed in ref. 8,24. Below, we sketch the derivation from the microscopic theory, relegating most details to Supplementary Notes 2 and 3. The interacting Hamiltonian in momentum space is given by

$${{\mathcal{H}}}_{{\rm{int}}}=\frac{1}{2\ {\rm{Vol}}}\sum _{q}\hat{\rho }(q)V(q)\hat{\rho }(-q),$$
(1)

where \(V(q)\) is the Fourier-transformed screened Coulomb interaction41,42. Since the screening coming from the distance between the system and the gate is comparable with the Moiré length scale, the screening length can be important for the interaction effects. The density \(\hat{\rho }(q)\) consists of an intravalley part \({\rho }^{+} \sim {c}_{\pm }^{\dagger }{c}_{\pm }\) and an intervalley part \({\rho }^{-} \sim {c}_{\pm }^{\dagger }{c}_{\mp }\), where \({c}_{\pm }^{\dagger }\) is the electron creation operator for \({{\bf{K}}}_{\pm }\) valley. The latter contribution arises from the small coupling between opposite valleys and gives rise to an intervalley Hund’s coupling term.

The resulting Hamiltonian consists of two parts, \({{\mathcal{H}}}_{{\rm{int}}}={{\mathcal{H}}}_{0}+{{\mathcal{H}}}_{J}\), where \({{\mathcal{H}}}_{0}\) contains the coupling between intravalley densities \({\rho }^{+}{\rho }^{+}\), whereas \({{\mathcal{H}}}_{J}\) contains the coupling between intervalley densities \({\rho }^{-}{\rho }^{-}\). Rough estimation for the relative energy scales for \({H}_{0}\) and \({H}_{J}\) gives \({V}_{0} \sim 35\ {\rm{meV}}\) and \(J \sim 0.6\ {\rm{meV}}\) for the experimentally relevant regime. Although \({H}_{J}\) is significantly smaller than \({H}_{0}\), it breaks the symmetry of the model down from two independent SU(2) spin-rotation symmetries for each valley to a single SU(2). Thus, it can lift the degeneracy between some symmetry breaking states which are degenerate on the level of the \({H}_{0}\). Indeed, we found that \({H}_{J}\) favors the spin alignment between opposite valleys and can be written in the form of intervalley Hund’s coupling as in ref. 24.

Within the self-consistent Hartree–Fock mean-field theory, we consider the order parameter defined as

$$\langle {c}_{\sigma ,\tau }^{\dagger }({\bf{k}}){c}_{\sigma ^{\prime} ,\tau ^{\prime} }({\bf{k}}^{\prime} )\rangle ={M}_{\sigma \tau ,\sigma ^{\prime} \tau ^{\prime} }({\bf{k}}){\delta }_{{\bf{k}},{\bf{k}}^{\prime} }.$$
(2)

For a gapped phase, matrix \(M({\bf{k}})\) must be a projector, i.e., \(M{({\bf{k}})}^{2}=M({\bf{k}})\) satisfying \({\rm{tr}}\ M({\bf{k}})=\nu\) for all \({\bf{k}}\). Given that there are four flavors of fermions due to spin (\(\sigma\)) and valley (\(\tau\)) degeneracies, any possible order parameter \(M\) can be expanded in terms of the generators of SU(4) \({\sigma }_{i}\otimes {\tau }_{j}\), which can be grouped based on their symmetry breaking into five categories: (i) \(\{{\sigma }_{0}{\tau }_{z}\}\) only breaks \({\mathcal{T}}\) and corresponds to a valley-polarized (VP) state, (ii) \(\{{\sigma }_{x,y,z}{\tau }_{0}\}\) breaks spin-rotation symmetry and correspond to a spin-polarized (SP) state. (iii) \(\{{\sigma }_{x,y,z}{\tau }_{z}\}\) breaks both spin rotation and time-reversal (but preserve some combination of the two) and corresponds to a spin-valley locked (SVL) state, (iv) \(\{{\sigma }_{0}{\tau }_{x,y}\}\) breaks \(U(1)\) valley-charge conservation and corresponds to an intervalley coherent (IVC) state, and (v) \(\{{\sigma }_{x,y,z}{\tau }_{x,y}\}\) breaks both spin rotation and U(1)\({}_{v}\) valley-charge conservation, corresponds to spin-IVC locked (SIVCL) state (see Table 1). We note that any of these orders may break or preserve \({C}_{3}\) symmetry depending on its \({\bf{k}}\) dependence.

Table 1 Symmetry broken states and the remaining symmetries for all possible translation-symmetric gapped states at \({\mathbf{\nu}} =1,\,2,\,3\).

The results of the self-consistent Hartree–Fock calculation are summarized in the following (Supplementary Note 3). Restricting ourselves to translation-symmetric gapped states, we find there are five options: SP, VP, SVL, IVC, and SIVCL at half-filling \(\nu =2\) and three options: spin-valley-polarized (SVP), spin-polarized-IVC (SPIVC), and spin-valley-locked-IVC (SVLIVC) at quarter-filling \(\nu =1,\,3\), as in Table 1. By solving the Hartree–Fock self-consistency condition, the ground-state energy \(E\) and the correlation gap \(\Delta\) are computed for different states (Fig. 4a). Let us first consider what happens in the absence of intervalley Hund’s coupling. In this case, we find that the SP and SVL states at half-filling and similarly the SPIVC and SVLIVC states at quarter-filling are exactly degenerate since they are related by a spin rotation in one of the valleys. Similarly, due to the enlarged symmetry of the mean-field Hamiltonian, the SP and VP states and the IVC and SIVCL states have the same energy. Thus, we only need to numerically investigate the competition between SP and IVC at half-filling and SVP and SPIVC at quarter-filling. The result of such numerical investigation is shown in Fig. 4a, where we clearly see that SP has a lower energy than that of the IVC in most of the parameter regime. Similar results apply for the competition between SVP and SPIVC at quarter-filling. The correlation-induced gap \(\Delta\) for the SP state in the band isolation region ranges between 4 and 8 meV (see Fig. 4b).

Fig. 4
figure 4

The results of the Hartree–Fock calculation. a Color plot (meV) for \({E}_{{\rm{IVC}}}-{E}_{{\rm{VP}}}\) per electron. b Color plot of self-consistency gap \({\Delta }_{{\rm{SP/VP}}}\) for the SP/VP-state in the band isolated region. (No \(J\)-term included) c, d Effect of the intervalley Hund’s coupling (\(J\)-term) on the gap for spin- and valley-polarized phases at half and quater fillings, respectively. At half-filling, \(J\)-term increases (decreases) \({\Delta }_{{\rm{SP}}}\) (\({\Delta }_{{\rm{VP}}}\)). At quarter-filling, \(J\)-term reduces the gap to the next-excited state, making the quarter-filled insulator (SP + VP) less stable than the half-filled (SP) one. e The correlated gap \(\Delta\) for half-filling insulators (SP, VP) as a function of in-plane \({B}_{x}\)-field. \((\theta ,\,U)=(1.3{3}^{\circ },\,60\,{\rm{meV}})\). Solid lines for SP state and dotted lines for VP-state. Zeeman effect would increase (decrease) \(\Delta\) for the SP (VP) state with increasing \(B\). The valley orbital effect \({g}^{x,y}({\bf{k}})\) leads to a linear decrease in the gap with field, thus effectively decreasing (increasing) the \(g\)-factor for the SPS (VP) state.

To understand the reason why IVC order is energetically unfavorable, we can employ the argument of ref. 29 as follows. IVC order between two valleys with opposite Chern number \(C\) is equivalent after a particle–hole transformation in one of the valleys to superconducting pairing between bands with the same Chern number i.e., a superconductor in a background magnetic field. This means that the order parameter necessarily includes \(| C|\) vortices within the Brillouin zone leading to increased energy. A more detailed analytic treatment of the energy competition between SP and IVC is provided in the Supplementary Note 4.

The inclusion of the effect of intervalley Hund’s coupling alters the competition between the phases as follows. First, since the term is ferromagnetic, it lowers the energy of the SP state, favoring the SP state over the VP-state, which is in turn favored over the SVL-state. Second, it lowers the energy of the filled bands for the SP state at half-filling, thus increasing \({\Delta }_{{\rm{SP}}}\). On the other hand, it reduces the energy of some of the empty bands for the VP-state, reducing \({\Delta }_{{\rm{VP}}}\) (see Fig. 4c, e). The Hund’s coupling term similarly reduces \({\Delta }_{{\rm{SVP}}}\) at quarter-filling by lowering the energy of one of the excited states (see Fig. 4d). We note here that the reduction of the correlated gap at quarter-filling relative to that at half-filling may explain why the former is more difficult to observe experimentally compared with the latter and requires the application of a magnetic field20.

In the presence of an in-plane field, the gap of the SP-phase at half-filling is expected to grow with a slope consistent with the Zeeman \(g=2\) factor. However, the orbital effect discussed earlier leads to a reduction in the effective \(g\)-factor by 20–50% depending on the band structure details (Fig. 4e), which is in agreement with the experimental data20. From the numerical calculation, we confirmed that such a reduction in gap also depends on the in-plane field direction, which exhibits threefold periodicity (see the Methods section). Therefore, the orbital effect can be directly verified in a rotating in-plane field setup, where we predict the modulation of the \(g\)-factor with period \(2\pi /3\) in the angle.

Superconductivity

When the correlated insulator is doped away from half-filling, a superconducting phase is observed below 3.5 K20. Our proposed scenario for the observed superconductivity is illustrated in Fig. 5a, where pairing takes place between time-reversal partners in opposite valley. Such an intervalley pairing between time-reversal partners has also been proposed43,44,45 and observed in transition metal dichalcogenides (TMD)46. However, unlike in TMD, where strong spin–orbit coupling implies a locking between spin and valley, here the proposed pairing takes place between the electrons with the same spin. To understand this, we first note that doping a spin-polarized insulator is expected to give rise to a ferromagnetic metal with spin-split Fermi surface. Similar to other ferromagnetic metals47,48,49, ferromagnetic spin fluctuations can act as a pairing glue responsible for superconductivity50. This motivates the following simplified Hamiltonian,

$${\mathcal{H}}=\sum _{{\bf{k}},\tau ,\sigma }{c}_{\sigma ,\tau ,{\bf{k}}}^{\dagger }{\xi }_{\sigma ,\tau ,{\bf{k}}}{c}_{\sigma ,\tau ,{\bf{k}}}-g\sum _{q}{S}_{q}\cdot {S}_{-q},$$
(3)

where the spin operator \({S}_{q}^{a}={\sum }_{{\bf{k}},\tau ,\sigma ,\sigma ^{\prime} }{c}_{\sigma ,\tau ,{\bf{k}}+q}^{\dagger }{{\boldsymbol{\sigma }}}_{\sigma ,\sigma ^{\prime} }^{a}{c}_{\sigma ^{\prime} ,\tau ,{\bf{k}}}\). This Hamiltonian can be obtained within an RPA treatment by identifying the ferromagnetic order as the leading instability in the doped itinerant phase. The ferromagnetic susceptibility is peaked at \(q=0\), which justifies a \({\bf{k}}\)-independent coupling.

Fig. 5
figure 5

Spin-triplet superconductivity. a Triplet paring between opposite valleys, \({c}_{\sigma ,+}({\bf{k}})\) and \({c}_{\sigma ,-}(-{\bf{k}})\) with exact energy match. b Schematic plot for the \({T}_{{\rm{c}}}\) as a function of \(B\)-field.

Next, we consider the simplest possible intervalley superconducting pairing function \(\Delta\), which is \({\bf{k}}\)-independent (\(s\)-wave) within each valley. Note, however, that the overall orbital symmetry incorporating both momentum and valley may still be anti-symmetric, e.g., \(p\)-wave. For the proposed pairing, \(\Delta\) is proportional to \({\tau }_{x}\) or \({\tau }_{y}\) corresponding to valley triplet or singlet, respectively. The overall antisymmetry of \(\Delta\) implies that the former scenario corresponds to a spin-singlet \(i{\sigma }_{y}\), whereas the latter corresponds to a spin triplet \(i{\sigma }_{y}d\cdot {\boldsymbol{\sigma }}\). Here, \(d\) is the vector which captures the direction of the spin state. To see which of these is the dominant pairing channel, it is useful to decouple the interaction in the pairing channel as

$${{\mathcal{H}}}_{{\rm{int}}}=-g\sum _{{\bf{k}},q}{\rm{tr}}\ ({\boldsymbol{\sigma }}{\Delta }_{{\bf{k}}})\cdot ({{\boldsymbol{\sigma }}}^{T}{\Delta }_{{\bf{k}}+q}^{\dagger })$$
(4)

We now assume \({\bf{k}}\)-independent \(\Delta\) and decompose it into spin-singlet/velly triplet \({\Delta }_{{\rm{s}}}\) and spin triplet/valley-singlet \({\Delta }_{{\rm{t}}}\). We now use

$${\boldsymbol{\sigma }}\cdot ({\Delta }_{t,s}{{\boldsymbol{\sigma }}}^{T})={\lambda }_{t,s}{\Delta }_{t,s},$$
(5)

where \({\lambda }_{{\rm{t}}}=1\) and \({\lambda }_{{\rm{s}}}=-3\). This means that the interaction is repuslive in the singlet channel, and attractive in the triplet channel making the latter the dominant pairing channel. A more detailed discussion of these pairing channels within the linearized BCS equation is provided in the Supplementary Note 5.

We highlight here that spin-triplet pairing is only known to occur in liquid He\({}_{3}\)51 and a few Uranium compounds47,48,49, as it requires pairing that varies over the Fermi surface (eg. \(p\)-wave) which is likely to be energetically unfavorable in typical solids. The existence of the valley degree of freedom here enables us to evade this difficulty and obtain a spin-triplet valley-singlet order parameter even for a \({\bf{k}}\)-independent interaction.

The experimental consequences of the proposed spin-triplet valley-singlet superconductivity can be investigated by writing the Ginzburg–Landau free energy for the order parameter \(\Delta ={\tau }_{y}{\sigma }_{y}d\cdot {\boldsymbol{\sigma }}\) in the presence of a magnetic field \({\bf{B}}\). Restricting ourselves to terms up to quartic order in \(d\) or \({\bf{B}}\), we can write the following free energy functional

$$F= \,\kappa [(T-{T}_{{\rm{c}}}+b{({\mu }_{{\rm{B}}}{\bf{B}})}^{2})d\cdot {d}^{* }+ia{\mu }_{{\rm{B}}}{\bf{B}}\cdot (d\times {d}^{* })\\ + \, c{\mu }_{{\rm{B}}}^{2}| {\bf{B}}\cdot d{| }^{2}+\alpha {(d\cdot {d}^{*})}^{4}+\eta | d\cdot d{| }^{4}]$$
(6)

Detailed microscopic derivation of the coefficients \(a,b,c,\kappa ,\alpha ,\eta\) is provided in the Supplementary Note 6. In the absence of spin–orbit coupling, the order parameter’s spin is expected to align with the magnetic field. Assuming the magnetic field is parallel to the \(z\)-axis, \({\bf{B}}=B{e}_{z}\), we can then write

$$d=\left(\frac{{\Delta }_{\uparrow \uparrow }+{\Delta }_{\downarrow \downarrow }}{2},\frac{{\Delta }_{\uparrow \uparrow }-{\Delta }_{\downarrow \downarrow }}{2i},0\right)$$
(7)

Substituting in the free energy (6) and using the fact that \(\eta =-\alpha /2\) yields

$$F= \frac{\kappa }{2}\sum _{s=\uparrow ,\downarrow }{F}_{s}\\ {F}_{s}= | {\Delta }_{ss}{| }^{2}(T-{T}_{{\rm{c}}}-{\sigma }_{s}a{\mu }_{{\rm{B}}}B+b{({\mu }_{{\rm{B}}}B)}^{2})+\frac{\alpha }{2}| {\Delta }_{ss}{| }^{4}$$
(8)

One important feature is that \(\alpha \, > \, 0\) which implies the stability of the phase considered.

The free energy (8) leads to the following dependence of the superconducting \({T}_{{\rm{c}}}\) on the applied field

$${T}_{c,\uparrow /\downarrow }(B)={T}_{{\rm{c}}}\pm a{\mu }_{{\rm{B}}}B-b{({\mu }_{{\rm{B}}}B)}^{2}.$$
(9)

The most remarkable feature of this result is that, for nonzero \(a\), \({T}_{{\rm{c}}}\) initially increases upon the application of magnetic field. This can be understood as follows: for a ferromagnetic metal with weakly spin-split Fermi surfaces, the application of the Zeeman field increases (decreases) the density of states for the majority (minority) spin Fermi surface, leading to a linear increase in \({T}_{{\rm{c}}}\) for the majority spin with the coefficient

$$a=2\chi {T}_{{\rm{c}}}\frac{N^{\prime} (0)}{N(0)}\mathrm{ln}\frac{\Lambda }{{T}_{{\rm{c}}}}$$
(10)

where \(\Lambda\) is the bandwidth, \(N(0)\) is the density of states at the Fermi energy, and \(\chi\) is the dimensionless magnetic susceptibility (Supplementary Note 6). Similar linear field-dependence of \({T}_{{\rm{c}}}\) is known in superfluid He\({}_{3}\)51, indicating independent pairing for each spin species. This behavior is in stark contrast to the monotonic decrease of \({T}_{{\rm{c}}}\) under increasing \(B\)-field in a spin-singlet superconductor. One crucial observation here is that \(a\) seems to depend on several details and is expected to be very small since \({T}_{{\rm{c}}}\ll \frac{N(0)}{N^{\prime} (0)} \sim {\epsilon }_{{\rm{F}}}\). Surprisingly, the measured value of \(a\) is of order 120, which suggests the vicinity of a quantum critical point where the scaling of the susceptibility cancels exactly against the other parameters. Indeed, the scaling \(\chi \sim {\epsilon }_{{\rm{F}}}/(T{\mathrm{log}}T)\) predicted by Herz-Millis theory in the quantum critical regime for an itinerant ferromagnet52,53 leads to such cancellation resulting in \(a \sim 1\).

The origin of the quadratic term in Eq. (9) can be understood in terms of the in-plane orbital effect discussed in Sec. IA. First, note that Zeeman splitting cannot break Cooper pairs between aligned spins. Instead, it yields an initial linear increase in \({T}_{{\rm{c}}}({\bf{B}})\) followed by saturation at large fields when all the spins are aligned. On the other hand, the in-plane orbital effect can induce pair breaking by mismatching the energies of time-reversal partner states in opposite valleys, resulting in a quadratic decrease in \({T}_{{\rm{c}}}\) with the applied field whose coefficient is given by (see Supplementary Note 6)

$$b=\frac{1}{{T}_{{\rm{c}}}} \int_{{\rm{FS}}}d{\bf{k}}{({e}_{{\bf{B}}}\cdot {g}_{+,{\bf{k}}})}^{2}$$
(11)

where \({e}_{{\bf{B}}}\) is the direction of the external magnetic field. The average value of \({({e}_{{\bf{B}}}\cdot {g}_{+}({\bf{k}}))}^{2}\) over the Fermi surface depends strongly on the filling and the field direction with typical value around 1 (cf. Fig. 3d–f). Using this value, we can make a rough estimate for the in-plane field needed to destroy superconductivity as \({\mu }_{{\rm{B}}}{B}_{{\rm{c}}} \sim \sqrt{{T}_{{\rm{c}}}/b}\) yielding a value about 3 Teslas, which compares favorably to the experimental value20. Furthermore, if we consider an out-of-plane field instead, \(| {g}_{z}|\) is on average ~1–2 orders of magnitude larger than \(| {g}_{x,y}|\), yielding a critical field of \(\sim 0.1T\) which is very close to the experimentally observed result20.

It is worth noting that the reduction of \({T}_{{\rm{c}}}\) at large field can also arise from the suppression of ferromagnetic fluctuations responsible for the pairing, as has been observed in the ferromagnetic superconductor UCoGe54. Such effects are neglected within our simplified analysis 3, which assumes a constant coupling \(g\).

Discussion

In this work, we theoretically investigated the physics of twisted double-bilayer graphene (TDBG), addressing the experimental observations of correlated insulating phases at integer fillings and the neighboring superconductor reported in ref. 20.

First, let us summarize a few important features of the band structure. Due to the absence of a \({C}_{2}\) symmetry in TDBG, isolated conduction and valence bands with nonzero valley Chern numbers can exist. Moreover, trigonal warping and particle–hole asymmetry in each bilayer graphene lead to (i) a significant broadening of each band so that they overlap in the absence of a displacement field, and (ii) asymmetry between electron- and hole-doped systems. As a result, the parameter space that can host strongly correlated physics is significantly constrained, and the tunability from displacement field at a particular filling becomes essential to realizing correlated states.

Second, we identified an important role played by the coupling of in-plane field to the orbital motion of the electron in TDBG. Despite being small compared with the bandwidth, this effect is comparable with Zeeman splitting, leading to a modified \(g\)-factor which compares favorably to the experimental value20 extracted from the slope of the half-filling gap as a function of in-plane field. Moreover, in our theory, this effect is responsible for the reduction of \({T}_{{\rm{c}}}\) under an in-plane field by providing the main pair-breaking mechanism when pairing takes place between aligned spins in opposite valleys. The resulting decrease in the superconducting \({T}_{{\rm{c}}}\) with in-plane field agrees qualitatively with the experimental results.

Furthermore, we have performed a self-consistent Hartree–Fock mean-field calculation to identify the possible symmetry broken correlated insulating states at integer fillings. Our prediction of a spin-polarized ferromagnet at half-filling is consistent with the observed increase in the gap with in-plane field.

Finally, here we have proposed a pairing mechanism based on ferromagnetic fluctuations, which is motivated by the evidence for a ferromagnetic parent insulator. Such a mechanism leads naturally to the spin-triplet pairing suggested by experiments. In addition, we showed that the experimentally observed dependence of \({T}_{{\rm{c}}}\) on in-plane field suggests that the superconductor emerges in the vicinity to a quantum critical point.

In conclusion, our theoretically established phase diagram for twisted double-bilayer graphene, captures all significant observations of the experiments reported in ref. 20. This includes single-particle features such as the parameter range for band isolation as well as correlation-induced features including a ferromagnetic insulator at half-filling which leads to a spin-triplet superconductor upon doping. In addition to deepening our understanding of correlated Moiré materials, our results highlight how phases which are rare in conventional solids can be readily realized in this novel and tunable platform.

After completing this work, we noticed two experimental papers55,56 which are consistent with ref. 20 and theoretical discussion contained here.

Methods

Numerical simulations for single particle

Here, we summarize the numerical methods used to calculate the single-particle physics. First, each bilayer-graphene (BLG) layer is modeled by the following bloch Hamiltonian:

$${h}_{{\bf{k}}}=\left(\begin{array}{cccc}{U}_{1}+\Delta &-{\gamma }_{0}f({\bf{k}})&{\gamma }_{4}{f}^{* }({\bf{k}})&{\gamma }_{1}\\ -{\gamma }_{0}{f}^{* }({\bf{k}})&{U}_{1}&{\gamma }_{3}f({\bf{k}})&{\gamma }_{4}{f}^{* }({\bf{k}})\\ {\gamma }_{4}f({\bf{k}})&{\gamma }_{3}{f}^{* }({\bf{k}})&{U}_{2}&-{\gamma }_{0}f({\bf{k}})\\ {\gamma }_{1}&{\gamma }_{4}f({\bf{k}})&-{\gamma }_{0}{f}^{* }({\bf{k}})&{U}_{2}+\Delta ,\end{array}\right),$$
(12)

which is labeled in the order of \({A}_{{\rm{1}}}\), \({B}_{{\rm{1}}}\), \({A}_{{\rm{2}}}\), \({B}_{{\rm{2}}}\). Here, we consider a realistic model of BLG illustrated in Fig. 1. AB stacking means that the \(A\)-site of the first layer (\({A}_{1}\)) sits on top of the \(B\)-site of the second layer (\({B}_{2}\)). This gives a small on-site energy \(\Delta\) for these sites. Here, \(f({\bf{k}})\equiv {\sum }_{l}{e}^{-i{\bf{k}}\cdot {\delta }_{l}}\), where \({\delta }_{1}=a(0,-1)\), \({\delta }_{2}=a(-\sqrt{3}/2,1/2)\), and \({\delta }_{3}=a(\sqrt{3}/2,1/2)\) are vectors from \(B\)-site to \(A\)-sites. One can expand \(f({\bf{k}})\) near \({{\bf{K}}}_{\pm }=\pm (4\pi /3\sqrt{3}a,0)\) as

$$f({{\bf{K}}}_{\pm }+{\bf{k}})=\frac{3}{2}(\mp {k}_{x}+i{k}_{y})a,$$
(13)

where \(a\) is the distance between carbon atoms. Throughout, we will use the phenomenological parameters extracted from ref. 57

$$({\gamma }_{0},{\gamma }_{1},{\gamma }_{3},{\gamma }_{4},\Delta )=(2610,361,283,138,15)\ {\rm{meV}},$$
(14)

where \({\gamma }_{0,1,3,4}\) and \(\Delta\) are the parameters illustrated in Fig. 1. In addition, the potential difference between the top and bottom graphene layer, \(U\) is an important parameter in the experiment, which is controlled by the gate voltage difference. For a displacement field strength \(D\), AB–AB system’s dielectric constant \(\epsilon\) and the thickness of the BLG/BLG system \(d\), \(U={\epsilon }^{-1}D\cdot d\).

Next, we couple two layers of AB-stacked bilayer graphenes by Moire hoping terms. As we are interested in the physics near charge neutrality point, we focus on band structures mostly originated near \({{\bf{K}}}_{\pm }\) points. In the continuum model approximation30, Moire bands from \({{\bf{K}}}_{\pm }\) valleys decouple; for the Moire band from \({{\bf{K}}}_{+}\) valley, the Hamiltonian is given by

$${H}_{+}= \,\sum _{{\bf{k}}}\Bigg[\ {h}_{\frac{\theta }{2}}^{t}({{\bf{K}}}_{+}+{\bf{k}}){c}_{{\bf{k}},+,t}^{\dagger }{c}_{{\bf{k}},+,t}^{}+{h}_{-\frac{\theta }{2}}^{b}({{\bf{K}}}_{+}+{\bf{k}}){c}_{{\bf{k}},+,b}^{\dagger }{c}_{{\bf{k}},+,b}^{}\\ +\sum _{n}\left({T}_{n}{c}_{{\bf{k}}+{q}_{n},+,b}^{\dagger }{c}_{{\bf{k}},+,t}^{\,}+{T}_{n}^{\dagger }{c}_{{\bf{k}},+,t}^{\dagger }{c}_{{\bf{k}}+{q}_{n},+,b}^{\,}\right)\Bigg],$$
(15)

where \({c}_{{\bf{k}},+,t/b}^{\dagger }\) is a 4-components electron creation operator for top/bottom layer with momentum \({{\bf{K}}}_{+}+{\bf{k}}\). Here, \({h}_{\theta }({\bf{k}})=h({R}_{-\theta }{\bf{k}})\) with \({R}_{\theta }\) denoting the counter-clockwise rotation matrix by angle \(\theta\) relative to the \(x\)-axis. The momenta \({q}_{0,1,2}\) are given by \({q}_{0}={R}_{\theta /2}K-{R}_{-\theta /2}K=\frac{8\pi \sin (\theta /2)}{3\sqrt{3}a}(0,-1)\), \({q}_{1}={R}_{\phi }{q}_{0}\), and \({q}_{2}={R}_{-\phi }{q}_{0}\) where \(\phi =2\pi /3\). The hopping matrices \({T}_{n}\), \(n=0,1,2\) are given by

$${T}_{n}={\left(\begin{array}{cc}0&1\\ 0&0\end{array}\right)}_{{\rm{layer}}}\otimes {({w}_{0}+{w}_{1}{e}^{2\pi n{\sigma }_{3}/3}{\sigma }_{1}{e}^{-2\pi n{\sigma }_{3}/3})}_{{\rm{sublattice}}},$$
(16)

where \({w}_{0},{w}_{1}\) are Moiré hopping parameters. One crucial parameter tunable in experiments is displacement field \(U\). In Fig. 6, we demonstrated how the band structure evolves with increasing \(U\). One can see that the first conduction band becomes isolated in the range of \(U\in [40,80]\). Furthermore, to illustrate the how the band isolation arises, we plot the energy gap between different bands in Fig. 7. For a smaller value of \(r\), gapped regimes in Fig. 7a–c expand in the parameter space of \((\theta ,U)\), giving arise to a wider band isolation regime (data available upon request).

Fig. 6
figure 6

The band structure of the model at \(\theta =1.3{3}^{\circ }\) and \(U=0,14,30,60,90,110\). At \(U=14\), Chern number is exchanged by \(3\) between the conduction and valence band at three momenta which are located not along the symmetric cut. However, at \(U=30\) and \(U=90\), Chern number changes by \(1\) which can be seen by the gap closing between bands at \({K}_{2}\) and \(\Gamma\) points.

Fig. 7
figure 7

The bandgap (meV) for the range of \((\theta ,U)\). Uncolored region implies bands being overlapped. a Gap between the first conduction and valence bands. b Gap between the first and second conduction bands. c Gap between the first and second valence bands.

Chern number

In the main text, we presented Chern number carried by Moire first conduction bands from \({{\bf{K}}}_{\pm }\)-valleys. Here, we carefully examine the evolution of Chern. First, at \(U=0\), the reflection symmetry \({M}_{y}\) enforces \(C=0\) for both valleys as \({M}_{y}\) maps the system back to itself without exchanging valleys, but \({k}_{y}\mapsto -{k}_{y}\) so Berry curvature flips its sign24. In the quadratic band approximation limit of BLG, as we increase \(U\), the band inversion between conduction and valence bands occurs at the Moiré \({K}_{2}\)-point (\({K}_{1}\) for negative \(U\)) with a quadratic touching. Thus, Chern number of \(\pm 2\) is exchanged.

Next, let us understand the Chern number evolution in the realistic Hamiltonian with parameters of Eq. (14) along the dotted line in Fig. 3b. With a trigonal warping term, the quadratic band touching point splits into four Dirac cones, three with positive and the other with negative chirality. These three Dirac cones are located at generic momenta, thus would not be observed in the band plot along the high symmetric line. Under the presence of particle–hole asymmetry terms, the degeneracy between four Dirac cones split, and the band inversion would happen first at three Dirac cones, exchanging Chern number by \(\pm 3\). Then, the band inversion would occur at the center Dirac cone, exchanging Chern number by \(\mp 1\). In total, it will still change the Chern number by \(\pm 2\). At larger values of the gate voltage \(U\), the band inversion happens between first and second conduction band at \(\Gamma\) point, and the Chern number then changes by \(\mp 1\) (It can change by \(\mp 2\) for other parameter setting), decreasing the Chern number.

This can be further checked by inspecting symmetry indicators58,59,60. There are three \({C}_{3}\)-invariant momenta \(\Gamma\), \(K\), and \(K^{\prime}\). For a Bloch state with these momenta, \({C}_{3}\) rotation symmetry would map the state back to itself with a rotation eigenvalue:

$${R}_{2\pi /3}\left|{\bf{k}},n\right\rangle ={e}^{2\pi i{L}_{n,{\bf{k}}}/3}\left|{\bf{k}},n\right\rangle ,\quad {\bf{k}}={K}_{1},{K}_{2},\Gamma$$
(17)

where \({L}_{n,{\bf{k}}}\) is an angular momentum associated with the Bloch state \(\left|{\bf{k}},n\right\rangle\). Then, the Chern number of the \(n\)-th band can be determined modulo 3 by

$${C}_{n}\equiv {L}_{n,\Gamma }+{L}_{n,{K}_{1}}+{L}_{n,{K}_{2}}\,{\mathrm{mod}}\,3$$
(18)

Thus, by tracking how \({C}_{3}\) eigenvalues of the three momenta change with the gating voltage \(U\), we can understand how Chern number transition happens in the system. Indeed, the aforementioned scenario can be confirmed. For example, consider a Moiré first conduction band for \({{\bf{K}}}_{+}\) valley at \(\theta =1.3{3}^{\circ }\). At \(U=0\) meV, we start with \(({n}_{\Gamma },\,{n}_{{K}_{1}},\,{n}_{{K}_{2}})=(0,\,1,\,-1)\). At \(U=14\) meV, Chern number changes by \(+3\) but it can be only captured by Berry curvature not by symmetry indicator. At \(U=30\) meV, Chern number changes by \(-1\), manifested by \({n}_{{K}_{2}}:-1 \, \mapsto \, 1\). At \(U=90\) meV, Chern number again changes by \(-1\), manifested by \({n}_{\Gamma }:0 \, \mapsto -{\!}1\). See Fig. 6 for the detail.

Magnetic field effect

Under in-plane magnetic field \({\bf{B}}=({B}_{x},{B}_{y},0)\), one can choose the gauge \({\bf{A}}(z)=-z\times {\bf{B}}\). Then, the effect of a magnetic field on hopping terms is evaluated via Peierl’s substitution, where the hopping term from \(R\) to \(R+{\boldsymbol{\delta }}\) is multiplied by the phase factor

$${e}^{i\frac{q}{\hslash }{\int }_{R}^{R+{\boldsymbol{\delta }}}dr\cdot {\bf{A}}(z)}={e}^{-i\frac{e}{\hslash }{{\boldsymbol{\delta }}}_{xy}\cdot \left[\left({R}_{z}+\frac{{{\boldsymbol{\delta }}}_{z}}{2}\right)\times {\bf{B}}\right]},$$
(19)

such that

$$\sum _{R,{\boldsymbol{\delta }}}{e}^{i\frac{q}{\hslash }{\int }_{R}^{R+{\boldsymbol{\delta }}}dr\cdot {\bf{A}}(z)}{c}_{R+\delta }^{\dagger }{c}_{R}=\sum _{{\bf{k}},{\boldsymbol{\delta }}}{e}^{-i({\bf{k}}+{\boldsymbol{\alpha }})\cdot {\boldsymbol{\delta }}}{c}_{{\bf{k}}}^{\dagger }{c}_{{\bf{k}}},$$
(20)

where \({\boldsymbol{\alpha }}=-\frac{q}{\hslash }{\bf{A}}({\bf{R}}_{z}+{{\boldsymbol{\delta }}}_{z}/2)=-\frac{e}{\hslash }\left[({\bf{R}}_{z}+\frac{{{\boldsymbol{\delta }}}_{z}}{2})\times {\bf{B}}\ \right]\) since \({\bf{A}}(z)\) is linear function of \(z\). Hence, the effect of in-plane field can be included by simply replacing all \({\bf{k}}\)-dependent matrix elements of Bloch Hamiltonians by \({\bf{k}}+{\boldsymbol{\alpha }}\) as follows (we take \({c}_{{\bf{k}}}={\sum }_{R}{e}^{-i{\bf{k}}\cdot R}{c}_{R}\)):

$${{\mathcal{H}}}_{l,m}({\bf{k}},{\bf{B}})={{\mathcal{H}}}_{l,m}\left({\bf{k}}-\frac{e}{\hslash }\frac{(l+m)d}{2}\ {e}_{z}\times {\bf{B}}\right)$$
(21)

where \({{\mathcal{H}}}_{l,m}\) is the matrix element connecting layers \(l\) and \(m\) (\(l,m=0,\ldots ,3\) from bottom to top) in Eq. (15), \(d=3.42\,\mathring{\rm{A}}\) is the interlayer distance, and \({e}_{z}\) is the unit vector in the \(z\) direction.

Due to its small magnitude relative to the energy gap, it suffices to consider the in-plane orbital effect to first order in pertrubation theory. This amounts to adding the following in-plane orbital term to the single-particle energies

$${\xi }_{n,\tau }({\bf{k}},{\bf{B}})={\xi }_{n,\tau }({\bf{k}})+{\mu }_{{\rm{B}}}{g}_{n,\tau }^{xy}({\bf{k}})\cdot {\bf{B}}$$
(22)

where \({g}_{n,\tau }^{xy}({\bf{k}})\) is given by

$${g}_{n,\tau }^{xy}({\bf{k}})=\frac{1}{{\mu }_{{\rm{B}}}}\langle {\psi }_{n,\tau }({\bf{k}})| {\nabla }_{{\bf{B}}}{{\mathcal{H}}}_{\tau }({\bf{k}},{\bf{B}}){| }_{{\bf{B}}=0}| {\psi }_{n,\tau }({\bf{k}})\rangle,$$
(23)

where \(\tau\) is the valley index. Time-reversal symmetry implies that \({g}_{n,\tau }^{xy}(-{\bf{k}})=-{g}_{n,-\tau }^{xy}({\bf{k}})\). The in-plane orbital \(g\)-factor transforms under \({C}_{3}\) rotation as

$${g}_{n,\tau }^{xy}({R}_{\pm 2\pi /3}{\bf{k}})={R}_{\mp 2\pi /3}{g}_{n,\tau }^{xy}({\bf{k}})$$
(24)

provided that the band \(n\) is non-degenerate at \({\bf{k}}\). This implies that \({g}_{n,\tau }^{xy}({\bf{k}})\) vanishes at any \({C}_{3}\)-invariant point. As pointed out in the Results, in general, the in-plane orbital contributions affects the bands very differently from the Zeeman effect. For example, it can distort the Fermi surface when the bands are partially filled in an opposite way in the two valleys which can influence the physical properties, e.g., superconducting \({T}_{{\rm{c}}}\) (see Supplementary Note 6).

The effect of out-of-plane field on the energy bands is generally more complicated since any gauge choice breaks translation symmetry. As a result, the band picture breaks down for large enough out-of-plane fields where Landau level physics form instead. In the following, we will consider the limit of weak out-of-plane fields which can be treated perturbatively. In this case, the out-of-plane field induces an orbital valley Zeeman effect as pointed out in ref. 38,39 whose \(g\)-factor is given by

$${g}_{n,\tau }^{z}({\bf{k}})=-\frac{4m}{{\hslash }^{2}}{\mathrm{Im}} \sum _{l\ne n}\frac{\langle n,\tau | {\partial }_{{k}_{x}}{{\mathcal{H}}}_{\tau }| l,\tau \rangle \langle l,\tau ,| {\partial }_{{k}_{y}}{{\mathcal{H}}}_{\tau }| n\rangle }{{\epsilon }_{n,\tau ,{\bf{k}}}-{\epsilon }_{l,\tau ,{\bf{k}}}}.$$
(25)

In summary, the single-particle energies has the following dependence on magnetic field

$${\xi }_{n,{\boldsymbol{\sigma }},\tau }({\bf{k}},{\bf{B}})={\xi }_{n,\tau }({\bf{k}})+{\mu }_{{\rm{B}}}(g{\boldsymbol{\sigma }}\cdot {\bf{B}}+{g}_{n,\tau }({\bf{k}})\cdot {\bf{B}}),$$
(26)

where \({\boldsymbol{\sigma }}\) is the electron spin operator (which is \(\pm {\!}1/2\) for up/down spins) and \(\tau =\pm\). The valley orbital \(g\)-factor is defined as

$${g}_{n,\tau }({\bf{k}})=({g}_{n,\tau }^{xy}({\bf{k}}),{g}_{n,\tau }^{z}({\bf{k}})).$$
(27)

We have also assumed that the spin-quantization axis is parallel to the field.