Introduction

CMOS technology has advertently followed Moore’s law of device scaling for the past 50 years to achieve higher transistor density, higher speed and power improvements. A significant part of this device scaling, especially for the planar Metal–Oxide–Semiconductor-Field-Effect-Transistor (MOSFET) was achieved by scaling the gate length1. This scaling is reaching its limits as short channel effects (SCE) significantly degrade the device performance. To partially overcome SCE, the tri-gate (FinFET) structure has been introduced2. For future technology nodes, the gate-all-around nanosheet FET, which sandwiches thin layers of silicon channel between multiple gates, is expected to provide additional improvements. Both configurations enhance the electrostatic control over the channel and allow for further gate length scaling. However, it has been reported3 that the required silicon channel thickness scaling below 10 nm severely degrades the carrier mobility due to increased surface-roughness scattering. In this context, two-dimensional (2D) semiconducting materials such as transition metal-dichalcogenides (TMDs) are considered to be ideal candidates due to their naturally passivated surface and ultra-thin body (1 monolayer MoS2 ~ 0.65 nm), providing excellent gate-control and enhanced transport4,5,6,7. However, since many studies are performed with manually exfoliated flakes and collecting large datasets is very labor-intensive, there has been a strong focus on only selecting top performing devices, at the cost of less device understanding. Until recently, only a few TMD studies8,9,10 have focused on devices fabricated using large area grown films. Especially for device scaling11, a statistically significant set of data is still lacking.

Therefore, we carry out a study of the impact of geometrical scaling on an extensive data set of large-area grown tri-layer MoS2 MOSFETs (1300 devices). We investigate the impact of scaling the channel length (Lch) and width (Wch), contact length (Lcont) and effective oxide thickness (EOT) on various device performance metrics such as the on- and off-current (Ion, Ioff), contact resistance (Rc), subthreshold swing (SS), interface trap density (Dit) and threshold voltage (VT). We demonstrate that scaling the contact length down to 13 nm has no impact on the device performance. This confirms that carrier injection occurs exclusively from the edge of the metal directly into the thin TMD channel, which is in line with our TCAD simulations. Further, using our large data set, we make a detailed assessment on the scaling trends of SS and VT with device dimensions. We identify the variation in the number of MoS2 layers in the channel and contact regions as a possible source for SS degradation and VT variability for ultra-scaled TMD MOSFETs. Such insights are crucial for device understanding and enables device architectures such as double-gate12 or stacked TMD FETs to outperform Si FETs13. This article is an extension of our previous work presented at IEDM 201910.

Results and discussion

We employ large area MoS2 grown on a 2″ c-plane sapphire template by metal–organic chemical vapor deposition (MOCVD) process using molybdenum hexacarbonyl and dihydrogen sulfide as the precursors. Atomic force microscopy (AFM) shows the MoS2 is composed of 3 monolayers (ML) fully closed and continuous film, with nucleation of 4 ML and 5 ML island regions (Fig. 1a). The average thickness is 3.6 ML, measured using Rutherford backscattering spectrometry (RBS). The device schematic is illustrated in Fig. 1b and details of the fabrication process (Fig. 1c) are discussed in the Methods section. Three different gate-oxides; (1) 50 nm SiO2, (2) 12 nm HfO2, and (3) 4 nm HfO2 are used. An optical image after contact deposition is shown in Fig. 1d and cross-section TEM images of the final fabricated device are shown in Fig. 1e, f.

Figure 1
figure 1

(a) Atomic force micrograph of CVD MoS2 on sapphire template shows a closed 3ML layer with islands of 4 and 5 monolayers distributed randomly. (b) Device schematic with global back-gate and top source/drain contacts. (c) Fabrication flow for the back-gated devices. (d) Optical micrograph showing the patterned MoS2 channel with 10 nm thick Ni contacts. (e) Cross-TEM shows a fabricated device with Lcont = 13 nm and Lch = 46 nm on 12 nm HfO2. (f) Zoom-in of the channel region for another device showing 3 monolayer MoS2 on nominal 4 nm HfO2 (g) Transfer characteristics at a fixed VDS = 1 V. Maximum drive current at VGS = 3 V scales with Lch saturating for short-channel devices. The plot shows Lch = 34 nm, 44 nm, 50 nm, 70 nm, 100 nm, 200 nm, 300 nm, 500 nm, 1000 nm, 5000 nm.

Direct current measurements are performed in N2 ambient to avoid any impact of ambient humidity. A total of 1300 devices with varying Lch (30 nm to 5 μm), Lcont (13 nm to 500 nm) and Wch (200 nm to 10 μm) are measured at two different drain-source bias (VDS = 0.05 V, 1 V). Back-gate leakage is low and below the tool noise range (< 1 pA) for the 50 nm SiO2 and 12 nm HfO2. Devices with 4 nm HfO2 have higher gate-drain leakage at VDS = 1 V due to large contact pads. Therefore, the source current (IS), instead of the drain current (ID), is used in their analysis. Channel edge effects are negligible, as confirmed by the constant on-state current density for several Wch (Fig S1). Devices with short Lch, wide Wch and therefore high absolute current, show a large parasitic voltage drop over the source-drain metal probes, and are therefore omitted from the analysis. The threshold voltage of the FETs for VDS = 0.05 V, 1 V is obtained by both the linear extrapolation from peak-transconductance (VT,LE) and constant-current method (VT,CC extracted at ID = 10 nA * Wch/Lch). SS is reported either as SSmin, which is the minimum value across the entire swing, or as SSCC, extracted at a current level of ID = 1 nA * Wch/Lch for the stated VDS bias.

Scaling of on- and off- state currents

From the representative transfer characteristics in Fig. 1g, we observe that the off-state current significantly increases as Lch is scaled, as a result of a loss of gate control. Accordingly, we extract the minimum current in the entire back gate sweep (Imin), and we observe that it is the same for both oxides and lower than the noise floor of the tool (< 1 pA). However, when comparing the Ioff in the scatterplot Fig. 2a, which is extracted at a fixed displacement field of 0.4 V/nm below VT,CC (i.e.,\( |V_{{GS}} - V_{{T,CC}} |/CET = 0.4\;{\text{V}}/{\text{nm}} \)), we note that the HfO2 sample exhibits higher Ioff compared to the SiO2 sample. This suggests that the subthreshold swing is limited by the high interface trap density (see Section D). We also note that for both oxides, Ioff degrades with smaller Lch. This is mainly due to SS degradation observed for short Lch devices, and will be further discussed in Section E.

Figure 2
figure 2

(a) Scatter plot (with median line) showing Ion extracted at ns = 1e13 cm−2 and Ioff at a fixed displacement field of 0.4 V/nm below VT,CC. for VDS = 1 V. Ion for the 50 nm SiO2 and 12 nm HfO2 devices overlap indicating no impact on low-field mobility and contact barrier. Ion roughly scales as 1/Lch for Lch > 500 nm and saturates for Lch < 50 nm. Ioff is higher for HfO2 compared to SiO2. (b) ID-VDS for Lch = 500 nm shows linear triode regime and saturation at high VDS. The dashed line follows the current at VDS = VOV. While the onset of saturation follows the VOV at low VGS, it saturates at VDS = 2.4 V for high VGS. The saturation current roughly scales VOV2 and VOV1, at low and high VGS, respectively. (c) ID-VDS for Lch = 30 nm shows non-linear triode regime due to Schottky contacts and saturation at high VDS. Saturation current follows a similar trend as Lch = 500 nm but VDS at onset of velocity saturation is reduced to 1.4 V. (d) Conduction band profile for Lch = 30 nm device with Schottky contacts shown for low and high VDS. The Fermi-level at the source and drain are indicated by EFS and EFD, respectively. The Schottky barrier is shown as the abrupt potential change at the contact-channel interface. At low VDS, IDS is determined by Schottky contacts. At high VDS, though the potential drops significantly at the source contact, velocity saturation or pinch-off near the drain determines the ID characteristics.

Next, we evaluate the Ion at a fixed charge density (ns) of 1013 cm−2 and do not observe any difference between the 50 nm SiO2 and 12 nm HfO2 samples (Fig. 2a). This indicates that the carrier transport in the MoS2 channel is predominantly limited by charged impurities14 in the MoS2 or at the interfaces, and not by remote phonons15 in the gate oxide.

For the Ion, two distinct channel length scaling regimes can be identified in Fig. 2a. In the long-channel limit (~ Lch > 500 nm), the Ion increases roughly proportional to 1/Lch and the device operates in the triode region (illustrated in Fig. 2b for the 12 nm HfO2 sample and Lch = 500 nm) i.e. gate-overdrive VOV (= VGS − VT) > VDS for both oxides. The drain current also exhibits strongly linear dependence with VDS in the triode region (Fig. 2b), suggesting that the channel resistance is dominant for this Lch and beyond. We also extract a low-field-effect mobility of ~ 15 cm−2/V.s (inset of Fig. 3c) using the transfer length method (TLM) for both the samples with 12 nm HfO2 and 50 nm SiO2. At higher lateral electric field (higher VDS), ID saturates (Fig. 2b), and the saturation current scales quadratically with VOV (here VT,CC = −0.4 V) due to channel pinch-off near the drain. However, for the highest VOV (~ 2 to 2.4 V), the saturation current scales roughly linear with VOV, indicating that it is limited by saturation of drift velocity at high lateral-field16 (FLATERAL > 5 V/μm).

Figure 3
figure 3

Scatter plot (with median line) of (a) Ion (at ns = 1e13 cm−2) versus Lcont for Lch = 30 nm (contact-limited), 100 nm (intermediate regime), and 500 nm (mobility limited). No dependence on Lcont down to 13 nm indicates carrier injection from the edge of the metal directly into the MoS2 channel with LT < 13 nm. (b) SSCC and VT,CC versus Lch for Lcont = 13 nm and 300 nm. No systematic deviation with Lcont indicates identical electrostatics in both cases. (c) Rtotal/2 (at ns = 1e13 cm−2) versus Lch show saturation below Lch = 50 nm due to contact resistance. Upper limit for RC is obtained as median Rtotal/2 for Lch = 30 nm. Median RC values of 3 kΩ.μm with best performers at 2 kΩ.μm are obtained. (inset) TLM fit of Rtotal/2 (at ns = 1e13cm−2) versus Lch gives Rc = 2.7 kΩ.μm and field-effect mobility = 15 cm−2/V.s (d) Rtotal/2 versus ns for Lch = 30 nm at VDS = 1 V of 8 devices. RC significantly reduces at ns = 2e13 cm−2 due to better carrier injection into the accumulated channel.

In the short-channel limit (~ Lch < 50 nm), the dependence of Ion on Lch saturates (Fig. 2a). Accordingly, in the output characteristics for Lch = 30 nm (Fig. 2c), we make two observations; (1) super-linear ID for VDS < 0.4 V and (2) saturation of ID for VDS > 1.4 V. The distinct super-linear dependence of ID with VDS (Fig. 2c) suggests that the Schottky contacts at the metal-MoS2 interface limit the current even though the bias conditions (VOV > VDS, here VT,CC = −0.3 V) ensure that the channel is continuously accumulated with electrons. At higher VDS, ID saturates similarly to the Lch = 500 nm device. The current at the onset of saturation is roughly proportional to VOV1.5–1.7 and VOV0.8–0.9 for low and high VOV, respectively, closely following the long-channel characteristics. This indicates that while contact resistance dominates at low VDS, velocity saturation or pinch-off near the drain determines the current at high VDS.

We can further understand both these observations from the simulated conduction band profile of Lch = 30 nm device (Fig. 2d) for low and high VDS. In the linear regime (VDS = 0.2 V and VOV > VDS), the drain-source potential is predominantly dropped across the reverse-biased source and forward-biased drain Schottky contacts. With increasing VDS (higher lateral field), the transmission probability across the Schottky contacts increases rapidly, especially across the reverse-biased source, giving rise to the super-linear dependence of ID with VDS. At even higher VDS (VDS = 1.2 V), the electric field in the channel near the drain is large enough to cause either pinch-off at low VOV or saturation of the carrier drift velocity at high VOV. Then, this results in saturation of the current.

Contact length scaling

Figure 3a shows that Ion (@ ns = 1013 cm−2) does not degrade as Lcont is scaled down to 13 nm. This agrees with TCAD simulations10,17,18 that predict contact edge injection of carriers for 1–3 layers of MoS2 channel. This observation holds true for three different Lch (30 nm, 100 nm, 500 nm) over a wide range of Lcont (500 nm to 13 nm) and for varying lateral field (VDS = 0.05 V, 1 V). In all three cases, as predicted, we do not observe any systematic degradation of Ion by scaling down Lcont from 500 to 13 nm. Even for the shortest Lch = 30 nm, where the channel resistance is negligible and the device is Schottky contact limited (ID-VDS is super-linear at low VDS in Fig. 2c), the contact resistance is independent of Lcont. Moreover, the electrostatic properties of the device are also unaffected by scaling down Lcont as can be seen in Fig. 3b from the trend of SSCC and VT,CC (@VDS = 1 V) with Lch for two extreme contact lengths. The SS degradation and VT roll-off with shorter Lch are independent of the contact length. The insensitivity to Lcont scaling also holds for other gate-oxides and charge densities (plots not shown). In summary, for 3 ML MoS2, the active region of MoS2 under the metal contact where most of the electrons get injected (called the transfer length LT) is at least below 13 nm.

These results agree very well with our previous TCAD simulations with overlapping back-gate. For thin MoS2 (1–3 ML), these predict LT smaller than the minimum simulated Lcont of 2 nm (Fig S2). This is caused by the Schottky barrier (SB) at metal-MoS2 interface, which depletes the MoS2 underneath even at a high gate-field and prevents vertical electron injection. Therefore, injection is only allowed from the edge of the metal contact directly into the carrier-rich channel, which is also predicted in other work18,19. For thicker MoS2 (more than 5 ML), the MoS2 region underneath the contact is no longer depleted close to the oxide interface, and a longer section of the contact contributes to carrier injection20,21,22. In a top-gate-only configuration, the absence of gate field under the contact would cause the vertical injection to become even more ineffective for both thick and thin MoS2 channels. As a result, the contact length can also be downscaled for top-gated devices without any performance penalty (Fig S2). Moreover, reduction of contact barrier or MoS2 sheet resistance under the contact does not increase the LT for 1–3 ML MoS2 as the oblique trajectory still provides the least resistive path for carrier injection (Fig S2). However, such improvements could increase LT for thicker MoS2 where a substantial carrier injection happens under the contact19.

In other work21,23,24,25, transfer lengths of 80 nm to 630 nm have been calculated using the transfer length method (x-axis intercept), but those values are in contradiction with our results. As argued elsewhere26, this method should not be used for thin TMD layers and Schottky contacts. The Schottky barrier fully depletes the TMD below, therefore the sheet resistance below the contact and in the channel are not the same, which is a requirement of the transfer length method. However, the transfer length method can still be reliably used for mobility calculation, because it does not have this requirement of identical TMD sheet resistance in the channel and below the metal.

Contact resistance extraction

As we found in Section A that devices become more contact dominated as Lch is scaled, we now take a closer look at the value of the contact resistance. We extract the contact resistance (Rc) directly as half of the total device resistance (Rtot/2) for devices with the shortest Lch = 30 nm, without any need for extrapolation like in the TLM method. By considering Rc ~ Rtot/2, an upper limit is obtained for Rc, as it assumes negligible channel resistance. Figure 3c shows a plot of Rtot/2 at a charge density of 1013 cm−2 vs Lch. For Lch < 50 nm, the Rtot/2 saturates, and we obtain a median Nickel-MoS2 Rc ~ 3 kΩ.μm (at ns = 1013 cm−2), which is in good agreement with Rc extracted using TLM (inset of Fig. 3c). Our RC values are comparable to the state-of-the-art devices which have been demonstrated with Au20 or Indium27 contact metals. For increased VOV, the contact resistance further drops due to better carrier injection into the accumulated channel, and we obtain Rc ~ 1.2–2 kΩ.μm @ ns = 2 × 1013 cm−2 (Fig. 3d). For even higher carrier densities (compare ns = 2 × 1013 cm−2 to 2.7 × 1013 cm−2), Rc no longer improves significantly. Significant device-to-device variation in contact resistance is observed, possible due to polymer residues between the contact metal and the MoS2, which were not completely removed after the transfer and contact lithography steps of the fabrication flow.

Long channel electrostatics and Dit extraction

Figure 4a shows that the subthreshold swing SSCC obtained at VDS = 0.05 V for different Lch, improves with thinner back-gate oxide due to better gate control of the charge in the channel. Consequently, we achieve the best subthreshold swing for the devices on 4 nm HfO2 substrate (Fig S3) with median SSmin = 90 mV/dec and 110 mV/dec (at VDS = 0.05 V) for Lch = 50 nm and 30 nm, respectively.

Figure 4
figure 4

(a) Scatter plot (with median line) of SSCC versus Lch for the three different oxides. While SS improves with lower EOT, the degradation and scatter for short channel devices are attributed to electrostatic potential fluctuations caused by non-uniform thickness of MoS2 in the contact and channel regions. (b) Experimentally measured maximum accumulation capacitance from MOScap (Cacc) versus MIMcap capacitance (Cox). Systematically, the Cacc is lower than Cox corresponding to an additional 1 nm CET over the measured EOT. Simulations show this is caused by the quantum capacitance Cq (MoS2 having lower DOS than metal), the impact of the charge centroid (CC) further away in MOS than MIM, and additionally due to 0.4 nm of water or carbon residues stuck at the HfO2/MoS2 interface during transfer. Qualitative comparison between (c) simulated and (d) experimental SS versus log (ID) for different Lch. The simulated SS is for a uniform 3 monolayers MoS2 with SBH = 0.45 eV. Two transport regimes at the contacts– thermionic emission and tunneling through the SB are identified. In the thermionic regime, the relative increase of field in the channel from the source/drain Schottky contacts degrades gate control for short Lch devices. In the tunneling regime, the nearly equal tunneling lengths for the different Lch results in a similar but degraded SS compared to the thermionic regime.

In the long-channel limit i.e., Lch > 1 μm, SSCC saturates to a constant median value of 80 mV/dec, 150 mV/dec, 1800 mV/dec for 4 nm HfO2, 12 nm HfO2, and 50 nm SiO2 respectively. This is determined by the charging of MoS2/oxide interface and channel defects (60° grain boundaries28, and point defects29), for which we calculate a trap density (Dit,min) of 4.5–7 × 1012 cm−2 eV−1 from SSmin. This range of Dit value is roughly similar across the different dielectrics. We also confirm this Dit value using multi-frequency C-V measurements of TiN/HfO2/MoS2 MOScap30, where we obtain an acceptor-type trap density of 3.2–6 × 1012 cm−2 eV−1 with energy levels near the midgap.

From C-V measurements, we find that the MOS capacitance is systematically lower than the target oxide capacitance due to exposure to water and/or atmospheric carbon during the wet transfer process from the sapphire template to the target substrates. Figure 4b shows how the maximum accumulation capacitance (Cacc) measured from TiN/HfO2/MoS2 MOScap (shown as the red diamonds) is lower than the value of Cox measured from TiN/HfO2/TiN MIMcap (without MoS2, shown as the black line). Equivalently, the capacitance equivalent oxide thickness (CET) values for MOScap (1.9 nm, 2.7 nm, and 3.8 nm) are systematically 1 nm higher than the EOT values of the MIMcaps (0.9 nm, 1.7 nm, and 2.6 nm). We calculate that the effect of quantum capacitance due to the limited density of states in MoS2, and the effect of charge centroid being a few angstrom away from the interface, are insufficient to account for this 1 nm difference. As the MIMcaps are not exposed to water or polymer during the fabrication, Fig. 4b shows the difference between the CET and EOT values can be explained by a 0.4 nm thick layer of water or hydrocarbons adsorbed from the ambient, or a combination thereof. In the future, we expect dry transfer in a controlled ambient will lower the CET, closer to the nominal EOT.

Short channel electrostatic degradation and variability

In the short-channel limit, i.e., Lch < 100 nm, Fig. 4a shows a degradation of median SSCC but also increased scatter (SSCC at VDS = 1 V in Fig S4). A similar trend is also seen for SSmin (Fig S3). We hypothesize that the increased median and scatter could both be caused by the Schottky contacts, where the median SS degradation with shorter Lch is related to the relative increase of depletion regions from the Schottky contacts, while the scatter could be due to the variation in Schottky barrier height31 (SBH) induced by the non-uniform thickness of the MoS2, seen in the AFM image in Fig. 1a.

We first verify the hypothesis of degraded median SS for shorter Lch by comparing representative experimental SS versus ID curves to simulations in Fig. 4c. We consider full SS–ID curves instead of extracting SS at a single current level to understand the injection mechanism in a wider operation range. The simulations are performed for a SBH = 0.45 eV and uniform 3 ML MoS2 channel. We observe two different regimes for SS for both the simulated and experimental data. In the first low-current regime (ID < 1e−9 A/μm), the current is limited by the thermionic emission of carriers from the metal into the channel. Here, the barrier for electrons consists of the highest position of conduction band edge inside the channel determined by the gate-bias. In this low-current regime, SS is determined by the change in the conduction band edge with gate-bias. As discussed in section D, the lower limit for SS (which corresponds to SSmin in Fig S3), is defined by the interface trap density. The degradation of SSmin for short-Lch devices is due to the electrostatic potential of the source and drain metallurgical junctions influencing the channel potential and degrading the gate control. This is illustrated in Fig S5 where the conduction band energy is flat over most of the device for Lch = 100 nm, while it is lowered for Lch = 30 nm with the region of maximum barrier reducing to a small portion near the center of the device. Note that this effect is similar to conventional MOSFETs.

The second regime (ID > 1e−9 A/μm) is reached when the conduction band in the channel is lowered further, and carriers can efficiently tunnel through the SB (Fig S6). Here, the thermionic component over the barrier saturates and the tunneling path length determines the current. Because it continuously changes with higher VGS, the SS is worse than the first regime. Correspondingly, in the experimental devices, the SSCC extracted at ID > 1e−8 A/μm (for Lch < 100 nm) shows a higher value than SSmin and stronger degradation with Lch. The SS for a given ID also becomes nearly independent of Lch, because the tunneling path length depends only on the gate voltage and the thicknesses and dielectric permittivities of the TMD32 and oxide, for the low lateral electric field (VDS = 0.05 V). This is illustrated in Fig S6 where the conduction band energy and tunneling rate are plotted along the edge carrier injection path for Lch = 30 nm and 100 nm, showing no significant difference. With further reduction in SBH, the SS value in the second regime improves, reaching closer to the thermionic limit of the first regime.

We study the increased SS scatter for short Lch seen experimentally, using simulations of devices with different uniform MoS2 channel thickness and SBH. Figure 5a shows the simulated SS value for two different SBH (0.45 eV, 0.75 eV) and three different uniform thicknesses (1, 3 and 5 layers) of MoS2 for Lch = 30 nm. Similar to the above case, we note two different regimes for SS irrespective of the barrier height. For the first regime of low ID (< 1e−8 A/μm for SBH = 0.45 eV and < 1e−11 A/μm for SBH = 0.75 eV), the SS is determined only by thermionic emission over the channel barrier. Therefore, the SS is independent on the channel thickness. However, the SS degrades for SBH = 0.75 eV compared to 0.45 eV, because the higher Schottky barrier field penetrates deeper into the channel. For the second regime of high ID (> 1e−8 A/μm for SBH = 0.45 eV and > 1e−10 A/μm for SBH = 0.75 eV), the SS is dependent on the tunneling length which is sensitive to the thickness of the semiconductor among other parameters33. Subsequently, the gate control over the Schottky barrier, and hence the tunneling length, reduces with thicker MoS2, resulting in poor SS for the 5 ML MoS2 (Fig S7). In agreement with this observation, we also note that the difference in SS between the layers is more pronounced for the higher SBH of 0.75 eV.

Figure 5
figure 5

(a) Simulated SS versus log (ID) for a uniform layer of 1, 3 and 5 monolayers of MoS2 for SBH = 0.45 eV and 0.75 eV. For ID > 1e−9 A/μm, tunneling through Schottky barrier determines the SS. Subsequently, a thinner channel results in better gate control, shorter tunneling length and therefore better SS. (b) Probability distribution versus device dimensions (Lch x Wch). AFM from Fig. 1a was used to compute the probability distribution for fabricating devices on only 3, 4, 5 or a combination of those (mixed). Our experimental devices have a 60–70% probability of being mixed, leading to non-uniform gate control across the channel and contact regions. (c) VT,CC versus Lch for two different Wch (200 nm, 1000 nm) and VDS (0.05 V, 1 V). No VT roll-off at VDS = 0.05 V due to excellent gate control over the channel for 12 nm HfO2. VT roll-off of about 200 mV for VDS = 1 V due to higher lateral-field at the source contact allowing for more carrier injection. No systematic VT deviation between Wch = 1 μm and 200 nm. (d) Benchmark plot showing gm,max versus SSmin. All values are at VDS = 1 V except39—VDS = 0.1 V34,—VDS = 0.5 V40,—VDS = 1.2 V37,—VDS = 1.5 V. In this work, 4 nm HfO2 provides best SS = 86 mV/dec and gm,max = 185 μS/μm.

In our experiments, we have even more variability due to non-uniform thickness within a single device. Even for the smallest functional device footprint (Lch ~ 30 nm * Wch ~ 200 nm), we always have a high probability (~ 70%) of having a mixed device i.e., regions of 3, 4 and 5 layers of MoS2 within the same device. This is illustrated in Fig. 5b where the representative AFM (Fig. 1a) image of the material was used to compute the probability of fabricating devices with different dimensions on only 3 (or) 4 (or) 5 or a mix of those layers. These mixed-thickness devices, together with the associated SBH variations, would result in non-uniform gate control and large scatter in the SS values of experimental devices. Also, note that the grain size and defects in the closed layers (1–3 ML) could additionally impact the device variability.

Threshold voltage control

We analyse VT control for decreasing channel length, and Fig. 5c shows that there is no significant median VT roll-off at VDS = 0.05 V. With a higher VDS = 1 V, we notice a VT roll-off of about 200 mV from Lch = 500 nm to 30 nm. We attribute this roll-off to the higher lateral electric field across the reverse-biased Schottky contact, because VDS is fixed at 1 V for all Lch. This higher electric field allows for increased carrier injection in short channel devices, which lowers VT. This roll-off could be mitigated by improving the gate control through gate-oxide scaling, or by reducing the amount of defects at the MoS2/oxide interface.

VT control for decreasing channel width is also shown in Fig. 5c, and no systematic impact is seen as Wch is scaled from 1 μm down to 200 nm. However, we note that the narrow devices (Wch = 200 nm) show higher VT variability than wider devices (Wch = 1 μm), especially at VDS = 0.05 V. This increased VT variability could be attributed to the higher probability of finding devices on discrete layers (Fig. 5b) for narrower channel compared to a wider channel where the devices are always mixed. Other sources of variability such as bias-temperature instability, non-uniformity of the MoS2 grains etc. could also impact the VT variability and more dedicated experiments are required.

Benchmark, projection and conclusion

We present a benchmark chart (Fig. 5d) to compare the performance of our devices against flake and CVD 2D material FETs in literature34,35,36,37,38,39,40,41,42,43. We choose the peak of transconductance (gm,max) measured at VDS = 1 V and SSmin as the two metrics for comparison, similar to conventional Si transistors. The best corner is on the top-left since low SSmin and high gm,max are desired. Our SiO2 devices, owing to the thick EOT, provide low transconductance even for the shortest Lch devices. Scaling the EOT (12 nm HfO2 and 4 nm HfO2) and using an optimized process flow (see Methods), we gain both in transconductance and SS, achieving a Rc < 2 kΩ.μm for Ni contact metal and Dit < 5 × 1012 cm−2 for a CET of 1.9 nm. We demonstrate the highest gm,max = 185 μS/μm at VDS = 1 V and a minimum SS of 86 mV/dec for 4 nm HfO2. We also achieve Imax = 400 μA/μm at VDS = 1 V and VGS = 4 V for our 12 nm HfO2 samples (Fig S8).

Despite the fact our 2D performance is among the best in literature, significant improvements are still needed to make 2D materials competitive with silicon channel devices for high-performance logic applications. Therefore we propose a roadmap using the Power Performance Area (PPA) metric for technology comparison in Fig. 6. 2D-FET and silicon nanosheet technology are compared using an inverter-based ring oscillator circuit, where each device consists of 4 vertically stacked sheets with scaled Lg = 14 nm and gate-all-around structure, corresponding to the imec 2 nm node44. All devices are retargeted to an Ioff = 2 nA at Vdd = 0.7 V and the inverter-circuit area is kept the same for fairer comparison between technologies. Starting from the baseline case (A) where experimental channel and contact parameters are assumed, the performance strongly improves in (B) when the Schottky barrier height is reduced. In (C), improvements to the 2D channel mobility results in higher ring-oscillator operating frequency compared to silicon, owing to superior electrostatic control of the 2D devices at shorter gate lengths. In (D), the ideal performance is simulated with more aggressively optimized material parameters.

Figure 6
figure 6

(a) Power performance area (PPA) analysis comparing silicon and 2D in the same configuration of 4 stacked nanosheets with gate-all-around. (b) The baseline (A) is set with experimental values, Rc = 1.5 kΩ.µm (corresponding to ΦSB = 0.45 eV), µ = 15 cm2/Vs, Dit = 3 × 1012 cm−2 eV−1, tchannel = 3 layers. For (B) the contacts are improved with Rc ≤ 50 Ω.µm (corresponding to ΦSB = 0.2 eV). For (C) the channel is further improved with µ = 200 cm2/Vs, Dit = 1 × 1012 cm−2 eV−1, tchannel = 1 layer. For (D), more aggressive improvements are done with µ = 450 cm2/Vs and no Rc. For all curves, the area is the same and the bias conditions are such that at Vdd = 0.7 V, Ioff is fixed at 2nA. Methodology from44.

In conclusion, we have scaled down the different device dimensions of CVD-grown MoS2 FETs and demonstrated gm,max = 185 μS/μm and SSmin = 86 mV/dec which are among the best in literature. Using our large dataset, we systematically identified the key obstacles to be tackled to outperform silicon. First, we showed that scaling Lcont for thin MoS2 does not impact the short channel performance, which allows for an overall reduction in the device footprint and enables device and circuit level gate optimization45. Second, we identified that for Lch < 100 nm, the on-current is currently limited by high Schottky contact resistance (Rc = 1–2 kΩ.μm) at low VDS, and by a combination of velocity saturation and the Schottky barriers at high VDS. Third, we identified that our devices suffer from short channel effects (SS degradation), caused by the Schottky barrier at intermediate current level and the thick CET at low current level. Reducing the CET is therefore crucial to keep optimal electrostatic control of the thin channel. We established that a 0.4 nm layer of water or adsorbed hydrocarbons (or combination thereof) at the HfO2/MoS2 interface is the root cause of a lower-than-expected CET. This value is consistent across different thicknesses of HfO2. Therefore, an optimized transfer process free of water and carbon is needed to enable gate stack scaling below 1 nm, and additionally allow upscaling to 300 mm-wafer processing. Finally, we have demonstrated using a PPA analysis that if the obstacles of Schottky contacts, gate stack scaling and mobility improvement can be tackled, MoS2 FETs will significantly outperform silicon GAA FETs at the imec 2 nm node and beyond. Therefore, they are excellent candidates to continue logic scaling.

Methods

Device fabrication

For the device design, we use the back-gate configuration with top-contacts (Fig. 1b). The fabrication flow is summarized in Fig. 1c. The MoS2 is delaminated from the sapphire growth substrate using water intercalation and transferred to three different target substrates; (1) Si/50 nm SiO2 (2) Si/50 nm SiO2/5 nm TiN/12 nm HfO2, or (3) Si/50 nm SiO2/5 nm TiN/4 nm HfO2. Before transfer, the target substrates are pre-cleaned using a solvent rinse, followed by an optimized forming gas anneal (FGA) or soft O2 plasma, for SiO2 and HfO2 back-gate oxides, respectively. The active channels are patterned using PMMA mask and e-beam lithography, followed by reactive ion etching (Cl2 + O2) of MoS2. Source and drain contacts of different lengths (Lcont) with different channel lengths (Lch) are subsequently defined on the active channel by another e-beam lithography exposure of ZEP520A-2 resist (ZEON Corp.), e-beam evaporation of 10 nm Ni, and metal lift-off in anisole. We ensure a low vacuum pressure < 10–6 Torr while depositing the Ni contact metal. Finally, in a third e-beam lithography step, thicker Ni/Pd contact pads are lifted off.

TCAD calibration

All simulations46 are performed in Sentaurus Synopsys Device. The low-field mobility (μeff) is calibrated from an experimental TLM fit shown in Fig. 3c and implemented under a constant mobility model. An estimate for Dit is obtained from multi-frequency CV measurements as discussed in section D. An acceptor trap distribution uniform over the entire bandgap is assumed with Dit = 3e12 cm−2 eV−1. With μeff and Dit fixed by experiments, the Schottky barrier height is fitted to median transfer characteristics of Lch = 30 nm devices which are predominantly contact-limited. For the Schottky injection, the non-local tunneling model based on the Wentzel-Kramers-Brillouin approach is used. All the parameters used in the simulation correspond to their median values.