The advent of microcomputers in the 1970s has dramatically changed our society. Since then, microprocessors have been made almost exclusively from silicon, but the ever-increasing demand for higher integration density and speed, lower power consumption and better integrability with everyday goods has prompted the search for alternatives. Germanium and III–V compound semiconductors are being considered promising candidates for future high-performance processor generations and chips based on thin-film plastic technology or carbon nanotubes could allow for embedding electronic intelligence into arbitrary objects for the Internet-of-Things. Here, we present a 1-bit implementation of a microprocessor using a two-dimensional semiconductor—molybdenum disulfide. The device can execute user-defined programs stored in an external memory, perform logical operations and communicate with its periphery. Our 1-bit design is readily scalable to multi-bit data. The device consists of 115 transistors and constitutes the most complex circuitry so far made from a two-dimensional material.
Two-dimensional (2D) materials, such as semiconducting transition metal dichalcogenides (TMDs)1,2, black phosphorus3, silicene4 and others, are considered promising candidates for future generations of electronic circuits. It is currently not foreseen that silicon will be replaced for mainstream digital electronics in the mid-term future; however, similar to organic semiconductors5 or carbon nanotubes6, 2D materials offer a number of interesting properties that could lead to novel applications. Their ultrathin channel thickness provides improved electrostatic gate control and reduced short-channel effects7,8, which ultimately results in better geometric scaling behaviour9,10 and less power consumption. 2D semiconductors are also one of the leading candidates to enable tunnel field-effect transistors (FETs)11,12, working with sub-threshold swing below 60 mV per decade and thus low supply voltage. Together, with their high mechanical flexibility and stability, optical transparency, excellent optoelectronic properties13 and compatibility with standard semiconductor technology processing this could lead to energy efficient and flexible electronics14,15,16.
The field of TMD-based electronics has progressed enormously during the past few years. Soon after the first realizations of bulk17,18 and monolayer2 FETs, basic electronic circuits were demonstrated19,20. Both n-type (NMOS)19,20,21 and complementary (CMOS)22,23 metal-oxide–semiconductor technologies have been developed and a good understanding of the FET device physics has been gained24,25,26. The work on devices has been paralleled by the development of growth techniques27,28,29,30 for the large-scale fabrication of TMD films with good uniformity over the size of a wafer30 and the development of technologies for transferring 2D materials onto bendable14,15,16 substrates. Nevertheless, due to the plethora of challenges being faced in large-scale integration, previous work has so far been restricted to applications consisting of only a few transistors and with limited functionality. These challenges range from the necessity to match voltage levels and achieve high noise margins in cascaded logic stages to stringent requirements on device uniformity over millimetre size dimensions.
Here, we demonstrate the feasibility of using a 2D semiconductor to realize a complex digital circuit—a microprocessor.
Figure 1a depicts the architectural block diagram of our microprocessor. For demonstration purposes, we minimized transistor count and thus realized a device that operates on single-bit data only. We stress that this is not a fundamental limitation and the device is readily scalable to N-bit data, broadly speaking by connecting N of our devices in parallel. Although we reduced the architecture of our device to the essentials, it comprises all basic building blocks that are common to most microprocessors. In particular, these are: an arithmetic logic unit (ALU), that forms the heart of the processor and is, in general, capable of performing basic arithmetic and logical operations; for simplicity, we have implemented here only logical conjunction and disjunction operations. An accumulator (AC), which holds one of the operands to be supplied to the ALU. An instruction register (IR), that stores the content of the program memory currently being executed, where the most significant two bits contain the instruction itself and the third bit contains the data (Although we retrieve the data directly from the program memory, our device can also process data stored in a separate data memory (Harvard architecture). In this case, the IR is supplied with an address that points to the data memory content, which is then placed on the bus.). A control unit (CU), that receives as input the instruction code from the IR and orchestrates all resources by enabling components to access the internal bus via the control signals EA and EO; A/O conveys to the ALU the operation selection code (conjunction, A/O=0; disjunction, A/O=1). A program counter (PC), which supplies the memory with the address of the active instruction. And, finally, an output register (OR), that allows the processor to transfer the results of a calculation to the output port. The memory is, as usual, implemented off-chip.
Figure 1b depicts the timing diagram of the device, using three clock (CLK) signals. The execution of each instruction occurs in two sequences—a FETCH sequence followed by an execute (EXE) sequence. The FETCH sequence consists of two phases: in a first phase, the content of the external memory (at the address stored in the PC) is loaded into the IR; the PC is then incremented in a second step. During the EXE sequence, which is implemented here in a single phase, the microprocessor decodes and executes the command stored in the IR. This cycle is repeated continuously. Each phase is triggered by a CLK signal (CLK1, phase 1; CLK2, phase 2; CLK3, phase 3). In order to be flexible in terms of clock rate and timing, we generated the CLK signals externally; an on-chip implementation is straightforward. Figure 1c summarizes the instruction set that we have implemented. The instructions are encoded with two bits; some of them are followed by one bit of data. The no-operation (NOP) instruction has no effect other than to increase the PC. LDA allows the transfer of data from the memory into the AC. AND and OR perform logical conjunction and disjunction operations, respectively.
It is instructive to consider a simple example. The program fragment
transfers in a first step, triggered by CLK1, the bit sequence 010 from the memory into the IR. CLK2 then increases the PC and the next instruction becomes available, but is not loaded into the IR yet. Triggered by CLK3, the CU then signals the AC (EA=1) to receive the data (0) from the IR via the internal bus. With the next CLK1 signal, the content of the IR is updated (IR=101), and the CU enables the ALU to perform a logical conjunction operation (A/O=0) between the data on the bus (1) and that stored in the AC during the previous instruction. Triggered by CLK3, the result of this operation (0) is finally written into the OR (EO=1).
We now come to the actual device implementation using a 2D semiconductor. Our microprocessor was fabricated in gate-first technology on a silicon wafer with 280-nm-thick silicon dioxide. The substrate fulfills no other function than acting as a carrier medium and could thus be replaced by glass31 or any other material, including flexible substrates14,15,16. We fabricated 18 devices per wafer, with FET channels made from chemical vapour deposition (CVD) grown large-area bilayer MoS2 films. Two Ti/Au metal layers were used to interconnect the transistors and Al2O3 was used as gate oxide. A detailed description of the device fabrication steps can be found in Methods. Subunits, such as for example, the ALU or the IR, were provided with metal pads for individual testing in a wafer probe station. All subunits were eventually bonded together and the sample was placed back into the probe chamber, where it remained in vacuum for final testing of the complete circuit.
Figure 2a (bottom) shows a schematic drawing of a so-obtained MoS2 FET. The devices exhibit a field-effect mobility of ∼3 cm2 V−1 s−1, a threshold voltage VT of ∼0.65 V (Supplementary Fig. 3), an on/off ratio of ∼108, and uniform behaviour over a ∼50 mm2 area over the wafer (Supplementary Fig. 4). The circuit is based on the NMOS logic family, where both pull-up (load) and pull-down networks were realized using n-type enhancement-mode FETs. The implementation of an inverter (see circuit schematic in Fig. 2d) using this logic family is shown in Fig. 2a (top). A careful design of the W/L ratios, where W and L denote the width and length of the FET channels, is crucial, as it determines the switching threshold voltage VM and thus the ability to cascade logic stages. For simple analytic modelling, we performed calculations based on long-channel FET theory32. The pull-down FET is described by in the triode regime and in the saturation regime (red curves in Fig. 2e). The load FET is operated in the sub-threshold regime (VG1=0<VT), and thus acts as a current source over a large drain voltage range, with β being the reciprocal of the thermal potential. From the circuit schematic Fig. 2d, it is apparent that , and thus (blue symbols in Fig. 2e). The parameters K1 and K2 are taken from the experiment (Fig. 2b). By equating both currents, , we obtain a relation between VOUT and VIN, from which the switching threshold VM can be determined (Supplementary Fig. 6). If both transistors are implemented with same W/L ratio, VM drops below 1 V (Supplementary Fig. 6b), resulting in low noise margin, especially in the presence of additional hysteresis. Asymmetric transistor design, on the other hand, allows shifting VM towards VDD/2 (Supplementary Fig. 6a), resulting in improved switching behaviour. W/L ratios of the pull-up and pull-down transistors were hence made 45/2 (μm/μm) and 7/5, respectively.
Logic NAND gates with M inputs were implemented by connecting M pull-down transistors with W/L=(M × 7)/5 in series. The processor was realized by using a combination of these elements. The minimum feature size of 2 μm was chosen rather large for two reasons. It makes the design immune to sample inhomogeneities (for example, small holes, cracks and contaminations in the MoS2 film) and also allows for fast visual inspection of the lithographic structures with an optical microscope. Because of the immunity of 2D transistors to short-channel effects7,8,9,10, we expect comparable performance when the devices are scaled to sub-micrometre dimensions, provided that low contact resistance can be achieved.
Figure 2b shows the transfer characteristics of load and pull-down transistors, where the ∼14 times higher current through the former demonstrates reliable controllability of the device characteristics by geometrical scaling. The output characteristic, depicted in Fig. 2c, shows clear current saturation due to channel pinch-off at the drain. The voltage transfer characteristic of our inverters exhibit excellent performance for a wide supply voltage range between VDD=2 and 7 V, with input and output logic levels being perfectly matched. Figure 2f (solid line) shows the results for VDD=5 V, for which the voltage gain reaches values of AV≈60. Although the voltage transfer curve shows some hysteresis (that mostly stems from trap charges in the gate oxide) the noise margin of the inverter (see shaded area in Fig. 2f), NM≈0.59 × (VDD/2), is sufficiently large for integration into multi-stage logic circuits. The NAND gates showed comparable performance. We estimate a static power consumption of ≈1.4 μW per logic gate, where ID,L and ID,H denote the currents at VIN=0 and 5 V (Fig. 2e), respectively. The total power consumption of the circuit, consisting of 41 stages, is thus ∼60 μW.
A microscope image of the microprocessor is shown in Fig. 3a. The device is composed of 115 MoS2 transistors and measures—without bonding pads—0.6 mm2 in size. Circuit schematics for a D-Latch and the ALU are shown in Fig. 3b,c, respectively. The complete schematic is presented in Supplementary Fig. 1. A D-Latch is a bi-stable circuit that can be used as 1-bit data storage element, triggered by a CLK signal. It forms the basic building block of all our data registers (IR, AC and OR) and the PC. The ALU is a combinational logic circuit, entirely based on NANDs, that performs bitwise logic operations on 1-bit data. The additional input A/O signals the ALU which operation to perform. Measurements of the ALU output for different input logic states are presented in Supplementary Fig. 8.
We first verified the functionality of the microprocessor by running the example program from above and measuring waveforms at different locations on the chip (see Methods for measurement details). As shown in Fig. 4a, the device is indeed able to deliver the correct result, with excellent signal integrity and with rail-to-rail performance, proving the ability to cascade logic stages based on 2D semiconductors. To further demonstrate the operability of the device, we present in Fig. 4b the results from a series of logical disjunction operations. The match of measured and expected outputs shows again correct operation. As shown in Supplementary Fig. 10, the device proved to be functional at CLK frequencies of 50 Hz. This is by no means a limitation of the TMD material itself, but is caused by the limitations of our measurement setup. Ultimately, the speed is limited by the current-driving capability of the pull-up transistor, which is operated in the sub-threshold regime (VGS=0<VT) and acts as current source with ID≈0.55 μA. For a typical (external) capacitive load of CL≈1–10 pF, we estimate a maximum operation frequency of ≈2–20 kHz (Supplementary Fig. 11). To increase fMAX, ID could be increased by employing depletion-mode load FETs20, controlled chemical doping, improving the carrier mobility of the 2D semiconductor or just by reducing the transistor channel lengths.
In summary, we have reported a first step towards the development of microprocessors based on 2D semiconductors. The major challenge that we faced during device fabrication is yield. Although the yield for subunits was high (for example, ∼80% of ALUs were fully functional), the sheer complexity of the full system, together with the non-fault tolerant design, resulted in an overall yield of only a few per cent of fully functional devices. Imperfections of the MoS2 film, mainly caused by the transfer from the growth to the target substrate, were identified as main source for device failure. However, as no metal catalyst is required for the synthesis of TMD films27,28,29,30, direct growth on the target substrate is a promising route to improve yield. Besides that, we do not see any roadblocks that could prevent the scaling of our 1-bit design to multi-bit data. Our work demonstrates that integrated circuits consisting of 2D materials are a promising emerging technology.
Fabrication (Supplementary Fig. 7) started with patterning of the bottom metal (gate) layer by electron beam lithography (EBL) and evaporation of Ti/Au (5/25 nm). A 22-nm-thick Al2O3 gate oxide was then deposited using atomic layer deposition, followed by a second lithography step and wet chemical etching in potassium hydroxide to define the via-holes that connect the bottom and top metal layers where necessary. Following the procedure described in ref. 29, a large-area MoS2 film was grown by CVD on sapphire and then transferred onto the target wafer. The film is continuous over an area of ∼50 mm2 with bilayer thickness and small multi-layer MoS2 islands and contaminations. The MoS2 film was characterized by atomic force microscopy and Raman spectroscopy (Supplementary Fig. 2). In a third EBL step, rectangular MoS2 channels were patterned and subsequently etched using Ar/SF6 plasma. Before lift-off, mild treatment of the sample in oxygen plasma was performed to remove the crust from the surface of the polymer mask. The top metal (drain/source contact) layer was then formed by another EBL process and subsequent Ti/Au (5/35 nm) deposition. The sample was finally annealed in vacuum at 400 K for several hours to remove adsorbants from the surface and reduce device hysteresis.
For testing, we generated the CLK signals externally, using a digital I/O card (National Instruments PCI-6229) in a computer. The same card was used for emulating the external memory. The device was supplied with VDD=5 V, and waveforms were recorded with a Semiconductor Parameter Analyzer (Agilent 4155C), connected to the probe tips of a wafer probe station (Lakeshore TTPX).
The data that support the findings of this study are available from the corresponding author on request.
How to cite this article: Wachter, S. et al. A microprocessor based on a two-dimensional semiconductor. Nat. Commun. 8, 14948 doi: 10.1038/ncomms14948 (2017).
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank Simone Schuler, Andreas Pospischil, Marco Furchi, Andreas Kleinl, Fabian Doná, Werner Schrenk, Markus Schinnerl, Peter Kröll and Benedikt Gottsbachner for technical assistance, Alois Lugstein and Emmerich Bertagnolli for providing access to CVD and atomic layer deposition systems, and Dumitru Dumcenco for helpful discussions. We acknowledge financial support by the Austrian Science Fund FWF (START Y 539-N16) and the European Union (Grant agreement No. 696656 Graphene Flagship).
About this article
A semi-floating gate memory based on van der Waals heterostructures for quasi-non-volatile applications
Nature Nanotechnology (2018)