Abstract
With the rapid development of artificial intelligence, parallel image processing is becoming an increasingly important ability of computing hardware. To meet the requirements of various image processing tasks, the basic pixel processing unit contains multiple functional logic gates and a multiplexer, which leads to notable circuit redundancy. The pixel processing unit retains a large optimizing space to solve the area redundancy issues in parallel computing. Here, we demonstrate a pixel processing unit based on a single WSe2 transistor that has multiple logic functions (AND and XNOR) that are electrically switchable. We further integrate these pixel processing units into a low transistor-consumption image processing array, where both image intersection and image comparison tasks can be performed. Owing to the same image processing power, the consumption of transistors in our image processing unit is less than 16% of traditional circuits.
Similar content being viewed by others
Introduction
As artificial intelligence technology evolves, the amount of data that computers need to process has greatly increased, especially image and video data. To deal efficiently with increasingly massive data, adopting parallel computing hardware is the market trend. Parallel computing has been considered “the high end of computing” and is more suitable for hardware acceleration and image processing applications1,2. In commercial neural network hardware acceleration applications, the widely used NVIDIA graphics processing unit3 and Google Tensor processing unit4 are all based on parallel computing architecture. For image data, multiple pixels are parallelly processed for efficiency. According to different computing tasks, different function modules need to be involved in each unit; for example, the logic AND function is applied to find the intersection of images, while the logic XNOR function is applied to compare the similarity of images. This requires each pixel processing unit to contain multiple functional modules on the hardware level, which are switchable according to task requirements. The existing technical path has a very high area redundancy because each functional module of the pixel processing unit is physically implemented by different circuits and the function selection depends on additional control circuits (such as multiplexer). The highly parallel computing process further increases the circuit redundancy because the complex pixel processing units need to be repeated in an array.
Unlike bulk materials, two-dimensional (2D) materials have atomic-level thickness and abundant electronic characteristics, which has potential in the design of emerging electronic devices5,6,7. Many studies have been performed to build logic circuits using 2D materials, such as n-type metal-oxide-semiconductor (NMOS) logic circuits8,9,10,11 and various reconfigurable logic gates schemes12,13,14,15,16,17,18. These works have remarkable progress in functional integration by utilising the characteristics of 2D materials. To promote 2D material devices to practical parallel computing applications, there are still important challenges to determine, such as electrical switchable logic functions and compact device structures. Until now, a compact transistor structure that has multiple electrical switchable logic functions has been missing. A compact structure that only has the requisite terminals for power supply and input/output can guarantee the area efficiency of circuits. Multiple electrical switchable logic functions can satisfy different task processing demands. Such a device has the potential to simulate the functions of a single-pixel computing unit to handle different graphics tasks.
In this work, we experimentally demonstrate an image processing array by 2D material WSe2. Through drain voltage regulation on the carrier injection barrier, the logic function is switchable between AND and XNOR in a single transistor without additional terminal or multiplexer circuits, which means that the single device is qualified to the pixel processing unit. This single transistor pixel processing unit greatly decreases the consumption of transistors in logic circuits (1 transistor implements logic AND and XNOR) compared to the NMOS logic family (2 transistors for logic AND, 8 transistors for logic XNOR, and additional multiplexer circuits). Assembling these WSe2 pixel processing units into an array, the image processing array can handle different graphic processing tasks, such as finding the intersection or similarity of images. At the same processing power, the transistor consumption of our image processing array is <16% of the traditional scheme, which has the potential to remove the circuit redundancy issue in parallel computing.
Results
Low transistor consumption image processing array
For image processing, each pixel data point is processed by a pixel processing unit. Physically using different circuits to implement various logic gates and adding additional multiplexers to switch functions will increase the redundancy of the circuit. According to the two-surface-channel (TSC) working mechanism12, logic computing can be performed in a single double-gated transistor. Utilising the voltage-modulated barrier effect, we have successfully implemented a pixel processing unit by only a single TSC WSe2 transistor.
Figure 1a is the packaged image processing array based on the TSC WSe2 transistors, and Fig. 1b is the functional area where the terminals are marked in different colours according to their electrical connections (green is input1, purple is input2, red is Op-Instruction and grey is output). The array consists of 3 × 3 TSC WSe2 transistors, and all the devices are encapsulated with an Al2O3 dielectric. The detailed fabrication process and energy dispersive spectroscopy (EDS) analysis are provided in Supplementary Notes 1 and 2. In addition, we demonstrated the scalability of the process in Supplementary Note 3 by using thin films of a large-area CVD film. The devices have an adjustable switching ratio to adapt to different operating conditions (Supplementary Note 4). Figure 1c shows the circuit layout and its image processing abilities. The cross-sectional transmission electron microscopy (TEM) image and the schematic device structure of one of the pixel computing units are shown in Fig. 1d. The top and bottom gates function as inputs, the source terminal is chosen as the output (current signal) and the drain serves as the Op-Instruction (voltage signal). It is worth noting that there is no additional terminal induced to regulate the logic function of the device; therefore, the area efficiency has not been sacrificed for multiple logic functions.
Mechanism of electrically switchable logic functions
Different from unipolar n-type semiconductor MoS2 with sulfur vacancies and strong Fermi level pinning near the conduction band19, WSe2, as an ambipolar semiconductor, has been demonstrated to effectively shift the Fermi level between the valence band and conduction band under the application of an external field20. This makes it possible for both holes and electrons to act as carriers in the WSe2 channel when a different voltage is applied. Therefore, WSe2 was selected as the channel material because the WSe2 transistor should have the potential to show various logic functions.
First, we studied the modulation effect of the drain-source voltage (VDS) on the polarity of the TSC WSe2 transistor. As Fig. 2a shows, the characteristic drain-source current (IDS)-bottom gate voltage (VBG) transfer curves are used to calibrate the basic performance of the device (the TG is no input). At different VDS values, the majority carrier in the channel materials significantly changes from only electrons to both electrons and holes. To analyse the mechanism, the band structures of the device in the electron-dominated region (VBG = −2 V) and the hole-dominated region (VBG = −9 V) at different VDS values were plotted. For identification, we colour-coded the carrier types, and the blue and red regions are hole- and electron-dominated carriers, respectively.
Figure 2b shows the band diagram development of the electron-dominated region with increasing VDS. Because the Al2O3 deposition process has an n-doping effect on the WSe2 channel, the device exhibits the n-type at a low drain voltage. As VDS increases, the electron current intensity does not show a noteworthy change. From the corresponding band diagram, we can see the electrons injected into the channel from the source electrode. The electron injection barrier (source/channel interface) is regulated by the gate voltage, and the drain voltage has little impact on this barrier because the source is grounded. Therefore, the electron-dominated region is not greatly affected by VDS. Figure 2c shows the band diagram development of the hole-dominated region with increasing VDS. In this case, the hole current intensity changes as VDS increases. Different from the electron-dominated region, the hole current intensity is determined by the barrier at the drain/channel interface and regulated by both gate voltage and drain voltage. A large VDS will thin the triangle hole barrier and distinctly increase the injection number of holes from the drain electrode into the channel.
We have already demonstrated that the hole current of the WSe2 channel (encapsulated with Al2O3) could be modulated by the drain voltage. This useful feature could be utilised in the TSC transistor to realise electrically switchable logic functions without additional control terminals. In Fig. 3, we carried out voltage scanning from −9 V to −2 V for both the top gate and bottom gate to obtain the output current map. The horizontal and vertical axes represent VBG and VTG, respectively (the data came from the first-pixel processing unit in the array). The test data of the remaining units are provided in Supplementary Note 6. According to the principle of the TSC transistor, the output current is jointly controlled by two gates. When the two gate input voltages are “0”(−9 V) and “1”(−2 V) respectively, the TG and BG compete with each other for the control of the channel potential and conduction. The “competition” leads to an intermediate channel potential that makes the Schottky barrier high for both electron & hole, and the channel is turned off. When the two gate input voltages are both “0” or “1”, the conductivity of the channel is controlled by VDS. When VDS is increased from 1 V to 3 V, the left-bottom corner (hole-dominated region) representing the output current is much less than the right-top corner (electron-dominated region) that represents the output current, and the corresponding logic function of the TSC WSe2 transistor is the AND gate (Fig. 3a). When VDS is above 3 V, the hole current is nearly the same as the electron current, and the corresponding logic function of the device is the XNOR gate (Fig. 3b).
Taking the VDS as Op-Instruction and IDS as the output current signal, the transistor logic function can be switched by VDS without additional control terminals or multiplexer circuits. We can see that as the Op-Instruction signal changes from 1 V to 5 V, the basic functions of this single-pixel computing unit are switched from AND to XNOR.
Analysis of transistor consumption
To visualise the consumption of transistors in various logic circuits, the normalised transistor consumption is used to measure the number of transistors required in the different logic gates in terms of the NMOS-based logic circuits. According to the output signal type, the logic gate can be divided into V (voltage input)–I (current output) logic and V (voltage input)–V (voltage output) logic. Adding a depletion-load transistor, V-I logic can always be transformed into V–V logic; therefore, transistor consumption computing is based on V–V logic.
As the baseline, the transistor number of NMOS logic is 9 (XOR), 3 (NAND) and 3 (NOR). The transistor consumption calculation method is given by
where TXOR, TNAND, and TNOR indicate the number of transistors required to implement logic XOR, NAND, and NOR, respectively, and PG represents the number of planar gates in a single transistor.
Table 1 is the transistor consumption summary of various logic transistor technologies. In Supplementary Note 5, we analysed the circuit structure of each work and the calculation method of transistor consumption in detail. To solve the circuit redundancy issue in parallel computing, multiple logic functions should be implemented by as few transistors as possible. Notably, the addition of more multiplexer or control gate terminals to switch logic functions will also induce extra circuit redundancy. As Table 1 shows, a single compact TSC WSe2 transistor can implement a pixel processing unit, and the transistor consumption is <16% of the NMOS logic baseline.
Demonstration of image processing tasks
Based on this TSC WSe2 transistor, we fabricated a 3 × 3 image processing array. We demonstrate two different kinds of image processing tasks (finding the intersection or similarity of two images) in the same low transistor consumption hardware. The input images are simplified to a binary pixel, input signals “0” (−9 V) and “1” (−2 V) are used to represent the pixels of low grey level and high grey level and the data of two images are input from the top gate and bottom gate, respectively. The drain voltage is used as the Op-Instruction, and the source current is selected as the output signal. When the Op-Instruction imposes AND instruction (1 V), the array realises the function of image intersection. When the XNOR instruction (5 V) is imposed, the array can compare the similarity of the two images. Next, we will explain the specific implementation method.
Figure 4 demonstrates how to find the intersection of two images in our array. As shown in Fig. 4a, a 3 × 3 pixel graph can be encoded as a 9*1 binary array, as we defined earlier. Two images are inputted to the top gate and bottom gate of each pixel computing unit. All the drain electrodes of each pixel computing unit are set to 1 V (Op-Instruction: AND), and the sampling currents are used as the output signals. Once the input and Op-Instruction have been given, the computing results will synchronise the output. Through the colour bar, we can obtain the intersection parts of the two input graphs. In addition, we randomly generated 100 groups of images (Row 1 and Row 2 in Fig. 4b). The ideal output (Truth Table) and the experimental results are shown in Row 3 and Row 4 of Fig. 4b, respectively. All the experimental output data are consistent with the simulation results.
The demonstration of image similarity comparison is shown in Fig. 5a. The input graphs are processed and encoded in the same way as the former, and the Op-Instruction is set to 5 V (XNOR). To judge the similarity of two input images, the 9 × 1 output data need to be put into an activation function (sigmoid function) to obtain the final result. After the sigmoid function process, the values sum to obtain a final score (0–9). The score measures whether the two images are matching (9), mismatching (0) or kind of matching (0~9). Figure 5b shows the test data of 26 randomly compared letter patterns. Only when the two letters are the same can the system output a value close to 9, and different letters can output how similar the two letters are. More details and data processing methods are described in Supplementary Note 7.
Discussion
In conclusion, we demonstrate a low transistor consumption image processing array based on TSC WSe2 transistors. Compared with existing complex logic circuits, this array system uses a single transistor to implement a pixel processing unit, greatly reducing circuit redundancy and improving transistor utilisation. Because the TSC WSe2 transistor can electrically switch between AND and XNOR logic functions, two different image process tasks can be carried out in the same hardware without complex multiplexer circuits. We believe that this low transistor consumption scheme has the potential to solve the circuit redundancy issue in parallel computing.
Methods
Fabrication of 3 × 3 pixel processing array
We mechanically exfoliated an ~60 × 90 μm2 WSe2 flake from a bulk crystal (from HQ Graphene) and transferred it onto a SiO2/Si (300 nm SiO2 grown on p-doped Si substrates) substrate. Reactive ion etching with Ar/CF4 plasma can be used to thinner the material if needed. Then, a 3 × 3 WSe2 flake array is patterned from it by defining a mask with electron-beam lithography (EBL) and reactive ion etching with Ar/CF4 plasma to remove the unmasked material. The Cr/Au stacks (4 nm/18 nm) were patterned and deposited on another SiO2/Si substrate as the bottom gate electrode by using EBL and electron-beam evaporation (EBE). Then, a 30-nm-thick Al2O3 gate oxide was deposited by atomic layer deposition (ALD) as the back gate dielectric. The patterned WSe2 flakes were transferred from the SiO2/Si substrate to the correct position of the Al2O3 gate oxide under an optical microscope by using a water-soluble transparent PVA film. Next, the source and drain electrodes were also patterned by Cr/Au stacks, and the top gate oxide (~30 nm Al2O3) was deposited on the channel surface by ALD (2 nm Al2O3 deposited by EBE as a seed layer). Finally, the top gate electrode pattern is deposited by EBE. After fabrication, the devices were annealed at 250 °C in a nitrogen atmosphere for 2 h to ensure good contact between the metal electrode and semiconductor.
Electrical measurements
In this study, all electronic measurements were performed at room temperature and under ambient conditions. The electronic measurements were conducted using a commercial KEYSIGHT B1500A source/measure unit on a probe station (Cascade Summit 11000).
Data availability
Relevant data supporting the key findings of this study are available within the article and the Supplementary Information file. All raw data generated during the current study are available from the corresponding authors upon reasonable request.
References
Barney, B. Introduction to Parallel Computing. Vol. 6, p. 10 (Lawrence Livermore National Laboratory, 2010).
Asanovic, K. et al. A view of the parallel computing landscape. Commun. ACM 52, 56–67 (2009).
Yang, Z., Zhu, Y. & Pu, Y. Parallel image processing based on CUDA. In 2008 International Conference on Computer Science and Software Engineering 198–201 (IEEE, 2008).
Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture 1–12 (IEEE, 2017).
Chhowalla, M., Jena, D. & Zhang, H. Two-dimensional semiconductors for transistors. Nat. Rev. Mater. 1, 16052 (2016).
Liu, Y. et al. Promises and prospects of two-dimensional transistors. Nature 591, 43–53 (2021).
Liu, C. et al. Two-dimensional materials for next-generation computing technologies. Nat. Nanotechnol. 15, 545–557 (2020).
Lin, Z. et al. Solution-processable 2D semiconductors for high-performance large-area electronics. Nature 562, 254–258 (2018).
Wachter, S., Polyushkin, D. K., Bethge, O. & Mueller, T. A microprocessor based on a two-dimensional semiconductor. Nat. Commun. 8, 14948 (2017).
Yu, L. et al. Design, modeling, and fabrication of chemical vapor deposition grown MoS2 circuits with E-mode FETs for large-area electronics. Nano Lett. 16, 6349–6356 (2016).
Xiang, L. et al. Low-power carbon nanotube-based integrated circuits that can be transferred to biological surfaces. Nat. Electron. 1, 237–245 (2018).
Liu, C. et al. Small footprint transistor architecture for photoswitching logic and in situ memory. Nat. Nanotechnol. 14, 662–667 (2019).
Wu, P., Reis, D., Hu, X. S. & Appenzeller, J. Two-dimensional transistors with reconfigurable polarities for secure circuits. Nat. Electron. 4, 45–53 (2020).
Resta, G. V. et al. Doping-free complementary logic gates enabled by two-dimensional polarity-controllable transistors. ACS Nano 12, 7039–7047 (2018).
Marega, G. M. et al. Logic-in-memory based on an atomically thin semiconductor. Nature 587, 72–77 (2020).
Xiong, X. et al. Reconfigurable logic-in-memory and multilingual artificial synapses based on 2D heterostructures. Adv. Funct. Mater. 30, 1909645 (2020).
Chen, H. et al. Logic gates based on neuristors made from two-dimensional materials. Nat. Electron. 4, 399–404 (2021).
Pan, C. et al. Reconfigurable logic and neuromorphic circuits based on electrically tunable two-dimensional homojunctions. Nat. Electron. 3, 383–390 (2020).
Kim, C. et al. Fermi level pinning at electrical metal contacts of monolayer molybdenum dichalcogenides. ACS Nano 11, 1588–1596 (2017).
Wang, Z. et al. The ambipolar transport behavior of WSe2 transistors and its analogue circuits. NPG Asia Mater. 10, 703–712 (2018).
Acknowledgements
This work was supported by the National Natural Science Foundation of China (61925402, 61851402, 61734003, 62090032 and 62004040), Science and Technology Commission of Shanghai Municipality(19JC1416600), Sanghai Education Development Foundation and Shanghai Municipal Education Commission Shuguang Program(18SG01), China Postdoctoral Science Foundation (2019M661358, 2019TQ0065).
Author information
Authors and Affiliations
Contributions
C.L. and P.Z. conceived the idea; C.L. and S.Z. designed and conducted the experiments; X.H., Z.T. and L.L. provided valuable input in the experiments; S.Z., C.L. and P.Z. co-wrote the manuscript and all authors contributed to the discussion and revision of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review information
Nature Communications thanks Wei Cao and the other anonymous reviewer for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zeng, S., Liu, C., Huang, X. et al. An application-specific image processing array based on WSe2 transistors with electrically switchable logic functions. Nat Commun 13, 56 (2022). https://doi.org/10.1038/s41467-021-27644-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-021-27644-3
This article is cited by
-
All-fibre phase filters with 1-GHz resolution for high-speed passive optical logic processing
Nature Communications (2023)
-
Reconfigurable WSe2 Schottky heterojunctions for logic rectifiers and ultrafast photodetectors
Science China Materials (2023)
-
p-Type Two-Dimensional Semiconductors: From Materials Preparation to Electronic Applications
Nano-Micro Letters (2023)
-
Two-dimensional materials-based integrated hardware
Science China Information Sciences (2023)
-
Two-dimensional devices and integration towards the silicon lines
Nature Materials (2022)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.