Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Classification with a disordered dopant-atom network in silicon

Abstract

Classification is an important task at which both biological and artificial neural networks excel1,2. In machine learning, nonlinear projection into a high-dimensional feature space can make data linearly separable3,4, simplifying the classification of complex features. Such nonlinear projections are computationally expensive in conventional computers. A promising approach is to exploit physical materials systems that perform this nonlinear projection intrinsically, because of their high computational density5, inherent parallelism and energy efficiency6,7. However, existing approaches either rely on the systems’ time dynamics, which requires sequential data processing and therefore hinders parallel computation5,6,8, or employ large materials systems that are difficult to scale up7. Here we use a parallel, nanoscale approach inspired by filters in the brain1 and artificial neural networks2 to perform nonlinear classification and feature extraction. We exploit the nonlinearity of hopping conduction9,10,11 through an electrically tunable network of boron dopant atoms in silicon, reconfiguring the network through artificial evolution to realize different computational functions. We first solve the canonical two-input binary classification problem, realizing all Boolean logic gates12 up to room temperature, demonstrating nonlinear classification with the nanomaterial system. We then evolve our dopant network to realize feature filters2 that can perform four-input binary classification on the Modified National Institute of Standards and Technology handwritten digit database. Implementation of our material-based filters substantially improves the classification accuracy over that of a linear classifier directly applied to the original data13. Our results establish a paradigm of silicon-based electronics for small-footprint and energy-efficient computation14.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Simplifying classification by nonlinear projection.
Fig. 2: Device structure and charge transport mechanism.
Fig. 3: Evolution of Boolean logic.
Fig. 4: Feature filtering and handwritten digit classification.

Similar content being viewed by others

Data availability

Data are available from the corresponding author upon reasonable request.

References

  1. Hubel, D. H. & Wiesel, T. N. Receptive fields of single neurones in the cat’s striate cortex. J. Physiol. 148, 574–591 (1959).

    Article  CAS  Google Scholar 

  2. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    Article  ADS  CAS  Google Scholar 

  3. Haykin, S. Neural Networks and Learning Machines (Pearson Prentice Hall, 2008).

  4. Cover, T. M. Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Trans. Electron. Comput. EC-14, 326–334 (1965).

    MATH  Google Scholar 

  5. Torrejon, J. et al. Neuromorphic computing with nanoscale spintronic oscillators. Nature 547, 428–431 (2017).

    Article  CAS  Google Scholar 

  6. Tanaka, G. et al. Recent advances in physical reservoir computing: a review. Neural Netw. 115, 100–123 (2019).

    Article  Google Scholar 

  7. Lin, X. et al. All-optical machine learning using diffractive deep neural networks. Science 361, 1004–1008 (2018).

    Article  ADS  MathSciNet  CAS  Google Scholar 

  8. Du, C. et al. Reservoir computing using dynamic memristors for temporal information processing. Nat. Commun. 8, 2204 (2017).

    Article  ADS  Google Scholar 

  9. Hung, C. S. & Gliessman, J. R. Resistivity and Hall effect of germanium at low temperatures. Phys. Rev. 96, 1226–1236 (1954).

    Article  ADS  CAS  Google Scholar 

  10. Mott, N. F. Conduction in glasses containing transition metal ions. J. Non Cryst. Solids 1, 1–17 (1968).

    Article  ADS  CAS  Google Scholar 

  11. Gantmakher, V. F. Electrons and Disorder in Solids (Clarendon Press, 2005).

  12. Bose, S. K. et al. Evolution of a designless nanoparticle network into reconfigurable Boolean logic. Nat. Nanotechnol. 10, 1048–1052 (2015).

    Article  ADS  CAS  Google Scholar 

  13. Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).

    Article  Google Scholar 

  14. Xu, X. et al. Scaling for edge inference of deep neural networks. Nat. Electron. 1, 216–222 (2018).

    Article  Google Scholar 

  15. Zabrodskii, A. G. & Zinov’eva, K. N. Low-temperature conductivity and metal-insulator transition in compensate n-Ge. Sov. Phys. JETP 59, 425–433 (1984).

    Google Scholar 

  16. Jenderka, M. et al. Mott variable-range hopping and weak antilocalization effect in heteroepitaxial Na2IrO2 thin films. Phys. Rev. B 88, 045111 (2013).

    Article  ADS  Google Scholar 

  17. Miller, J. F. & Downing, K. Evolution in materio: looking beyond the silicon box. In Proc. 2002 NASA/DoD Conference on Evolvable Hardware 167–176 (IEEE, 2002).

  18. Harding, S. & Miller, J. F. Evolution in materio: a tone discriminator in liquid crystal. In Proc. 2004 Congress on Evolutionary Computation 1800–1807 (IEEE, 2004).

  19. Mohid, M. & Miller, J. F. Evolving robot controllers using carbon nanotubes. In Proc. 13th European Conference on Artificial Life 106–113 (MIT Press, 2015).

  20. Wolfram, S. Approaches to complexity engineering. Physica D 22, 385–399 (1986).

    Article  ADS  MathSciNet  Google Scholar 

  21. Backus, J. Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. Commun. ACM 21, 613–641 (1978).

    Article  MathSciNet  Google Scholar 

  22. Maass, W., Natschläger, T. & Markram, H. Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Comput. 14, 2531–2560 (2002).

    Article  Google Scholar 

  23. Dale, M., Stepney, S., Miller, J. F. & Trefzer, M. Reservoir computing in materio: an evaluation of configuration through evolution. In 2016 IEEE Symposium Series on Computational Intelligence, SSCI 1–8 (IEEE, 2016).

  24. Björk, M. T., Schmid, H., Knoch, J., Riel, H. & Riess, W. Donor deactivation in silicon nanostructures. Nat. Nanotechnol. 4, 103–107 (2009).

    Article  ADS  Google Scholar 

  25. Pierre, M. et al. Single-donor ionization energies in a nanoscale CMOS channel. Nat. Nanotechnol. 5, 133–137 (2010).

    Article  ADS  CAS  Google Scholar 

  26. Hartstein, A. & Fowler, A. B. High temperature ‘variable range hopping’ conductivity in silicon inversion layers. J. Phys. C 8, L249–L253 (1975).

    Article  ADS  CAS  Google Scholar 

  27. Minsky, M. & Papert, S. Perceptrons: An Introduction to Computational Geometry (MIT Press, 1969).

  28. Chen, T. et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM SIGPLAN Not. 49, 269–284 (2014).

    Article  Google Scholar 

  29. Lee, J. et al. UNPU: an energy-efficient deep neural network accelerator with fully variable weight bit precision. IEEE J. Solid-State Circuits 54, 173–185 (2019).

    Article  ADS  Google Scholar 

  30. Li, C. et al. Analogue signal and image processing with large memristor crossbars. Nat. Electron. 1, 52–59 (2018).

    Article  Google Scholar 

  31. Tapson, J. & van Schaik, A. Learning the pseudoinverse solution to network weights. Neural Netw. 45, 94–100 (2013).

    Article  CAS  Google Scholar 

  32. Such, F. P. et al. Deep neuroevolution: genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. Preprint at http://arxiv.org/abs/1712.06567 (2017).

  33. Kingma, D. P. & Ba, L. J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2015).

  34. Aharony, A., Zhang, Y. & Sarachik, M. P. Universal crossover in variable range hopping with Coulomb interactions. Phys. Rev. Lett. 68, 3900–3903 (1992).

    Article  ADS  CAS  Google Scholar 

  35. Pettersson, J. et al. Extending the high-frequency limit of a single-electron transistor by on-chip impedance transformation. Phys. Rev. B 53, R13272–R13274 (1996).

    Article  ADS  CAS  Google Scholar 

  36. The Green500 TOP500.org https://www.top500.org/green500/ (2019).

  37. Hu, M. et al. Memristor-based analog computation and neural network classification with a dot product engine. Adv. Mater. 30, 1705914 (2018).

    Article  Google Scholar 

Download references

Acknowledgements

We thank T. Bolhuis, M. H. Siekman and J. G. M. Sanderink for technical support. We thank C. P. Lawrence, B. J. Geurts, M. Nass, A. J. Annema, M. Dale and J. Dewhirst for discussions. We thank W. M. Elferink, R. Hori, J. Wildeboer and T. Dukker for help with measurements. We acknowledge financial support from the MESA+ Institute for Nanotechnology, and the Netherlands Organisation for Scientific Research (NWO): NWA Startimpuls grant number 680-91-114 and Natuurkunde Projectruimte grant number 400-17-607.

Author information

Authors and Affiliations

Authors

Contributions

T.C. and W.G.v.d.W. designed the experiments. J.v.G., T.C., B.v.d.V. and S.V.A. fabricated the samples. T.C., J.v.G. and B.v.d.V. performed the measurements and simulations. T.C. analysed the data with input from all authors. H.B., H.C.R.E. and P.A.B. provided theoretical inputs. B.d.W. and H.C.R.E. contributed to measurement script. T.C. and W.G.v.d.W. wrote the manuscript and all the authors contributed to revisions. W.G.v.d.W. and F.A.Z. conceived the project. W.G.v.d.W. supervised the project. F.A.Z. co-supervised the sample fabrication.

Corresponding author

Correspondence to Wilfred G. van der Wiel.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Cyrus Hirjibehedin and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Fabrication steps and dopant concentration.

a, Thermal oxidation. b, Implantation window definition and growth of 35 nm oxide. c, Ion implantation. d, Photolithography and contact pads lift-off. e, Electron-beam lithography and nanoelectrodes lift-off. f, Reactive ion etching (RIE) of silicon. g, Height profile of the metal electrodes with respect to silicon before (black) and after (red) RIE etching. The etch depth of silicon is estimated by measuring the height change of the metal electrodes with respect to the silicon surface (indicated by the black line on the atomic force microscopy image in the inset, not to scale). Assuming that the metal is not etched by RIE, the etch depth of silicon is around 83 nm. h, Secondary ion mass spectroscopy of the boron dopant depth profile after implantation. On the basis of the etch depth, the boron concentration near the recessed silicon surface is of the order of 5 × 1017 cm−3.

Extended Data Fig. 2 Nonlinear and tunable hopping conduction.

a, IV characteristics at 4.2 K with different total etching time by RIE. As the total etching time increases, the nonlinearity becomes increasingly prominent, signalling the dominance of hopping conduction. b, Drain current versus control voltage for constant source–drain voltage VSD = 1.2 V at 4.2 K. The source (S), drain (D) and control (C) electrodes are shown in the inset. The hysteresis for negative gate voltage is probably due to charging of the other five floating electrodes. c, Schematic plot of electrochemical potential µ versus position r, illustrating the tunability. The solid lines represent impurity states and the arrows represent hopping of carriers among states. See Supplementary Note 3 for detailed discussion. d, Fitting the temperature-dependent IV curves with the model described by equation (2) in Supplementary Note 2. Black dashed lines represent the fitted curves. e, Conductance versus the reciprocal of the cube root of the source–drain voltage at different temperatures. The black circle groups data at temperatures below 140 K. See also Supplementary Note 2 for more discussion.

Extended Data Fig. 3 Evolved logic gates at 77 K.

a, Abundance plot of 14 non-trivial truth tables at 77 K. From a search with 10,000 sets of randomly generated control voltages, we found all 16 possible truth tables that can be realized for a two-input–one-output configuration. b, Thermal stability of a NAND gate evolved at 77 K. Above 140 K, the output current clipped to compliance, and therefore the fitness was not extracted. The error bars represent the standard deviation of ten tests (see also Supplementary Note 4). c, Boolean logic gates evolved at 77 K in a device other than the one in Fig. 3c. Red circles are experimental output currents, and black lines represent the normalized desired output currents. The left six panels show the six major logic gates evolved with input voltage levels 0 V and 0.5 V. The right two panels show a NAND and a XNOR gate evolved with input voltage levels of −0.25 V and 0.25 V, showing the adaptability of the dopant network to different voltage levels (see also Supplementary Note 6).

Extended Data Fig. 4 Convergence of genetic algorithm in the configuration space.

a, Genetic algorithm convergence for the six major Boolean logic gates at 77 K. The best fitness of the 20 genomes is plotted against generation. b, Histograms of the control voltages that configure the dopant network to the XNOR gates with fitness F larger than 1.5. The first control voltage is prominently concentrated in a small range, but the others do not show a favourite range. The ranges of the five control voltages are (−600, 600), (−1,200, 1,200), (−1,200, 1,200), (−1,200, 1,200) and (−600, 600). c, Control voltages for the six major logic gates. d, Control voltages for the 16 filters, which are visualized in e. The filters ‘0110’ and ‘0010’ have the smallest separation. See Supplementary Notes 3 and 7 for more discussion.

Extended Data Fig. 5 Evolution of logic gates at two ends of hopping conduction.

a, Evolved logic gates at 4.2 K, at which the charge transport mechanism is still VRH (Methods). b, Evolved logic gates at 140 K. Red circles are experimental output currents and black lines represent the normalized desired output currents. See Supplementary Note 5 for a detailed discussion.

Extended Data Fig. 6 Measurement setup.

a, Schematics of the existing measurement setup. b, Equivalent circuit of the current measurement setup. Iout and Rout represent the output current and output resistance of the device. CL is the parasitic capacitance of about 4 nF. RIV and RF are the input resistance and feedback resistance of the I/V converter, respectively. c, Schematic of an integrated high-speed current reading circuit. Here, RIV is a resistor to convert current to voltage, CL is the parasitic capacitance that can be reduced to below 1fF. RO is a resistor that sets the amplification.

Extended Data Fig. 7 Bandwidth and energy efficiency scaling.

a, The scaling of allowed bandwidth with signal intensity in a log–log plot. The back, blue and red solid lines represent three different indicated cases. Larger required SNR (red) and smaller RIV (blue) lower the bandwidth. The horizontal black dashed line represents the limit set by the hopping relaxation time at 77 K, which increases with temperature. b, The scaling of equivalent energy efficiency with signal intensity in a log–log plot. Larger SNR (red) and smaller RIV (blue) lowers the energy efficiency. The horizontal black dashed line represents the limit at 77 K and fixed power consumption. If the dopant network power consumption is lowered, then the limit and all three scaling trends shift upwards. The three black dotted lines mark three representative computational technologies, the most energy efficient high-performance computer36, the neural network (NN) accelerator29 and memristors37 (Supplementary Note 8). c, Current flow pattern of a NAND gate (NAND10 in d) with inputs 500 and 0 mV. There is a large parasitic current flowing from input 1 to control electrode 2 (black curved arrow). This parasitic current limits the energy efficiency. This can be solved by using electrostatically coupled electrodes (Supplementary Note 8). d, Measured power consumption of a NAND gate for the four input combinations. The standard deviations in the current are calculated from ten measurements. The differential resistances Rdiff are measured around the voltages in the second column.

Extended Data Fig. 8 Backgate-induced nonlinearity and evolved logic gates at room temperature.

a, A positive voltage VSub with respect to the drain voltage is applied to the n-type substrate (Fig. 2b) to make the depletion region wider at the p–n junction, and to suppress the band conduction. b, Evolved gates at room temperature. Red circles are experimental outputs, and black lines represent the normalized desired outputs. The output current levels, and also the separation between these levels, are more than one order of magnitude larger than those of the logic gates evolved at 77 K, owing to the increased hopping conductance (Supplementary Note 3). The increased noise intensity is mainly due to the settings of the current measurement circuit (Methods).

Extended Data Fig. 9 Experimental response of the 16 filters.

Each of them is evolved to filter the feature given in blue. The output currents corresponding to features other than the desired one are not zero, but the output current of the targeted feature is clearly separated from the other currents. Error bars represent the standard deviation obtained from ten tests.

Extended Data Fig. 10 Enhancing robustness of the linear classifier against noise.

Besides optimizing the SNR, the linear classifier’s tolerance to noise can also be increased by taking noise into account during the training phase. The accuracy remains over 92% at 0.05 nA noise amplitude (see Supplementary Note 8 for a detailed discussion).

Supplementary information

Supplementary Information

Nine additional notes are included. The first three sections are related to the charge transport mechanism and the origin of the tunable nonlinearity. Aspects related to device characteristics such as stability and reproducibility are detailed from the fourth to the sixth sections. The last three sections consist of benchmarking-related topics including the speed, energy consumption, and classification accuracy.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, T., van Gelder, J., van de Ven, B. et al. Classification with a disordered dopant-atom network in silicon. Nature 577, 341–345 (2020). https://doi.org/10.1038/s41586-019-1901-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-019-1901-0

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics