Inspired by the brain’s neural networks, scientists have for decades tried to construct electronic circuits that can process large amounts of data. However, it has been difficult to achieve energy-efficient implementations of artificial neurons and synapses (connections between neurons). In a paper in Nature, Ambrogio et al.1 report an artificial neural network containing more than 200,000 synapses that can classify complex collections of images. The authors’ work demonstrates that hardware-based neural networks that use emerging nanoelectronic devices can perform as well as can software-based networks running on ordinary computers, while consuming much less power.
Artificial neural networks are not programmed in the same way as conventional computers. Just as humans learn from experience, these networks acquire their functions from data obtained during a training process. Image classification, which involves learning and memory, requires thousands of artificial synapses. The states (electrical properties) of these synapses need to be programmed quickly and then retained for future network operation.
Nanoscale synaptic devices that have programmable electrical resistance, such as phase-change-memory (PCM) devices, show promise because of their small physical size and excellent retention properties. PCM devices contain a material known as a chalcogenide glass, which can switch reversibly between an amorphous phase (of high resistance) and a crystalline phase (of low resistance). The device’s resistance state is programmed by crystallizing part of the material using local heating produced by an applied voltage. This state is retained long after the voltage has been removed, and further programming can be achieved by crystallizing other parts of the material.
Unfortunately, PCM devices can be programmed in only one direction: from high to low resistance, by changing from low to high crystallinity. To achieve the desired resistance state with good precision, sequences of hundreds of voltage pulses are required. If the desired state is overshot, the chalcogenide glass must be completely reset to the amorphous phase and the step-by-step programming restarted. This shortcoming, combined with variations between devices caused by the manufacturing process, can slow or even prevent network training, as previous work by the group that performed the current study has shown2. As a result, the prototype networks that have been constructed using these devices3,4 are impractical and have much lower image-classification accuracies than do software-based networks.
The breakthrough of Ambrogio and colleagues’ work lies in a two-tier, bio-inspired approach. In biological neural networks, short-term changes in the states of synapses support a variety of computations, whereas long-term changes provide a platform for learning and memory5. For this reason, the authors’ artificial neural network uses synaptic ‘cells’ that contain two types of synapse: short-term and long-term (Fig. 1).
The short-term synapses are used regularly during network training. They require only brief state retention, but fast and precise programming to the desired state. Such features are provided by an electronic switch called a transistor, which has a capacitor (a device for storing electric charge) attached to one of its electrodes, known as the gate6. The transistor’s state is programmed by a fast voltage pulse applied to the gate. The capacitor maintains this voltage for a few milliseconds, providing brief state retention.
After the network has been trained on several thousand images and the short-term synapses have changed states substantially, the synaptic states are written into long-term synapses. The cycle is then repeated until all of the training images have been presented to the network. The long-term synapses are used for network operation after training is complete. They consist of PCM devices that have state-retention times of years, at the expense of tedious, energy-intensive programming.
An advantage of this technique is that the transfer of states from short- to long-term synapses can be done in electronic-circuit blocks separate from the network, while the network carries out other tasks. Moreover, although the authors’ synaptic cells are more complicated in practice — containing one capacitor, two PCM devices and five transistors — they are still about half the size of artificial synapses used in other networks6.
Ambrogio et al. tested their synaptic-cell approach using a fairly complex artificial neural network containing multiple layers of neurons and more than 200,000 PCM devices. The authors carried out classification tasks using three standard sets of images: greyscale handwritten numbers from the MNIST database7, and colour images from the CIFAR-10 and CIFAR-100 databases8. The accuracies obtained were 98%, 88% and 68%, respectively. These results are strikingly similar to those obtained using TensorFlow, a leading neural-network software.
Despite these impressive findings, a key limitation of the work is that only the PCM devices were actually fabricated; the other components of the synaptic cells and the neurons were simulated computationally. The authors took care to use accurate models that consider variations between transistors, and they proposed a method to minimize the impact of such variability on synaptic-cell performance. Most importantly, they carried out a detailed power assessment, and found that their proposed technology would consume about 100 times less power than current state-of-the-art networks, while providing a similar classification performance. Nevertheless, only a working hardware prototype will convince industry of the technology’s performance and low-power advantages. Furthermore, the estimated power consumption is still a far cry from that of biological neural networks, leaving plenty of room for improvement.
However, Ambrogio and colleagues’ work is more than a crucial stepping stone to the integration of PCM devices in neural-network hardware. It will also inspire device research, because it creates a need for nanoscale short-term synapses to replace the bulky transistor–capacitor ones. A wall in emerging memory technologies has been breached — networks based on these devices can work as well as do their software counterparts. This finding suggests that advances in artificial intelligence will not only continue, but also be accelerated by emerging hardware.
Nature 558, 39-40 (2018)