## Introduction

Neuromorphic computing systems built with memristors could have much-improved power efficiency and computing throughput than traditional hardware1,2,3,4,5,6. Recent implementations with memristors are, however, primarily artificial neural networks for supervised learning algorithms7,8,9,10,11,12,13. Unsupervised learning networks inspired by biological systems can learn through data sets without labels and hence are more energy- and cost-efficient14,15,16. Compared with the supervised network, the unsupervised learning are more similar with human brain and has more extensive application, considering that most data and information are unlabeled in real world. Besides, unsupervised approach can cluster or pre-process the unlabeled complex data to smaller subspaces for subsequent classification through another supervised network. A self-organizing map (SOM), also called a ‘Kohonen network’, is a frequently used unsupervised algorithm inspired by the topological maps in the sensory-processing areas of the brain, where neurons responding to similar inputs are spatially located very close17,18. As a result, SOMs can identify relationships of input data and are well suited for clustering and optimization problems such as language recognition and text mining, financial predictions, and medical diagnosis19,20,21,22,23,24. Furthermore, SOM is a nonlinear dimension-reduction tool that automatically maps high-dimensional data to a lower dimension (usually two- or one-dimensional), more effectively in nonlinear distributions than classical linear algorithms such as multi-dimensional scaling25 or principal components analysis26.

However, implementing SOM in conventional CMOS-based hardware is limited by the complexity in calculating the similarities and determining neighborhoods, which imposes an enormous increase in computing time and power consumption as the number of neurons and features increase27. It is therefore imperative to seek emerging energy-efficient hardware with parallel computing capacity for SOM networks. Memristor, a two-terminal resistance switch with multiple conductance states as synaptic weights, has been organized into large-scale crossbar arrays to implement parallel and energy-efficient in-memory computing using physical laws28,29,30,31,32. On the other hand, experimental demonstrations of SOM using memristors are yet to be achieved due to two main challenges: finding the shortest Euclidean distance with the unnormalized vectors and implementing complex topology of SOM output layer without extra hardware cost33,34.

Herein we report our experimental implementation of SOMs in a 128 × 64 1-transistor 1-memristor (1T1R) crossbar array and its applications in data mining and optimization. The similarities between inputs and weight are directly calculated through Euclidean distance in the hardware. The neighborhood function of SOM is directly realized by the topological structure of the memristor array without extra circuits. Memristor-based 1D-SOM and 2D-SOM are successfully employed to solve color clustering and traveling salesman problems. Compared with traditional hardware, the memristor-based SOM system has better power efficiency and higher parallelism in computing, extending the application range of memristor-based neuromorphic computing systems in artificial intelligence.

## Results

### SOM topography and algorithm

An SOM is composed of an input layer and an output layer (Fig. 1a). The input layer has multiple dimensionless nodes that act as input neurons, while the output layer is a one-dimensional (1D) line or two-dimensional (2D) grid of neurons for 1D and 2D SOM, respectively (Fig. S1). Each input node is connected to every output neuron through a weighted connection (synapses). Unlike an ordinary fully-connected unsupervised neural network, the output neurons in SOM can communicate to their topological neighborhoods. For example, for a given pattern, the best matching unit (BMU) is determined by finding the neuron with the synaptic weight vector that is most similar to the input vector. During weight updating, both the synapse connected to the BMU and those connected to the BMU’s topological neighborhoods are modified to increase the strength of the match. The weights of neighborhoods are determined by the distance between the BMU and neighborhood neurons through the neighborhood function. Typically, the neighborhood function is a unimodal function that is symmetric around the winner’s location. The connection strength decreases monotonically with the distance from the winner. Response of neurons to similar inputs is spatially located very close, among which typical ones are shown in Supplementary Fig. S1. Due to the unique neighborhood function of the SOMs, this mapping retains the relationship between input data as faithfully as possible, thus describing a topology-preserving representation of input similarities in terms of distances in the output space35.

Euclidean distance, which refers to the distance between the input vector (feature vectors) and weight matrix (dictionary vectors) in a Euclid space, represents similarities between two vectors. A smaller Euclidean distance between two elements means they are more similar, and the smallest Euclidean distance corresponds to the BMU36,37. For a certain n-dimension input X and an n×m weight matrix W, the Euclidean distance is calculated by

$${{{{{{{\boldsymbol{ D}}}}}}}}={{{||}}{{{{{{{\boldsymbol{ X}}}}}}}}-{{{{{{{\boldsymbol{ W}}}}}}}}{{||}}}^{2}={{{{{{{{\boldsymbol{ X}}}}}}}}}^{2}-2{{{{{{{\boldsymbol{ W}}}}}}}}{{\cdot}}{{{{{{{\boldsymbol{ X}}}}}}}}+{{{{{{{{\boldsymbol{ W}}}}}}}}}^{2}$$
(1)

After finding the BMU, the weights of the winner and its neighborhoods are updated based on the following equations.

$${\Delta} {{{{{\boldsymbol{W}}}}}}= \eta \cdot {T}_{i} \cdot ({{{{{\boldsymbol{ X}}}}}}-{{{{{\boldsymbol{ W}}}}}})$$
(2)

Where η is the learning rate of matrix W and Ti is the neighborhood function of the ith neuron determined by

$${T}_{i}={{\exp }}\left(-\frac{{\left({r}_{c}-{r}_{i}\right)}^{2}}{2\cdot {{\delta }}}\right)$$
(3)

In this equation, ri and rc represent the location of neuron i and winner c, $${\left({r}_{c}-{r}_{i}\right)}^{2}$$ denotes the topological distance between neuron i and the winner neuron, and δ is a time-carrying parameter that guides the reduction of the neighborhood function during training.

### Implementing SOM in a memristive crossbar array

We experimentally implemented the SOMs in a 128 × 64 1T1R memristor array composed of Pd/TaOx/Ta memristors (Fig. 1b, c). The programming and computing were implemented by off-chip peripheral circuits on custom-built print circuit boards (PCBs) and MATLAB scripts, as have been successfully used in our early demonstrations38,39,40. The 1T1R memristor array acted as a weight matrix (W), consisting of two types of weights (Wdata in data rows and Wsquared in square rows). In our SOM systems, all the weights are linearly mapped into the conductance of the memristors, as shown in Fig. 1d. In the computing, normalized input vectors (0–1) to the data rows were coded with amplitudes of voltage pulses of fixed pulse width.

The BMU is detected by finding the shortest Euclidean distance between the Wdata and input vectors. Only if the input vectors and Wdata are both normalized, the shortest Euclidean distance is equivalent to the smallest dot product between the input vectors and Wdata. Unfortunately, numerically normalizing the weight vectors after every updating step is difficult and resource-demanding. In our SOM, we use the square row method to calculate the Euclidean distance without normalizing the weight matrix in every step. For a certain input vector X, X2 was a constant; Wdata·X term was calculated using the standard vector-matrix multiplication approach through Ohm’s Law and Kirchhoff’s Current Law in a crossbar array. During the training process, in each training epoch, for all nodes the X2 term are only depended on the input value and equal with each other, as a result the X2 term will not affect the comparison when determining the winner. However, the W will be updated in each epoch, for different node the weight is different. As a result, the W2 term affects the selection of the winning neuron and cannot be neglected. To calculate the $${{{{{{{{\boldsymbol{ W}}}}}}}}}_{{{{{{{{\boldsymbol{data}}}}}}}}}^{{{{{{{{\boldsymbol{2}}}}}}}}}$$ term, the inputs to the square rows were kept at −1/2 (represented by −0.1 V as the read voltage was 0.2 V), and the weights of the square rows in a certain column (for example, mth column in the array) were all the same ($${w}_{{m\_square}}=\frac{\mathop{\sum }\limits_{{{{{{{{\boldsymbol{ i}}}}}}}}{{{{{{{\boldsymbol{=}}}}}}}}{{{{{{{\boldsymbol{1}}}}}}}}}^{{{{{{{{\boldsymbol{ n}}}}}}}}}{{{{{{{{\boldsymbol{ w}}}}}}}}}_{{{{{{{{\boldsymbol{data}}}}}}}}\_{{{{{{{\boldsymbol{im}}}}}}}}}^{{{{{{{{\boldsymbol{2}}}}}}}}}}{l}$$, where l is the number of square rows and n is the number of data rows). As a result, the output current at the mth column is $${I}_{m}={{{{{{{{\boldsymbol{ W}}}}}}}}}_{{{{{{{{\boldsymbol{data}}}}}}}}}\cdot {{{{{{{\boldsymbol{ X}}}}}}}}-l\cdot \frac{{{{{{{{{\boldsymbol{ W}}}}}}}}}_{{{{{{{{\boldsymbol{squared}}}}}}}}}}{{{{{{{{\boldsymbol{2}}}}}}}}}=\frac{1}{2}({2\cdot {{{{{{{\boldsymbol{ W}}}}}}}}}_{{{{{{{{\boldsymbol{data}}}}}}}}}\cdot {{{{{{{\boldsymbol{ X}}}}}}}}-{{{{{{{{\boldsymbol{ W}}}}}}}}}_{{{{{{{{\boldsymbol{data}}}}}}}}}^{{{{{{{{\boldsymbol{2}}}}}}}}})$$, opposite to the Euclidean distance between the input vector and the weight. As a result, the column with the largest measured current has the smallest Euclidean distance with the input, and the corresponding neuron will be the winner or the BMU. We used multiple additional rows in the crossbar (square rows in Fig. 1d), ensuring that the memristors in the data rows and square rows have the same conductance dynamic range. In the most extreme situation (all the weights in a column are the maximum value “1”), the largest number of the additional square rows l should be equal to the number of data rows n; so the largest weight in Wsquared will not go beyond the largest weight in Wdata, while in most cases, l is smaller than n. In previous one-row method, which use width (0~T) to code the input (0~1)41, to fit the conductance dynamic range of data row and additional row, the conductance of the device in the one additional row is $$\frac{{{{{{{{{\boldsymbol{ W}}}}}}}}}^{2}}{l}$$. As a result, the pulse width of the additional row should be l×T, which expands l times time cost than our multiple rows method. Besides, we use multiple devices to represent the W2, which can significantly reduce the impact of the writing errors of one device. Compare with one-row method, our multiple square row method needs a larger area but has less time cost and less disturbance from writing errors.

After finding the BMU, the weight updating depends on the topological distance of the neurons in the output map and the neighborhood function. The intrinsic topology of the memristor-array-based neurons is 1D-line because devices in the same column act as the synapse connecting to one neuron in the array. Figure 1d presents the schematic of utilizing this 1D-topology to form a 2D rectangular SOM output map. For a 5 × 5 SOM, the first five columns of the array act as the first row of the output map; the 6th to 10th columns of the array act as the second row of output map; and the last five columns act as the last row of the output map. The weights in the data rows (Wdata) will be updated according to Eq. (2), with a gaussian neighborhood function without the depression part has been used in our SOM, and those (Wsquared) in the square rows will be updated by

$$\triangle {{{{{{{{\boldsymbol{ W}}}}}}}}}_{{{{{{{{\boldsymbol{squared}}}}}}}}}=\frac{{{{{{{{{\boldsymbol{ W}}}}}}}}}_{{{{{{{{\boldsymbol{data}}}}}}}}}^{2}}{l}-{{{{{{{{\boldsymbol{ W}}}}}}}}}_{{{{{{{{\boldsymbol{squared}}}}}}}}}$$
(4)

Because of the limitation of our hardware-measure system that can only read and write in one direction (Write on rows and read on columns), the square rows are calculated in software part in offline mode. The squared term can be directly calculated in on-line mode based on the memristor array41. The squared values can be obtained in the memristor crossbar by sequential backward and forward read operations through the W matrix. First in backward reading process, a reading pulse proportional to 1 are applied to the n column. The scaled weight elements of data rows can be obtained while the square rows were floated. Then forward reading process, these values are then used as inputs that are fed into the data rows whereas floating the square rows. The output collected at column n then corresponds to the squared value. The network will be trained to achieve a self-organizing map by repeating the iterative training process with the search steps and the update steps. In the testing process, inputs from testing sets are applied in the arrays one by one, and the best matching unit with the biggest output current will be found. Different post-processing methods are adopted for various applications, such as transferring the weight to RGB pixel for image processing or sequence the neurons for solving TSP. The flow chart of training and testing processes is illustrated in Fig. 1e. It is worth pointing out that calculating the $$\triangle{{{{{{{{\boldsymbol{ W}}}}}}}}}_{{{{{{{{\boldsymbol{data}}}}}}}}}{{{{{{{\boldsymbol{,}}}}}}}}$$ $$\triangle {{{{{{{{\boldsymbol{ W}}}}}}}}}_{{{{{{{{\boldsymbol{squared}}}}}}}}}$$ and neighborhood function Ti are computationally expensive. As a result, they are achieved in software and will be accomplished with circuits integrated onto the chip in the future.

### Color clustering and image processing

Clustering is the analysis and organization of a collection of patterns through the similarities42. In our SOM, the similarities between input data are mapped into the output neurons’ topographical or spatial organization relation, which gives the SOMs adjustable clustering ability. We used the memristor-based SOM for color clustering, and clustered 256 colors into an 8 × 8 SOM with a 5 × 64 crossbar array. The inputs to three data rows were the normalized R (red), G (green), B (blue) (0–1) color components of a pixel, and inputs to the two square rows were −1/2. A randomly selected color was first applied to the data rows during training, and the weights were updated according to the algorithms as described in Eqs. 24. The evolution of the weights is illustrated in Fig. S2, and the final conductance map after the training process is shown in Fig. 2a. In the clustering process, a color vector was applied to each data row, −1/2 was still applied on the square rows. The 1st, 2nd, and 3rd rows are data rows, representing the red, green, and blue color components of output neurons, respectively. To implement a 2D-rectangular output map, devices in the 1st to the 8th columns of memristor array act as eight output neurons’ weights in the first row of the 2D 8 × 8 output map, the devices in the 9th to the 16th columns of the array act as the second row of the output map, whereas the devices in the 57th to 64th columns representing the weights of output neurons in the last row of the output map. Figure 2b shows the self-organized pattern map of the trained color mapping SOMs. After training, nearby neurons tend to respond to the same input type colors, e.g., reds are concentrated to the bottom left corner, yellows are concentrated to the bottom  right corner, and oranges are located in the middle bottom, just between the reds and yellows. The clustering ability of the SOMs is further proven on standard cluster databases, as shown in Figs. S3 and S4. And after the clustering process, the memristor-based 2D-SOM/1D-SOM system enables 94.6% and 95% high classification accuracy for IRIS data set and wine data set, respectively. Compared with supervised method (96.08% and 94.7% for IRIS and wine data set, respectively)43,44, our SOM can achieve competitive accuracy, which is also better than memristor-based unsupervised K-mean system (93.3% for IRIS data set). Besides, in comparison with other unsupervised cluster algorithm (e.g., K-means) SOM does not need any additional information at all. K-means needs to determine the number of classes in prior. Therefore, SOM is less affected by initialization and has more generality. In training process, SOM updates the winner and the adjacent nodes. Most approach only updates a single node. Therefore, SOM is less affected by noise data. SOM also has good visualization and elegant topology diagram, which make people easier to understand and analyze45.

The color mapping capability of our SOM, characterized by the number of output colors, is affected by the strength of the neighborhood function. The strength is evaluated by the neighborhood function factor A, which is represented  as the initial size of neighborhood area. Based on the SOMs, a dynamic color mapping capability ratio (or resolution in color clustering) can be easily achieved by tuning the neighborhood function factor in the software part for different applications, as shown in Fig. 2c. More neighborhood neurons’ weights will be updated with a larger neighborhood factor. With a large neighborhood function factor (from 5 to 50), the size of the cluster will be increased, and obviously, the number of the final clusters will be decreased. However, a sizeable compression ratio achieved by a smaller number of clusters may cause serious data loss and affect the following pattern recognition. It is worth noting that when the neighborhood function value is tiny (e.g. 0.1), only the weights of the winner will be updated. The total number of clusters is decreased since the SOMs lose the ability to recognize mixed colors from the neighborhood function.

Compared with other methods46 (Fig. S5), the square row approach for calculating Euclidean distance in our SOM has a stronger color mapping ability (a higher resolution in color clustering). Figures 2d and S6 show the comparison of three similarity calculation methods, Euclidean distance, dot product, and normalized dot product, in the task of clustering 256 colors into 64 neurons with identical parametric configuration for the highest resolution. The Euclidean distance method shows the best result as 256 colors are successfully compressed into 48 colors with 48 firing neurons. In contrast, only 6 and 9 colors are left with dot product and normalized dot product. That is because the clustering potential is strongly related to the accuracy of the similarities obtained by the memristor array, and our approach can directly calculate the precise Euclidean distance.

These results support that our memristor-based SOM shows good performance in different image processing tasks, including compression (Fig. S7) and segmentation. As shown in Fig. 2e, using the self-organizing feature of SOM, color-based image segmentation can be realized. After training, the neurons in the output layer are clustered into five categories according to each neuron’s position and response color in the SOM (Fig S8). Pixels in the colorful origin image acts as the input and are applied into the network and classified according to the category of the corresponding response neuron. Three sub-images were obtained, while the pixels in one sub-image are the nodes in the same cluster. As a result, the original image was successfully segmented into three sub-images by different colors (red, green, and yellow). As a result, the flowers can be segmented into different parts: red petal, yellow petal, and green leaf (Fig. 2e). It is worth noting that two other parts (black stamen and white background) of the original figure can be also segmented based on the result in Fig. S8. To show the energy-efficient nature of our memristor-based SOM in image processing, we performed an energy estimation, in which the energy consumption of the array is ~3.84 μJ for the image segment tasks, which is much lower than that in CMOS platforms27 (see Supplementary Note 1).

### Solving traveling salesman problem

Our memristor-based SOM can be used to address optimization problems, such as the traveling salesman problem (TSP), that intends to find the shortest yet most efficient route among multiple nodes. In clustering application, people normally use the SOM with 2D planer output layer to maintain more topology information of input data. However, in solving TSP problems, the SOM with 1D-ring output layer is more common. In clustering, the SOM is a mapping  tool which  maps the high dimension data to low dimension space, the neighborhood function is used to maintain the topology information of input data. In TSP, SOM is considered as an elastic ring, with the training process, the nodes turn to catch the cities, and due to the shrink ability from the neighborhood function, the ring trends to minimize the perimeter. The principle of solving TSP by memristor-based SOM is illustrated in Fig. 3a. The topology of SOMs’ output layer is a 1D ring, representing the calculated route of TSP. In our memristor-based SOM, the last column of the array is considered as the neighborhood of the first column,  like a set of neurons joins together in a one-dimensional ring. During the training process, the coordinates of the cities were applied to the data rows. The winner neuron, representing the closest to the chosen city, was found by detecting the column with the largest output current. The weight of the winner and its neighborhood were updated according to Eqs. 24. It seems that the winner node moves in the city plane and induces its neighborhood on the ring to do so, but with a decreasing intensity along the ring. This correlation between the motion of neighborhood nodes intuitively leads to a minimization of the distance between two neighborhoods, hence giving a short tour47. In the testing process, all cities are applied to the SOM one by one, normally different winners were found when applying different cities. As shown in Fig. 3a, for example, in a four-city TSP, the coordinate will be applied one by one, while the 2nd node is the winner when city A is applied, whereas the 1st, the nth, and the 4th nodes are the winners when city B, city C, and city D are applied, respectively. The intrinsic physical sequence of nodes can directly correspond to the city sequence. Therefore, the route can be directly obtained without extra configuration of the network by sorting the winners according to the intrinsic sequence of neurons in the memristor array. In this example, by sorting the winner sequence in the crossbar array (the 1st column −>the 2nd column −>the 4th column− > the nth column), the obtained shortest route between chosen cities should be departed from city B via city A, city D to city C.

We experimentally solved a 10-city TSP with our memristor-based SOM, in which the coordinates of the cities were randomly generated as shown in Fig. 3b–d. The conductance map and normalized weight of the training process are shown in Fig. S9. With a gradually decreasing neighborhood function in the training process, the nodes will progressively become independent and eventually attach to different cities. After 210 training epochs, the final conductance map of the crossbar array is shown in Fig. 3b. And Fig. 3c shows the city map and normalized weight vectors of all nodes that mapped into the city space; the blue dots represent the position of 10 cities, and the orange dots are the normalized weights of 45 different nodes. After training, the city’s coordinate will be applied to the SOM one by one to find the winner neuron, just like every city will ‘catch’ one nearest node of the ring. Based on this method, the testing result of 10 cities is presented in Fig. 3d. The orange numbers are the serials number of the winner nodes in the whole array when 10-city coordinates are acted as inputs. The blue dash line is the shortest route calculated by our SOM, determined by the node sequence in the memristor array (5 → 9 → …→38 → 41 → 45). In the experimental training process, some neuron outputs deviate from the original positions, which is caused by the writing error and the variation of the devices. However, in our SOM, the number of nodes is three to five times the number of cities. As a result, in most cases, the extra neurons help overcome this issue.

The performance of the SOM to solve TSP is analyzed with the accuracy and probability (P) which are defined as Accuracy = $$\frac{{{{{{{{\boldsymbol{Shortest}}}}}}}}\_{{{{{{{\boldsymbol{Distance}}}}}}}}}{{{{{{{{{\boldsymbol{Distance}}}}}}}}}_{{{{{{{{\boldsymbol{ t}}}}}}}}{{{{{{{\boldsymbol{ested}}}}}}}}}}$$ and $${P}_{{Accuracy}}=P({{Distance}}_{{tested}}\le \frac{{{{{{{{{\rm{Distance}}}}}}}}}_{{{{{{{{\rm{shortest}}}}}}}}}}{{Accuracy}})$$, respectively. As shown in Fig. 3e, the accuracy increases with the number of training epochs and the shortest route can be obtained just after 50 training epochs. The inset of Fig. 3e shows the probabilities of different accuracies with the number of training epochs. The success probability that finding the shortest route (P100%) is nearly 58% after 100 training epochs. And P95% can beyond 90% after 100 epochs, whereas the P90% and P85% can achieve nearly 100% just after 40 epochs.

The performance of the SOM with different conditions is futured studied in Fig. 4. Figure 4a shows the result of 20-node SOM with a different number of cities. The system can obtain perfect results for a six-city TSP. With the increase of the number of cities, the complexity of the problem increases, the ratio between the winner nodes and the cites and the accuracy decrease. It is worth noting that which city should be firstly arrived is randomly selected when different cities make the same node win during the testing process. The accuracy and probability decrease due to the device writing error and the lack of nodes in the SOM. The writing-error-induced performance degradation is further confirmed in Fig. 4b. The simulation is based on a 4 × 70 array with different writing errors (from 0% to 5%). High accuracy and probability (P95%) are obtained with low writing error (below 1%). With increasing of the writing errors, the ratio between the winners and the cities decreases, which implies the deterioration of the ability to distinguish nearby cities. The accuracy decreases to 75% and P95% decreases to 13% with a 5% writing error. A significant impact of writing errors on the accuracy due to the SOM does not rely on minimizing an output label error or a cost function which can provide a feedback mechanism to help network stabilization41. As a result, the Euclidean distance between the weight vectors and the inputs should be calculated precisely to solve the TSPs with SOM. To reduce the write errors, we adopt a write-and-verify method in our hardware SOM system. To further reduce the writing errors, multiple-device-as-one-synapse approach is used, which decreases the total writing error and increase the robustness of the network48. As shown in Fig. S10, we simply added more sub-arrays in the memristor array to implement multiple device synapses. For example, a three-device-weight in a 4×N weight map can be easily achieved with a 12×N array by parallelly connecting three identical sub-3×N arrays, without any extra circuit. This effective method has been tested and verified with experiment. For an 8-city TSP, based on our 20 columns(neurons) array, the experiment accuracy and P95% increased from 78% to 93% and from 64% to 78%, respectively when using five devices as one weight, as shown in Fig. 4c.

Increasing the number of output nodes will also improve the performance of SOM. As presented in Fig. 4d, for a 20 cities TSP, the accuracy and P95% are only 74% and 39% when using 40 output nodes. Because if there are not enough nodes, some cities may be far from all the nodes or  more nodes will converge to the same city, which stems from the shrinking ability of the SOM that makes nodes tend to cluster together. With the number of neurons rising from 40 to 80, the accuracy increases from 74% to 91%; whereas P95% increases to 68%. It is worth noting that the accuracy and probability will not remarkably increase when continually using more neurons. As a result, to obtain a solution for N-city TSP, ~3N to 5N neurons are needed in our SOM hardware for acceptable accuracy.

Besides normal 2D-TSP, 3D TSPs can also be easily implemented by our SOM just by using three inputs in data rows (Fig. 4e) The city map in 3D space, and the weight maps of all neurons after training in of 15-city 3D-TSP solved by 5 × 45 array is presented in Fig. 4f. And Fig. 4g shows the testing result of 15-city 3D-TSP in which the blue dots represent the position of the cities. Orange dashed line is the shortest route in 3D space determined by sorting the winners. Besides, a larger-scale memristor-based SOM with ideal device performance will address TSP with a much large number of cities. Fifty-city TSP is demonstrated with simulation Fig. S11 to show the potential of solving the complex problem of our system. For hardware-implementation of large-scale SOM, one can use a single large array or multiple small arrays, and architecture-level design (which is out of the scope of this work) may be necessary for large-scale and complex problems. Our memristor-based SOM also shows the potential of energy-efficient in solving optimization problems. In solving 10-city TSP by the 12 × 45 SOM, the total energy consumption in the memristor array part is only 12.71 nJ. And the detail of energy consumption of the system has been presented in Supplementary Note 1.

In previous work, memristor-based Hopfield networks have been proposed to solve the optimization problems49,50. The principle of SOM and Hopfield for TSP is totally different. SOM is considered as an elastic ring. The shortest route is obtained by the shrink ability from the neighborhood function. And Hopfield minimizes the energy function. The memristors in SOM are written at each cycle which may suffer reliability issues with cycling, while in the Hopfield the meristor conducatnce are static. However, in hardware implementation solving the TSP based on Hopfield network is extremely expensive. For an N-city TSP, N2 neurons and N4 synapses are needed since the number of fully interconnected nodes is proportional to the square of the number of cities. And for our memristor-based SOM, even counting on the additional square rows, only around 4N neurons and 16N synapses are needed, proving that our SOM can solve more complex optimization problems with less energy consumption and hardware cost. Besides compared with our previous work about solving TSP with memristor based on Hopfield network51, which obtained 40% successful probability with 104 pre-determined iteration cycles for 8-city TSP, our SOM system can achieve higher successful probability (58%) only with hundreds training epochs for 10-city TSP.

## Discussion

We have experimentally demonstrated in-situ SOM in memristor crossbar arrays. The Euclidean distance is directly calculated in the hardware by adopting additional rows of 1T1R cells. The similarities between input vectors and weight vectors are computed in one readout step without normalized weights. We have further employed the memristor-based SOM in clustering and solving the traveling salesman problems. Taking advantage of the intrinsic physical properties of memristors and the massive parallelism of the crossbar architecture, the novel memristor-based SOM hardware has advantages in computing speed, throughput, and energy efficiency, compared with current state-of-the-art SOM implementation, as shown in Supplementary Table 2. Unlike a digital counterpart, the entire read operation is performed in a single time step, so the latency does not scale with the size of the array. The energy consumptions of the memristor device are extremely low in both inference (~40 fJ) and update (~2.42 pJ) processes. And even for compute-intensive tasks such as image segment or optimization problems (e.g., solving TSP), the memristor-based SOM can achieve a high energy efficiency system due to the small energy required in a memristor-based weight array over an all-digital system. Besides, SOM hybrid systems such as SOM-MLP52,53, SOM-RNN54, SOM-LSTM55,56, show better performance and are more robust in pattern classification and prediction than simple artificial neural network systems. Our results encourage advances in the hardware implementation of unsupervised neural networks using emerging devices and provide a promising path towards machine learning or neuromorphic computing based on memristors.

## Method

### The fabrication process of 1T1R array

The 1T1R arrays are composed of Pd/TaOx/Ta memristors. The front-end and part of the back-end processes to build the transistor arrays were completed in a commercial foundry. Three different metal layers Ag/Ti/Pd (3 nm/7 nm/50 nm) were deposited by sputtering under a decent vacuum background after the argon plasma treatment to remove the native oxide layers on the via, followed by an annealing process at 300 °C for 40 min in N2 atmosphere. A 2-nm Ti adhesive layer and 20 nm Pt bottom electrode was deposited by e-beam evaporator and patterned by lift-off process. A 5-nm Ta2O5 switching layer was deposited by sputtering followed by lift-off process for patterning. Finally, Ta/Pt (20 nm/20 nm) top electrodes was deposited and patterned by sputtering and lift-off process, respectively.

### Measurement system

An in-house measurement system has been built to electrically read and write the 1T1R chip. The programming and computing were implemented by off-chip peripheral circuits on custom-built print circuit boards (PCBs) and MATLAB scripts, the details of the systems can be found in our early demonstrations38,39,40.