Scaling up molecular pattern recognition with DNA-based winner-take-all neural networks

Cherry, Kevin M.; Qian, Lulu

doi:10.1038/s41586-018-0289-6

Letter
Published: 04 July 2018

Scaling up molecular pattern recognition with DNA-based winner-take-all neural networks

Kevin M. Cherry¹ &
Lulu Qian^1,2

Nature volume 559, pages 370–376 (2018)Cite this article

31k Accesses
312 Citations
601 Altmetric
Metrics details

Subjects

Abstract

From bacteria following simple chemical gradients¹ to the brain distinguishing complex odour information², the ability to recognize molecular patterns is essential for biological organisms. This type of information-processing function has been implemented using DNA-based neural networks³, but has been limited to the recognition of a set of no more than four patterns, each composed of four distinct DNA molecules. Winner-take-all computation⁴ has been suggested^5,6 as a potential strategy for enhancing the capability of DNA-based neural networks. Compared to the linear-threshold circuits⁷ and Hopfield networks⁸ used previously³, winner-take-all circuits are computationally more powerful⁴, allow simpler molecular implementation and are not constrained by the number of patterns and their complexity, so both a large number of simple patterns and a small number of complex patterns can be recognized. Here we report a systematic implementation of winner-take-all neural networks based on DNA-strand-displacement^9,10 reactions. We use a previously developed seesaw DNA gate motif^3,11,12, extended to include a simple and robust component that facilitates the cooperative hybridization¹³ that is involved in the process of selecting a ‘winner’. We show that with this extended seesaw motif DNA-based neural networks can classify patterns into up to nine categories. Each of these patterns consists of 20 distinct DNA molecules chosen from the set of 100 that represents the 100 bits in 10 × 10 patterns, with the 20 DNA molecules selected tracing one of the handwritten digits ‘1’ to ‘9’. The network successfully classified test patterns with up to 30 of the 100 bits flipped relative to the digit patterns ‘remembered’ during training, suggesting that molecular circuits can robustly accomplish the sophisticated task of classifying highly complex and noisy information on the basis of similarity to a memory.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Winner-take-all neural network and its DNA implementation.**

**Fig. 2: Experimental characterization of winner-take-all DNA neural networks.**

**Fig. 3: A winner-take-all DNA neural network that recognizes 100-bit patterns as one of two handwritten digits.**

**Fig. 4: A winner-take-all DNA neural network that recognizes 100-bit patterns as one of nine handwritten digits.**

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

John Jumper, Richard Evans, … Demis Hassabis

Machine learning reveals the control mechanics of an insect wing hinge

Article 17 April 2024

Johan M. Melis, Igor Siwanowicz & Michael H. Dickinson

De novo design of protein structure and function with RFdiffusion

Article Open access 11 July 2023

Joseph L. Watson, David Juergens, … David Baker

References

Wadhams, G. H. & Armitage, J. P. Making sense of it all: bacterial chemotaxis. Nat. Rev. Mol. Cell Biol. 5, 1024–1037 (2004).
Article PubMed CAS Google Scholar
Mori, K., Nagao, H. & Yoshihara, Y. The olfactory bulb: coding and processing of odor molecule information. Science 286, 711–715 (1999).
Article PubMed CAS Google Scholar
Qian, L., Winfree, E. & Bruck, J. Neural network computation with DNA strand displacement cascades. Nature 475, 368–372 (2011).
Article PubMed CAS Google Scholar
Maass, W. On the computational power of winner-take-all. Neural Comput. 12, 2519–2535 (2000).
Article PubMed CAS Google Scholar
Kim, J., Hopfield, J. & Winfree, E. Neural network computation by in vitro transcriptional circuits. Adv. Neural Inf. Process. Syst. 17, 681–688 (2005).
Google Scholar
Genot, A. J., Fujii, T. & Rondelez, Y. Scaling down DNA circuits with competitive neural networks. J. R. Soc. Interface 10, 20130212 (2013).
Article PubMed PubMed Central Google Scholar
Muroga, S. Threshold Logic and its Applications (Wiley Interscience, New York, 1971).
MATH Google Scholar
Hopfield, J. J. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl Acad. Sci. USA 79, 2554–2558 (1982).
Article ADS MathSciNet PubMed MATH CAS Google Scholar
Yurke, B., Turberfield, A. J., Mills, A. P., Simmel, F. C. & Neumann, J. L. A DNA-fuelled molecular machine made of DNA. Nature 406, 605–608 (2000).
Article ADS PubMed CAS Google Scholar
Zhang, D. Y. & Seelig, G. Dynamic DNA nanotechnology using strand-displacement reactions. Nat. Chem. 3, 103–113 (2011).
Article PubMed CAS Google Scholar
Qian, L. & Winfree, E. Scaling up digital circuit computation with DNA strand displacement cascades. Science 332, 1196–1201 (2011).
Article ADS PubMed CAS Google Scholar
Thubagere, A. J. et al. Compiler-aided systematic construction of large-scale DNA strand displacement circuits using unpurified components. Nat. Commun. 8, 14373 (2017).
Article ADS PubMed PubMed Central CAS Google Scholar
Zhang, D. Y. Cooperative hybridization of oligonucleotides. J. Am. Chem. Soc. 133, 1077–1086 (2011).
Article PubMed CAS Google Scholar
Redgrave, P., Prescott, T. J. & Gurney, K. The basal ganglia: a vertebrate solution to the selection problem? Neuroscience 89, 1009–1023 (1999).
Article PubMed CAS Google Scholar
Zhang, D. Y. & Winfree, E. Control of DNA strand displacement kinetics using toehold exchange. J. Am. Chem. Soc. 131, 17303–17314 (2009).
Article PubMed CAS Google Scholar
Yurke, B. & Mills, A. P. Using DNA to power nanostructures. Genet. Program. Evol. Mach. 4, 111–122 (2003).
Article Google Scholar
Cardelli, L. & Csikász-Nagy, A. The cell cycle switch computes approximate majority. Sci. Rep. 2, 656 (2012).
Article ADS PubMed PubMed Central CAS Google Scholar
Chen, Y.-J. et al. Programmable chemical controllers made from DNA. Nat. Nanotechnol. 8, 755–762 (2013).
Article ADS PubMed PubMed Central CAS Google Scholar
LeCun, Y., Cortes, C. & Burges, C. J. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/index.html.
Deng, L. The MNIST database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag. 29, 141–142 (2012).
Article ADS Google Scholar
Cherry, K. M. WTA Compiler. http://www.qianlab.caltech.edu/WTAcompiler/ (2017).
Rojas, R. Neural Networks: A Systematic Introduction (Springer, Berlin, 2013).
MATH Google Scholar
Zhang, D. Y. & Seelig, G. DNA-based fixed gain amplifiers and linear classifier circuits. In DNA 2010: DNA Computing and Molecular Programming (eds Sakakibara, Y. & Mi, Y.) 176–186 (Springer, 2011).
Chen, S. X. & Seelig, G. A DNA neural network constructed from molecular variable gain amplifiers. In DNA 2017: DNA Computing and Molecular Programming (eds Brijder, R. & Qian, L.) 110–121 (Springer, Cham, 2017).
Cho, E. J., Lee, J.-W. & Ellington, A. D. Applications of aptamers as sensors. Annu. Rev. Anal. Chem. 2, 241–264 (2009).
Article CAS Google Scholar
Li, B., Ellington, A. D. & Chen, X. Rational, modular adaptation of enzyme-free DNA circuits to multiple detection methods. Nucleic Acids Res. 39, e110 (2011).
Article PubMed PubMed Central CAS Google Scholar
Pei, R., Matamoros, E., Liu, M., Stefanovic, D. & Stojanovic, M. N. Training a molecular automaton to play a game. Nat. Nanotechnol. 5, 773–777 (2010).
Article ADS PubMed CAS Google Scholar
Fernando, C. T. et al. Molecular circuits for associative learning in single-celled organisms. J. R. Soc. Interface 6, 463–469 (2009).
Article PubMed CAS Google Scholar
Aubert, N. et al. Evolving cheating DNA networks: a case study with the rock–paper–scissors game. In ECAL 2013: Advances in Artificial Life (eds Liò, P. et al.) 1143–1150 (MIT Press, Cambridge, 2013).
Lakin, M. R., Minnich, A., Lane, T. & Stefanovic, D. Design of a biochemical circuit motif for learning linear functions. J. R. Soc. Interface 11, 20140902 (2014).
Article PubMed PubMed Central CAS Google Scholar
Zadeh, J. N. et al. NUPACK: analysis and design of nucleic acid systems. J. Comput. Chem. 32, 170–173 (2011).
Article PubMed CAS Google Scholar

Download references

Acknowledgements

We thank R. M. Murray for sharing an acoustic liquid-handling robot. We thank C. Thachuk and E. Winfree for discussions and suggestions. K.M.C. was supported by a NSF Graduate Research Fellowship. L.Q. was supported by a Career Award at the Scientific Interface from the Burroughs Wellcome Fund (1010684), a Faculty Early Career Development Award from NSF (1351081), and the Shurl and Kay Curci Foundation.

Reviewer information

Nature thanks R. Schulman and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information

Authors and Affiliations

Bioengineering, California Institute of Technology, Pasadena, CA, USA
Kevin M. Cherry & Lulu Qian
Computer Science, California Institute of Technology, Pasadena, CA, USA
Lulu Qian

Authors

Kevin M. Cherry
View author publications
You can also search for this author in PubMed Google Scholar
Lulu Qian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

K.M.C. developed the model, designed and performed the experiments, and analysed the data; K.M.C. and L.Q. wrote the manuscript; L.Q. initiated and guided the project.

Corresponding author

Correspondence to Lulu Qian.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 DNA implementation of winner-take-all neural networks.

The winner-take-all computation is broken into five subfunctions: weight multiplication, summation, pairwise annihilation, signal restoration and reporting. In the chemical reactions listed next to the five subfunctions, the species in black are needed as part of the function, the species in grey are needed to facilitate the reactions and the waste species are not shown. k_f and k_s are the rate constants of the pairwise-annihilation and signal-restoration reactions, respectively. In the DNA-strand-displacement implementation, weight multiplication and signal restoration are both catalytic reactions. The grey circle with an arrow indicates the direction of the catalytic cycle. Representative, but not all possible, states are shown for the pairwise-annihilation reaction. Zigzag lines indicate short (5 or 7 nucleotide) toehold domains and straight lines indicate long (15 or 20 nucleotide) branch-migration domains in DNA strands, with arrowheads marking their 3′ ends. Each domain is labelled with a name, and asterisks in the names indicate sequence complementarity. Black-filled and white-filled arrowheads indicate the forwards and backwards directions of a reaction step, respectively. All DNA sequences are listed in Supplementary Table 1.

Extended Data Fig. 2 Seesaw circuit implementation of winner-take-all neural networks.

a, Same as Fig. 1a. b, Seesaw circuit diagram¹¹ for implementing the winner-take-all neural network. Each black number indicates the identity of a seesaw node. A total of n + 3m nodes are required for implementing a winner-take-all neural network with m memories that each has n bits. The location and absolute value of each red number indicates the identity and relative initial concentration of a DNA species, respectively. A red number on a wire connected to a node (or between two nodes) indicates a free signal molecule, which can be an input or fuel strand. A red number inside a node indicates a gate molecule, which can be a weight, summation gate or restoration gate. A red number on a wire that stops perpendicularly at two wires indicates an annihilator molecule. A negative red number inside a half node with a zigzag arrow indicates a reporter molecule.

Extended Data Fig. 3 Experimental characterization of winner-take-all DNA neural networks.

a, Two-species winner-take-all behaviour. The experimental data (left, same as Fig. 2a) were used to identify the reverse rate constant k_r = 0.4 s⁻¹ of the annihilation reaction in simulations (right). All fluorescence kinetics data and simulation are shown over the course of 2.5 h. The standard concentration is 50 nM (1×). Initial concentrations of the annihilator, restoration gates, fuels and reporters are 75 nM (1.5×), 50 nM (1×), 100 nM (2×) and 100 nM (2×), respectively. b, A 4-bit pattern recognition circuit with input concentration varying from 50 nM to 500 nM. In each output trajectory plot, dotted lines indicate fluorescence kinetics data and solid lines indicate simulation. The patterns to the left and right of the arrow indicate input signal and output classification, respectively. c, Applying thresholding to clean up noisy input signals. The thresholding mechanism has been reported previously in work on seesaw DNA circuits¹¹. The extended toehold in threshold molecule has 7 nucleotides. In b and c, to compare the range of inputs, the concentration of each input strand is shown relative to 50 nM. The initial concentration of each weight molecule is either 0 or 50 nM; weight fuels are twice the concentration of weight molecules. The initial concentrations of the summation gates, annihilator, restoration gates, restoration fuels and reporters are 100 nM (1×), 400 nM (4×), 100 nM (1×), 200 nM (2×) and 200 nM (2×), respectively, with a standard concentration of 100 nM.

Source Data

Extended Data Fig. 4 A winner-take-all DNA neural network that recognizes 9-bit patterns as ‘L’ or ‘T’.

In each output trajectory plot, dotted lines indicate fluorescence kinetics data and solid lines indicate simulation. The standard concentration is 50 nM (1×). The initial concentration of each input strand is either 0 or 50 nM (1×). The initial concentration of each weight molecule is either 0 or 10 nM (0.2×); weight fuels are twice the concentration of weight molecules. The initial concentrations of the summation gates, annihilator, restoration gates, restoration fuels and reporters are 50 nM (1×), 75 nM (1.5×), 50 nM (1×), 100 nM (2×) and 100 nM (2×), respectively. The patterns to the left and right of the arrow indicate input signal and output classification, respectively. In addition to the perfect inputs, 28 example input patterns with 1–5 corrupted bits were tested. Note that 5 is the maximum number of corrupted bits, because an ‘L’ with more than 5-bit corruption will be as similar as or more similar to a ‘T’, and vice versa.

Source Data

Extended Data Fig. 5 A winner-take-all DNA neural network that recognizes 100-bit patterns as one of two handwritten digits.

a, Choosing the test input patterns on the basis of their locations in the weighted-sum space. b, Overlap between the two memories: ‘6’ and ‘7’. c, 36 test patterns with the number of flipped bits shown next to their weighted sums. d, Recognizing handwritten digits with up to 30 flipped bits compared to the perfect digits. Dotted lines indicate fluorescence kinetics data and solid lines indicate simulation. The standard concentration is 100 nM. Initial concentrations of all species are listed in Extended Data Fig. 10. The input pattern is shown in each plot. Note that 40 is the maximum number of flipped bits because all patterns have exactly 20 1s.

Source Data

Extended Data Fig. 6 Three-species winner-take-all behaviour and rate measurements for selecting DNA sequences in winner-take-all reaction pathways.

a, Fluorescence kinetics data for a three-species winner-take-all circuit. Initial concentrations of the three weighted-sum species are shown on top of each plot as a number relative to a standard concentration of 50 nM (1×). The initial concentrations of the annihilator, restoration gates, fuels and reporters are 75 nM (1.5×), 50 nM (1×), 100 nM (2×) and 100 nM (2×), respectively. b, Measuring the rates of 15 catalytic gates. Fluorescence kinetics data (dotted lines) and simulations (solid lines) of the signal restoration reaction are shown, with a trimolecular rate constant (k) fitted using a Markov chain Monte Carlo package (https://github.com/joshburkart/mathematica-mcmc). The reporting reaction was needed for the fluorescence readout. Initial concentrations of all species are listed as a number relative to a standard concentration of 50 nM. c, The 15 catalytic gates sorted and grouped on the basis of their rate constants. All rate constants are within ±65% of the median. The two coloured groups of three rate constants are within ±5% of the median. These two groups of catalytic gates were selected for signal restoration in the winner-take-all DNA neural networks that remember two to nine 100-bit patterns (Methods section ‘Sequence design’).

Source Data

Extended Data Fig. 7 A winner-take-all DNA neural network that recognizes 100-bit patterns as one of three handwritten digits.

a, Circuit diagram. b, Choosing the test input patterns on the basis of their locations in the weighted-sum space. c, Overlap between the three memories: ‘2’, ‘3’ and ‘4’. d, Recognizing handwritten digits with up to 28 flipped bits compared to the ‘remembered’ digits. Dotted lines indicate fluorescence kinetics data and solid lines indicate simulation. The standard concentration is 100 nM. Initial concentrations of all species are listed in Extended Data Fig. 10. The input pattern is shown in each plot. Note that 40 is the maximum number of flipped bits because all patterns have exactly 20 1s.

Source Data

Extended Data Fig. 8 Workflow of the winner-take-all compiler.

The compiler²¹ is a software tool for designing DNA-based winner-take-all neural networks. Users start by uploading a file that describes a winner-take-all neural network. Alternatively, the weight matrix and test patterns can be drawn graphically. Next, a plot of the weighted-sum space provides a visual representation of the classification decision boundaries. The kinetics of the system can be simulated using Mathematica code downloaded from the compiler website, and the set of reaction functions are displayed online. Finally, the compiler produces a list of DNA strands that are required to experimentally demonstrate the network as designed by the user.

Extended Data Fig. 9 Size and performance analysis of logic circuits for pattern recognition.

a, Logic circuits that determine whether a 9-bit pattern is more similar to ‘L’ or ‘T’. b, Logic circuits that recognize 100-bit handwritten digits. To find a logic circuit that produces correct outputs for a given set of inputs, with no constraint on other inputs, we first created a truth table including all experimentally tested inputs and their corresponding outputs. The outputs for all other inputs were specified as ‘don’t care’, meaning the values could be 0 or 1. The truth table was converted to a Boolean expression and minimized in Mathematica, and then minimized again jointly for multiple outputs and mapped to a logic circuit in Logic Friday (https://download.cnet.com/Logic-Friday/3000-20415_4-75848245.html). In the minimized truth tables shown here, ‘X’ indicates a specific bit of the input on which the output does not depend. For comparison, minimized logic circuits were also generated from training sets with a varying total number of random examples from the MNIST database. The performance of each logic circuit, defined as the percentage of correctly classified inputs, was computed using all examples in the database. To make the minimization and mapping to logic gates computable in Logic Friday, the size of the input was restricted to the 16 most significant bits, determined on the basis of the weight matrix of the neural networks.

Extended Data Fig. 10 Species and their initial concentrations in all neural networks that recognize 100-bit patterns.

a, List of species and strands. Reporters were annealed with top strands (that is, Rep[j]-t) in 20% excess. All other two-stranded complexes were annealed with a 1:1 ratio of the two strands and then PAGE-purified (Methods section ‘Purification’). b, Weights and example inputs in the neural network that recognizes ‘6’ and ‘7’. c, Weights in the neural network that recognizes ‘1’–‘9’. Weights and inputs used in all experiments are listed in Supplementary Table 2. Detailed protocols for all experiments are listed in Supplementary Table 3.

Supplementary information

Supplementary Table 1

DNA sequences

Supplementary Table 2

Weights and inputs

Supplementary Table 3

Experimental protocols

Source data

Source data Figure 2

Source data Figure 3

Source data Figure 4

Source data Extended Data Figure 3

Source data Extended Data Figure 4

Source data Extended Data Figure 5

Source data Extended Data Figure 6

Source data Extended Data Figure 7

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cherry, K.M., Qian, L. Scaling up molecular pattern recognition with DNA-based winner-take-all neural networks. Nature 559, 370–376 (2018). https://doi.org/10.1038/s41586-018-0289-6

Download citation

Received: 30 October 2017
Accepted: 18 April 2018
Published: 04 July 2018
Issue Date: 19 July 2018
DOI: https://doi.org/10.1038/s41586-018-0289-6

This article is cited by

DNA-based molecular classifiers for the profiling of gene expression signatures
- Li Zhang
- Qian Liu
- Guoming Xie
Journal of Nanobiotechnology (2024)
DNA as a universal chemical substrate for computing and data storage
- Shuo Yang
- Bas W. A. Bögels
- Tom F. A. de Greef
Nature Reviews Chemistry (2024)
Pattern recognition in the nucleation kinetics of non-equilibrium self-assembly
- Constantine Glen Evans
- Jackson O’Brien
- Arvind Murugan
Nature (2024)
Characterization of Cascaded DNA Generation Reaction for Amplifying DNA Signal
- Ken Komiya
- Chizuru Noda
- Masayuki Yamamura
New Generation Computing (2024)
An efficient feature selection and classification system for microarray cancer data using genetic algorithm and deep belief networks
- Morolake Oladayo Lawrence
- Rasheed Gbenga Jimoh
- Waheed Babatunde Yahya
Multimedia Tools and Applications (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Reviewer information

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Extended data figures and tables

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links