Hybrid computing using a neural network with dynamic external memory

Journal name:
Nature
Volume:
538,
Pages:
471–476
Date published:
DOI:
doi:10.1038/nature20101
Received
Accepted
Published online

Abstract

Artificial neural networks are remarkably adept at sensory processing, sequence learning and reinforcement learning, but are limited in their ability to represent variables and data structures and to store data over long timescales, owing to the lack of an external memory. Here we introduce a machine learning model called a differentiable neural computer (DNC), which consists of a neural network that can read from and write to an external memory matrix, analogous to the random-access memory in a conventional computer. Like a conventional computer, it can use its memory to represent and manipulate complex data structures, but, like a neural network, it can learn to do so from data. When trained with supervised learning, we demonstrate that a DNC can successfully answer synthetic questions designed to emulate reasoning and inference problems in natural language. We show that it can learn tasks such as finding the shortest path between specified points and inferring the missing links in randomly generated graphs, and then generalize these tasks to specific graphs such as transport networks and family trees. When trained with reinforcement learning, a DNC can complete a moving blocks puzzle in which changing goals are specified by sequences of symbols. Taken together, our results demonstrate that DNCs have the capacity to solve complex, structured tasks that are inaccessible to neural networks without external read–write memory.

At a glance

Figures

  1. DNC architecture.
    Figure 1: DNC architecture.

    a, A recurrent controller network receives input from an external data source and produces output. b, c, The controller also outputs vectors that parameterize one write head (green) and multiple read heads (two in this case, blue and pink). (A reduced selection of parameters is shown.) The write head defines a write and an erase vector that are used to edit the N × W memory matrix, whose elements’ magnitudes and signs are indicated by box area and shading, respectively. Additionally, a write key is used for content lookup to find previously written locations to edit. The write key can contribute to defining a weighting that selectively focuses the write operation over the rows, or locations, in the memory matrix. The read heads can use gates called read modes to switch between content lookup using a read key (‘C’) and reading out locations either forwards (‘F’) or backwards (‘B’) in the order they were written. d, The usage vector records which locations have been used so far, and a temporal link matrix records the order in which locations were written; here, we represent the order locations were written to using directed arrows.

  2. Graph tasks.
    Figure 2: Graph tasks.

    a, An example of a randomly generated graph used for training. b, Zone 1 interchange stations of the London Underground map, used as a generalization test for the traversal and shortest-path tasks. Random seven-step traversals (an example of which is shown on the left) were tested, yielding an average accuracy of 98.8%. Testing on all possible four-step shortest paths (example shown on the right) gave an average accuracy of 55.3%. c, The family tree that was used as a generalization test for the inference task; four-step relations such as the one shown in blue (from Freya to Fergus, her maternal great uncle) were tested, giving an average accuracy of 81.8%. The symbol sequences processed by the network during the test examples are shown beneath the graphs. The input is an unordered list of (‘from node’, ‘to node’, ‘edge’) triple vectors that describes the graph. For each task, the question is a sequence of triples with missing elements (denoted ‘_’) and the answer is a sequence of completed triples.

  3. Traversal on the London Underground.
    Figure 3: Traversal on the London Underground.

    a, During the graph definition phase, the network writes each triple in the map to a separate memory location, as shown by the write weightings (green). During the query phase, the start station (Victoria) and lines to be traversed are recorded. The triple stored in each location can be recovered by a logistic regression decoder, as shown on the vertical axis. b, The read mode distribution during the answer phase reveals that read head 1 (pink) follows temporal links forwards to retrieve the instructions in order, whereas read head 2 (blue) uses content lookup to find the stations along the path. The degree of coloration indicates how strongly each mode is used. c, The region of the map used. d, The final content key used by read head 2 is decoded as a triple with no destination. e, The memory location returned by the key contains the complete triple, allowing the network to infer the destination (Tottenham Court Rd).

  4. Mini-SHRDLU analysis.
    Figure 4: Mini-SHRDLU analysis.

    a, In a short example episode, the network wrote goal-related information to sequences of memory locations. (‘G’, ‘T’, ‘C’ and ‘A’ denote goals; the numbers refer to the block number; ‘b’, ‘a’, ‘l’ and ‘r’ denote ‘below’, ‘above’, ‘left of’ and ‘right of’, respectively.) The chosen goal was T (‘T?’), and the read heads focused on the locations containing goal T. b, The constraints comprising goal T. c, The policy made an optimal sequence of moves to satisfy its constraints. d, On 800 random episodes, the first five actions that the network took for the chosen goal were decoded from memory using logistic regression at the time-step after the goal was written (box in a with arrow to c). Decoding accuracy for the first action is 89%, compared to 17% using action frequencies alone, indicating that the network had determined a plan at the time of writing, many steps before execution. Error bars represent 5–95 percentile bootstrapped confidence intervals on validation data. e, Within trials, we average the location contents associated with each goal label into single vectors. Across trials, we create a dataset of these vectors and perform t-SNE (t-distributed stochastic neighbour embedding) dimensionality reduction down to two dimensions. This shows that each goal label is coded geometrically in the memory locations.

  5. Mini-SHRDLU results.
    Figure 5: Mini-SHRDLU results.

    a, 20 replicated training runs with different random-number seeds for a DNC and LSTM. Only the DNC was able to complete the learning curriculum. b, A single DNC was able to solve a large percentage of problems optimally from each previous lesson (perfect), with a few episodes solved in extra moves (success), and some failures to satisfy all constraints (incomplete).

  6. Dynamic memory allocation.
    Extended Data Fig. 1: Dynamic memory allocation.

    We trained the DNC on a copy problem, in which a series of 10 random sequences was presented as input. After each input sequence was presented, it was recreated as output. Once the output was generated, that input sequence was not needed again and could be erased from memory. We used a DNC with a feedforward controller and a memory of 10 locations—insufficient to store all 50 input vectors with no overwriting. The goal was to test whether the memory allocation system would be used to free and re-use locations as needed. As shown by the read and write weightings, the same locations are repeatedly used. The free gate is active during the read phases, meaning that locations are deallocated immediately after they are read from. The allocation gate is active during the write phases, allowing the deallocated locations to be re-used.

  7. Altering the memory size of a trained network.
    Extended Data Fig. 2: Altering the memory size of a trained network.

    A DNC trained on the traversal task with 256 memory locations was tested while varying the number of memory locations and graph triples. The heat map shows the fraction of traversals of length 1–10 performed perfectly by the network, out of a batch of 100. There is a clear correspondence between the number of triples in the graph and the number of memory locations required to solve the task, reflecting our earlier analysis (Fig. 3) that suggests that DNC writes each triple to a separate location in memory. The network appears to exploit all available memory, regardless of how much memory it was trained with. This supports our claim that memory is independent of processing in a DNC, and points to large-scale applications such as knowledge graph processing.

  8. Probability of achieving optimal solution.
    Extended Data Fig. 3: Probability of achieving optimal solution.

    a, DNC. With 10 goals, the performance of a DNC network with respect to satisfying constraints in minimal time as the minimum number of moves to a goal and the number of constraints in a goal are varied. Performance was highest with a large number of constraints in each goal. b, The performance of an LSTM on the same test.

  9. Effect of link matrix sparsity on performance.
    Extended Data Fig. 4: Effect of link matrix sparsity on performance.

    We trained the DNC on a copy problem, for which a sequence of length 1–100 of size-6 random binary vectors was given as input, and an identical sequence was then required as output. A feedforward controller was used to ensure that the sequences could not be stored in the controller state. The faint lines show error curves for 20 randomly initialized runs with identical hyper-parameters, with link matrix sparsity switched off (pink), sparsity used with K = 5 (green) and with the link matrix disabled altogether (blue). The bold lines show the mean curve for each setting. The error rate is the fraction of sequences copied with no mistakes out of a batch of 100. There does not appear to be any systematic difference between no sparsity and K = 5. We observed similar behaviour for values of K between 2 and 20 (plots omitted for clarity). The task cannot easily be solved without the link matrix because the input sequence has to be recovered in the correct order. Note the abrupt drops in error for the networks with link matrices: these are the points at which the system learns a copy algorithm that generalizes to longer sequences.

Tables

  1. bAbI best and mean results
    Extended Data Table 1: bAbI best and mean results
  2. Hyper-parameter settings for bAbI, graph tasks and Mini-SHRDLU
    Extended Data Table 2: Hyper-parameter settings for bAbI, graph tasks and Mini-SHRDLU
  3. Curriculum results for graph traversal
    Extended Data Table 3: Curriculum results for graph traversal
  4. Curriculum results for inference
    Extended Data Table 4: Curriculum results for inference
  5. Curriculum results for shortest-path task
    Extended Data Table 5: Curriculum results for shortest-path task
  6. Curriculum for Mini-SHRDLU
    Extended Data Table 6: Curriculum for Mini-SHRDLU

Videos

  1. Shortest Path Visualisation
    Video 1: Shortest Path Visualisation
    This video shows a DNC successfully finding the shortest path between two nodes in a randomly generated graph. By decoding the memory usage of the DNC (as in Fig. 3) we were able to determine which edges were stored in the memory locations it was reading from and writing to at each timestep. The edges being read are shown in pink on the left, while the edges being written are shown in green on the right; the colour saturation indicates the relative strength of the operation. During the initial query phase, the DNC receives the labels for the start and end goal ("390" and "040" respectively). During the ten step planning phase it attempts to determine the shortest path. During this time it repeatedly reads edges close to or along the path, which are indicated by the grey shaded nodes. Beginning with edges attached to the start and end node, it appears to move further afield as the phase progresses. At the same time it writes to several of the edge locations, perhaps marking those edges as visited. Finally, during the answer phase, it successively reads the outgoing edges from the nodes along the shortest path, allowing it to correctly answer the query.
  2. Mini-SHRDLU Visualisation
    Video 2: Mini-SHRDLU Visualisation
    This video shows a DNC successfully performing a reasoning problem in a blocks world. A sequence of letter-labeled goals (S, K, R, Q, E) is presented to the network one step at a time. Each goal consists of a sequence of defining constraints, presented one constraint per time-step. For example, S is: 6 below 2 (6b2); 2 right of 5 (2r5); 6 right of 1 (6r1); 5 above 1 (5a1). On the right, the write head edits the memory, writing information about the goals down. Ultimately, the DNC is commanded to satisfy goal \Q", which it does subsequently by using the read heads to inspect the locations containing goal Q. The constraints constituting goal \Q" are shown below, and the final board position is correct.

References

  1. Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems Vol. 25 (eds Pereira, F. et al.) 10971105 (Curran Associates, 2012)
  2. Graves, A. Generating sequences with recurrent neural networks. Preprint at http://arxiv.org/abs/1308.0850 (2013)
  3. Sutskever, I., Vinyals, O. & Le, Q. V. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems Vol. 27 (eds Ghahramani, Z. et al.) 31043112 (Curran Associates, 2014)
  4. Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529533 (2015)
  5. Gallistel, C. R. & King, A. P. Memory and the Computational Brain: Why Cognitive Science Will Transform Neuroscience (John Wiley & Sons, 2011)
  6. Marcus, G. F. The Algebraic Mind: Integrating Connectionism and Cognitive Science (MIT Press, 2001)
  7. Kriete, T., Noelle, D. C., Cohen, J. D. & O’Reilly, R. C. Indirection and symbol-like processing in the prefrontal cortex and basal ganglia. Proc. Natl Acad. Sci. USA 110, 1639016395 (2013)
  8. Hinton, G. E. Learning distributed representations of concepts. In Proc. Eighth Annual Conference of the Cognitive Science Society Vol. 1, 112 (Lawrence Erlbaum Associates, 1986)
  9. Bottou, L. From machine learning to machine reasoning. Mach. Learn. 94, 133149 (2014)
  10. Fusi, S., Drew, P. J. & Abbott, L. F. Cascade models of synaptically stored memories. Neuron 45, 599611 (2005)
  11. Ganguli, S., Huh, D. & Sompolinsky, H. Memory traces in dynamical systems. Proc. Natl Acad. Sci. USA 105, 1897018975 (2008)
  12. Kanerva, P. Sparse Distributed Memory (MIT press, 1988)
  13. Amari, S.-i. Characteristics of sparsely encoded associative memory. Neural Netw. 2, 451457 (1989)
  14. Weston, J., Chopra, S. & Bordes, A. Memory networks. Preprint at http://arxiv.org/abs/1410.3916 (2014)
  15. Vinyals, O., Fortunato, M. & Jaitly, N. Pointer networks. In Advances in Neural Information Processing Systems Vol. 28 (eds Cortes, C et al.) 26922700 (Curran Associates, 2015)
  16. Graves, A., Wayne, G. & Danihelka, I. Neural Turing machines. Preprint at http://arxiv.org/abs/1410.5401 (2014)
  17. Bahdanau, D., Cho, K. & Bengio, Y. Neural machine translation by jointly learning to align and translate. Preprint at http://arxiv.org/abs/1409.0473 (2014)
  18. Gregor, K., Danihelka, I., Graves, A., Rezende, D. J. & Wierstra, D. DRAW: a recurrent neural network for image generation. In Proc. 32nd International Conference on Machine Learning (eds Bach, F. & Blei, D.) 1462–1471 (JMLR, 2015)
  19. Hintzman, D. L. MINERVA 2: a simulation model of human memory. Behav. Res. Methods Instrum. Comput. 16, 96101 (1984)
  20. Kumar, A. et al. Ask me anything: dynamic memory networks for natural language processing. Preprint at http://arxiv.org/abs/1506.07285 (2015)
  21. Sukhbaatar, S. et al. End-to-end memory networks. In Advances in Neural Information Processing Systems Vol. 28 (eds Cortes, C et al.) 24312439 (Curran Associates, 2015)
  22. Magee, J. C. & Johnston, D. A synaptically controlled, associative signal for Hebbian plasticity in hippocampal neurons. Science 275, 209213 (1997)
  23. Johnston, S. T., Shtrahman, M., Parylak, S., GonÇalves, J. T. & Gage, F. H. Paradox of pattern separation and adult neurogenesis: a dual role for new neurons balancing memory resolution and robustness. Neurobiol. Learn. Mem. 129, 6068 (2016)
  24. O’Reilly, R. C. & McClelland, J. L. Hippocampal conjunctive encoding, storage, and recall: avoiding a trade-off. Hippocampus 4, 661682 (1994)
  25. Howard, M. W. & Kahana, M. J. A distributed representation of temporal context. J. Math. Psychol. 46, 269299 (2002)
  26. Weston, J., Bordes, A., Chopra, S. & Mikolov, T. Towards AI-complete question answering: a set of prerequisite toy tasks. Preprint at http://arxiv.org/abs/1502.05698 (2015)
  27. Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 17351780 (1997)
  28. Bengio, Y., Louradour, J., Collobert, R. & Weston, J. Curriculum learning. In Proc. 26th International Conference on Machine Learning (eds Bottou, L. & Littman, M.) 4148 (ACM, 2009)
  29. Zaremba, W. & Sutskever, I. Learning to execute. Preprint at http://arxiv.org/abs/1410.4615 (2014)
  30. Winograd, T. Procedures as a Representation for Data in a Computer Program for Understanding Natural Language. Report No. MAC-TR-84 (DTIC, MIT Project MAC, 1971)
  31. Epstein, R., Lanza, R. P. & Skinner, B. F. Symbolic communication between two pigeons (Columba livia domestica). Science 207, 543545 (1980)
  32. McClelland, J. L., McNaughton, B. L. & O’Reilly, R. C. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev. 102, 419457 (1995)
  33. Kumaran, D., Hassabis, D. & McClelland, J. L. What learning systems do intelligent agents need? Complementary learning systems theory updated. Trends Cogn. Sci. 20, 512534 (2016)
  34. McClelland, J. L. & Goddard, N. H. Considerations arising from a complementary learning systems perspective on hippocampus and neocortex. Hippocampus 6, 654665 (1996)
  35. Lake, B. M., Salakhutdinov, R. & Tenenbaum, J. B. Human-level concept learning through probabilistic program induction. Science 350, 13321338 (2015)
  36. Rezende, D. J., Mohamed, S., Danihelka, I., Gregor, K. & Wierstra, D. One-shot generalization in deep generative models. In Proc. 33nd International Conference on Machine Learning (eds Balcan, M. F. & Weinberger, K. Q.) 15211529 (JMLR, 2016)
  37. Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D. & Lillicrap, T. Meta-learning with memory-augmented neural networks. In Proc. 33nd International Conference on Machine Learning (eds Balcan, M. F. & Weinberger, K. Q.) 18421850 (JMLR, 2016)
  38. Oliva, A. & Torralba, A. The role of context in object recognition. Trends Cogn. Sci. 11, 520527 (2007)
  39. Hermann, K. M. et al. Teaching machines to read and comprehend. In Advances in Neural Information Processing Systems Vol. 28 (eds Cortes, C. et al.) 16931701 (Curran Associates, 2015)
  40. O’Keefe, J. & Nadel, L. The Hippocampus as a Cognitive Map (Oxford Univ. Press, 1978)
  41. Graves, A., Mohamed, A.-r. & Hinton, G. Speech recognition with deep recurrent neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (eds Ward, R. et al.) 66456649 (Curran Associates, 2013)
  42. Wilson, P. R., Johnstone, M. S., Neely, M. & Boles, D. Dynamic storage allocation: a survey and critical review. In Memory Management (ed. Baler, H. G.) 1116 (Springer, 1995)
  43. Ross, S., Gordon, G. J. & Bagnell, J. A. A reduction of imitation learning and structured prediction to no-regret online learning. In Proc. Fourteenth International Conference on Artificial Intelligence and Statistics (eds Gordon, G. et al.) 627635 (JMLR, 2010)
  44. Daumé, H. III, Langford, J. & Marcu, D. Search-based structured prediction. Mach. Learn. 75, 297325 (2009)
  45. Williams, R. J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229256 (1992)
  46. Sutton, R. S., McAllester, D., Singh, S. P. & Mansour, Y. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems Vol. 12 (eds Solla, S. A. et al.) 10571063 (MIT Press, 1999)
  47. Schulman, J., Moritz, P., Levine, S., Jordan, M. & Abbeel, P. High-dimensional continuous control using generalized advantage estimation. Preprint at http://arxiv.org/abs/1506.02438 (2015)
  48. van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 25792605 (2008)
  49. Dean, J. et al. Large scale distributed deep networks. In Advances in Neural Information Processing Systems Vol. 25 (eds Pereira, F. et al.) 12231231 (Curran Associates, 2012)
  50. Werbos, P. J. Backpropagation through time: what it does and how to do it. Proc. IEEE 78, 15501560 (1990)
  51. Tieleman, T. & Hinton, G. RmsProp: divide the gradient by a running average of its recent magnitude. Lecture 6.5 of Neural Networks for Machine Learning (COURSERA, 2012); available at http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf

Download references

Author information

  1. These authors contributed equally to this work.

    • Alex Graves &
    • Greg Wayne

Affiliations

  1. Google DeepMind, 5 New Street Square, London EC4A 3TW, UK

    • Alex Graves,
    • Greg Wayne,
    • Malcolm Reynolds,
    • Tim Harley,
    • Ivo Danihelka,
    • Agnieszka Grabska-Barwińska,
    • Sergio Gómez Colmenarejo,
    • Edward Grefenstette,
    • Tiago Ramalho,
    • John Agapiou,
    • Adrià Puigdomènech Badia,
    • Karl Moritz Hermann,
    • Yori Zwols,
    • Georg Ostrovski,
    • Adam Cain,
    • Helen King,
    • Christopher Summerfield,
    • Phil Blunsom,
    • Koray Kavukcuoglu &
    • Demis Hassabis

Contributions

A.G. and G.W. conceived the project. A.G., G.W., M.R., T.H., I.D., S.G. and E.G. implemented networks and tasks. A.G., G.W., M.R., T.H., A.G.-B., T.R. and J.A. performed analysis. M.R., T.H., I.D., E.G., K.M.H., C.S., P.B., K.K. and D.H. contributed ideas. A.C. prepared graphics. A.G., G.W., M.R., T.H., S.G., A.P.B., Y.Z., G.O. and K.K. performed experiments. A.G., G.W., H.K., K.K. and D.H. managed the project. A.G., G.W., M.R., T.H., K.K. and D.H. wrote the paper.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Reviewer Information Nature thanks Y. Bengio, J. McClelland and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Dynamic memory allocation. (150 KB)

    We trained the DNC on a copy problem, in which a series of 10 random sequences was presented as input. After each input sequence was presented, it was recreated as output. Once the output was generated, that input sequence was not needed again and could be erased from memory. We used a DNC with a feedforward controller and a memory of 10 locations—insufficient to store all 50 input vectors with no overwriting. The goal was to test whether the memory allocation system would be used to free and re-use locations as needed. As shown by the read and write weightings, the same locations are repeatedly used. The free gate is active during the read phases, meaning that locations are deallocated immediately after they are read from. The allocation gate is active during the write phases, allowing the deallocated locations to be re-used.

  2. Extended Data Figure 2: Altering the memory size of a trained network. (197 KB)

    A DNC trained on the traversal task with 256 memory locations was tested while varying the number of memory locations and graph triples. The heat map shows the fraction of traversals of length 1–10 performed perfectly by the network, out of a batch of 100. There is a clear correspondence between the number of triples in the graph and the number of memory locations required to solve the task, reflecting our earlier analysis (Fig. 3) that suggests that DNC writes each triple to a separate location in memory. The network appears to exploit all available memory, regardless of how much memory it was trained with. This supports our claim that memory is independent of processing in a DNC, and points to large-scale applications such as knowledge graph processing.

  3. Extended Data Figure 3: Probability of achieving optimal solution. (137 KB)

    a, DNC. With 10 goals, the performance of a DNC network with respect to satisfying constraints in minimal time as the minimum number of moves to a goal and the number of constraints in a goal are varied. Performance was highest with a large number of constraints in each goal. b, The performance of an LSTM on the same test.

  4. Extended Data Figure 4: Effect of link matrix sparsity on performance. (461 KB)

    We trained the DNC on a copy problem, for which a sequence of length 1–100 of size-6 random binary vectors was given as input, and an identical sequence was then required as output. A feedforward controller was used to ensure that the sequences could not be stored in the controller state. The faint lines show error curves for 20 randomly initialized runs with identical hyper-parameters, with link matrix sparsity switched off (pink), sparsity used with K = 5 (green) and with the link matrix disabled altogether (blue). The bold lines show the mean curve for each setting. The error rate is the fraction of sequences copied with no mistakes out of a batch of 100. There does not appear to be any systematic difference between no sparsity and K = 5. We observed similar behaviour for values of K between 2 and 20 (plots omitted for clarity). The task cannot easily be solved without the link matrix because the input sequence has to be recovered in the correct order. Note the abrupt drops in error for the networks with link matrices: these are the points at which the system learns a copy algorithm that generalizes to longer sequences.

Extended Data Tables

  1. Extended Data Table 1: bAbI best and mean results (166 KB)
  2. Extended Data Table 2: Hyper-parameter settings for bAbI, graph tasks and Mini-SHRDLU (78 KB)
  3. Extended Data Table 3: Curriculum results for graph traversal (161 KB)
  4. Extended Data Table 4: Curriculum results for inference (160 KB)
  5. Extended Data Table 5: Curriculum results for shortest-path task (172 KB)
  6. Extended Data Table 6: Curriculum for Mini-SHRDLU (159 KB)

Supplementary information

Video

  1. Video 1: Shortest Path Visualisation (2.54 MB, Download)
    This video shows a DNC successfully finding the shortest path between two nodes in a randomly generated graph. By decoding the memory usage of the DNC (as in Fig. 3) we were able to determine which edges were stored in the memory locations it was reading from and writing to at each timestep. The edges being read are shown in pink on the left, while the edges being written are shown in green on the right; the colour saturation indicates the relative strength of the operation. During the initial query phase, the DNC receives the labels for the start and end goal ("390" and "040" respectively). During the ten step planning phase it attempts to determine the shortest path. During this time it repeatedly reads edges close to or along the path, which are indicated by the grey shaded nodes. Beginning with edges attached to the start and end node, it appears to move further afield as the phase progresses. At the same time it writes to several of the edge locations, perhaps marking those edges as visited. Finally, during the answer phase, it successively reads the outgoing edges from the nodes along the shortest path, allowing it to correctly answer the query.
  2. Video 2: Mini-SHRDLU Visualisation (5.16 MB, Download)
    This video shows a DNC successfully performing a reasoning problem in a blocks world. A sequence of letter-labeled goals (S, K, R, Q, E) is presented to the network one step at a time. Each goal consists of a sequence of defining constraints, presented one constraint per time-step. For example, S is: 6 below 2 (6b2); 2 right of 5 (2r5); 6 right of 1 (6r1); 5 above 1 (5a1). On the right, the write head edits the memory, writing information about the goals down. Ultimately, the DNC is commanded to satisfy goal \Q", which it does subsequently by using the read heads to inspect the locations containing goal Q. The constraints constituting goal \Q" are shown below, and the final board position is correct.

PDF files

  1. Supplementary Information (85 KB)

    This file contains a glossary of symbols and the complete equations.

Additional data