Abstract
The ability to learn new tasks and generalize to others is a remarkable characteristic of both human brains and recent artificial intelligence systems. The ability to perform multiple tasks simultaneously is also a key characteristic of parallel architectures, as is evident in the human brain and exploited in traditional parallel architectures. Here we show that these two characteristics reflect a fundamental tradeoff between interactive parallelism, which supports learning and generalization, and independent parallelism, which supports processing efficiency through concurrent multitasking. Although the maximum number of possible parallel tasks grows linearly with network size, under realistic scenarios their expected number grows sublinearly. Hence, even modest reliance on shared representations, which support learning and generalization, constrains the number of parallel tasks. This has profound consequences for understanding the human brain’s mix of sequential and parallel capabilities, as well as for the development of artificial intelligence systems that can optimally manage the tradeoff between learning and processing efficiency.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Example data files are available at https://github.com/lordgrilo/Multitasking_capacity. Source data are provided with this paper.
Code availability
Code to reproduce the simulations and analysis reported here is availabile at https://github.com/lordgrilo/Multitasking_capacity.
Change history
05 March 2021
A Correction to this paper has been published: https://doi.org/10.1038/s41567-021-01212-4
References
McClelland, J. L., Rumelhart, D. E. & Hinton, G. E. The Appeal of Parallel Distributed Processing (MIT Press, 1986).
Rogers, T. T. & McClelland, J. L. Semantic Cognition: A Parallel Distributed Processing Approach (MIT Press, 2004).
Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1798–1828 (2013).
Caruana, R. Multitask learning. Mach. Learn. 28, 41–75 (1997).
Baxter, J. Learning internal representations. In Proc. Eighth Annual Conference on Computational Learning Theory 311–320 (ACM, 1995).
Gropp, W., Lusk, E., Doss, N. & Skjellum, A. A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput. 22, 789–828 (1996).
Musslick, S. et al. Multitasking capability versus learning efficiency in neural network architectures. In Proc. 39th Annual Meeting of the Cognitive Science Society 829–834 (Cognitive Science Society, 2017).
Posner, M. I. & Snyder, C. R. in Information Processing and Cognition: The Loyola Symposium 55–85 (Erlbaum, 1975).
Shiffrin, R. M. & Schneider, W. Controlled and automatic human information processing: II. Perceptual learning, automatic attending and a general theory. Psychol. Rev. 84, 127–190 (1977).
Wickens, C. D. in Multiple-Task Performance 1st edn (ed. Damos, D. L.) Ch. 1 (CRC Press, 1991).
Allport, D. A. in New Directions in Cognitive Psychology (ed. Claxton, G. L.) 112–153 (Routledge, 1980).
Meyer, D. E. & Kieras, D. E. A computational theory of executive cognitive processes and multiple-task performance: Part I. Basic mechanisms. Psychol. Rev. 104, 3–65 (1997).
Navon, D. & Gopher, D. On the economy of the human-processing system. Psychol. Rev. 86, 214–255 (1979).
Feng, S. F., Schwemmer, M., Gershman, S. J. & Cohen, J. D. Multitasking vs. multiplexing: toward a normative account of limitations in the simultaneous execution of control-demanding behaviors. Cogn. Affect. Behav. Neurosci. 14, 129–146 (2014).
Musslick, S. et al. Controlled vs. automatic processing: a graph-theoretic approach to the analysis of serial vs. parallel processing in neural network architectures. In Proc. 38th Annual Meeting of the Cognitive Science Society 1547–1552 (Cognitive Science Society, 2016).
Stroop, J. R. Studies of interference in serial verbal reactions. J. Exp. Psychol. 18, 643–662 (1935).
Gavril, F. Algorithms for a maximum clique and a maximum independent set of a circle graph. Networks 3, 261–273 (1973).
Cohen, J. D., Dunbar, K. & McClelland, J. L. On the control of automatic processes: a parallel distributed processing account of the stroop effect. Psychol. Rev. 97, 332–361 (1990).
Cohen, J. D., Servan-Schreiber, D. & McClelland, J. L. A parallel distributed processing approach to automaticity. Am. J. Psychol. 105, 239–269 (1992).
Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S. & Cohen, J. D. Conflict monitoring and cognitive control. Psychol. Rev. 108, 624–652 (2001).
Newman, M. Networks: An Introduction (Oxford Univ. Press, 2010).
Lucibello, C. & Ricci-Tersenghi, F. The statistical mechanics of random set packing and a generalization of the Karp–Sipser algorithm. Int. J. Stat. Mech. 2014, 1–13 (2014).
Newman, M. E. J. Random graphs with clustering. Phys. Rev. Lett. 103, 058701 (2009).
Alon, N. et al. A graph-theoretic approach to multitasking. In Proc. 31st Annual Conference on Neural Information Processing Systems 2101–2110 (NIPS, 2017).
Alon, N. et al. Multitasking capacity: hardness results and improved constructions. SIAM J. Discrete Math. 34, 885–903 (2020).
Pósfai, M. & Hövel, P. Structural controllability of temporal networks. New J. Phys. 16, 123055 (2014).
Li, A., Cornelius, S. P., Liu, Y.-Y., Wang, L. & Barabási, A.-L. The fundamental advantages of temporal networks. Science 358, 1042–1046 (2017).
Townsend, J. T. & Wenger, M. J. A theory of interactive parallel processing: new capacity measures and predictions for a response time inequality series. Psychol. Rev. 111, 1003–1035 (2004).
Townsend, J. T. & Wenger, M. J. The serial–parallel dilemma: a case study in a linkage of theory and method. Psychon. Bull. Rev. 11, 391–418 (2004).
Wenger, M. J. & Townsend, J. T. On the costs and benefits of faces and words: process characteristics of feature search in highly meaningful stimuli. J. Exp. Psychol. Human 32, 755–779 (2006).
Musslick, S. & Cohen, J. D. A mechanistic account of constraints on control-dependent processing: shared representation, conflict and persistence. In Proc. 41st Annual Meeting of the Cognitive Science Society 849–855 (Cognitive Science Society, 2019).
Bernardi, S. et al. The geometry of abstraction in hippocampus and prefrontal cortex. Cell 183, 954–967 (2020).
Cohen, U., Chung, S. Y., Lee, D. D. & Sompolinsky, H. Separability and geometry of object manifolds in deep neural networks. Nat. Commun. 11, 746 (2020).
Hinton, G. E., Rumelhart, D. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
Usher, M. & McClelland, J. L. The time course of perceptual choice: the leaky, competing accumulator model. Psychol. Rev. 108, 550–592 (2001).
Acknowledgements
G.P. has received funding support from Fondazione Compagnia San Paolo and from Intesa Sanpaolo Innovation Center. S.M. and J.D.C. acknowledge support from the John Templeton Foundation. The opinions expressed in this publication are those of the authors and do not necessarily reflect the views of the John Templeton Foundation.
Author information
Authors and Affiliations
Contributions
G.P., S.M., B.D., K.Ö., N.K.A., T.L.W. and J.D.C. designed the research. G.P. developed and performed analytical and numerical calculations. S.M. and D.T. designed, implemented and performed the neural network simulations. S.M., K.Ö., B.D. and N.K.A. provided tools and performed neural network analysis. J.D.C. and T.L.W. conceptualized research and provided advice for all parts of the work. G.P., S.M., B.D., K.Ö., N.K.A., T.L.W. and J.D.C. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Physics thanks Hartmut Lentz and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–15 and Sections 1–10.
Source data
Source Data Fig. 1
MIS from neural network simulation data.
Source Data Fig. 2
Degree distribution prediction, MIS size simulation data and predictions for interference graphs with Gaussian degree distribution, MIS size simulation data and predictions for task structure graph with Gaussian degree distribution.
Source Data Fig. 3
Data for effective capacity simulated and predicted.
Rights and permissions
About this article
Cite this article
Petri, G., Musslick, S., Dey, B. et al. Topological limits to the parallel processing capability of network architectures. Nat. Phys. 17, 646–651 (2021). https://doi.org/10.1038/s41567-021-01170-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41567-021-01170-x
This article is cited by
-
Knowledge generalization and the costs of multitasking
Nature Reviews Neuroscience (2023)