The ability to learn new tasks and generalize to others is a remarkable characteristic of both human brains and recent artificial intelligence systems. The ability to perform multiple tasks simultaneously is also a key characteristic of parallel architectures, as is evident in the human brain and exploited in traditional parallel architectures. Here we show that these two characteristics reflect a fundamental tradeoff between interactive parallelism, which supports learning and generalization, and independent parallelism, which supports processing efficiency through concurrent multitasking. Although the maximum number of possible parallel tasks grows linearly with network size, under realistic scenarios their expected number grows sublinearly. Hence, even modest reliance on shared representations, which support learning and generalization, constrains the number of parallel tasks. This has profound consequences for understanding the human brain’s mix of sequential and parallel capabilities, as well as for the development of artificial intelligence systems that can optimally manage the tradeoff between learning and processing efficiency.
Subscribe to Journal
Get full journal access for 1 year
only $14.08 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Example data files are available at https://github.com/lordgrilo/Multitasking_capacity. Source data are provided with this paper.
Code to reproduce the simulations and analysis reported here is availabile at https://github.com/lordgrilo/Multitasking_capacity.
McClelland, J. L., Rumelhart, D. E. & Hinton, G. E. The Appeal of Parallel Distributed Processing (MIT Press, 1986).
Rogers, T. T. & McClelland, J. L. Semantic Cognition: A Parallel Distributed Processing Approach (MIT Press, 2004).
Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1798–1828 (2013).
Caruana, R. Multitask learning. Mach. Learn. 28, 41–75 (1997).
Baxter, J. Learning internal representations. In Proc. Eighth Annual Conference on Computational Learning Theory 311–320 (ACM, 1995).
Gropp, W., Lusk, E., Doss, N. & Skjellum, A. A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput. 22, 789–828 (1996).
Musslick, S. et al. Multitasking capability versus learning efficiency in neural network architectures. In Proc. 39th Annual Meeting of the Cognitive Science Society 829–834 (Cognitive Science Society, 2017).
Posner, M. I. & Snyder, C. R. in Information Processing and Cognition: The Loyola Symposium 55–85 (Erlbaum, 1975).
Shiffrin, R. M. & Schneider, W. Controlled and automatic human information processing: II. Perceptual learning, automatic attending and a general theory. Psychol. Rev. 84, 127–190 (1977).
Wickens, C. D. in Multiple-Task Performance 1st edn (ed. Damos, D. L.) Ch. 1 (CRC Press, 1991).
Allport, D. A. in New Directions in Cognitive Psychology (ed. Claxton, G. L.) 112–153 (Routledge, 1980).
Meyer, D. E. & Kieras, D. E. A computational theory of executive cognitive processes and multiple-task performance: Part I. Basic mechanisms. Psychol. Rev. 104, 3–65 (1997).
Navon, D. & Gopher, D. On the economy of the human-processing system. Psychol. Rev. 86, 214–255 (1979).
Feng, S. F., Schwemmer, M., Gershman, S. J. & Cohen, J. D. Multitasking vs. multiplexing: toward a normative account of limitations in the simultaneous execution of control-demanding behaviors. Cogn. Affect. Behav. Neurosci. 14, 129–146 (2014).
Musslick, S. et al. Controlled vs. automatic processing: a graph-theoretic approach to the analysis of serial vs. parallel processing in neural network architectures. In Proc. 38th Annual Meeting of the Cognitive Science Society 1547–1552 (Cognitive Science Society, 2016).
Stroop, J. R. Studies of interference in serial verbal reactions. J. Exp. Psychol. 18, 643–662 (1935).
Gavril, F. Algorithms for a maximum clique and a maximum independent set of a circle graph. Networks 3, 261–273 (1973).
Cohen, J. D., Dunbar, K. & McClelland, J. L. On the control of automatic processes: a parallel distributed processing account of the stroop effect. Psychol. Rev. 97, 332–361 (1990).
Cohen, J. D., Servan-Schreiber, D. & McClelland, J. L. A parallel distributed processing approach to automaticity. Am. J. Psychol. 105, 239–269 (1992).
Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S. & Cohen, J. D. Conflict monitoring and cognitive control. Psychol. Rev. 108, 624–652 (2001).
Newman, M. Networks: An Introduction (Oxford Univ. Press, 2010).
Lucibello, C. & Ricci-Tersenghi, F. The statistical mechanics of random set packing and a generalization of the Karp–Sipser algorithm. Int. J. Stat. Mech. 2014, 1–13 (2014).
Newman, M. E. J. Random graphs with clustering. Phys. Rev. Lett. 103, 058701 (2009).
Alon, N. et al. A graph-theoretic approach to multitasking. In Proc. 31st Annual Conference on Neural Information Processing Systems 2101–2110 (NIPS, 2017).
Alon, N. et al. Multitasking capacity: hardness results and improved constructions. SIAM J. Discrete Math. 34, 885–903 (2020).
Pósfai, M. & Hövel, P. Structural controllability of temporal networks. New J. Phys. 16, 123055 (2014).
Li, A., Cornelius, S. P., Liu, Y.-Y., Wang, L. & Barabási, A.-L. The fundamental advantages of temporal networks. Science 358, 1042–1046 (2017).
Townsend, J. T. & Wenger, M. J. A theory of interactive parallel processing: new capacity measures and predictions for a response time inequality series. Psychol. Rev. 111, 1003–1035 (2004).
Townsend, J. T. & Wenger, M. J. The serial–parallel dilemma: a case study in a linkage of theory and method. Psychon. Bull. Rev. 11, 391–418 (2004).
Wenger, M. J. & Townsend, J. T. On the costs and benefits of faces and words: process characteristics of feature search in highly meaningful stimuli. J. Exp. Psychol. Human 32, 755–779 (2006).
Musslick, S. & Cohen, J. D. A mechanistic account of constraints on control-dependent processing: shared representation, conflict and persistence. In Proc. 41st Annual Meeting of the Cognitive Science Society 849–855 (Cognitive Science Society, 2019).
Bernardi, S. et al. The geometry of abstraction in hippocampus and prefrontal cortex. Cell 183, 954–967 (2020).
Cohen, U., Chung, S. Y., Lee, D. D. & Sompolinsky, H. Separability and geometry of object manifolds in deep neural networks. Nat. Commun. 11, 746 (2020).
Hinton, G. E., Rumelhart, D. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
Usher, M. & McClelland, J. L. The time course of perceptual choice: the leaky, competing accumulator model. Psychol. Rev. 108, 550–592 (2001).
G.P. has received funding support from Fondazione Compagnia San Paolo and from Intesa Sanpaolo Innovation Center.
The authors declare no competing interests.
Peer review information Nature Physics thanks Hartmut Lentz and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
MIS from neural network simulation data.
Degree distribution prediction, MIS size simulation data and predictions for interference graphs with Gaussian degree distribution, MIS size simulation data and predictions for task structure graph with Gaussian degree distribution.
Data for effective capacity simulated and predicted.
About this article
Cite this article
Petri, G., Musslick, S., Dey, B. et al. Topological limits to the parallel processing capability of network architectures. Nat. Phys. (2021). https://doi.org/10.1038/s41567-021-01170-x