Introduction

New innovations in transforming science education to promote success and broaden participation require an understanding of how students learn. Evidence has shown that learning interventions, both long and short term, can be accompanied by lasting, content-related brain changes, suggesting that classroom instruction may influence the measurable neural processes by which students consolidate, access, or store information.1,2 Physics in particular can be a challenging discipline for many students as it requires both a conceptual understanding and recall of physical principles, along with acquisition of procedural skills for solving problems. Neuroimaging studies on physics learning indicate that cognition about physical concepts (e.g., velocity, acceleration, and force) is encoded into specific neural representations,3 and these representations may change during progressive stages of physics learning.4 Moreover, problem solving is known to engage an extensive frontoparietal central executive network (CEN), both generally across domains of knowledge5 and specifically regarding physics concepts.6 Collectively, these findings highlight a putatively influential role science learning may have on functional brain architecture and underscore the complexity of neural processes linked with proficiency in physics problem solving.

Insight into the scientific learning process may be gained by considering the obstacles students encounter. A wealth of cognitive science and education research has identified consistent patterns in how students think about physics, with a preponderance of studies focusing on difficulties mastering Newtonian mechanics.7,8,9 Physics students consistently struggle to learn key concepts and novice students are known to invoke intuitive but incorrect ideas of physical causality when solving problems.10,11 These misleading conceptions frequently interfere with a student’s ability to successfully acquire new physics knowledge,12 and a broad, but sometimes conflicting, body of literature has attempted to characterize these ideas to support conceptual change across instruction.13,14,15,16,17 One model posits that these so-called “folk physics” notions18,19 may be implicitly linked to associative memory, with naive reasoning arising from context-based extrapolations of remembered personal experiences.20 Another describes students’ reasoning as being based on common sense, but weakly organized, physical intuitions.21 Yet another view argues that ontological differences in the way students think about physical processes impact how persistent incorrect conceptions are across instruction.14 A contrasting opinion holds that students use ontological categories dynamically, and that the range of physics reasoning processes may be better explained by varying levels of coherence (integration of concepts) and robustness (applicability across contexts) in how students build patterns of associations between their existing cognitive resources (e.g., memories, beliefs, and facts).15,22 Despite these many models, little is known about the underlying neural processes of how students access, deploy, and attempt to resolve physics conceptions during reasoning. The limited work that has been done on this topic indicates that the anterior cingulate cortex (ACC) may be engaged when students view physically causal scenes that conflict with their strongly held intuitions.23 In addition, episodic, associative, and spatial recall are known to be supported by hippocampal and retrosplenial cortex (RSC) activity,24,25 and reasoning processes are linked with the dorsolateral prefrontal (dlPFC) and posterior parietal cortex (PPC) activity.5 However, no prior work has identified the specific neural processes that underlie physics reasoning nor any neurobiological differences associated with students different use of incorrect physics conceptions. Such an understanding would inform existing behavioral models and might help us more fully understand how students learn physics.

We acquired functional magnetic resonance imaging (fMRI) data from 107 undergraduate students after the conclusion of a semester of university-level physics instruction. During fMRI, students were presented with questions adapted from the force concept inventory (FCI),26 a widely adopted test of conceptual problem solving that presents scenarios of objects at rest or in motion and asks students to choose between a Newtonian solution and several reasonable non-Newtonian alternatives, each of which mirror common confusions. Physics and baseline perceptual questions (Supplementary Fig. 1) were presented as blocks composed of three sequential view screens (e.g., “phases”): problem initiation in which students viewed text and a figure describing a physical scenario (Phase I), question presentation in which the students viewed a physics question about the scenario (Phase II), and answer selection wherein four possible answer choices were displayed for selection (Phase III). Brain activity across full questions (all phases), as well as within each phase, was assessed. We then explored putative links between the neural substrates of physics problem solving and accuracy, difficulty, strategy, and student conceptualization of physics ideas. First, we probed for brain-behavior correlations revealed by parametric modulation of the BOLD signal in independent meta-analytically defined a priori reasoning and memory-linked regions of interest (ROIs; Supplementary Fig. 2) located in the left dlPFC, ACC, left PPC, left hippocampus, and RSC, and across the whole brain. Second, because student response patterns across FCI questions are heterogeneous, and even incorrect answer choices provide meaningful information about students’ conceptions,27 we distinguished subtypes of “physics thinkers” based on their FCI answer choices. Specifically, we applied community detection to FCI answer distributions to identify subgroups of similarly responding students and contrasted brain activity between groups to examine differential ways of thinking about the behavior of physical phenomena.

Results

Physics problem solving engages visual motion, central executive, and default mode processes

FCI responses (mean accuracy = 61%, mean response time (RT) = 20.2 s) were consistent with previous reports27,28 and significantly differed (p < 0.001) from control responses (mean accuracy = 98%, mean RT = 15.8 s), suggesting overall task compliance. Maps of FCI > Control blocks revealed activation across a fronto-temporo-parietal network, including the prefrontal cortex (PFC), left dorsal striatum, PPC, RSC, and dorsal posterior cingulate cortex, lateral occipitotemporal cortex (V5/MT+), and cerebellum (Fig. 1a; Supplementary Table 1). To tease apart constituent neural processes, we analyzed sequential phases of the problem-solving process and observed multiple dissociable whole-brain networks linked with problem initiation (Phase I), question presentation (Phase II), and answer selection (Phase III). Phase I was associated with a similar activity pattern as the FCI > Control contrast, Phase II maps were characterized by right-emphasized dorsal posterior parietal and V5/MT+ engagement, and Phase III maps included medial, anterior, and posterior nodes of the default mode network (DMN; Fig. 1b–d; Supplementary Table 2). These network transitions from fronto-temporo-parietal (Phase I) to dorsal attention (DAN; Phase II) followed by default mode cooperation (Phase III) point to the potentially important role V5–DMN–CEN interactions may have within physics reasoning processes. Meta-analytic functional decoding, which is a technique used to provide data-driven inferences about which mental functions are likely associated with specific brain activation patterns (see SI for more details), was performed on the resulting unthresholded z-statistic maps by using Neurosynth,29 indicating that terms for switching, default, motion, and reasoning were associated with physics problem solving (Fig. 1 radar plots; Supplementary Table 3).

Fig. 1
figure 1

Physics problem solving-related brain activity. Activation of FCI > Control for a problem solving across all phases, bd across each sequential problem phase, and e parametric modulation across all phases by problem difficulty. Activation maps were thresholded by using a cluster-defining threshold of P < 0.001 and a cluster extent threshold of P < 0.05, FWE corrected. Adjacent radar plots depict functional decoding results of the top ten weighted terms for each network. Note that term weightings depend on the values of each input map; thus, each radar plot depicts an arbitrary scale and comparison of values across plots is not recommended29

Decoding sequential phases indicated that problem initiation may reflect visuospatial attention, perceptual/motor, and memory retrieval; question presentation was associated with switching, visual short-term memory, and numbers, and answer selection was linked to DMN-related terms (e.g., unconstrained (free), mentalizing, and ambiguous), consistent with mental exploration of a solution. Next, to assess information exchange across GLM-identified regions during problem solving, we performed task-based functional connectivity (FC) analyses for three seeds centered on peaks of the overall FCI > Control map located in the left V5/MT+, the left dlPFC, and the RSC. Psychophysiological interaction (PPI) results (Fig. 2; Supplementary Table 4) revealed greater physics problem solving-related coupling (relative to control conditions) of the left V5/MT+ with DAN brain areas, the left dlPFC with V5/MT+ and DMN areas, and the RSC with frontoparietal, DMN, and salience network (SN) regions. These outcomes suggest that complex visual information may be carried through a dorsal stream to frontoparietal regions that direct CEN–DMN network exchanges during physics reasoning.

Fig. 2
figure 2

Psychophysiological interaction (PPI) results. Whole-brain PPI task-based functional connectivity associated with FCI > Control for a left V5/MT+, b left dlPFC, and c RSC seeds. PPI maps were thresholded by using a cluster-defining threshold of P < 0.001 and a cluster extent threshold of P < 0.05, FWE corrected

Difficulty, but not accuracy and strategy, modulate brain activity during problem solving

To relate brain function to behavioral measures impacting student success, we tested our hypotheses that activity in meta-analytically derived ROIs (e.g., left dlPFC, left PPC, ACC, left hippocampus, and RSC) would be parametrically modulated by student-reported strategy and normative problem difficulty,30 but not answer accuracy. While no significant BOLD signal modulations were observed in these a priori ROIs, an exploratory whole-brain parametric modulation analysis revealed that DAN and occipital activity were positively modulated by problem difficulty (Fig. 1e; Supplementary Table 5). This indicates that the network associated with physics reasoning is consistently activated, regardless of whether or not a correct answer is achieved, and does not reflect students’ perception of their reasoning strategy. Importantly, the most salient relation appears to be between the degree of difficulty and engagement of brain regions linked with visuospatial perceptual, memory, and attentional processes, as assessed by functional decoding (Fig. 1e right).

Students demonstrate dissociable brain activity linked to knowledge fragmentation

We next performed module analysis31 on students’ answer patterns to probe potential relationships between brain activity and students’ conceptual coherence (i.e., integration of physics knowledge)22 and to assess if distinct reasoning profiles were rooted in the underlying functional brain differences. We analyzed answer distributions by using a community detection algorithm32 to parse student subgroups who provided similar responses across FCI questions. Percent overlap was assessed between answers provided by each group and previously identified “conceptual modules” present in the FCI test31 (Supplementary Table 6). Conceptual modules are communities of incorrect FCI answer choices that are usually selected together. They represent students’ dissociable non-Newtonian (incorrect) notions about physical phenomena, some of which demonstrate a high degree of conceptual coherence, while others are more suggestive of a fragmented collection of physics ideas.21,31,33 The set of conceptual modules selected by a group (their reasoning profile) represents distinguishable arrangements of student’s (mis)interpretations and confusions about the physical world. Module analysis detected 13 student groups across 107 students who answered similarly to each other during FCI problem solving with a modularity of Q = 0.53 (Fig. 3a). Four groups had ten or more members (i.e., normative groups). ANOVA indicated a significant difference in mean framewise displacement (FD) head motion between groups or one or more of the groups (F(3, 178) = 8.213, p 0.001). Post hoc multiple comparison Tukey HSD tests indicated that students in Group D showed to significantly greater head motion (p < 0.05). The three remaining in the normative groups had no significant differences of in-scanner head motion and were thus selected for further analysis. The remaining three groups’ answer distributions were characterized based on prevalence of conceptual modules (Fig. 3b). These groups, composed of 24, 17, and 10 students, were carried into group-level neuroimaging analyses to assess brain activity and connectivity differences during problem solving.

Fig. 3
figure 3

Inhomogeneity in students’ conceptual approach. a Module analysis of student responses across FCI answer distributions. Heat map colors represent student responses to multiple-choice FCI questions and black horizontal lines distinguish groups identified by community detection. b Scaled within-group overlap of incorrect FCI responses across nine previously measured physics conceptual models31 (Supplementary Table 6) for top three normative groups. c Group differences in problem solving-related brain networks (FCI > Control, all phases) across the three normative groups. Increased activity is shown for Groups A and B relative to Group C (top) and Group C relative to Groups A and B (bottom). No significant differences were observed between Groups A and B. Group difference maps were thresholded by using a cluster-defining threshold of P < 0.001 and a cluster extent threshold of P < 0.05, FWE corrected

Group A (n = 24) achieved an accuracy rate of 77% across all FCI questions, indicative of being highly Newtonian thinkers.27 Of the non-Newtonian responses provided by this group, incorrect answers almost exclusively aligned with a common naive physics idea known as the “impetus force” (m1, Fig. 3b top), which is the incorrect belief that moving objects experience a propelling force. Group B (n = 17) achieved an accuracy rate of 73% across all FCI questions, which is also indicative of high Newtonian thinking. The reasoning profile for Group B (Fig. 3b middle) indicated that students gave incorrect answers by either falling victim to the impetus force fallacy (m1) or to another common, but less coherent set of physics conceptions that we term the “confusion about gravitational actionmodule (m9). Group C (n = 10) achieved an accuracy rate of 53% across all FCI questions, indicative of non-Newtonian thinking. The reasoning profile for Group C (Fig. 3b bottom) indicated that students’ incorrect answers were primarily associated with five conceptual modules that each occurred at relatively similar rates: the “impetus forcemodule (m1), “more force yields more resultmodule (m2), “confusion relating speed and pathmodule (m5), “sudden forces induce instantaneous path changemodule (m6), and “an object’s mass determines how it fallsmodule (m7).

We performed a whole-brain, one-way ANOVA to identify between-group differences in physics-related brain activity (FCI > Control, all phases). Omnibus results indicated that one or more subgroups showed significantly different brain activity during problem solving. Post hoc tests were performed across each combination of group pairs (Fig. 3c; Supplementary Table 7). Group A (vs. C) students demonstrated greater activity during problem solving in the left lateral orbitofrontal cortex (lOFC) as well as in the left inferior parietal lobule, bilateral V5/MT+, and right cerebellum. Group B (vs. C) students also exhibited greater activity in the left lOFC. Group C (vs. both A and B) students showed greater activity in the cuneus extending into the lingual gyri. In addition, Group C students also showed increased activity relative to Group A in the caudal medial frontal gyrus, ACC, bilateral precentral and postcentral gyri along the precentral sulcus, bilateral anterior insular cortex (aIC), and left superior temporal gyrus. Overall, the student who answered by using more coherent physics conceptions, even if incorrect, showed increased reliance on a lOFC-V5/MT+ network, whereas students who held less consistent ideas involving multiple conceptual approaches showed increased primary visual and SN activity. One possible interpretation of these differences may be that in the absence of stable and coordinated physics conceptions, students engage relatively more visual search processes for salient problem features.

Discussion

Our fMRI results suggest that when students solve physics problems, they activate a network of bilateral dlPFC, left lOFC, PPC, RSC, and V5/MT+ areas, consistent with previous CEN-supported problem-solving findings across knowledge domains.5 Yet, V5/MT+ and RSC involvement with the CEN appear to be a feature of physics problem solving in particular. Both areas support visuospatial information processing,34 with the bilateral V5/MT+ system being linked to visual motion processing including imagining implied motion and maintaining motion information in working memory,35,36,37 and RSC supporting spatial cognition and episodic memory retrieval, especially when imagined scenes are mentally transformed between specific viewpoints.24 Thus, these regions may aid in the mental imagery of motion, as informed by remembered physical scenarios, and build internal representations of physical systems, which is considered an essential step in physics solution generation.38 Shifts in physics-related brain activity across problem phases indicate reliance on memory-linked associations. We find that V5/MT+, CEN, DAN, and DMN transitions support sequential problem-solving phases. Notably, answer generation elicited concurrent DMN, lateral fronto-parietal, and V5/MT+ activity. Interestingly, while CEN-supported tasks often evoke DMN deactivations, this DMN–CEN coherence likely indicates reliance on episodic and semantic memory retrieval processes39,40 during physics cognition, a notion consistent with the constructivist theory of learning.41 In addition, the PCC is functionally heterogeneous, connecting DMN and fronto-parietal networks, and serving as a possible hub across brain systems to direct attentional focus.42 Further, the FCI is differentiated from other fMRI tasks by its relatively long trials, requiring sustained cognition to generate answers. The DMN may thus be activated along with the CEN to allow for mental exploration necessary in solution derivation.

Problem solving-related brain activity was shown to differ based on how students think, not how correct they are. We found that students’ problem solving-related brain function cannot be categorized by simply considering their “incorrect” versus “correct” answers. Rather, module analysis indicates that variance in conceptual approach better characterizes brain differences, which in turn impacts success rate. An existing framework of learning conceptualizes physics cognition as relying on dual “knowledge structure” and “control structure” processes.22 Under this model, students apply executive functions to select or inhibit associational patterns that ground how they describe the physical world. Here, associational patterns, known as knowledge structures, are conceptualized as flexible, contextually primed collections of linked knowledge elements called “resources” that students activate to scaffold reasoning. Ideally, students learn to activate stable associations between physical laws, enabling long deductive chains to be carried out during problem solving. However, when this does not occur, student’s non-Newtonian processes can vary: strongly associated yet inappropriate resources may stably activate across contexts, or more basic, axiomatic physical beliefs (e.g., intuitive notions such as closer is stronger or more effort gives more result)21 may form weak, unstable links that do not support ancillary deductive elaboration. These differences are described along an axis of “compilation” or memory chunking. Students without precompiled knowledge structures require additional cognitive resources to assemble associations during reasoning, whereas physics experts can access well-developed associational patterns that do not need to be actively assembled during problem solving.

We adopt this resource framework to interpret brain function with the goal of relating neuroimaging findings to educational knowledge and practice. Physics-related CEN and DAN activations were linked to varied cognitive terms consistent with the idea of a control structure, and DMN involvement during reasoning may reflect associational mappings within semantic or episodic memory circuits.39,40 Thus, dlPFC–RSC FC may support the idea that control processes guide knowledge structure selection. Under this interpretation, reasoning subgroups may be thought of as differentiated by knowledge structure use. Groups A and B applied predominantly Newtonian (i.e., compiled) thinking, but Group C was less consistent in their approach. Of the non-Newtonian modules activated, Group A consistently used an arguably concrete impetus model, Group B applied an impetus model while also expressing confusion about gravitational action, and Group C utilized multiple modules characterized by simple, vague, or confused ideas that differed across problems. We argue that these groups can be described along a continuum of knowledge compilation, coherence, and robustness. Groups A, and to a lesser extent, B demonstrated stable, strongly associated knowledge structures, whereas Group C showed more labile associational patterns that were limited by problem context. In this manner, less coherent, more variable knowledge structures were associated with increased primary visual and SN activity, whereas precompiled, stable reasoning strategies more strongly activated lOFC and V5/MT+, areas implicated by physics thinking in the CEN. These findings suggest that chunked knowledge can reduce working memory demands, allowing for increased focus on other control structure aspects of problem solving.22 However, when students continually reidentify associational patterns across problems, they may rely more heavily on visually guided SN activity to select which problem features deserve their attention.43

A fundamental goal of educational neuroscience is to bridge understanding of brain function with the insights, findings, and models of education research. Under a resource framework, our results suggest that physics students struggle most when they do not understand how to choose appropriate and coherently chunked resources from long-term memory, thus relying on increased SN activity during problem solving. Learning obstacles also occur when students access compiled but nonphysical conceptions during reasoning, allowing for increased CEN brain function linked to control processes. While the latter still represents a type of incorrect physics thinking, it more closely resembles the kind of cognition instructors aim to teach.22 As others have pointed out,44 it is a long path between brain imaging and the potential development of lesson plans, yet these insights may begin to inform aspects of physics classroom practice: instruction that explicitly attends to how students select, link, and reorganize resources may be critical in developing appropriately compiled knowledge to map back onto control processes.22 Learning physics is complex, yet a disproportionate focus is often placed on whether students answer questions correctly. Our results suggest that the conceptual foundations of wrong answers are accompanied by functional brain differences during reasoning and can reveal much more about student’s ability to succeed than simple measures of accuracy. A focus on accuracy alone oversimplifies the complex processes engaged during physics reasoning. Instructors that leverage (rather than ignore or attempt to simply overwrite) students’ incorrect conceptions to facilitate conceptual change and transition-existing resources about physical phenomena into stable and accessible knowledge structures may better serve students in connecting what they believe with what they predict.

In sum, we find that the neural mechanisms underlying conceptual physics problem solving are characterized by integrated visual motion, central executive, attentional, and default mode brain systems, with solution generation relying on critical DMN–CEN engagement during reasoning. Furthermore, we explored whether measures of student success show underlying neurobiological bases, finding that students’ physics conceptions manifest as brain differences along an axis of relative knowledge fragmentation and robustness. Critically, accuracy alone did not predict brain function, but students achieved increased success when they made use of stable, strongly associated knowledge structures. We acknowledge that our results may be specific to the FCI questions used here, that additional or varied brain dynamics may be more relevant for different kinds of physics problem solving, and that sample sizes across Groups A–C, are relatively small and uneven. Despite these concerns, we are confident that our findings serve to deepen understanding into how students learn. Together, our results demonstrate that associational and control processes operate in tandem to support physics problem solving and offer potential educational insight toward promoting student success.

Methods

Participants

One hundred and seven healthy right-handed undergraduate students (age 18–25 years; 48 women) enrolled in introductory calculus-based physics at Florida International University (FIU) took part in this study. MRI data were acquired no more than 2 weeks after the end of the academic semester. Written informed consent was obtained in accordance with FIU Institutional Review Board approval.

FCI task

The Force Concept Inventory, a widely used45 and reliable46 test of conceptual understanding in Newtonian physics,26 which includes a series of questions about physical scenarios, was adapted for the MRI environment. FCI questions do not require mathematical calculation; rather they force students to choose between a correct answer and multiple common sense alternatives. The task included three phases: participants viewed a figure and descriptive text presenting a physical scenario (Phase I), a physics question was presented (Phase II), and participants viewed four possible answers and were instructed to choose the correct answer and mentally justify why their solution made the most sense (Phase III). Participants provided a self-paced button press to advance between phases and provide their final answer; a fixation cross was shown after answer selection before presentation of the next scenario. Question blocks were of maximum duration 45 s and were followed by a fixation cross of minimum duration 10 s. Control questions presented everyday physical scenarios and queried students on general reading comprehension instead of physics content. Control questions also included three phases (Control I, Control II, and Control III) to match the presentation of FCI questions. Post-scan debriefing included a paper-based questionnaire in which students rated the degree to which they had used “knowledge and reasoning” or had relied on a “gut feeling” to solve each FCI question.

fMRI acquisition and preprocessing

Functional images were acquired with an interleaved gradient-echo, echo planar imaging sequence (TR/TE = 2000/30 ms, flip angle = 75°, FOV = 220 × 220 mm, matrix size = 64 × 64, and voxel dimensions = 3.4 × 3.4 × 3.4 mm, 42 axial oblique slices). A T1-weighted series was acquired by using a 3D fast spoiled gradient recall brain volume (FSPGR BRAVO) sequence with 186 contiguous sagittal slices (TI = 650 ms, bandwidth = 25.0 kHz, flip angle = 12°, FOV = 256 × 256 mm, and slice thickness = 1.0 mm). Preprocessing was performed by using FSL (www.fmrib.ox.ac.uk/fsl) and AFNI software libraries. Anatomical and functional images were skull stripped, the first five frames of each functional run were discarded, rigid-body motion correction was performed, functional images were high-pass filtered (110 s), and a 12-degree-of-freedom affine transformation was applied to co-register the series with each structural volume. Nonlinear resampling was applied to transform all images into MNI152 space, and functional volumes were spatially smoothed by using a 5-mm Gaussian kernel. All motion-corrected non-registered 4D data underwent visual inspection, and TRs associated with visually identified motion artifacts were flagged for exclusion and their corresponding FD values were recorded. The minimum of the distribution of these artifact-linked FDs was used as a common scrubbing threshold across subjects during analyses. TRs with excessive motion (including one frame before and two frames after) were censored out during the GLM analysis if they met or exceeded a threshold of 0.35-mm FD.47 Runs containing excessive motion (≥33% of within-block motion) were discarded from the analysis, resulting in the omission of three runs from two individuals. Six motion parameters (translations and rotations) were included as nuisance regressors in all analyses.

General linear model analyses

Stimulus-timing files were created for each participant based on question phase onset/offset times. FCI and control questions were modeled as blocks from question onset to the onset of a concluding fixation cross triggered by answer selection. The contrast FCI > Control was modeled across full question duration; three additional GLM analyses were performed for the individual phases. Timing files were convolved with a hemodynamic response function, and the first temporal derivatives of each convolved regressor were included to account for any offset in peak BOLD response. General linear modeling for within- and between-subject analyses was performed in FSL by using FEAT. Group-level activation maps for all contrasts were thresholded with a cluster-defining threshold (CDT) of P < 0.001 and a cluster extent threshold (CET) of P < 0.05 (FWE corr).

Task-based functional connectivity analysis

We tested for PPI associated with the FCI task across three seeds centered on peaks from the overall FCI > Control map located in the left V5/MT + , left dlPFC, and RSC. ROIs were transformed into native space, and time series were extracted from unsmoothed data and included as regressors in separate within-subject PPI analyses performed on spatially smoothed 4D data sets. Design matrices for the within-subject PPI analyses contained regressors for the ROI time series, the condition difference vector modeling the differences between FCI and Control timing files, a vector representing the sum of the FCI and Control conditions, and the interaction between the task difference vector and ROI time series. The interaction term was calculated by zero-centering the task explanatory variable, and the mean of the ROI time series was set to zero. All task and interaction regressors, but not the ROI time series, were convolved with a Gamma-modeled hemodynamic response. PPI analyses were carried out separately for each ROI, and the resultant beta maps were averaged within-subject and carried into three separate group-level analyses. ROI-to-voxel task-based functional connectivity analyses were thresholded at a significance of P < 0.001 CDT, P< 0.05 CET (FWE corr).

Brain-behavior correlates

Separate within-subject parametric modulation analyses were performed for accuracy, difficulty, and self-reported problem-solving strategy. Design matrices were identical to GLM analyses but included a single parametric modulator with the same FCI question timing but with a regressor height modeled by differences in the behavioral measures. Accuracy was modeled with regressor heights of 1, 0, or −1 corresponding to correct, no response, or incorrect answer provided. Difficulty was measured as a normative miss rate per FCI question, as measured externally.30 Problem-solving strategy was measured on a Likert scale by a post-scan questionnaire. If any parametric modulator had zero variance within a run (i.e., the student reported using an identical strategy for all questions) then the run was discarded to avoid rank deficiency in the design matrix. The resulting beta maps were then averaged across within-subject runs. Brain-behavior correlations were tested via two separate analyses: we extracted within-subject parametric modulator beta values within five hypothesis-driven ROIs and conducted one sample, two-sided t tests on the beta distributions for significant variations from baseline (Supplementary Fig. 3). Group-level analyses were also performed with whole-brain beta maps resulting from the parametric modulation GLMs to determine if significant network-level activity was present during problem solving associated with the behavioral measures.

Student response profiles

Given evidence indicating student responses to the FCI, which provide insight into how students think about physics problems,31 we performed a module analysis of the observed FCI answer distributions to identify student response profiles. The data were treated as a bipartite matrix of Students × Responses. This bipartite matrix was computed and then projected into a weighted adjacency matrix of students, A= MMT, where M is the bipartite matrix. Each element in A represents the count of how many times one student agreed with any other student (values from 0 to 9, for 9 questions). Next, we performed nonparametric sparsification48 on A to identify the backbone of the graph. Backboning identifies important links within a network and reduces the number of spurious links. A significance value was computed for each edge weight and the edge weights were thresholded at P < 0.01. We performed community detection (InfoMapR32) on the backbone network to identify subgroups of students who provided similar responses to the FCI prompts. We assessed the scaled within-group overlap of incorrect FCI responses across a set of nine previously measured physics modules consisting of jointly selected incorrect FCI response items31 (Supplementary Table 6). Each group’s relative conceptual module representation was scaled by group size to allow for comparisons across groups of different sizes. Alignment with conceptual modules indicates that students draw on specific non-Newtonian physics conceptions. Finally, we tested for network differences across student groups. An omnibus test was conducted for the FCI > Control contrast as well as for the three whole-brain PPI maps. Significant F-test results were further interrogated with post hoc t tests across groups. Maps were thresholded at P < 0.001 CDT, P < 0.05 CET (FWE corr).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.