Hippocampal GABA enables inhibitory control over unwanted thoughts

Intrusive memories, images, and hallucinations are hallmark symptoms of psychiatric disorders. Although often attributed to deficient inhibitory control by the prefrontal cortex, difficulty in controlling intrusive thoughts is also associated with hippocampal hyperactivity, arising from dysfunctional GABAergic interneurons. How hippocampal GABA contributes to stopping unwanted thoughts is unknown. Here we show that GABAergic inhibition of hippocampal retrieval activity forms a key link in a fronto-hippocampal inhibitory control pathway underlying thought suppression. Subjects viewed reminders of unwanted thoughts and tried to suppress retrieval while being scanned with functional magnetic resonance imaging. Suppression reduced hippocampal activity and memory for suppressed content. 1H magnetic resonance spectroscopy revealed that greater resting concentrations of hippocampal GABA predicted better mnemonic control. Higher hippocampal, but not prefrontal GABA, predicted stronger fronto-hippocampal coupling during suppression, suggesting that interneurons local to the hippocampus implement control over intrusive thoughts. Stopping actions did not engage this pathway. These findings specify a multi-level mechanistic model of how the content of awareness is voluntarily controlled.

This paper uses functional MRI (fMRI) and Magnetic Resonance Spectroscopy (MRS) to study the brain mechanisms underlying control over unwanted thoughts. Many tests were made so as to infer that these mechanisms were specific both to the brain regions involved (e.g. hippocampus rather than primary motor/visual cortex) and what was being controlled (e.g. thoughts rather than actions). I find the paper to be impressive in the quality with which a broad range of techniques have been used and harnessed together to answer an important question (how does the brain suppress unwanted thoughts ?) -a question of relevance to many psychiatric disorders.
In what follows I will review each section of the results, covering the main findings, and focussing on methodology: (1). Thought suppression engages a functionally specific hippocampal pathway The GLM-based mass univariate analysis of the fMRI data (Fig 1) used a correction for multiple comparisons of cluster-level inferences using high cluster forming thresholds (p<0.001) -see Subjects performed a Think/No-Think task and a contrast of Think versus No-think identified left and right hippocampus, whereas a contrast Go versus Stop identified left and right M1. I'm not especially familiar with this literature -I expect the Go-versus Stop paradigm has been scanned many times using fMRI with similar results -is this correct ? Whereas, is this the first time Think/No-think has been scanned ? Additionally, DLPFC was activated during suppression of thoughts or actions.
(2) Hippocampal GABA predicts (i) reduced BOLD and (ii) successful thought suppression (i) Hippocampal GABA predicted hippocampal BOLD response during Think and No-Think tasks (more GABA, less BOLD) but not during Go or No-Go tasks (Actions). DLPFC and visual cortical GABA did not make these predictions. These inferences were made using correlations over subjects with bootstrapped confidence intervals.
(ii) Hippocampal GABA predicted 'suppression induced forgetting' (impairment of later memory for suppressed items). Again inferences were made using correlations over subjects with bootstrapped confidence intervals. Looks fine.
Think/No-think tasks modulated the (undirected) connectivity between hippocampus and DLPFC. Here DLPFC was found, and this inference made, using a whole brain search using the method known as Psycho-Physiological Interaction (PPI). Here the statistical threshold was not set using a whole brain correction, but a region of interest centred on the DLPFC (Fig 4a). This seems fine given its expected role (from work prior to this paper) in behavioural inhibition.
4d,e). Subjects were split into those with high versus low hippocampal GABA. For higher hippocampal GABA subjects, the best network model was one in which DLPFC provided input and "No-Think" task modulating connectivity between DLPFC and hippocampus. This was not the case for the low GABA group. This inference was made using the 'exceedence probability' measure -indicating which (of the tested) models was the most likely (frequently used) in the population from which the subjects were drawn. Again, the application of this methodology is sound.

SUMMARY
Overall, the findings suggest that GABAergic inhibition local to the hippocampus implements prefrontal control over intrusive thoughts. This finding is consistent with previous literature (e.g. reductions of the BOLD signal in hippocampus) but the additional use of MRS with fMRI nails this down to GABA. The data analyses have been conducted in an exemplary manner and clearly support the findings.
Reviewer #2 (Remarks to the Author): 30 young adults were recruited to the study. They performed a fMRI session, where they performed a Think/No-Think task, and the Stop Signal (SS) task, to investigate the relationship between GABAergic activity in the hippocampus and inhibitory control of unwanted thoughts. MRS data were acquired in a separate session from the right hippocampus, the right DLPFC and the visual cortex. The authors showed that hippocampal GABA was inversely related to BOLD signal suppression in the hippocampus in response to thought suppression. Appropriate controls were performed. This is an interesting study, which uses complex methodology, and was clearly performed with care and thought. The manuscript is well-written and guides the reader through the data well. However, I am somewhat unconvinced by the specificity of the results, as discussed in detail below. In addition, MRS of the hippocampus is difficult and while the authors acknowledge this and have provided some data to reassure the reader of the quality of their spectra it is currently not possible to assess the data quality fully here, making it difficult to know how to interpret their results.
Major Points 1. The authors introduce the paper in terms of a number pf psychiatric conditions and then test their hypotheses on healthy controls. It is not immediately clear to me that the mechanisms underlying the inhibition of intrusive thoughts in psychiatric disorders are the same as the mechanisms underlying the instructed inhibition of thoughts in healthy controls. 2. Quantification of GABA from the hippocampus is difficult. I am reassured by the line-width reliability across the 3 voxels, but it would be useful to have some values for the fit for the GABA per se for all three voxels to determine the reliability of the measures. 3. I am not convinced that the hippocampal GABA measure is "specific". The authors say that other "difficult non-memory tasks sometimes also reduce hippocampal activity" but this was not the case with the control task here. I do not think that this can be therefore claimed to be specific to the task in question. This issue should either be addressed in the discussion directly, and the interpretation amended accordingly, or a control experiment performed. 4. The hippocampal GABA levels and the task data were acquired on 2 different days. What assurance can the authors give that either of these measures is sufficiently stable across time to make this an appropriate analysis approach. This is a particularly important question given the A second type of conceptual error here is presenting the GABA differences as causal, when the findings are only correlational. The authors attribute a causal role to bulk measures of HC GABA (as measured by JPRESS) when they say "GABA enables," "depends on GABA", "GABA alters", "low GABA compromises", "GABA influences." In fact, the study provides evidence for associations with bulk measures of HC GABA. Speculations about causal relationships should be minimized and clearly framed as speculation or hypotheses for future testing.
The manuscript's title incorporates both of these misleading conceptual frames, using the terms "GABA enables" and "unwanted thoughts." In contrast, the study actually shows evidence that HC GABA is associated with volitional memory suppression.
Overstating and overgeneralizing the findings occurs in many places in the text. For example, line 10 states "In so doing, we isolate a fundamental mechanism enabling inhibitory control over thought: GABAergic inhibition of hippocampal activity." "Isolate a fundamental mechanism" is much too strong a phrase, "enabling" is speculative, and "control over thought" is much too general to associate with HC GABA based on this study.

Correlations
Line 202 states "Because the robust and partial correlation analyses yielded similar conclusions, we focus on the partial correlations for simplicity." The reader assumes that a study principally aiming to examine the association between HC BOLD and HC GABA would have an a priori statistical plan for testing this association. If so, which of these two approaches to correlation analysis was chosen a priori? All results should be reported using the a priori method, with secondary comments on the convergence or divergence of results found with an alternate method.
As written, there is a confusing intermixing of robust and partial correlation approaches. For example, line 202 suggests that the results of the partial correlation analyses are presented in the main paper. However, line 214 indicates that CI are used for testing significance and cites the papers on robust correlations. This suggests that robust correlations are being reported for these comparisons. Again on line 353, the citations for robust correlations are given in a context where they appear to be reporting partial correlations.
In addition, there is a lack of consistency in how correlations are applied in the manuscript. Sometimes the authors provide direct comparisons between correlations, and sometimes they don't. For example, the authors report that HC BOLD-GABA correlations are significant during memory task components and not significant during motor task components. They interpret this as a selective finding, but they omit direct comparison of the correlations across tasks. However, the authors include a direct comparison between correlations for a different contrast on lines 241-244. Sometimes they include the GO condition BOLD responses as covariates in relevant analyses (line 240-1) and sometimes they don't (lines 224). The result is the appearance of selectively focusing on findings that support their model and not making sincere attempts to challenge or disprove the model. The relatively low power of the key contrasts (N=18) may have a role in this selective reporting.
What type of robust correlation was used? Was it bend, skipped, or some other? If bend, what percentage was used? If skipped, how many outliers were removed? For all statistical results, it is necessary to include either df or N.

MRS
The authors state that good shims were obtained in the HC voxel for 18 of the 24 participants. It is necessary to state whether a specific line width threshold was used for exclusion of spectra, and if so, what threshold was used. It appears that the mean (s.e) of the linewidth is presented. Please present the mean (s.d.). The authors state that 4 voxels were excluded for lipid contamination. Please clarify how many were excluded from each voxel location.
The authors helpfully teach the reader that 2D-JPRESS offers some advantages over PRESS in regions of high inhomogeneity, like the HC. However, they fail to mention an apparent disadvantage of the 2D-JPRESS method when compared to the more commonly used MEGA_PRESS approach. Specifically, it appears from the cited JPRESS studies that the reliability of GABA/Cr measurements is considerably less with JPRESS than is typically reported for MEGA-PRESS. Given the HC target location, and the appearance of valid GABA measurements, this is not a criticism of the choice to use JPRESS. However, for readers familiar with MEGA-PRESS, the apparently lower reliability of the JPRESS approach should be mentioned among the limitations of the study.
The issue of the stability of HC GABA measurements is particularly relevant in the current study because of the interval between BOLD measures and the GABA measures with which they were correlated was relatively long (mean = 13 days). It is essential to also report the range of interval days. Are the authors aware of any data on the stability of MRS GABA measures in HC or other regions across intervals in the range occurring in this study? Even the mean value (13 days) is quite long, and this aspect of the design represents a limitation of the study that should be acknowledged.
If estimates of glutamate content and gray matter fraction are to be used as covariates, then the mean (s.d.) of these measurements must be reported. Since these are inherently noisy measurements, the reader will want to see some information about their distribution.
There is some confusion in the supplement about the duration of the MRS acquisitions. On line 145, it states "TR/TE=2400/31-229ms, DTE=2ms, 4 signal averages per TE step … yielding a total acquisition time of 13 min 28 sec." The math doesn't seem to add up. I get a total of 16 minutes for this acquisition. Similarly on line 153 it states "In addition, water unsuppressed 2D 1H MRS data were acquired from each voxel with 2 signal averages recorded for each TE step (acquisition time 3 min 28 sec)." However, I calculate a total of 8 minutes for acquisition, if there are the same number of TE steps. Please clarify.
Please clarify whether or not signal from the macromolecule multiplet at ~3.0 ppm in included in the GABA estimate from this method.
Minor point In addition to the primary findings relating HC GABA to both HC BOLD response during suppression (negative correlation) and SIF (positive correlation), there is also a finding that HC GABA is negatively correlated with HC BOLD during retrieval. In fact, the correlation is stronger for retrieval than for suppression. The authors address this in a reasonable way. However, they may be missing an opportunity to clarify a parsimonious view of why both findings emerge. The authors correctly point out that bulk tissue GABA measurements in brain cannot distinguish between the various compartments in which the GABA is located. In fact, the great majority of GABA in HC and cortex is located in the cytoplasm of GABAergic interneurons. Cytoplasmic GABA serves, in part, as a reservoir both for the filling of synaptic vesicles with GABA and for extrasynaptic GABA release (as in tonic inhibition). Thus, some have argued that MRS GABA reflects the capacity for GABAmediated effects during times of high demand (e.g. during tasks). It is quite possible that both retrieval and suppression evoke and depend on an increase in HC GABA-mediated effects. If so, then the BOLD response during both task components could be negatively associated with the bulk tissue GABA content in HC as measured by MRS. The association with bulk GABA does not distinguish between the specific GABA-mediated effects involved in the different tasks.
Reviewer #4 (Remarks to the Author): Schmitz and colleagues reported a multimodal neuroimaging study in which they investigated how hippocampal GABA contributes to suppressing unwanted thoughts, with fMRI and 1H magnetic resonance spectroscopy (MRS). During fMRI scanning, 30 participants performed an adapted Think/No-Think (TNT) task and a Stop-signal (SS) task, which were interleaved in a mixed block/event-related design. 1H MRS data were obtained on a separate day to measure GABA concentrations in three regions of interest (ROIs), including the right hippocampus, the right dorsolateral prefrontal cortex (DLPFC) and the primary visual cortex. Three major results are reported: (1) fMRI data revealed that suppression led to reduced hippocampal activation and impaired memory for suppressed memories; (2) 1H MRS data revealed that greater hippocampal GABA concentrations predicted better mnemonic control in both retrieval and suppression conditions; (3) Higher hippocampal GABA specifically predicted stronger suppression-induced negative coupling between the DLPFC and the hippocampus. The authors concluded that GABAergic inhibition local to the hippocampus plays a critical role in mediating fronto-temporal inhibitory control pathway involved in the suppression of unwanted thoughts or memories.
Overall, there are several novel and significant strengths for this well-written manuscript, particularly the use of both fMRI and 1H MRS to address an important question of how hippocampal GABA contributes to suppressing unwanted memories in humans. It would be wise to publish this novel piece of work with no delay. The experimental design was very thoughtful and well controlled, involving a TNT task interleaved with a SS task. The authors have done a good job on including control regions in 1H MRS and conducting dynamic causal modeling analysis for fMRI data. The association of hippocampal GABA concentrations with hippocampal activation, functional coupling and dynamic causal interactions are very interesting. These findings will not only have important implications into understanding of neurobiological mechanisms underlying suppression of unwanted thoughts/memories, but also provide novel insights into understanding of intrusive symptoms of various psychiatric disorders. Despite of above novel and potentially important aspects, I do have several suggestions (detailed below) to improve the manuscript.
Major comments: 1. In the Introduction section, the authors emphasized several aspects of diminished lateral PFC engagement in cognitive control and hippocampal hyperactivity seen in a variety of psychiatric disorders. Although they attempted to build a link of local GABAergic inter-neuron network with hippocampal hyperactivity, it is still not that clear about the logic of how hippocampal local GABA actually modulates long-range PFC region(s) thought to drive top-down control over unwanted thoughts or memories. This point should be better framed to aid readers. For instance, the author may want to clarify this point by building up more thoughtful arguments about potential GABA neuromodulatory pathways acting on long-range PFC regions.
2. Another point related to above, the authors may want to point out how tonic hippocampal GABA network functioning may actually modulate their observed phasic hippocampal BOLD signals/activity and functional coupling with the DLPFC in their current fMRI study. This way may be helpful for readers to better understand the link of tonic high/low GABA concentrations with their observed effects on both behavioral and neuroimaging levels.
As they introduced that tonically disinhibiting GABAergic interneuron networks in the hippocampus has been linked to desynchronized hippocampal rhythms, reduced overall activity and impaired memory performance (line 41-42), one would thus expect to see an overall reduction pattern in hippocampal BOLD activity between high versus low hippocampal GABA groups. It would be great if the authors could look into their fMRI data about this point.
Did the author collect resting state fMRI data? It would be great to verify whether hippocampal GABA is tonically related to task-free intrinsic hippocampal activity and intrinsic hippocampal-DLPFC connectivity at a resting rather than an active task state.
3. The central findings in this study are that hippocampal GABA levels were predictive of not only suppression-induced forgetting, but also BOLD hippocampal activity and connectivity as well as hippocampal-DLPFC dynamic causal interactions. Unfortunately, the authors did not report whether there was any potential difference in memory acquisition phase between high versus low hippocampal GABA groups. Based on above concern in Comment 2, one would expect that tonic hippocampal GABA concentrations might contribute to not only hippocampal-dependent memory processing not only during the suppression phase but also during the acquisition phase. This point is also somehow in line with their observed correlation with general memory performance regardless of Think/No-Think trials. It would be relevant to see any potential difference in memory performance between high versus low GABA groups during the training phase. They may simply compare training time and memory performance between during the TNT training phase between two groups. 4. In the training phase, participants were trained only to reach a learning criterion of at least 40% for the critical memories on the Think/No-Think task. What is the mean rate across participants? How much individual differences are there after this training procedure? In reality, however, there must be some participants reaching higher or lower than average. It is unclear this potential variance took into account for their analyses of fMRI data and 1H MRS data?
5. The authors have done a good job on analyzing hippocampal-DLPFC dynamic causal interactions and their links to local hippocampal GABA concentrations. This analytic approach looks only into hippocampal-DLPFC neural pathways while ignoring other potentially important neural pathways. As the authors have noted in the Introduction section, suppression of unwanted thoughts is most likely to carry out through polysynaptic pathways of the DLPFC to down-regulate hippocampal activity. The authors may want to point out this limitation in their manuscript. 6. The authors reported significant correlation of hippocampal GABA with suppression-induced forgetting, hippocampal activity and hippocampal-DLPFC functional coupling. It would be interesting to know whether there is any reliable moderate relationship among GABA, brain activity/functional coupling and memory performance. In other words, they may also want to consider GABA-brain-behavior moderation analysis (i.e., https://github.com/canlab/MediationToolbox) on the whole brain activity and hippocampal-based connectivity. This approach may provide some complimentary data to illustrate other possible modulatory pathways on the whole brain level. 7. For suppression-induced hippocampal BOLD activity, did the authors only look into No-Think trials regardless of subsequent memory status (i.e., later remembered or forgotten)? If memory status was considered, how did they differ while linking to hippocampal GABA concentrations? These data may be helpful to better understand the link of hippocampal GABA with suppressioninduced forgetting and corresponding neural activity 8. In the Methods section, there appears no any description about 1H MRS data acquisition and analysis, fMRI data functional connectivity and dynamic causal modeling analyses. I would courage to include these parts in the Methods.
Minor comments: 9. More details are needed to aid readers about how regional GABA concentrations were computed for each ROI. For instance, it appears that three ROIs show quite different profiles for their frequency distribution of observed GABA concentrations in each voxel. How are the overall GABA concentrations then computed each ROI? 10. In Figure legend S1:, I believe that "sagittal and axial slices" should be "sagittal and coronal slices". 11. On line 532: In the fMRI analysis section on line 589-690, the authors wrote as "Each model included within-session global scaling (default). Please clarify whether this is same as "global intensity normalization" implemented in SPM or not.
12. In the Supplemental Materials, it is unclear what the abbreviations of "SP and IP" on line 205 stand for.

Reviewer 1
Reviewer Comment 1.1: This paper uses functional MRI (fMRI) and Magnetic Resonance Spectroscopy (MRS) to study the brain mechanisms underlying control over unwanted thoughts. Many tests were made so as to infer that these mechanisms were specific both to the brain regions involved (e.g. hippocampus rather than primary motor/visual cortex) and what was being controlled (e.g. thoughts rather than actions). I find the paper to be impressive in the quality with which a broad range of techniques have been used and harnessed together to answer an important question (how does the brain suppress unwanted thoughts ?) -a question of relevance to many psychiatric disorders.
Author Response 1.1: We greatly appreciate reviewer 1's positive response.

Reviewer Comment 1.2:
In what follows I will review each section of the results, covering the main findings, and focussing on methodology: (1). Thought suppression engages a functionally specific hippocampal pathway The GLM-based mass univariate analysis of the fMRI data (Fig 1) used a correction for multiple comparisons of cluster-level inferences using high cluster forming thresholds (p<0.001) -see (2) Hippocampal GABA predicts (i) reduced BOLD and (ii) successful thought suppression (i) Hippocampal GABA predicted hippocampal BOLD response during Think and No-Think tasks (more GABA, less BOLD) but not during Go or No-Go tasks (Actions). DLPFC and visual cortical GABA did not make these predictions. These inferences were made using correlations over subjects with bootstrapped confidence intervals.
Author Response 1.5: Yes this summary of our findings and methods is accurate.
Reviewer Comment 1.6: (ii) Hippocampal GABA predicted 'suppression induced forgetting' (impairment of later memory for suppressed items). Again inferences were made using correlations over subjects with bootstrapped confidence intervals. Looks fine.
Author Response 1.6: Yes this summary is correct.
Think/No-think tasks modulated the (undirected) connectivity between hippocampus and DLPFC. Here DLPFC was found, and this inference made, using a whole brain search using the method known as Psycho-Physiological Interaction (PPI). Here the statistical threshold was not set using a whole brain correction, but a region of interest centred on the DLPFC (Fig 4a). This seems fine given its expected role (from work prior to this paper) in behavioural inhibition.
To test for directed changes in connectivity the authors then used Dynamic Causal Modelling (Fig 4d,e). Subjects were split into those with high versus low hippocampal GABA. For higher hippocampal GABA subjects, the best network model was one in which DLPFC provided input and "No-Think" task modulating connectivity between DLPFC and hippocampus. This was not the case for the low GABA group. This inference was made using the 'exceedence probability' measure -indicating which (of the tested) models was the most likely (frequently used) in the population from which the subjects were drawn. Again, the application of this methodology is sound.
Author Response 1.7: Yes, this summary is accurate. Thank you for the feedback on the appropriateness of our methods.

Reviewer Comment 1.8. SUMMARY
Overall, the findings suggest that GABAergic inhibition local to the hippocampus implements prefrontal control over intrusive thoughts. This finding is consistent with previous literature (e.g. reductions of the BOLD signal in hippocampus) but the additional use of MRS with fMRI nails this down to GABA. The data analyses have been conducted in an exemplary manner and clearly support the findings.
Author Response 1.8: We greatly appreciate reviewer 1's positive response to the work and thank them for their efforts. We hope our responses to their comments are satisfactory.

Reviewer 2
Reviewer Comment 2.1: 30 young adults were recruited to the study. They performed a fMRI session, where they performed a Think/No-Think task, and the Stop Signal (SS) task, to investigate the relationship between GABAergic activity in the hippocampus and inhibitory control of unwanted thoughts. MRS data were acquired in a separate session from the right hippocampus, the right DLPFC and the visual cortex. The authors showed that hippocampal GABA was inversely related to BOLD signal suppression in the hippocampus in response to thought suppression. Appropriate controls were performed. This is an interesting study, which uses complex methodology, and was clearly performed with care and thought. The manuscript is well-written and guides the reader through the data well.
Author Response 2.1: We thank the reviewer for their nice remarks about the work.
Reviewer Comment 2.2: However, I am somewhat unconvinced by the specificity of the results, as discussed in detail below. In addition, MRS of the hippocampus is difficult and while the authors acknowledge this and have provided some data to reassure the reader of the quality of their spectra it is currently not possible to assess the data quality fully here, making it difficult to know how to interpret their results.

Author Response 2.2:
We address these concerns below, where they are further elaborated.

Reviewer Comment 2.3: Major Points
1. The authors introduce the paper in terms of a number of psychiatric conditions and then test their hypotheses on healthy controls. It is not immediately clear to me that the mechanisms underlying the inhibition of intrusive thoughts in psychiatric disorders are the same as the mechanisms underlying the instructed inhibition of thoughts in healthy controls.

Author Response 2.3:
The reviewer correctly notes that the current design did not study psychiatric populations, but rather healthy adults; as such, our conclusions do not directly relate to psychiatric populations. Indeed, our main goal was to understand the thought suppression mechanism as it normally operates as a way to highlight what might go wrong in some mental disorders.
Given the above, the main question is whether our experimental model of thought control is relevant to the control of intrusive thoughts in daily life. If the answer depended only on the current study, it might not be clear. Fortunately, there is much more data to go on, about which the reviewer may not be aware.  2015) measured suppression-induced forgetting on the task used here, and also collected EEG. They then exposed participants to a traumatic video clip depicting an event that people find distressing. Over the next week, the participants kept diaries of intrusive thoughts about the film. After a week, they completed a clinical instrument (the Impact of Events Scale), which measures intrusion symptoms. Participants' success at suppressing retrieval during the task (i.e. suppression-induced forgetting and also the N2 ERP component) predicted the frequency and distress of trauma-film related intrusions, and people's PTSD score on clinical scales.
It is also worth noting the diversity of stimuli used. The foregoing designs have used simple word pairs, face-scene pairs, word-object pairs, word-line drawing pairs, and even, in some cases, people's own autobiographical memories. Both neutral and emotionally negative contents have been used as well. In general, these various materials consistently identify a common pathway involving the right DLPFC and the down-regulation of hippocampal activity.
So, empirical data exist that permit confidence in the generality of the processes we are measuring, and that suggest their clinical relevance.
Author Action Taken 2.3 The reviewer's comment gave us a clear appreciation that, in our effort to economical in our presentation, we might have failed in our job at communicating the depth of support for our experimental model. If unaddressed, this would be a significant problem because some readers might have the same response. We therefore revised the manuscript to more fully articulate the evidence base supporting the relevance of this model (see e.g., Lines 80-93).
In addition, we further addressed the reviewer's concern by introducing a new paragraph in the final discussion that explicitly discusses the issue of generalization for readers to consider (see Lines 551-574). This paragraph acknowledges the limitation of using emotionally neutral word pairs while also making the case that the existing literature supports the potential relevance of this work. We thank the reviewer for highlighting this shortcoming of our exposition, which enabled us to strengthen our case.
Reviewer Comment 2.4: 2. Quantification of GABA from the hippocampus is difficult. I am reassured by the line-width reliability across the 3 voxels, but it would be useful to have some values for the fit for the GABA per se for all three voxels to determine the reliability of the measures.

Author Response 2.4:
The spectral fitting methods used in this study enable the estimation of metabolite peak amplitudes, but it is not possible to directly estimate the uncertainty and reproducibility of these peaks without performing repeated measurements which we were not able to do given the already long acquisition times required. Alternatively, a common metric used to estimate the uncertainty on a metabolite measurement is the Cramer-Rao Lower Bound (CRLB) of variance. CRLB cut-off thresholds of 20-50% are widely used as metabolite rejection criteria in the literature. However, it should be noted that there are limitations to the interpretability of CRLB values, and a low CRLB does not guarantee an accurate or reproducible metabolite concentration and vice-versa (Kries and Boesch 2003, and Kries 2004). Therefore other quality control measures should be used alongside CRLB, including line-width and inspection of residual plots, both of which were considered in this study.
Quality control criteria using line-width were included in the original submission, and the mean linewidth for the hippocampal ROIs were shown to be comparable to the other two ROIs in our original submission. Visual inspection of the raw 2D spectra and post-fitting residual plots revealed lipid contamination in four ROIs (three is the visual cortex and one in DLPFC). No unexplained features were identified in the residual plots for any of the hippocampal ROIs.
In the table below we provide the mean Cramér-Rao Lower Bound (CRLB) values (±standard error of the mean) for GABA for each of the three voxels. The CRLB values were higher in hippocampus due to the location of the voxel in an area of the brain with lower signal-to-noise ratio (SNR), a direct consequence of increased B0 susceptibility. In addition, to ensure specificity of the measurements, the hippocampal ROI was also smaller than the other two volumes (see Figures 2 and S1 Methods part IV.2.a), which also results in decreased SNR. However, we elected to include subjects that passed the quality assurance screening on our other metrics (N=18), which included the line-widths obtained from the higher-order shims, and visual inspection of the fit and residuals in the spectral plots produced for each voxel and subject.
The second reason is that the relationships reported in this manuscript were relatively unaffected when subjects were weighted according to their GABA CRLB values. Specifically, under the assumption that higher CRLB reflects lower quality data, we used weighted least squares regression to give each data point its proper amount of influence over the parameter estimates. Each subject was therefore precisely weighted by subtracting their CRLB from a constant value, ensuring all weights were positive values. Individuals with higher CRLB values therefore contributed proportionally smaller weights to the model. Below we show the standardized coefficients (betas) for the primary relationships demonstrated with hippocampal GABA, in linear regression models with and without the CRLB weighting. Of the observed significant relationships with hippocampal GABA, inference on only one relationship was affected by weighting with CRLB (NT GABA/HIP BOLD). The actual magnitude of this effect on the Beta value was, however, quite small. In general, the weighted least squares regression analyses demonstrate that our relationships did not change substantially when carefully adjusting the amount of influence of each datapoint over the parameter estimates according to CRLB.
Finally, a third reason not to exclude subjects according to CRLB comes from a recent review paper by Roland Kreis (Kreis, 2016), "The Trouble With Quality Filtering Based on Relative Cramér-Rao Lower Bounds". In this paper, Kreis argues that removal of 1 H MRS data based on CRLB cut-points introduces selection biases into the data, and inflates Type II error. Kreis further concluded that " CRLB should not be used to eliminate bad MRS data -certainly not as sole criterion -because they may just reflect low levels of the measured quantity". In this study, this point is illustrated by showing that rejection of subjects on the basis of a fixed CRLB threshold can lead to biases between a patient population and healthy controls, but we would argue the same argument to be true when comparing ROIs of different sizes and different brain areas affected by different artefacts.
Author Action Taken 2.4: We now describe the weighted least squares regression of CRLB, as well as a separate weighted least squares regression assessing the impact of line widths (Hz) of the hippocampal voxel, in the main text MRS methods, and report the results of both the unweighted and weighted regression models in the supplemental information.
Reviewer Comment 2.5: 3. I am not convinced that the hippocampal GABA measure is "specific". The authors say that other "difficult non-memory tasks sometimes also reduce hippocampal activity" but this was not the case with the control task here. I do not think that this can be therefore claimed to be specific to the task in question. This issue should either be addressed in the discussion directly, and the interpretation amended accordingly, or a control experiment performed.

Author Response 2.5:
The reviewer is correct to note that our specificity claim rests on a juxtaposition of our GABA/suppression finding to other difficult (non-suppression) tasks that also reduce hippocampal activity. Fortunately, we included the motor response inhibition task, in part, because it is exactly the sort of difficult task we had in mind. After running a one-sample t-test on the simple effect [Stop -Go], we can reassure the reviewer that our motor response inhibition task reliably reduced hippocampal activity, although this effect was smaller, relative to thought suppression. This is something that should have been included in the original submission, and the reviewer is correct to point it out.
Is this motor-stopping related reduction in hippocampal BOLD related to hippocampal GABA as well?
Can a participant's tendency for difficult tasks to reduce hippocampal activity explain our findings? We show that this motor-stopping task-induced reduction in BOLD is (a) uncorrelated with hippocampal GABA and (b) does not explain the significant relationship between GABA and BOLD during the retrieval suppression task. Indeed, we observed no change in the relationship between hippocampal GABA and hippocampal activity during the retrieval suppression task, even when we controlled for reductions in activity in that structure during motor stopping in a partial correlation analysis. Moreover, the selectivity of this relationship of hippocampal GABA to memory function extends to the behavioural level as well. We observed that the relationship between hippocampal GABA and memory inhibition performance (SIF), if anything, is improved (see lines 367) when we controlled for motor stopping performance (SSRT) in a partial correlation analysis.
Author Action Taken 2.5: To address the reviewer's comment, we now report the reduction in hippocampal activity during motor inhibition at lines 278-284, in the section exploring the functional specificity of relationships between hippocampal GABA and hippocampal BOLD response in the Think/No-Think and Stop-signal tasks. This finding adds force to the evidence that follows, establishing the specificity of our relationship of hippocampal GABA to BOLD response during the No-Think and Think conditions. We thank the reviewer for this suggestion, as it tightens our case.
Reviewer Comment 2.6: 4. The hippocampal GABA levels and the task data were acquired on 2 different days. What assurance can the authors give that either of these measures is sufficiently stable across time to make this an appropriate analysis approach. This is a particularly important question given the gender split -GABA is thought (though not definitively shown) to vary with the menstrual cycle.  ., 2014). Indeed, the observed magnitude of intra-subject variability was approximately the same as longitudinal 1 H MRS studies conducted at much shorter intervals, indicating that the majority of variance between timepoints arises from measurement error. These findings indicate 1 H MRS indices of GABA, in cognitively normal adults, reflect stable biological traits. Our decision to acquire fMRI and MRS in separate sessions also reflects a deliberate strategy to maximise data quality: In piloting the study, we found that the long acquisition times required to acquire MRS in multiple voxels (~1 hour) led to participant fatigue and discomfort and to a reduction in data quality (e.g. head motion) when combined with the fMRI acquisitions in a single session..
Nevertheless, we assessed whether the relationships reported in this manuscript were affected when subjects were weighted according to their inter-scan interval. Specifically, under the assumption that longer intervals reflect lower quality data, we used weighted least squares regression to give each data point its proper amount of influence over the parameter estimates. Each subject was precisely weighted by the number of days between the fMRI and MRS acquisition, by subtracting this interval from a constant to ensure positive values. Individuals with longer intervals therefore contributed proportionally smaller weights to the model. Below we show the standardized coefficients (betas) for the primary relationships demonstrated with hippocampal GABA, in linear regression models with and without Interval weighting. Of the observed significant relationships with hippocampal GABA, none were affected by weighting with Interval; in fact most were slightly improved. The weighted least squares regression analyses therefore demonstrate that our relationships did not change substantially when carefully adjusting the amount of influence of each datapoint over the parameter estimates according to Interval between the fMRI and MRS scans.

Unweighted
Author Action Taken 2.6: To address the reviewer's comments, we revised the manuscript in several ways. First, we now include in the main methods text additional information about our pre-scan screening form on lines 632, which instructed participants to refrain from alcohol or other psychoactive drugs in the 24-hour period prior to the scan. Participants were also screened for medical history indicators, such history with psychotropic medications, prior experience with mental health issues, or head injury. We did not, however, collect information from our female participants concerning the point they were at in their menstrual cycles. Finally, we also now cite the Near et al (2014) paper demonstrating the longitudinal reliability of GABA (see Lines 200-201) and describe the weighted least squares regression of Interval in the main text MRS methods and report the above table in the supplemental results.

Reviewer Comment 2.7:
If I understand correctly the subjects were trained on the tasks prior to any of the imaging. Could it not therefore be the case that the levels of hippocampal GABA here reflect how well subjects were able to learn how to perform this task, rather than reflecting the ability to inhibit thoughts per se?
Author Response and Action Taken 2.7: In principle, yes, the reviewer could be right. The data suggest, however, that this is unlikely to be a concern. First, as reported in Table S1 in our supplement, there were no differences in memory performance on the word pairs at the end of the training phase (immediately before fMRI scans were acquired) across our Low and High GABA groups (t = 0.63, p = 0.54). Indeed, the correlation between GABA measurements and this index of initial word pair learning was not significant, r = -0.095, 95% CI: [-0.4994 0.4075]. More generally, as can be seen in Table S1 in the supplement, the two groups showed nearly identical performance on various measures from the stopsignal reaction time task, suggesting that on both memory and motor measures, the groups were comparable in their ability to learn and perform tasks. Given these observations, our data point to a specific relationship between suppression-induced forgetting and hippocampal GABA, not to the broad ability to learn the materials needed to do the task or to general features of participant performance.
Author Action Taken 2.7: The answer to the reviewer's question seems like it would be of interest to readers. To report the relevant findings, we have now inserted a new sentence at Lines 357-359, in which we report that there was no correlation between HC GABA and initial memory performance. In this sentence, we also steer readers more directly to Table S1 for further exploration of how GABA might relate to performance measures in general.

Reviewer Comment 2.8:
There were relationships between hippocampal GABA and BOLD signal here and elsewhere, but the BOLD signal is complex and not well understood. Were any behavioural relationships demonstrated, either with GABA or BOLD? This would be very useful to understand the importance of this relationship.

Author Response 2.8:
We agree with the reviewer that relationships to behaviour are helpful in understanding the data. We did observe a relationship between hippocampal BOLD and memory inhibition performance (suppression-induced forgetting; SIF). This is reported on lines 105-107 of the current manuscript. We also observed a relationship between hippocampal GABA and memory inhibition performance (SIF). This is reported on lines 350 and Table 1c. These functional relationships show that the ability to down-regulate a thought (as estimated from suppression-induced forgetting) is indeed linked to hippocampal down-regulation during suppression, and to hippocampal GABA, in line with the hypothesis.
Reviewer Comment 2.9: The MRS methods are not given in the main body of the manuscript. Given the detail in which the behavioural and FMRI acquisitions are described this seems like an odd omission from the main text, particularly as the behavioural and fMRI acquisition and analysis is relatively standard while the MRS is certainly not.
Author Response 2.9: The reviewer raises a very good point. Too much of the spectroscopy methodology was relegated to the supplemental section in our original submission.
Author Action Taken 2.9: We have revised the manuscript to strike a better balance between the fMRI and spectroscopy methodology, particularly the post-processing steps. We have now added basic information about acquisition sequences used for both the fMRI and MRS data to the main body text Methods section Lines 766-774 and Lines 803-818). We have also now added information about the covariates used for each ROI (glutamate and grey matter) and descriptions of the various additional control analyses, e.g. weighted least squares regression (using CRLB, line width, and Interval). See lines 172-224 and supplemental results.
Reviewer Comment 2.10: Minor Points 1. I am not sure that the bins in the histograms in figure 2 are informative -smaller bins would give a more detailed distribution that would be more informative to the reader.

Author Response and Action Taken 2.10:
We agree that the distributions in figure 2 may obfuscate the data somewhat. We have removed these plots, and replaced this with simple numerical descriptions of the distributions (means ± standard deviation), which are more precise and easier to compare between regions (see lines 215-217).
Reviewer Comment 2.11: 2. Many readers will not be familiar with 2D MRS -it would be useful if the authors could expand the figure legend to figure 2 to explain the figures to the non-expert.

Author Response 2.11:
We agree that more information should be given about the 2D plot, especially so that non-experts can understand the report better.
Author Action Taken 2.11: We have simplified the figure legend and clarified its components (lines 228-239). We further have attempted to improve Figure 2 itself to better visually capture how the metabolite concentrations are estimated from model fitting.

Reviewer 3
Reviewer Comment 3.1: The authors present intriguing evidence suggesting an association between hippocampal GABA content and suppression of memory retrieval in a paired associates task. The manuscript has many strengths, including the memory suppression paradigm, the fMRI approach, and the neural circuit models. In addition, the approach to measuring hippocampal GABA is commendable and uncommon thus far in the literature.
Author Response 3.1: We greatly appreciate reviewer 3's encouraging feedback.
Reviewer Comment 3.2: However, there are significant problems with the overall conceptual framework and with the approach to correlational analyses used in support of the principal aims, as well as some concerns about the GABA measures. Until these issues are addressed, it is difficult to assess the overall impact of the work.

Author Response 3.2:
We addressed these concerns below, where they are further elaborated.

Reviewer Comment 3.3: Conceptual Framework
The authors present potentially important evidence suggesting an association between hippocampal GABA content and suppression of memory retrieval in a paired associates task. However, the conceptual framework offered in the introduction and discussion focuses primarily on cognitive and clinical phenomena that lack a clear relationship to suppression of paired associate retrieval. Word retrieval is generalized here to represent "thinking," "thought," "intrusive thoughts" "intrusive symptomatology" and "awareness." Relatedly, retrieval suppression is conceptualized as "thought suppression," "suppression of intrusive memories," and "control of awareness." The Oxford dictionary's first definition of thought is "An idea or opinion produced by thinking, or occurring suddenly in the mind." It is true that paired associated word retrieval could be considered a simple subtype of thinking, but it is not generally considered a valid proxy for the complex, clinically relevant thought processes discussed at length in the paper. For example, the words "thought," "think" or "thinking" are found 155 times in the manuscript, but only once in the list of references. The HC BOLD and GABA findings pertaining to retrieval and retrieval suppression are important on their own. However, these findings do not permit generalization to cognitively and phenomenologically distinct and more complex processes such as pathological worry, rumination, obsession and hallucination.

Author Response 3.3:
We can understand why the reviewer might suspect that the cognitive and clinical phenomena of interest in this paper may lack a clear relationship to the suppression of paired associate retrieval, and why our findings might not permit generalization to complex processes like rumination, pathological worry, obsession and hallucination. Indeed, if we were in the reviewer's position and this was the only study we were focusing on, we might also share this view. Data exists, however, that supports the relevance of this experimental model to the processes of interest here, and we apologize to the reviewer for not doing a better job at presenting this background in the paper.
Prior work with the current Think/No-Think paradigm (used in over 100 publications) supports its relevance as a model of the suppression of intrusive thoughts and memories in clinical samples. This evidence documents the generality of the phenomenon and its mechanisms and their relationship to clinical phenomena. First we summarise this evidence, and then discuss the analytic considerations.

Generalizability across Materials.
Most early applications of the Think/No-Think paradigm (see, e.g. Anderson & Green, 2001) used word pairs of the sort used here. Like the reviewer, however, we also considered it important to establish the generality of the phenomenon and its cognitive and neural mechanisms. Over the last 16 years, suppression-induced forgetting (SIF) has been established with a broad variety of materials: Word-word pairs; word-scene pairs; object-scene pairs; word-line drawing pairs; face-scene pairs; face-word pairs, and word-object pairs. SIF has been found for both emotionally neutral and negative materials. Critically, SIF has been found with (a) autobiographical memories, and even (b) intrusive, person-specific worries about recurrently feared future events. In all cases, suppressing retrieval reduces the accessibility of the suppressed content, establishing a content-general phenomenon that appears relevant to complex constructs (e.g. worries).

Generalizability of the Neural Mechanism.
At present, we are aware of 16 fMRI studies using the Think/No-Think procedure, and a similar number of ERP studies. These studies suggest a frontohippocampal inhibitory control pathway that supports retrieval suppression in a materials general manner. These data suggest that the pathway engaged to suppress retrieval of word pairs maps very well onto the pathway used to suppress upsetting images and memories-to the point that the very same prefrontal cortex region identified in a word pair study can be used as an a priori ROI for the analysis of an autobiographical memory study, recovering the full pattern of effective connectivity with the hippocampus. These data-which were unfortunately not highlighted in our initial submission--provide a good empirical grounding for optimism about the generalization of the current findings to a broader range of stimuli that the reviewer would consider to be more transparently relevant to clinical disorders.

Examples of Clinical Relevance:
Of course, the generality of the phenomenon and neural mechanism need not imply its clinical relevance. It is reasonable and appropriate to consider whether the foregoing mechanism may be entirely irrelevant to how people control intrusive thoughts in daily life, and may in no way be related to clinical disorders. reported a similar relationship between SIF for person-specific future worries and trait anxiety. In the latter instance, the same fronto-hippocampal network was implicated with an effective connectivity analysis (DCM) that was highly similar to the one used here. found that people's self-reports of how well they control intrusive thoughts and memories in daily life, as measured by the thought control ability questionnaire (aka, the TCAQ scale), are well predicted by suppression-induced forgetting. The TCAQ is a standard clinical scale devised to measure individual differences in the ability to control intrusive thoughts, and strongly predicts anxiety.
7. Intrusive memories of Analogue Trauma. Strebb et al. (2015) measured suppression-induced forgetting on the word-pair task used here, and also collected EEG. They then exposed participants to a traumatic video clip depicting an event that people find very distressing. Over the next week, the participants kept diaries of intrusive thoughts about the film. After a week, they completed a clinical instrument (the Impact of Events Scale), which measures intrusion symptoms. Participants' success at suppressing retrieval during the verbal paired associate task (i.e. suppression-induced forgetting and also the N2 ERP component) predicted the frequency and distress of trauma-film related intrusions, and people's PTSD score on clinical scales.

Retrieval Suppression as a Model of the Control of Intrusive Thought.
It's very easy to understand the reviewer's skepticism about accepting forgetting on an episodic memory test for paired associates as a proxy for the ability to suppress thoughts in general. Clearly human thought is not just about episodic memory, and it is not, as a general rule, reducible to something as simple as associative retrieval, especially of simple word pairs. How then, could we feel justified in making the generalization that we make in the paper about the relevance of this work to clinically relevant intrusive thoughts?
It is important to consider the fact that the Think/No-Think task doesn't model all varieties of thought. Rather, it is intended to model processes involved in perseverative thoughts that spring to mind unbidden. As the reviewer notes, the Oxford English dictionary includes thoughts that "occur suddenly in the mind". When confined to this sense of "thought", we suggest that our method credibly indexes a process shared with the ability to control perseverative thoughts, as evident by the clear relationships of suppression effects to intrusive symptomatology just reviewed.
But why should this be true?
Addressing this is simpler than it might seem. The perseverative nature of involuntary, intrusive thoughts renders automatic memory retrieval a natural model of this situation: If not from memory, from where would a repeated thought spring? Recurring thoughts or ruminations clearly do have a memory component, and this likely involves hippocampal activity, an idea that converges with the role of this structure in mind wandering and the "default mode".
Moreover, many intrusive thoughts are involuntary images that clinical psychologists have flagged as critical in disorders including anxiety, depression, ptsd, schizophrenia, and bipolar disorder (see, e.g. Brewin, Gregory, Lipton, & Burgess, 2010 for a review of the evidence for this trans-diagnostic symptom). This form of recurring intrusive image has already been examined in the Think/No-think task and suppression of these experiences engages the same fronto-hippocampal network engaged during suppression of words (with additional suppression in visual cortex). Finally, worries about the future are instances of episodic future thinking, which Daniel Schacter and Donna Addis and colleagues have spent the last 5-10 years arguing involves activity in the hippocampus in service of scenario construction. Thus, the suppression of future worries can be modeled as the suppression of repeated intrusive images/scenarios generated initially during episodic prospection (please see Benoit, Davies, & Anderson, 2016, PNAS). Here too, the same fronto-hippocampal pathway identified in the current study has now been shown to be engaged when people suppress their imagination for future events.
The foregoing illustrates that the line between "intrusive memories' and "intrusive thoughts" is not altogether clear and that many if not most intrusive thoughts of clinical significance reflect involuntary retrievals. Indeed, this intimate linkage between thinking and retrieval is reflected in the name of our procedure, first introduced in 2001: The Think/No-Think paradigm. Thus, although suppressing automatic retrieval of simple pairs does differ in various respects from the particular phenomena of clinical interest, there are nevertheless core processes indexed by this task that are demonstrably relevant to these clinical symptoms, and that there are excellent analytical reasons for this.
The Upshot. The foregoing illustrates that we have reasonable empirical and theoretical grounding for adopting this simple task as a model of processes important to the clinical phenomena of main interest. There is a long-term historical effort behind the current study that lends credibility to its relevance. We hope these considerations clarify why we believe our generalization to be appropriate. We respect that the reviewer may still disagree, but we hope they will consider granting us the courtesy of allowing us to have a different view on this subject.

Author Action Taken 3.3:
We have retained our conceptual framing in terms of intrusive thoughts because we believe that both empirical and theoretical considerations warrant this.
However, we accept responsibility for the fact that our initial submission invited the kind of reaction that the reviewer had, given that it did not represent the background evidence and considerations clearly enough. We therefore have elaborated on this background in the introduction, which can be found on (see e.g., Lines 80-93). Moreover, we now include a new paragraph in the discussion that raises the issue of generalizability for readers to consider (see Lines 551-574). We thank the reviewer for prompting us to do this, because we should not take for granted that readers will be aware of the literature behind this work and because it is appropriate for readers to reflect explicitly about whether generalization is warranted.

Reviewer Comment 3.4:
A second type of conceptual error here is presenting the GABA differences as causal, when the findings are only correlational. The authors attribute a causal role to bulk measures of HC GABA (as measured by JPRESS) when they say "GABA enables," "depends on GABA", "GABA alters", "low GABA compromises", "GABA influences." In fact, the study provides evidence for associations with bulk measures of HC GABA. Speculations about causal relationships should be minimized and clearly framed as speculation or hypotheses for future testing.

Author Response and Action Taken 3.4:
The reviewer is correct. We agree that our treatment of the findings would be improved if we tried to maintain a clearer separation, throughout the text, between hypotheses about causality and statistical association. We have revised the results sections reporting the intermodal correlations with GABA, and, where applicable, toned down our language describing the findings accordingly. Elsewhere in the introduction and discussion, we have also toned down the implication of causality (e.g. see lines 10, 529, 622). We do, however, continue to include causal statements in our hypotheses and in interpretative statements based on the associations.

Reviewer Comment 3.5:
The manuscript's title incorporates both of these misleading conceptual frames, using the terms "GABA enables" and "unwanted thoughts." In contrast, the study actually shows evidence that HC GABA is associated with volitional memory suppression.
Author Response 3.5: As noted above, we believe that the conceptual framing of our work in terms of thought suppression is justified, and is not in error. We considered revising the title of the manuscript to eliminate reference to the causal role of GABA in mediating the ability we are measuring. In the end, this decision comes down to the function of the title-whether it is an empirical summary, or a conceptual interpretation that we wish to emphasise. We decided on the latter. Based on the evidence presented in the manuscript, we argue that hippocampal GABA may enable the suppression of intrusive thoughts. This is the idea we wish to preserve.
Author Action Taken 3.5. Although we have retained our title, we do agree that in the body of the manuscript, we should carefully separate statistical association from the causal interpretation we are attributing to it. In doing so, we will highlight the issue of causality for the reader, encouraging them to draw their own conclusions based on the data. We also included an explicit statement in the discussion frankly acknowledging that experimental manipulations of GABA are required to draw causal conclusions, unlike the correlational approach used here. Specifically: (Lines 571-574) "Ultimately, however, determining whether successful thought suppression relies on local hippocampal GABA requires a direct test of this generalization, together with experimental manipulations of GABA rather than the individual differences correlational approach used here." Reviewer Comment 3.6: Overstating and overgeneralizing the findings occurs in many places in the text. For example, line 10 states "In so doing, we isolate a fundamental mechanism enabling inhibitory control over thought: GABAergic inhibition of hippocampal activity." "Isolate a fundamental mechanism" is much too strong a phrase, "enabling" is speculative, and "control over thought" is much too general to associate with HC GABA based on this study.

Author Response and Action Taken 3.6:
We respect the reviewer's goal of ensuring that our language is calibrated to the data. In response, we generally scrutinised the manuscript to see whether any of the language used was overstated or overgeneralized and we made modifications to tune our statements more precisely. Here are 3 examples to illustrate the sort of changes we made to act upon this request: 1. Line 10.
Previous "In so doing, we isolate a fundamental mechanism enabling inhibitory control over thought: GABAergic inhibition of hippocampal activity." Revised: "In so doing, we provide evidence for a mechanism…." 2. Line 532. Previous: Our results point to GABAergic inhibition of hippocampal retrieval processes as a key mechanism underlying the suppression of thought.
Revised. Our results point to GABAergic inhibition of hippocampal retrieval processes as a potential mechanism that enables such thoughts to be suppressed.

Line 625
Previous: If so, the current work establishes a transdiagnostic framework that specifies one important computational reason why persistent intrusive thoughts emerge from hippocampal disinhibition.
Revised: If so, the current work offers a transdiagnostic framework that specifies one computational reason why persistent intrusive thoughts emerge from hippocampal disinhibition.
We generally made an effort to change things in the spirit of the reviewer's recommendation, even if they didn't specifically mention it. Nevertheless, it is possible that there remain some cases of language that the viewer might take a different view on. In fairness, we perhaps have a different view of our conceptual framework-a view that we believe has a solid evidence base. We hope that in the event that such cases arise, the reviewer will consider them honest differences of opinion and consider giving us latitude.

Reviewer Comment 3.7: Correlations
Line 202 states "Because the robust and partial correlation analyses yielded similar conclusions, we focus on the partial correlations for simplicity." The reader assumes that a study principally aiming to examine the association between HC BOLD and HC GABA would have an a priori statistical plan for testing this association. If so, which of these two approaches to correlation analysis was chosen a priori? All results should be reported using the a priori method, with secondary comments on the convergence or divergence of results found with an alternate method.
Author Response 3.7: We agree that an a priori statistical plan is essential. We assure the reviewer, however, that we conducted the robust and partial correlation analyses using a consistent a priori strategy throughout the entire manuscript. Indeed, the task design (using both motor and memory inhibition tasks) was selected to facilitate a particular a priori analysis approach. We acknowledge however that the results in the initial version were distributed widely, traversing multiple sections of the results in both the main body text and supplemental information, rendering this strategy somewhat hard to follow.
Author Action Taken 3.7: Based on the reviewer's feedback, in the revision we have substantially revised the results sections describing the intermodal relationships with GABA in order to improve the clarity of our original a priori strategy. We have more fully and explicitly described our a priori strategy in the beginning of the intermodal section (see Lines 244-258). Crucially, we have also replaced Figure 3, which depicted only a subset of the intermodal relationships, with two comprehensive Tables (Tables 1  and 2, pages 18-19), which serve both as an organisational framework of our analysis strategy and also as a core repository for all of our primary individual differences analyses. Tables 1 and 2 are now consistently referred to in the results sections for each analysis step, as opposed to the diffuse reporting employed in the prior draft.
Reviewer Comment 3.8: As written, there is a confusing intermixing of robust and partial correlation approaches. For example, line 202 suggests that the results of the partial correlation analyses are presented in the main paper. However, line 214 indicates that CI are used for testing significance and cites the papers on robust correlations. This suggests that robust correlations are being reported for these comparisons. Again on line 353, the citations for robust correlations are given in a context where they appear to be reporting partial correlations.
Author Response and Action Taken 3.8: Author Action 3.7 addresses this concern. In brief, we more fully and explicitly described our a priori strategy, including uniform method of inference, in the beginning of the intermodal section (see Lines 244-258). We also include Tables 1 and 2, which succinctly describe when robust or partial correlation was used, how they were performed (covariates used, degrees of freedom, and the resulting relationship). Reviewer Comment 3.9: In addition, there is a lack of consistency in how correlations are applied in the manuscript. Sometimes the authors provide direct comparisons between correlations, and sometimes they don't. For example, the authors report that HC BOLD-GABA correlations are significant during memory task components and not significant during motor task components. They interpret this as a selective finding, but they omit direct comparison of the correlations across tasks. However, the authors include a direct comparison between correlations for a different contrast on lines 241-244. Sometimes they include the GO condition BOLD responses as covariates in relevant analyses (line 240-1) and sometimes they don't (lines 224). The result is the appearance of selectively focusing on findings that support their model and not making sincere attempts to challenge or disprove the model. The relatively low power of the key contrasts (N=18) may have a role in this selective reporting.
Author Response 3.9: We agree with the reviewer that the mixture of partial correlation techniques and comparisons between correlated correlations (Meng's z) was inconsistent. We have removed the Meng's z comparisons entirely from this revision of the manuscript. To infer functional and anatomical specificity, we now instead employ a uniform partial correlation strategy throughout the manuscript. These are fully detailed in Tables 1 and 2. As described in Author Action 3.7, we have also substantially revised the Results section describing the intermodal relationships to better reflect the organization and consistency of reporting in the Tables (Lines 244-258). In all cases, we start by reporting the robust correlations, then the 'Control' partial correlations (controlling for sex, grey matter, and glutamate), then the partial correlations controlling for these covariates plus a covariate from the Stop signal task ('Functional Specificity') or from the DLPFC region of interest ('Anatomical Specificity'). See, e.g. lines .288-300, lines 301-323) Reviewer Comment 3.10: What type of robust correlation was used? Was it bend, skipped, or some other? If bend, what percentage was used? If skipped, how many outliers were removed? For all statistical results, it is necessary to include either df or N.
Author Response and Action Taken 3.10: Author Action 3.7 addresses this concern. In brief, we more fully and explicitly described our a priori strategy, including the type of robust correlation conducted (skipped) and outlier detection method, in the beginning of the intermodal section (see Lines 244-258). Tables 1 and 2 describe, for each robust correlation analysis, the number of outliers removed and the degrees of freedom. The Tables also describe the degrees of freedom for all partial correlation analysis as well. We believe Tables 1 and 2 visually capture our a priori strategy, and comprehensively address the reviewer's concerns regarding the organization and clarity of the correlation results.

Reviewer Comment 3.11: MRS
The authors state that good shims were obtained in the HC voxel for 18 of the 24 participants. It is necessary to state whether a specific line width threshold was used for exclusion of spectra, and if so, what threshold was used. It appears that the mean (s.e) of the linewidth is presented. Please present the mean (s.d.). The authors state that 4 voxels were excluded for lipid contamination. Please clarify how many were excluded from each voxel location.
Author Response and Action Taken 3.11: We agree that this information would be useful to report. We excluded linewidths ±3 SDs from the mean linewidth for a given voxel, in accordance with the recommendation of Waddell et al (2007). We have now added the SD to the mean and SEM values for the lineswidths (see lines [188][189][190][191][192]. For each voxel, an average 1D spectrum was produced by averaging the data across all repetitions and all TEs. These spectra were visually inspected by two independent raters (TWS and MMC) to determine whether they suffered from lipid contamination. In four cases, both raters identified a large unexpected peak on the right-hand side of the spectrum, with the left tail of that peak significantly displacing the baseline for the remaining peaks. All four voxels were subsequently excluded from further analysis. There were no cases of a spectrum being flagged for lipid contamination by one rater but not the other. We have broken down the excluded voxels according to their location (see lines 195).

Reviewer Comment 3.12:
The authors helpfully teach the reader that 2D-JPRESS offers some advantages over PRESS in regions of high inhomogeneity, like the HC. However, they fail to mention an apparent disadvantage of the 2D-JPRESS method when compared to the more commonly used MEGA_PRESS approach. Specifically, it appears from the cited JPRESS studies that the reliability of GABA/Cr measurements is considerably less with JPRESS than is typically reported for MEGA-PRESS. Given the HC target location, and the appearance of valid GABA measurements, this is not a criticism of the choice to use JPRESS. However, for readers familiar with MEGA-PRESS, the apparently lower reliability of the JPRESS approach should be mentioned among the limitations of the study.
Author Response and Action Taken 3.12: We thank the reviewer for this point. We agree that 2D JPRESS sequences remain somewhat more exotic in the literature compared to MEGA-PRESS sequences, rendering assessments of their reliability across studies somewhat less robust. We have added a point to this inherent limitation at the end of the MRS methods section (see lines 222-224).
Reviewer Comment 3.13: The issue of the stability of HC GABA measurements is particularly relevant in the current study because of the interval between BOLD measures and the GABA measures with which they were correlated was relatively long (median = 13 days). It is essential to also report the range of interval days. Are the authors aware of any data on the stability of MRS GABA measures in HC or other regions across intervals in the range occurring in this study? Even the median value (13 days) is quite long, and this aspect of the design represents a limitation of the study that should be acknowledged.
Author Response 3.13: Longitudinal 1 H MRS indices of GABA are reliable within cognitively healthy young adults at mean intervals of more than half a year, e.g. 229 ± 42 days (Near et al., 2014). Indeed, the observed magnitude of intra-subject variability in that study was approximately the same as longitudinal 1 H MRS studies conducted at much shorted intervals, indicating that the majority of variance between timepoints arises from measurement error. These findings indicate 1 H MRS indices of GABA, in cognitively normal adults, reflect stable biological traits. The interval between first and second visits in our study ranged from 1-111 days (mean ± SD: 26 ± 34 days), which is considerably lower than Near et al., (2014). The decision to acquire fMRI and MRS in separate sessions also reflects a deliberate strategy to maximise data quality: In piloting the study, we found that the long acquisition times required to acquire MRS in multiple voxels (~1 hour) led to participant fatigue and discomfort and to a reduction in data quality (e.g. head motion) when combined with the fMRI acquisitions in a single session.
Nevertheless, we assessed whether the relationships reported in this manuscript were affected when subjects were weighted according to their inter-scan interval. Specifically, under the assumption that longer intervals reflect lower quality data, we used weighted least squares regression to give each data point its proper amount of influence over the parameter estimates. Each subject was precisely weighted by the number of days between the fMRI and MRS acquisition, by subtracting this interval from a constant to ensure positive values. Individuals with longer intervals therefore contributed proportionally smaller weights to the model. Below we show the standardized coefficients (betas) for the primary relationships demonstrated with hippocampal GABA, in linear regression models with and without the Interval weighting. Of the observed relationships with hippocampal GABA, none were affected by weighting with Interval; in fact most were slightly improved. The weighted least squares regression analyses therefore demonstrate that our relationships did not change substantially when carefully adjusting the amount of influence of each datapoint over the parameter estimates according to Interval between the fMRI and MRS scans.