Individual differences in fear acquisition: multivariate analyses of different emotional negativity scales, physiological responding, subjective measures, and neural activation

Negative emotionality is a well-established and stable risk factor for affective disorders. Individual differences in negative emotionality have been linked to associative learning processes which can be captured experimentally by computing CS-discrimination values in fear conditioning paradigms. Literature suffers from underpowered samples, suboptimal methods, and an isolated focus on single questionnaires and single outcome measures. First, the specific and shared variance across three commonly employed questionnaires [STAI-T, NEO-FFI-Neuroticism, Intolerance of Uncertainty (IU) Scale] in relation to CS-discrimination during fear-acquisition in multiple analysis units (ratings, skin conductance, startle) is addressed (NStudy1 = 356). A specific significant negative association between STAI-T and CS-discrimination in SCRs and between IU and CS-discrimination in startle responding was identified in multimodal and dimensional analyses, but also between latent factors negative emotionality and fear learning, which capture shared variance across questionnaires/scales and across outcome measures. Second, STAI-T was positively associated with CS-discrimination in a number of brain areas linked to conditioned fear (amygdala, putamen, thalamus), but not to SCRs or ratings (NStudy2 = 113). Importantly, we replicate potential sampling biases between fMRI and behavioral studies regarding anxiety levels. Future studies are needed to target wide sampling distributions for STAI-T and verify whether current findings are generalizable to other samples.

translational models in fear and anxiety research [20][21][22] . During fear acquisition training, an initially neutral stimulus (the to-be-conditioned stimulus, CS+) is paired with an aversive event (the unconditioned stimulus, US) and thereby becomes a predictor of the US while a second stimulus (CS−) is never paired with the US. Subsequently, the CS+ elicits (anticipatory) defensive responses that can be assessed at different response levels, all capturing slightly different time-windows and sub-processes, for a review see 23 . These include self-report (e.g., ratings of fear or US expectancy), physiological responding [e.g., skin conductance responses (SCRs), fear-potentiated startle responses (FPS) and neuro-functional activation (e.g., BOLD fMRI)]. Skin conductance responses are the most commonly used measures of conditioned responding and are assessed as phasic arousal-related changes in sweat gland activity 24,25 . Fear potentiated startle, which follows a valence gradient in responding 26,27 , measures the increase in the startle reflex elicited by a sudden event (such as a burst of white noise) in the presence of threat as compared to the absence of threat.
Focusing on individual differences in negative emotionality 1 , e.g. 28 , and combining it with fear conditioning research 29 holds promise to provide critical insights into the mechanisms underlying individual risk and resilience for the development of anxiety and/or stress-related disorders 19,29 . A recent review identified three scales linked to the broader construct of negative emotionality that have been consistently associated with individual differences in fear conditioning performance 29 and vulnerability to pathological fear and anxiety: the trait anxiety scale of the Spielberger's State-Trait Anxiety Inventory (STAI-T 30 ), the Big Five neuroticism scale of the NEO-FFI (NEO-FFI-N 31 ) and the intolerance of uncertainty scale (IUS 32 ).
Trait-anxiety, reflects the general tendency to react anxiously and to show cognitive as well as affective styles related to pathological anxiety to a wide range of events and contexts. There has been a long-standing debate on whether the STAI-T is a "good" measure of anxiety. Based on confirmatory factor analytical approaches in large samples of healthy individuals 33,34 and patients 34 , it was suggested that the STAI-T measures "general negative affect" rather than "measuring anxiety or depression in a strict sense". The latter two hypothetical sub-factors had been proposed previously 35 but lacked sufficient discriminant validity in newer work using larger samples 33,34 .
In turn, neuroticism, one of the "Big-Five" constructs derived factor-analytically, reflects the tendency to show negative affect such as anger, envy, guilt, and depressed mood and to be emotionally highly reactive and vulnerable to stress 36 . Neuroticism has also been described as "sensitivity of defensive distress systems that become active in the face of threat, punishment or uncertainty" 37 and is considered an established risk factor for psychopathology 38 . Recently, it was reported that neuroticism may be associated with experiencing more intense negative emotions, but not with the variability in experiencing negative emotions 39 .
Finally, intolerance of uncertainty is defined as the dispositional cognitive bias to perceive and interpret ambiguous situations as threatening 32,40 , which has been suggested to be a possible trans-diagnostic factor contributing to maintaining affective disorders including anxiety disorders and depression 41,42 . Relatedly, patients suffering from affective disorders are characterized by heightened scores on the IUS 43 . Of note, several different scales assessing intolerance of uncertainty co-exist 32,44 .
All three constructs (trait anxiety, neuroticism and intolerance of uncertainty), as assessed through the above mentioned scales, are related to-or can be subsumed under-the broader umbrella negative emotionality. All three have been associated with individual differences in fear conditioning performance 29 . Yet results in the literature are heterogeneous and partly inconclusive at the behavioral and neuro-functional level. The fear conditioning field, similar to the field of personality neuroscience in general 45 , suffers from a number of well described problems. These problems include: (1) generally underpowered samples (typically below N = 30 per group 29 , and (2) sub-optimal statistical approaches such as dichotomizing continuous variables which gives rise to interpretation problems, causing massive loss of power and increases in Type II error rates (i.e., false negatives) [46][47][48][49] . Furthermore the majority of results in the fear conditioning field originate from univariate analyses focusing on (3) single constructs related to negative emotionality (for a discussion see 29 , for a few exceptions see [50][51][52][53][54] ) and (4) singular outcome measures (such as ratings, SCRs, FPS or BOLD fMRI) each tapping into slightly different underlying processes 23 . Attempts for multivariate integration are thus far rare. As a consequence, separate lines of research and isolated findings have emerged that are notoriously difficult to integrate and interpret into one bigger picture. Hence, we echo recent calls for a paradigm shift embracing more complex multivariate approaches, the use of larger data sets and the dimensionality of the data 16,55 . The overarching aim of this work is to enhance our understanding of the mechanisms through which negative emotionality may convey risk for affective psychopathology by integrating separate lines of research (using different scales and outcome measures) that have emerged in parallel and are difficult to integrate.
To achieve this aim, we start by integrating dimensional measures as derived from three commonly employed scales in the field (i.e., STAI-T, NEO-FFI-N and IUS) with the three most commonly used measures of conditioned responding (ratings, SCRs, FPS)-as identified and summarized by a recent review by our group 29 . These measures are obtained in a large sample (Study 1, N = 356) and combined into one statistical model that is set up to investigate whether any of these scales is linked to specific fear conditioning performance over-and-beyond the other scales. Subsequently we investigate whether it is the shared variance across the scales and across outcome measures that explains these potential associations and thus supports a prominent role for the general construct negative emotionality, or whether the scales remain specifically associated with specific measures of conditioned responding. Specific and directed hypotheses on the outcome of this model are difficult to derive from the existing literature because results in the field are extremely heterogeneous.
Additionally, we aim to replicate the main findings from Study 1 in a re-analysis of a second pre-existing sample (Study 2, N = 113) which also allows to extend our inferences to the neuro-functional level (a brief Introduction to Study 2 is provided below).
Questionnaires. Participants filled in a batch of questionnaires prior to the experiment. This batch included (1) questions to obtain demographic information, (2) the State-Trait Anxiety Inventory 30 , (3) the NEO-FFI 31,57 (4) the Intolerance of Uncertainty Scale 32 and (5) the locus of control IPC Scale (Internal control, Powerful others external control, Chance control) 58 . Upon completion of the experiment (i.e., after extinction and reinstatement), participants filled in a post-experimental awareness questionnaire 23 of which answers were orally confirmed with the experimenter. The questionnaire included estimations on the total number of received electrotactile stimuli and the total number of experimental stimuli presented during the experiment. Also, it contained questions about perceived CS-US contingencies during the experiment (first as a free recall then as a forced choice). Based on this, participants were classified as either aware (N = 236, able to correctly report CS-US contingencies in free recall and/or forced choice) or unaware (N = 87, unable to report correct CS-US contingencies across questions). Twenty-one participants that reported a tendency towards the correct contingencies but also some unsureness were counted as aware. Data on CS-US awareness were missing from twelve participants.
The trait scale of the STAI (STAI-T) consists of 20 items, evaluated on a four-point Likert scale, allowing individuals to score between minimally 20 and maximally 80 points. Despite its potential misleading name (i.e. trait anxiety inventory), the STAI-T more likely assesses how a respondent generally feels, and is thought to target relatively stable aspects significant for "anxiety proneness", including calmness, confidence and security 59 . Congruently, the STAI-T has been criticized for representing a psychometrically inhomogeneous scale itself 33,34 representing facets of anxiety and depression. Based on confirmatory factor analytical approaches in large samples of healthy individuals 33,34 and patients 34 , it was suggested that the STAI-T measures "general negative affect" rather than "measuring anxiety or depression in a strict sense". The latter two hypothetical sub-factors had been proposed previously 35 but lacked sufficient discriminant validity in newer work using larger samples 33,34 .
The neuroticism scale of the NEO-FFI (NEO-FFI-N) consists of 12 out of 60 items, which were derived factor analytically and should represent one of the five higher order Big-Five personality traits 31 , i.e., neuroticism. Scores on this NEO-FFI-N scale can range between 0 and 48. Neuroticism refers to the tendency to express negative emotionality and has been suggested to be associated with defensive responding to uncertainty, threat, and punishment 60 .
The IUS consists of 27 items that aim to assess an individual's tendency to react to the uncertainties of life, or more precisely their intolerance towards these uncertainties 32,40 . Each item is evaluated on a five-point Likert scale, allowing respondents to score between 27 and 135. Intolerance of uncertainty is defined as a cognitive bias that affects how uncertain situations are perceived, interpreted, and dealt with cf. 61,62 . Several factor solutions have been suggested including a four-factor solution 40 which include: (1) uncertainty is stressful and upsetting, (2) uncertainty causes inability to act, (3) uncertain events are negative and should be avoided, and (4) being uncertain is unfair. No official German version of this questionnaire exists, however, Gerlach et al. 32 created a German translation and investigated the underlying factor structure in which this four-factor structure could not be replicated. In Study 1, this German translation of the full 27 items of the IUS is used.
For those individuals having one or more, but not all, missing items on either of the questionnaires, missing values were imputed using a single imputation with the predictive mean matching method in the MICE package in R. This imputation method draws observed values from other subjects with a similar response pattern on other variables. In total, data was imputed for 8 subjects on the STAI-T, for 5 subjects on the NEO-FFI-N, and for 26 subjects on the IUS. Forty-six participants have missing data for the full STAI-T and IUS, one misses the full STAI-T only, and one participant has full missing data on the NEO-FFI-N. Because this data cannot be considered as missing at random, it is not imputed, but maintained as missing data.
Overall reliabilities of the questionnaires and subscales of interest were high in the final sample, as indicated by Cronbach's α: 0.91 for STAI-T; 0.86 for NEO-FFI-N; 0.94 for IUS. In addition, a wide range of scores was covered in the acquired sample: 21-76 for STAI-T with 38 ± 9 (mean ± SD); 1-40 for NEO-FFI-N with 20 ± 8 (mean ± SD); and 27-135 for IUS with 62 ± 18 (mean ± SD).
Experimental design. All participants underwent a fear conditioning, extinction and return of fear paradigm. Data of interest for the current research question concerns fear acquisition training only. Data acquired during the same experimental session but involving experimental manipulations after this fear acquisition training phase or involving a methodological validation in sub-samples of this sample is published elsewhere 63 www.nature.com/scientificreports/ Therefore, experimental details will only be provided for the fear acquisition training phase and the preceding habituation phases of the experiment.

Instructions.
Participants were not instructed with respect to the CS-US contingencies or the learning element of the study.
Visual material-conditioned stimuli. Black geometrical shapes (i.e., a rectangle and an ellipse) served as conditioned stimuli (CS). One of these shapes (CS+) co-terminated with the unconditioned stimulus (US) during all fear acquisition training trials, whereas the other shape did not (CS−). In other words, a 100% reinforcement ratio was used during this experimental phase. Each CS type was presented consecutively for maximally two times, and nine times in total during fear acquisition training (9 CS+ and 9 CS− trials). Allocation of the shapes to CS+ or CS− was counterbalanced across participants, as well as the order in which the CS+/ CS− appeared. The CSs were presented for 6 s on a colored computer screen (blue, purple, green, or yellow). The background color served as contextual stimulus, which has no relevance to the fear acquisition training phase, but is of value in the context of post-acquisition experimental manipulations-not included here. The background color remained constant within experimental phases and was counterbalanced across participants. CS presentations were interleaved with inter trial intervals (ITI) consisting of a white fixation cross on a black computer screen, with variable durations (11.5 ± 1.5 s). Prior to fear acquisition training, subjects underwent an explicitly US-free CS habituation phase in which both stimulus types (i.e., the CS+ and CS−) were presented two times each.
Electro-tactile material-unconditioned stimulus. A train of three electro-tactile square wave pulses, 2 ms each, with 50 ms intervals, served as US. The US was produced by a DS7A electrical stimulator (Digitimer, Welwyn Garden City, UK) and delivered through a surface electrode with a platinum pin (Specialty Developments, Bexley, UK) to the dorsal part of the right hand. The intensity of the electro-tactile US was individually adjusted using a stair-case procedure to reach an unpleasant but tolerable level (range US intensities 0.3-70 mA, mean ± SD = 4.7 ± 5.2, median 3.5). The intensity of the US was gradually increased after conferring with the participant. The participant could then herself/himself elicit the US by pressing the space bar. After delivery of each US, the participant rated the averseness of the US on a scale from one to ten, with one not being aversive at all to ten being unbearable. The experimenter aimed to reach a final averseness rating of seven, which was not explicitly communicated to the participant.
Acoustic material-startle probes. A burst of 95 dB(A) white noise was used to elicit a startle response.
Startle probes were presented binaurally via headphones (Sennheiser, Wedemark, Germany) four or five seconds after CS onset in half of all CS habituation trials (one out of two trials) and in two thirds of all CS fear acquisition training trials (six out of nine trials). Additionally, startle probes were presented in one third of all ITI's, either five or seven seconds after ITI onset. To obtain a stable baseline for startle reactivity, five consecutive startle probes-interleaved six seconds-were presented during a white fixation cross on a black computer screen.
Procedure. Experimental instructions were provided in written form and importantly did not contain instruction with respect to CS-US contingencies. Participants were instructed to attend to the visual stimuli presented on the screen, and ignore the acoustic startle probes. It was made explicit that startle probes had the sole purpose of enabling physiological data acquisition. Participants started by filling out the questionnaires. After, they proceeded with the US intensity calibration phase. In a step-wise procedure the US intensity was increased to a level described by the participant as very annoying but not painful equaling to a rating of at least 7 on a ten-point scale (with ten being the maximally aversive sensation that could be induced by the electrode). Next, the actual experiment started with the startle habituation phase and continued with an explicitly US-free CS habituation phase. Subsequently the uninstructed fear acquisition training phase of interest started. Presentation of all stimuli was controlled using Presentation Software (NeuroBehavioral Systems, Albany California, USA). After completing the full experiment, thus after the post-acquisition training phases that included extinction training, reinstatement administration, and return of fear test phases, participants completed the post experimental awareness questionnaire. Twenty-eight participants had to be excluded for the fear conditioning experiment either due to voluntarily discontinuation or technical failure during data acquisition.
Subjective data recording-fear ratings. Participants indicated their level of fear, anxiety, and distress towards both CS types within intermittent rating blocks. The following text was presented on screen: "How much stress, fear or anxiety did you experience the last time you saw symbol X?", with the "X" referring to one of the CS types at a time. Participants were given seven seconds to provide their response on the computerized visual analogue scale (VAS) ranging from 0 (none) to 100 (maximum), which had to be confirmed within the given time window by pressing the enter key. One rating block was presented at the end of the habituation phase, and three rating blocks were presented during fear acquisition training. Rating blocks were always presented after minimally one and maximally four CS+ and CS− presentation(s) (see 65 for a graphical overview on the design including ratings). The last rating in the fear acquisition training phase either occurred after the seventh or eighth acquisition trial. Nineteen participants failed to confirm their selected VAS values within seven seconds, and have therefore missing rating data. Post processing was conducted in R version 3.6.0 (2019-04-26). www.nature.com/scientificreports/ Physiological data recording and processing-skin conductance and startle responding. Methods and procedures to acquire physiological data have been previously described in Sjouwerman et al. 65 . Physiological data were recorded using a BIOPAC MP100 amplifier, (BIOPAC Systems Inc., Goleta, CA, USA) and AcqKnowledge 3.9.2 software. Data preprocessing was conducted in MATLAB (version2014b), response quantification was conducted manually in a custom made program, and post processing was conducted in R version 3.6.0 (2019-04-26). For physiological measurements, additional data is missing for some participants (SCR = 8, startle = 30) due to technical failures including for example saving failure of the physiological data file only, data extraction problems, erroneously adjusting the gain during the experiment causing SCR amplitudes to be uninterpretable, or electrode misplacement during data acquisition.
Skin conductance. For skin conductance recording, participants first cleaned their hands with warm water.
After, two hydrogel and Ag/AgCl sensor recording electrodes (Ø 55 mm) were attached to the distal and proximal hypothenar eminence of the left hand. Skin conductance data were recorded continuously at 1,000 Hz with a gain of 5mΩ. In case participants' skin conductance moved beyond the scaling window, the gain (i.e. resistance) was increased or decreased to reduce or increase sensitivity of the skin conductance being recorded prior to the start of the experiment. Offline, data was down sampled to 10 Hz. According to published guidelines 24 , data were scored manually as foot-to-peak responses with response onsets starting between 0.9 and 4.0 s after CS or US onset. Increases smaller than 0.02 µS were scored as zero responses. Responses confounded by recording artifacts, such as electrode detachment, responses moving beyond the sampling window, or excessive baseline activity were discarded and scored as missing values. Raw skin conductance response (SCR) amplitudes were normalized by log transformation and range corrected by division through an individuals' maximum response amplitude (either CS or US).
Participants not showing valid SCRs in over two thirds of all fear acquisition training to the US (i.e., six out of nine trials 56 ) were classified as physiological non responder (n = 16) and all SCR trials were set to missing values.
Startle responding. Startle responding was measured by using Ag/AgCl electromyogram (EMG) electrodes. Two electrodes were placed below the right eye over the orbicularis oculi muscle and one electrode was placed on the participants' forehead to obtain a reference signal. Startle data filtered online (band-pass: 28-500 Hz), rectified, and integrated (averaged over 20 samples). According to published guidelines 66 data were scored manually as foot-to-peak with response onsets within 20-120 ms post startle probe onset. Responses confounded by a blink occurring up to 50 ms before startle probe onset were scored as missing value. Similarly, trials confounded by recording artifacts or excessive baseline activity within the same time window were scored as missing values. Raw data were t-transformed across the experimental phases up to the fear acquisition training phase.
Participants not showing valid startle responses in over one third of all trials from the habituation phases and the fear acquisition training (i.e., more than 9 out of 28) were classified as physiological non responder (n = 16), and startle responses for these participants were set to missing values. Note that these startle non-responders do not overlap with SCR non-responders.
Fear acquisition is operationalized as CS+/CS− discrimination during the fear acquisition training phase (i.e., average CS+ minus average CS− responding). This includes responding towards 9 CS+ and 9 CS− trials for SCRs, and 6 CS+ and 6 CS− trials for startle responses (not all trials were 'startled' , see above), and all 3 intermittent CS+ and 3 intermittent CS− ratings.

Study 1: Statistical analyses
Study 1 consists of three analysis steps. First, univariate zero-order correlational analyses are conducted between the three questionnaires, or their respective subscale, and the three outcome measures of conditioned responding recorded in our study. For each outcome measure, a CS-discrimination value is calculated by subtracting CS− responding from CS+ responding. All variables are treated as dimensional. For all univariate analyses, multiple testing will be corrected by using the Benjamini Hochberg method (p BH ). Correlation coefficients were compared using freely available online computer software 67 . Exploratory correlational analyses with awareness and US-intensity are reported in the Supplementary Information.
In a second step which serves the aim to integrate potential effects of these independent and dependent measures in a single model, a path model is constructed in which relationships between the three independent variables and the three dependent variables are estimated simultaneously. In this path model, correlations among the independent measures, i.e., questionnaires or scales, are allowed.
In a third step we take into account that the three independent measures are likely to highly correlate with each other which makes it possible that these questionnaires, or subscales of questionnaires, all tap into a same larger construct, i.e., "negative emotionality". Similarly, the three measures of conditioned responding highly correlate and might be part of the larger construct "fear learning". This hypothesis, i.e., whether it is the shared variance across the questionnaires or scales or unique variance of individual questionnaires or scales that explains differences in specific outcome measures will be examined by employing a structural equation model. In this model, two latent variables will be defined, one for the three questionnaires/scales and one for the three outcome measures. The regression weight between the latent variable negative emotionality and the first questionnaire/ scale, as well as the regression weight between the latent variable fear learning and the first outcome measure will be fixed to 1. Subsequently, a complementary structural equation model will be constructed that in addition to the paths defined in the model described above (under step 3) includes the paths that showed significant associations in the established path model in step 2.
Scientific RepoRtS | (2020) 10:15283 | https://doi.org/10.1038/s41598-020-72007-5 www.nature.com/scientificreports/ For both, path models and structural equation models, two-sided model fit will be evaluated based on root mean square error of approximation (RMSEA) values, indicating excellent fit when < 0.01, good fit when < 0.05, fair fit when < 0.08, mediocre fit when < 0.10, and poor fit at > 0.10 RMSEA values 68,69 . To improve model fit, backward selection of significant and trend-significant paths will be executed. Trends (p < 0.1) will be included in interim models, but not in final models. Full models, interim models, as well as final models will be reported.

Study 1: Results
Study 1 aims to enhance our understanding of the mechanisms through which negative emotionality may convey risk for affective psychopathology by integrating separate lines of research (using different scales and outcome measures) that have emerged in parallel and are difficult to integrate. In a first step, we present univariate analyses illustrating associations between the three commonly employed scales in the field (i.e., STAI-T, NEO-FFI-N and IUS) with the three most commonly used measures of conditioned responding (ratings, SCRs, startle responding). We then move to multivariate analyses integrating these variables into a single model (Step 2) and exploring the role of potentially latent higher-order factors (Step 3). www.nature.com/scientificreports/

Step 1: Univariate analyses
Univariate analyses revealed a significant albeit small negative correlation between STAI-T and CS discrimination in SCRs (r = − 0.19, p BH = 0.007), whereas correlations between CS discrimination in SCRs and either NEO-FFI-N or IUS were not significant when correcting for multiple testing but were trend wise significant and significant only when not correcting for multiple comparisons (see Fig. 1). This effect seems descriptively driven by the combination of weakly and non-significantly increasing CS− responding, and weakly and non-significantly decreasing CS+ responding with increasing scores on these questionnaires/scales. The correlation coefficients for CS discrimination in SCRs and the three questionnaires do however not differ significantly from each other (SCR discrimination for STAI-T and NEO-FFI-N: z = − 1.691, p = 0.091; STAI-T and IUS: z = − 0.977, p = 0.329; NEO-FFI-N and IUS: z = 0.498, p = 0.619). Furthermore, a small negative correlation between IUS and CS discrimination in startle responding (r = − 0.15, p BH = 0.039) during fear acquisition training was observed, with no (BH corrected) significant associations for STAI-T and NEO-FFI-N and CS discrimination in startle responding but only trend wise significant associations when not controlling for multiple comparisons. The negative association between IUS and startle responding is driven by significantly increasing CS− responding with increasing scores on the IUS, while the association between the CS+ and scores on the IUS is not statistically significant. Even though the associations for the NEO-FFI-N and STAI-T with CS discrimination were uncorrected only trend significant, a similar pattern (i.e., significant positive association; r = 0.157-0.183, p BH < 0.05) between these questionnaire scores and CS− responding was observed. The correlation coefficients for the CS discrimination and each of the three questionnaires/ scales were not significantly different (all z < 0.69, all p > 0.492).
CS discrimination in ratings was not significantly associated with any of the three questionnaires, or scales of questionnaires. But note that CS− responding in ratings, similar to the pattern observed in startle responding, was positively-albeit only trend wise-associated with all three questionnaires/scales. Exploratory analyses for associations between STAI-T scores and US intensities as well as with awareness are reported in the Supplementary Information.

Step 2: Multivariate analyses
Path analysis on the sum scores on the three questionnaires/scales and the three outcome measures of fear learning revealed expected significant positive associations between on the one side the scores on the three questionnaires (all p's < 0.001) and on the other side also revealed positive correlations between the three outcome measures (all p's < 0.001). In addition, a significant path between STAI-T and CS discrimination in SCRs was observed in this full model in which all paths between all variables were included (Fig. 2, grey font and grey paths). The final model generated through backwards selection (Fig. 2, blue font and blue path), also yielded a significant path between IUS and CS discrimination in startle, which was only trend-significant in the full model. www.nature.com/scientificreports/

Step 3: Multivariate analyses testing for shared vs. unique variance
To test the hypothesis whether the associations observed in step 2 are specific to specific questionnaire scores or driven by shared variance of the three questionnaires or scales, we set up three structural equation models: (1) an initial model with two latent variables ("negative emotionality" and "fear learning", Fig. 3, grey lines and font), (2) an interim model (Fig. 3, black lines and font) in which we added the unique paths identified in Step 2 to the initial model (i.e., STAI-T and SCR discrimination as well as IUS and startle discrimination), and (3) a final reduced model generated through backward selection (Fig. 3, blue lines and blue font). The initial model shows that the three questionnaires or scales, STAI-T, NEO-FFI-N, and IUS are indeed closely related to the latent variable "negative emotionality" with all factor loadings > 0.69. Similarly, the three measures of fear acquisition, SCR, startle responding and ratings are also closely related to the latent variable "fear learning" with slightly lower factor loadings (yet, all > 0.46). This pattern is maintained in the interim and final model. Importantly, in this initial model (Fig. 3, grey lines and font) in which the path between STAI-T and SCR, and IUS and startle responding were not yet included, a significant negative relation (β st = − 0.246) between the two latent factors "negative emotionality" and "fear learning" was observed, suggesting that there is a general predictive effect of higher negative emotionality being linked to reduced differential fear learning. This model shows good fit (RMSEA = 0.019).
In the interim model the two unique significant paths identified in Step 2 (i.e., STAI-T and SCR discrimination as well as IUS and startle discrimination) were added to the initial model. This resulted in the relation between the two latent factors disappearing (black dotted line in Fig. 3), whereas the two unique path turn out significant in this interim model. This path structure is entered in the final model and the paths between STAI-T and SCR and IUS and startle remain significant. The interim and the final model both show excellent fit (RMSEA = 0). This suggests that it may be the unique variance in STAI-T that predicts fear learning in SCR responding (β st = − 0.167), and the unique variance in IUS that predicts fear learning in startle responding (β st = − 0.149), rather than the shared variance across the questionnaires/scales, and across outcome measures. Note that these standardized coefficients of the final model are identical to the coefficients estimated with the final path model in Step 2.

Study 1: Interim discussion
As expected, we observed moderate to strong links between the scores derived from the three questionnaires/ scales linked to negative emotionality (STAI-T, NEO-FFI-N, IUS) in a large sample, whereas links between the three outcome measures of conditioned responding (SCRs, startle, ratings) were weak to moderate. These weak to moderate correlations among outcome measures are consistent with the idea that they tap into slightly different processes and capture different timings with respect to CS processing discussed in 29 . Uncorrected univariate www.nature.com/scientificreports/ analyses, which would mirror the approach employed in studies focusing on these questionnaires/scales and outcome measures in isolation, suggest significant associations between STAI-T and SCR, IUS and SCR, and IUS and startle responding. These significant correlations were small and each explained between 10 and 20% of the variance. Even when ignoring the strong correlations among the measured questionnaires/scales, this leaves space for other individual traits beyond negative emotionality to affect physiological responding. Additionally these univariate analyses suggest trends for the NEO-FFI-N and SCR and startle, as well as for STAI-T and startle responding. Importantly, the correlation coefficients between one outcome measure and the different questionnaire scores do not differ significantly from each other. The results from the multivariate path model on the other hand may suggest a slightly different conclusion. When all measures are integrated in one statistical model, only associations between STAI-T and CS discrimination in SCR, and between IUS and CS discrimination in startle responding remain significant. This may suggest a certain level of specificity indicating that more general measures (such as the STAI-T) of negative emotionality might be specifically associated with outcome measures that also reflect very general physiological arousal or general affective processes (SCR). Our results suggest that these may be distinct from more specific measures (such as the IUS) in the negative emotionality domain, which seem to be associated with measures reflecting valence specific processes related to fear learning.
Remarkably, the structural equation model that includes latent variables for negative emotionality and fear responding only, reveals a strong association between both latent variables with good fit. All questionnaires/scales load strongly on the latent factor negative emotionality, with STAI-T showing the strongest factor loading. These high factor loadings indicate that including negative emotionality as latent factor may be indeed appropriate and informative. The fear learning latent variable is represented by less strong, but still medium sized factor loadings, suggesting that these readouts are related and there might be a broad underlying "fear learning" variable, but it also underlines that there is room for dissociation between them in particular because from a theoretical and neurobiological perspective 70 these different outcome measures capture slightly different cognitive-affective processes and timings with respect to CS processing 23,56,71 . The initial effect between the latent variables is eliminated when adding the two identified specific paths.
In sum, our results speak in favor of a substantial shared variance and the existence of a latent "negative emotionality" factor across the three questionnaires and scales included, speaking in favor of convergent validity. In addition, specific paths between specific scales and outcome measures were identified-yet it cannot be excluded that these specific results may represent overfitting in this particular sample. Our results may indicate that it might rather be the shared variance across the questionnaires/scales that predicts fear learning and the specificity of the here identified paths needs to be replicated and further investigated in another well-powered sample potentially including item-level factor analytical approaches which would require a substantially larger sample than included in this study. Similarly, additional individual difference factors in the personality domain that for example target positive emotionality, could improve the amount of explained variance between individual difference factors and physiological measures. Ultimately this might contribute to a more dimensional framework in understanding the mechanisms behind individual differences in fear acquisition.

Study 2: Brief introduction
Surprisingly, the neurocognitive processes underlying an association between negative emotionality and the ability to discriminate signals of danger and safety remain largely unknown to date 29 . A bunch of studies has investigated neural associations with trait anxiety 52,72-75 , intolerance of uncertainty 52,72-75 or neuroticism 52,72-75 , but studies integrating fMRI results with concurrently acquired psychophysiological measures and measures of negative emotionality in one and the same study are rare or even non-existent in the field, reviewed in 29 .
Therefore, we aim to address this fundamental gap in the literature and extend our findings from Study 1 by exploring the neuro-functional mechanisms potentially underlying the observed specific association between the STAI-T score with CS+/CS− discrimination in SCRs that has been observed in Study 1. To achieve this aim, we re-analyzed data from a large pre-existing sample (N = 113).
Of note, recording of startle responses in the MR scanner has been challenging due to technical challenges, which we have only very recently been able to overcome 26 . To date, we do not have sufficiently large samples with simultaneous recordings of startle and BOLD-fMRI to follow up upon the association between the IUS and CS+/CS− discrimination in startle.

Study 2: Methods
Participants. Study 2 is based on a pre-existing dataset. One-hundred and twenty four participants were included in Study 2. Participants were recruited from a large screening sample described in 76 in which they had been screened by a psychologist for neurological disorders using the M.I.N.I. interview 77 as a diagnostic tool. Any current or prior psychiatric neurological disorder or self-reported abuse of illegal drugs led to exclusion. Participants were re-invited for two separate studies based on exposure to life adversity in order to study its impact on a post-extinction manipulation (i.e., reinstatement; an experimental phase not included in the analyses of the current manuscript and published elsewhere 78 , first study also in (Scharfenort and Lonsdorf 79 ), second unpublished). These two studies employed identical experimental protocols (including experimenter) for fear acquisition, and thus participants were pooled across both studies into Study 2 of this manuscript.
Out of the 124 participants, eleven participants had to be excluded due to technical issues (N = 6), pathological anatomy (N = 2) or missing items on the STAI-T (N = 3) which resulted in 113 participants included for fMRI, rating and SCR analyses of fear acquisition. The final sample included 44 females and 69 males, ranging in age between 19 and 34 years. On average participants were 25 ± 3.5 (SD) years old. All participants had normal or corrected to normal vision.

Questionnaires. Participants filled in a batch of questionnaires prior to the experiment. This batch included
(1) questions to obtain demographic information, (2) the state scale of the State-Trait Anxiety Inventory 30 , and (3) the NEO-FFI 31,57 . The trait scale of the State-Trait Anxiety Inventory (STAI-T) was already acquired within the context of the screening sample, and based on the results obtained in Study 1, the STAI-T is of main interest here.
Overall reliability of the STAI-T was high in the final sample as indicated by Cronbach's α = 0.93. Notably, a smaller range of STAI-T scores was obtained in Study 2 as compared to Study 1. STAI-T scores ranged between 20 and 59 with a mean of 35 ± 9 (SD).
After completing the experiment (i.e., after fear acquisition training in the MR-environment), participants filled in a post-experimental awareness questionnaire 23 and results were orally confirmed with the experimenter. Participants were asked to estimate the total number of received electrotactile and other experimental stimuli, as well as about perceived CS-US contingencies curing the experiment (first as free recall, then forced choice). Consequently, participants were classified as either aware (N = 101, able to correctly report CS-US contingencies in free recall and/or forced choice) or unaware (N = 12, unable to report correct CS-US contingencies across questions).
Instructions. As in study 1, participants were not instructed with respect to the CS-US contingencies or the learning element of the study.
Visual material. Two different white fractals presented on a grey background (RGB [230,230,230], 340 × 320 pixel, resolution: 1,024 × 768) served as conditioned stimuli (CSs; duration: 6-8 s, mean: 7 s). A white cross in the middle of the grey background screen served as inter trial interval (ITI; duration: 10-16 s; mean: 13 s). One of the fractals (CS+) co-terminated with the unconditioned stimulus (US) during all fear acquisition training trials (100% reinforcement rate), whereas the other fractal did not (CS−). A 100% reinforcement rate was chosen to render it likely that all participants learn the association between CS+ and US within the 14 presentations. To allow for a differentiation between CS+ and US-related neural activity despite of this high reinforcement rate, the CS+ duration was jittered between 6 and 8 s (mean 7 s). Allocation of the fractals to CS+/CS− was counterbalanced over all subjects.
Electrotactile US. Similar to Study 1, the US consisted of a train of three electrotactile stimuli (interval 50 ms, duration 10 ms). The US was administered through a surface electrode on the dorsal part of the right hand via a DS7A electrical stimulator (Digitimer, Elwyn Garden City, UK). Before the experiment started, intensity was calibrated individually to a maximum tolerable level with a mean US intensity ± SD of 7.18 ± 4.47 mA, see study 1 for details on the US calibration procedure. Procedure. Fear acquisition training occurred between 1 and 6 pm. As mentioned, experimental phases on the subsequent day (extinction, reinstatement and reinstatement test) are not of interest to the current manuscript. Participants were instructed outside the MR-environment. They were instructed to attend the visual stimuli on the screen, no instruction with respect to CS-US contingency was given.

CS-US awareness. CS-US awareness was assessed as in
After positioning the participant within the MR-environment, two skin conductance recording electrodes were attached, as well as one stimulation electrode for US delivery. When positioned in the MR-scanner, the US calibration procedure was started. Next the actual experiment started with a CS habituation phase. Both CSs were explicitly unreinforced and presented seven times each. In the subsequent uninstructed fear acquisition training phase of interest, both CSs were presented 14 times each.
Presentation of all stimuli was controlled using Presentation Software (NeuroBehavioral Systems, Albany California, USA). After completing the experiment for that day, thus after fear acquisition training, participants completed the post experimental awareness questionnaire.
Subjective data recording-subjective ratings. Subjective ratings were acquired retrospectively, i.e., after all fear acquisition training trials. Participants indicated their level of stress/fear/tension elicited by the preceding CS+ and CS− presentations on a 25-stepped visual analog scale (VAS, anchored at 0 and 100). Ratings of participants failing to confirm their rating for both CS types, were set to missing values (N = 9). In case participants missing either the CS+ or the CS− rating, the respective rating was replaced with the mean CS+ or CS− rating of all valid responses of the other participants on that rating (number of replaced values CS+ = 7, CS− = 4). Similar to what has been described previously in 79 , SCRs were recorded using a BIOPAC MP100 amplifier (BIOPAC Systems Inc., Goleta, California, USA). Ag/AgCl electrodes were placed on the palmar side of the left hand on the distal and proximal hypothenar. Data were processed and scored according to published guidelines 24  www.nature.com/scientificreports/ data were down-sampled to 10 Hz. The phasic SCRs to the CS onsets were manually scored off-line, using custom-made software. SCR amplitudes (in µS) were scored as the first response initiating 0.9-4.0 s after CS onset 24 . To normalize the distribution, the SCRs were log transformed 80 and range-corrected by division through an individuals' maximum response amplitude 81 .

BOLD-fMRI. MRI data acquisition and preprocessing has been previously described in Scharfenort and
Lonsdorf 79 . MRI data were acquired on a 3-T MR-scanner (MAGNETOM trio, Siemens, Germany) using a 32-channel head coil. Functional data were obtained using an echo planar images (EPI) sequence (TR = 2460 ms, TE = 26 ms). For each volume, 40 slices with a voxel size of 2 × 2 × 2 mm (1 mm gap) were acquired sequentially. Structural images were obtained by using a T1 MPRAGE sequence. Preprocessing and analyses were performed using standard pre-procsessing in SPM8 (Welcome Trust Centre for Neuroimaging, UCL, London, UK). Preprocessing included, coregisteration to the individual structural image, realignment, normalization to groupspecific templates (created via the DARTEL-algorithm 82 ; as well as smoothing (6 mm FWHM).

Statistical analyses.
For fMRI data, four effects-of-interest regressors were built at the first level (i.e. early and late half of the acquisition trials for CS+ and CS−) as well as ten nuisance regressors (BOLD responses for CS+ and CS− during habituation, USs, ratings, and six movement parameters derived from realignment). All regressors of interest were modeled as stick function and time locked to stimulus onset for acquisition analyses. The general linear model was used to compute regression coefficients (beta values) for the regressors in each voxel. CS discrimination contrasts (CS+ > CS−; CS− > CS+ for the full acquistion phase) were estimated on the first level and taken into the second level analysis employing voxel-wise regression analyses with the STAI-T. The main effect of task was estimated in a full factorial model with two regressors for both CS+ and CS− (early and late) and contrasts of interest were set to CS+ > CS− and CS− > CS+ covering the full phase.
ROI analyses were based on key areas for fear conditioning [primary ROI: amygdala; secondary ROIs: hippocampus, dorsal anterior cingulate cortex/dmPFC (dmPFC), pallidum/putamen, ventromedial prefrontal cortex (vmPFC), thalamus, insula (Fullana et al. 83 ; Sehlmeyer et al. 82 )]. Masks were derived from the Harvard-Oxford cortical and subcortical structural atlases; with a probability threshold of 0.7. As no vmPFC and dmPFC mask is provided by the Harvard-Oxford atlases, we used a box (20 × 16 × 16 mm) centered on coordinates from a previous (independent) study [vmPFC: x, y, z: 0, 40, − 12; dmPFC: 0, 43, 29 84 ]. Due to strong a-priori predictions with respect to the amygdala and the use of additional regions as secondary ROIs, correction was performed separately for each ROI. A statistical threshold of p < 0.05 (FWE corrected within the ROI) was considered significant.
Parameter estimates from peak voxels were extracted from the first individual level. Correlation test were performed for CS discrimination as well as CS+ and CS− specific responding in SCRs and ratings with scores on the STAI-T. For ratings and SCR correlational analyses were also carried out with the parameter estimates to explore the link between physiological responding and brain activation in areas linked to the STAI-T. Exploratory correlational analyses with the STAI-T and awareness as well as with US-intensity are reported in the Supplementary Information.

Study 2: Results
Main effects of task. Successful fear acquisition was evident by significantly larger SCR amplitudes for the CS+ than for the CS− during the full fear acquisition phase [t(112) = 10.14, p < 0.001, d = 0.95]. Similarly, post-acquisition fear ratings were higher for the CS+ as compared to the CS− [t(102) = 15.65, p < 0.001, d = 1.52]. On a neuro-functional level CS-discrimination (CS+ > CS−; Table 1) was reflected in areas typically activated in fear acquisition (i.e., thalamus, amygdala, dmPFC/dACC, insula/frontal operculum and putamen/pallidum). Stronger activation to the CS− than the CS+ was observed in the vmPFC (T-maps are available on neurovault https ://ident ifier s.org/neuro vault .image :30500 7). Note that when correcting for the number of ROIs (i.e., 7) the hippocampus would not meet the corrected significance threshold of 0.007.

Associations of CS+/CS− discrimination in SCRs and ratings with STAI-T scores. We explored
associations between the STAI-T score and CS+/CS− discrimination in SCRs, post-experimental ratings as well as in BOLD-fMRI. In contrast to what was observed in Study 1, the STAI-T score was not significantly associated with CS+/CS− discrimination in SCRs or ratings during fear acquisition training in univariate correlation analyses (SCR: r = − 0.05, p = 0.59; rat: r = − 0.15, p = 0.13) as illustrated in Fig. 4. Despite the absence of differences in CS discrimination we explored possible associations with CS+ or CS− responding individually. STAI-T was weakly (r = 0.31, p BH < 0.01) and positively correlated with CS− responding (i.e., higher CS− ratings in individuals with higher STAI-T scores) in ratings, but not with either CS+ or CS− responding in SCR.

Neuro-functional associations of CS+/CS− discrimination with STAI-T scores.
On a neural level, however, higher STAI-T scores were associated with significantly stronger CS+/CS− discrimination related activation of the right amygdala, the putamen (bilaterally) and the thalamus (bilaterally) during fear acquisition training ( Table 2 Robustness checks revealed that a model including the covariate 'life adversity' (as participants in Study 2 were initially recruited based on this variable) yielded comparable results. More precisely, statistical values differed only at the last decimal place with the exception of the right thalamus which does not meet the 0.05 threshold when including the covariate 'life adversity' (data not shown). Of note, these areas are also significantly implicated in CS− discrimination irrespective of STAI-T in this sample (see above).
Additional, explorative analyses revealed that of these areas, CS discrimination in SCRs was only positively associated with peak voxel activation in the left putamen (r = 0.240, p BH = 0.028), and with the first cluster in the left thalamus (r = 0.322, p BH = 0.002). The latter might be driven by a positive association between SCRs for the CS+ and the left thalamus (r = 0.315, p BH = 0.002), whereas the CS− is does not show a significant association. A graphical representation of these associations can be found in the Supplementary Material (Supplemental Figure S1). Ratings for CS discrimination, the CS+ and the CS− were not significantly associated with peak voxel activation in any of our ROIs (all p BH > 0.512).
An exploratory analysis testing associations between STAI-T and US intensity scores as well as associations with awareness are reported in the supplementary information.

Comparing the STAI-T distributions across both samples (behavioral study vs. fMRI study).
To explore whether the distribution of STAI-T values in Study 1 and Study 2 are different (see Fig. 6A), a twosample Kolmogorov-Smirnov test was performed. This test indicates that both samples come from different distributions, D = 0.219, p < 0.001. As can be derived from Fig. 6B the fMRI sample includes substantially more individuals with low STAI-T values (i.e., < 50) as compared to the behavioral sample.

Study 2: Interim summary
Results of study 2 show a positive association of the STAI-T score with CS+/CS− discrimination on a neurofunctional level in the right amygdala, the putamen (bilateral) and the thalamus (bilateral). These regions have all been implicated in the ability to discriminate signals of danger from signals of safety, both in the literature 83,85 and in the paradigm and sample reported here. Of note, the amygdala is a core region implicated in fear learning [86][87][88][89][90] and has been previously linked to individual differences in discriminating signals of danger from signals of safety 73 . This previous work has often not included the simultaneous acquisition of both autonomic (i.e., SCRs) and neuro-functional measures in the same experimental phase 91,92 while others have recoded both measures during fear acquisition 72,74 or fear expression 73 . Importantly, also in domains of threat processing, similar positive Table 1. Neural activation reflecting CS CS− discrimination during fear acquisition (main effects of task) in the defined ROIs for left (L) and right (R) regions separately. Cluster size (k), MNI coordinates (x, y, z), and statistical values for uncorrected (0.001uc) and family-wise error corrected p-values in the ROI using smallvolume correction (SVC FWE ) are reported. Note that we used seven ROIs with the amygdala being the primary ROI and the other six being secondary ROIs. Correcting for multiple comparisons related to these seven regions would yield a corrected significance threshold of 0.007 (0.05/7), which the right and left hippocampus would not meet. Coordinates are in MNI space. www.nature.com/scientificreports/  www.nature.com/scientificreports/ associations between STAI-T score and amygdala reactivity as reported here have been observed [93][94][95][96] . In addition, our work provides evidence for an involvement of the amygdala in individual differences underlying the strength of fear learning beyond the average (i.e., a general role in fear acquisition and expression). This is important as evidence suggesting the role of the amygdala in fear acquisition has been questioned 83 and also as there is accumulating evidence that aggregated results across a group do not necessarily generalize to individuals e.g., 97,98 .
Despite the observed associations with the STAI-T score and CS+/CS− discrimination on a neural level, we did not observe a significant association between CS+/CS− discrimination in SCRs and the STAI-T score . This is twice the number of subjects included in Study 2 and hence the non-replication of SCR results should be treated with caution. Furthermore, we highlight that the distribution of STAI-T scores between Study 1 and Study 2 (see Fig. 6) is significantly different. The distribution in Study 2 is substantially more left skewed. More precisely study 1 (behavioral) contained more individuals with a high STAI-T score (> 60) than Study 2 (fMRI)-as evident from the much flatter right tail of the density in Study 2. In Study 1 scores reach values up to 76, whereas in Study 2 the maximum score is 59. Moreover, in the imaging study (Study 2) there are proportionally more individuals included with STAI-T scores falling in the lower quartile and thus in a group that would be characterized as having no or low anxiety (STAI-T < 37). Hence, we call for caution when interpreting this null finding as a replication failure of findings in Study 1. Instead, sample bias-possibly originating from high anxious individuals not signing up for fMRI studies-may in addition to the differences in power between the studies also contribute to different results in both studies. Hence, we replicate a recent report of the existence of a profound sampling bias in MRI studies in a large set of pooled studies, which showed that participants in MRI studies had lower trait anxiety scores compared to participants in behavioral studies 99 . This implies that good characterization and reporting of study populations and experimental parameters is highly important especially in individual difference research 29 .

General discussion Study 1 and 2
The overarching aim of both studies presented here, was to investigate and to explore the putatively specific and shared variance between three commonly used questionnaires associated with negative emotionality and their relation to conditioned responding measured during a fear conditioning experiment in multiple units of analyses (ratings, skin conductance, startle, BOLD-fMRI). These relations were investigated in two large samples (N Study1 = 356; N Study2 = 113). The three questionnaires selected for this study (the trait scale of the STAI, the neuroticism scale of the NEO-FFI and the Intolerance of Uncertainty Scale) were selected because of the abundance of literature in the field of individual differences in fear conditioning research, for a review 29 . These three questionnaires share a substantial part of their variance, here operationalized as a latent 'negative emotionality' variable. Our results hint to potentially specific associations between the STAI-T (Study 1), with the discrimination between cues signaling danger (CS+) or safety (CS−) in the arousal-related outcome measure of skin conductance responding, and intolerance of uncertainty with discriminating danger and safety in valence related outcome measures of startle responses. These results should be interpreted, however, with caution as overfitting in this particularly study cannot be excluded. Importantly, we also find support for the existence of a negative emotionality latent variable that has an effect on general fear learning.
Notably, not accounting for shared variance between measures of emotional negativity in univariate correlational analyses revealed comparable negative associations of all three questionnaires with CS+/CS− discrimination in SCRs-with the STAI-T showing the strongest and significant association. Note that association with www.nature.com/scientificreports/ NEO-FFI-N and IUS were not statistically significant but correlation coefficients were comparable to the one derived from the STAI-T/SCR association. Results derived from the multivariate path model approach imply that the observed univariate associations of NEO-FFI-N and IUS with SCR CS+/CS− discrimination might be fully explained by their shared variance with STAI-T. Of note, the observed association between the STAI-T and CS+/CS− discrimination was negative in Study 1 and Study 2 (i.e., high scores are associated with less discrimination)-although clearly non-significant in Study 2-while others have observed positive associations in small samples 73,91 . There is a plethora of potential and plausible reasons underlying these seeming discrepancies. As discussed in a recent review 29 , these include a number of procedural factors including high vs. low reinforcement rate 100 , potency of the experimental situation 101 instructions 102 , additional triggered outcome measures such as startle that impact on the learning process 65 as well as sample biases or exclusion of specific participants 56,99 to name just a few. Hence, it is possible that neither results may be necessarily 'wrong' as different associations may unfold depending on the specific sample, experimental context or boundary conditions. While the large sample in Study 1-in comparison to the typical sample size in the field in individual difference studies, systematically summarized in 29 -should contribute to trust in our findings, systematic investigations are highly warranted. Hence, we urge authors to focus more on procedural details, demands and related processes, and potential sampling bias in future studies to explore whether this may facilitate mechanistic conclusions 29 .
Of note, in the substantially smaller Study 2, we do, not observe an association between the STAI-T score with CS+/CS− discrimination in SCRs or ratings. Our sample size calculation revealed that Study 2 was most likely underpowered to detect an association between CS+/CS− discrimination an STAI-T scores (given the correlation coefficient observed in Study 1) and in addition represents individuals sampled from a different distribution, likely caused by the nature of the study (i.e., fMRI). Hence, we replicate the recent results by a report suggesting samples for fMRI and behavioral studies are drawn from different populations 99 .
Given that nearly all published studies in the literature (with few exceptions 103 ) fall well below the sample size in Study 1 (N = 356), the zero findings across outcome-measures in these studies 72,74,[104][105][106][107] are difficult to interpret. More large-N studies are needed to determine whether these different results originate from fluctuations around the null (i.e., absence of a true effect) or whether there is indeed a true effect.
In Study 2, we did observe a positive association between the STAI-T and CS− responding in subjective ratings (i.e., higher CS−ratings in individuals with higher scores). This may be in line with the non-significant positive trend observed in univariate analyses for CS− ratings and STAI-T in Study 1. Note that for all three questionnaires/scales in Study 1 (i.e., STAI-T, NEO-FFI-N, IUS) the association between the respective questionnaire/scale and CS− responding in ratings was trend wise significant. A link between negative emotionality and CS− responding is in line with the suggestion of deficient safety signal processing in individuals with affective disorders or those at risk 108 ; previous results in a similarly large sample 103 ; as well as previous reports on associations between STAI-T and deficits in safety signal (e.g., CS−) processing 103,[109][110][111] . Results should however be treated with caution, as this association was only observed in the smaller Study 2, and was only a trend in Study 1.
In Study 2, however, CS+/CS− discrimination in a number of brain areas of key relevance to fear processing and expression are positively associated with the STAI-T score as well as its sub-components (amygdala, bilateral putamen, bilateral thalamus). The construct of trait anxiety as assessed by the STAI-T has been criticized in the literature for representing a psychometrically inhomogeneous scale 112 , capturing facets of both anxiety and depression 33,35,112,113 . Factor analyses on the single items of these questionnaires in larger well powered studies could address this question further. Exploratory factor analyses performed on the items included in Study 1 are not included here as the results likely represented over-fitting in this particular study sample (i.e., items derived from one scale loaded primarily on a single factor and few plausibly expected cross loadings between similar items across scales were observed). Note that the sample size of Study 1 in relation to the number of items was too small for this purpose and hence, data are not shown and included. Future work in appropriately sized samples should focus on unraveling cross-questionnaire factors that may inform us on the specific mechanisms and components underlying the association with negative emotionality and the discrimination between danger and safety cues (CS+/CS− discrimination) in fear acquisition.
It is also noteworthy that the selection of questionnaires related to negative emotionality for study 1 was exclusively motivated by evidence from the available literature in the field of human fear conditioning 29 , in which the three selected measures have commonly been used (in isolation however). Hence future work should extend these findings by targeting additional measures not included in this report, such as specific measures of depression, to further unravel an underlying potentially mechanistic component of negative emotionality driving the link between dispositional negativity and fear learning on an autonomic level.
Furthermore, it is noteworthy, that we observe a specific association not only between differential SCRs and the STAI-T score but also between fear potentiated startle (i.e., CS+ > CS− in startle responding) and intolerance of uncertainty scores. Although awaiting replication of this potentially interesting finding, it its noteworthy that others have also observed intolerance of uncertainty scores to be negatively associated with startle responding during the uncertain but not certain threat condition 114 suggesting that it was not predictive of general aversive responding, but specific to responses to uncertain averseness.
Importantly, despite our work providing clear evidence for substantially shared variance between the three questionnaires, the specific dissociations in outcome measures and questionnaire scores (i.e., specific association of STAI-T with CS-discrimination in SCRs, and IUS with CS-discrimination in FPS) may provide insights into the underlying processes. Different outcome measures capture and reflect diverse aspects and represent unique sources of variance in fear processing 23 and emotional processing per se 115,116 . SCRs are thought to reflect general arousal. Startle in turn is considered a rather fear specific index 23 that per definition reflects an enhanced reflexive response towards an unexpected, and therewith uncertain, event. Hence, both results may carry complementary mechanistic information corresponding to multi-causal vulnerability in fear and anxiety. As it was technically Scientific RepoRtS | (2020) 10:15283 | https://doi.org/10.1038/s41598-020-72007-5 www.nature.com/scientificreports/ not yet feasible to implement combined EMG-fMRI measurements at the time of data acquisition, future studies profiting from this novel option 26,117 are warranted to investigate the neurobiological mechanisms underlying the specific association between intolerance of uncertainty and FPS. Our results clearly highlight the value of multimodal work and multivariate analyses tools and suggest that 'compound profiles' that integrate multiple input and outcome measures and hence potentially capture multiple processes may in the long run prove useful from a 'personalized medicine' perspective. Yet, the associations observed here between measures of negative emotionality and physiological responding are not substantially large or even of medium size. This has to be kept in mind when discussing implications for their potential for biomarker development. Yet, we speculate that a multivariate composite of different response patterns ('profile') may show stronger associations which may hold potential for the development of clinically useful products in the long run. Our results suggest that negative emotionality in general and trait anxiety specifically may serve as a potential starting point to identify individuals with difficulties in discriminating signals of threat from safety. These individuals might benefit from tailored discrimination training programs. Future well-powered multimodal studies including a wider range of personality aspects (e.g. depression, optimism) are needed to ultimately obtain more comprehensive personality profiles that might allow to link specific individuals to specific interventions.
In sum, it is fundamental to uncover factors and their potential interactions that contribute to individual risk and resilience to pathological fear-although fear conditioning protocols may rather model adaptive fear 118 . Hence, improved understanding of the personality related and neurobiological processes underlying individual differences in experimental fear learning can be expected to translate into improved understanding on how adaptive responding to threats turns into maladaptive fear responding 119,120 . It will thus be important to extend the investigation of individual differences and the underlying neurobiology beyond experimental fear acquisition to tests focusing on the long-term retention of fear and extinction memory (i.e., return of fear 121 ), and ultimately to clinical populations. We provide a very first step towards this overarching aim towards ultimately improving our mechanistic understanding of pathological fear and emotional responding. We provide initial insights of inter-individual differences in fear processing by finding support for specific associations between trait anxiety and physiological responding, as well as for a general link between negative emotionality and fear learning, using multivariate approaches across units of analysis in two samples.