Evaluating an internet-delivered fear conditioning and extinction protocol using response times and affective ratings

Pavlovian fear conditioning is widely used to study mechanisms of fear learning, but high-throughput studies are hampered by the labor-intensive nature of examining participants in the lab. To circumvent this bottle-neck, fear conditioning tasks have been developed for remote delivery. Previous studies have examined remotely delivered fear conditioning protocols using expectancy and affective ratings. Here we replicate and extend these findings using an internet-delivered version of the Screaming Lady paradigm, evaluating the effects on negative affective ratings and response time to an auditory probe during stimulus presentation. In a sample of 80 adults, we observed clear evidence of both fear acquisition and extinction using affective ratings. Response times were faster when probed early, but not later, during presentation of stimuli paired with an aversive scream. The response time findings are at odds with previous lab-based studies showing slower as opposed to faster responses to threat-predicting cues. The findings underscore the feasibility of employing remotely delivered fear conditioning paradigms with affective ratings as outcome. Findings further highlight the need for research examining optimal parameters for concurrent response time measures or alternate non-verbal indicators of conditioned responses in Pavlovian conditioning protocols.

Fear conditioning and extinction are widely used experimental protocols in behavioral science important for studying associative learning and anxiety 1,2 . Processes related to threat-induced associative learning are generally held to be theoretically important for understanding the emergence and treatment of anxiety-and traumarelated psychopathology 3 . The popularity of the fear conditioning paradigm is partly due to the possibility to translate findings between humans and animals, which informs understandings on neurobiological mechanisms of associative fear learning 4,5 .
During fear conditioning, a neutral cue (for example sounds, geometrical figures or pictures of human faces) is paired with an aversive unconditioned stimulus (US), often an electrical shock or an aversive sound 2 . Through associative learning, the previously neutral cue becomes a conditioned stimulus (CS+) predictive of the aversive outcome, and starts to elicit conditioned defensive responses (CRs). This is adaptive in an evolutionary sense, but exaggerated threat reactions can contribute to pathological anxiety 6,7 . In human research, experimental protocols typically include a control stimulus (CS−) never paired with the aversive outcome and often conceptualized as a conditioned safety signal 2 . Initially considered a neutral control, response to the CS− is of theoretical interest as a measure of safety learning or conditioned inhibition 8 . During extinction, the CSs are presented again, but omitting the US, typically leading to a decrease in CR expression. Of relevance for clinical anxiety disorders, deficits in extinction have been related to anxiety-related psychopathology 6 , and trait-anxiety 9 . Furthermore, extinction is commonly used as an experimental model for exposure-based psychological treatment and labbased extinction studies can be translated to novel treatment interventions in clinical samples 10,11 .
To study fear acquisition and extinction, the CR needs to be quantified, and for this purpose various methods have been used. In lab-based studies, the CR is most often quantified through freezing behavior in rodents, and in humans through psychophysiological measurements, such as skin conductance responses (SCR), fear-potentiated startle (FPS), hear-rate changes or pupil dilation, and less commonly through response time measurements. Additionally, self-report based measures are often used, most commonly ratings of US-expectancy during stimulus presentation or affective ratings of experimental stimuli pre and post learning phases 2,12,13 .

Methods
Participants. Eighty adults (61 women, 14 men, 5 other/do not want to specify; mean (SD) age: 35.5 (13.5), range: 18-71 years) were recruited from the general population using social media advertisements. Participants were required to be above 18 years of age, have no hearing impairments, and normal or corrected to normal vision, which was assessed by self-report. Participants were reimbursed with a gift card valued 200 SEK (approximately $25) on completion. The study was approved by the Swedish Ethical Review Authority and conducted in accordance with the Helsinki Declaration. All participants actively provided informed consent to participate in the study.
Materials. Consent and questionnaire data were collected and managed using REDCap (Research Electronic Data Capture), a secure, web-based software platform designed to support data capture for research studies 22,23 , hosted at Uppsala University. To assess trait anxiety, participants completed the Spielberger State-Trait Anxiety Inventory-Trait version (STAI-T) using the REDCap server 24 . The fear conditioning experiment, Screaming Internet Lady, was delivered online using the PsyToolkit platform 20,21 . For experimental stimuli, images of two female faces were selected from the FACES database 25 . Neutral facial expressions were used as conditioned stimuli (CS) and fearful facial expressions in conjunction with a fearful scream served as the unconditioned stimulus (US). Reaction time probes were 440 Hz tones with a duration of 200 ms.
Procedure. Participants were recruited through social media advertisements targeting adults (18 years and older) living in Sweden. A web link to a RedCap server with information about the study and registration was included in the advertisement. After actively providing informed consent to participate in the study, subjects answered screening questionnaires. Those passing inclusion and exclusion criteria were provided a link to further questionnaires and the fear conditioning task. Participants completed all questionnaires and experiments through the internet using their own computers and a standard web browser.
Prior to commencing the fear conditioning task, participants were instructed to sit in an undisturbed place and wear headphones. They then individually set their computer sound volume to be unpleasantly loud (but not hurt their hearing) using a two-step procedure. First, they listened to a recording of a neutral text and set their sound-level to their preferred listening volume. After this, they listened to a test sound (an alarm-sound) and were instructed to set their computer volume to be markedly unpleasant but bearable when listening to this sound. They were then provided with the instructions to the fear conditioning task, that they were to see faces on the screen, might hear screams, and that they would occasionally hear a tone (the auditory probe) that signaled that they should press the spacebar as quickly as possible. The Screaming Internet Lady fear conditioning task is an online adaptation of the Screaming Lady task 26 , consisting of three phases: acquisition, extinction, and reinstatement. The instruction to press the spacebar as quickly as possible when hearing the probe tone was repeated before the extinction and reinstatement phases in accordance with previous studies 17, 18  www.nature.com/scientificreports/ acquisition phase they were shown the following instruction: "Now you will start the first task. It takes about 4 min. You will see faces on the screen and may hear screams. Sometimes, you will hear a tone. When you hear the tone, press the spacebar as fast as possible. Click to start the first task. Remember to keep your attention on the screen and be ready to press the spacebar when you hear the tone. " And before the extinction phase they were shown the following instructions: "Now, you will start the second task. It takes about 4 min and just like before you press the spacebar as fast as possible when you hear the tone. Click to start the task and be ready to press the spacebar as fast as possible when you hear the tone. " During fear acquisition, participants were presented with two neutral female faces serving as conditioned stimuli (CS), see Fig. 1. The CSs were presented 16 times each in a sequential order. One of the images (CS +) was presented for 4000 ms and always followed by an image of the same woman with a fearful facial expression and a scream for 1000 ms, serving as unconditioned stimulus (US). The scream was played at the intensity set by the participant as described above. The other image (CS−) was presented for 5000 ms and never followed by the US. The two CSs were counterbalanced across participants. CS presentations were interspersed with fixation crosses in the center of the screen for a random amount of time (mean: 2500 ms, range: 2000-3000 ms, in steps of 100 ms). In 12 of the 16 of each CS presentation, a probe tone was played for 200 ms. Six of the probes for each CS were played 500 ms (early probe) and six 2500 ms (late probe) after CS onset and indicated to the participant to press the spacebar as quickly as possible. The omission of the probe in 25% of the trials and the two different timings of the probe were used to make their presentation unpredictable, see Fig. 1. Responses faster than 100 ms or slower than 1500 ms were discarded similar to previous studies using reaction time as an index of fear conditioning 17,18 .
The fear extinction phase was identical to the fear acquisition, but without the US. Thus, neither CS+ nor CS− were followed by the US during extinction. Reinstatement started with two unsignaled screams (but no fearful face) and then proceeded with 10 trials each of CS+ and CS−, in sequential order, without presentation of the US. The probe tone was played in 8 of the trials for each CS (4 early and 4 late probes).
Before and after fear acquisition, after extinction and after reinstatement, participants rated how much they agreed from 1 (not at all) to 7 (completely) with feeling calmness, fear, unpleasantness and irritation, while viewing the neutral CS faces. The instructions for the negative affective ratings were as follows: "Rate how much you agree with the following statements. When I view this face, I feel [calm/fear/unpleasantness/irritation]". A composite rating score of the four feelings was created for each rating phase by reversing ratings on calmness and During CS+ trials the presentation was followed by an aversive sound (scream) accompanied by the same face with a fearful expression. During CS− trials only the neutral expression was shown and no sound was played. During CS presentation, response times were probed after either 500 ms or 2500 ms by playing a brief 440 Hz tone. Participants were instructed to press the space-bar as quickly as possible when the probe was played. Trials during extinction and reinstatement had the same structure but the US (sound and fearful expression) was always omitted. The face-images shown in the figure are sample images from the FACES database, but not the same as the ones used in the experiment. The bottom panel illustrates the outline of the entire experiment, including number of trials of each type for each experimental phase. Participants were sent an e-mail approximately 4-5 weeks after the initial experiment, in which they were invited to complete the follow-up assessment. Reminder e-mails were sent out if they did not complete assessment. There was large variability in participants' latency to complete the follow-up, and thus time from experiment to follow-up ranged from 5 to 15 weeks. Only affective ratings were collected at the follow-up.

Missed responses and statistical analysis.
In order to deal with missing responses for the response time measurements the following steps were taken. Subjects with incomplete data were excluded phase by phase according to the following principles: a maximum of 2 missed responses within each stimulus and probe-timing category, i.e. a maximum of 2 missed responses to early probes for the CS− during the acquisition phase, and so forth. As stated above, in addition to trials where the subject had not given a response, all trials with a response time below 100 ms and above 1500 ms were considered missed responses. In cases where participants had missed responses, but did not meet criteria for exclusion, imputation was performed according to the following principles: for each phase separately; in case of the first trial within a stimulus and probe-timing category was missing, the mean from the first trial of the other categories was imputed; in case of the last trial within a stimulus and probe-timing category was missing, the mean of the previous two trials was imputed; in other instances, the mean of the preceding and following trial within that stimulus and probe-timing category was imputed. For each of the phases none of the included subjects had more than 4 missed responses in total and the average percentage of missed responses across the entire experiment was low (M = 1.6%; SD = 2,8%; min = 0%; max = 12,5%; n = 61) and similar across phases and stimulus types (see Tables S1 and S2 for details). All statistical analyses were performed in JASP (version 0.14.1).

Results
US ratings and contingency awareness. As expected, participants rated the US high on negative affect (M = 5.7; SD = 1.0; n = 79, on the scale 1-7). Analysis of contingency ratings showed that participants were largely contingency aware. The average contingency rating for CS+ (M = 67.4%; SD = 32.4%; median = 75%; n = 80) was substantially higher (t(79) = 12.96; p < 0.001) than for CS− (M = 8.6%; SD = 19.6%; median = 0%; n = 80), with 62 subjects (77.5%) rating at least 20 percentage points higher rate of screams following CS+ compared to the CS−.  Table 1. Furthermore, simple main effects analyses showed main effects of Phase both for the CS+ category, where ratings increased from the pre to the post-acquisition measurement (F(1,79) = 42.24; p < 0.001), and also for the CS− where ratings decreased (F(1,79) = 4.66; p = 0.034), see Table 1. Thus, post-acquisition differences in negative affective ratings are not only driven by increases to the CS+ but also decreases to the CS−, Figure 2. Affective ratings to conditioned stimuli (CS). Negative affective ratings were similar for stimuli paired with aversive outcome (CS +) and stimuli never paired with aversive outcome (CS−) pre acquisition, but differed after acquisition (acq), with increased negative ratings of CS+ and reduced negative ratings of CS−. Post extinction (ext), ratings of CS+ and CS− still differed, but to a lesser degree, whereas reinstatement (reinst) increased negative ratings of both CS. Affective ratings is a composite score consisting of the mean rating (1[not at all] to 7[totally]) of how much participants agree with feeling fear, discomfort, irritation and calm (reversed score). Points and error-bars indicate means and SEMs. To analyze the effect of extinction on affective ratings, we entered rating values from the post-acquisition and post-extinction assessments into a 2 × 2 repeated measures ANOVA with factors Phase (acquisition; extinction) and Stimulus (CS+; CS−). Eight subjects were excluded from this analysis due to incomplete data during the extinction phase, thus 72 participants were included. The result showed main effects of Phase (F(1,71) = 7.24; p = 0.009) and Stimulus (F(1,71) = 24.97; p < 0.001) as well as a Phase by Stimulus interaction effect (F(1,71) = 23.44; p < 0.001). Simple main effects analyses showed a main effect of Stimulus both during postacquisition (F(1,71) = 38.60; p < 0.001) as well as post-extinction (F(1,71) = 7.63; p = 0.007), although the effect was much larger after acquisition (d = 0.72) compared to after extinction (d = 0.33). Also, the main effect of Phase was only detected for the CS+ (F(1, 71) = 30.04; p < 0.001) with significant decreases in negative affective ratings, whereas no effect of Phase was detected for the CS− (F(1, 71) = 1.28; p = 0.261), where negative affective ratings were similar after acquisition and extinction, see Fig. 2 and Table 1. We found clear indications of extinction for affective ratings as indicated by the Phase by Stimulus interaction, and that this effect is driven by decreases to the CS+ whereas no change occurred for the CS−. Additionally, the current protocol does not appear to lead to complete extinction, since there was still an effect of Stimulus even after extinction.
To analyze possible effects of reinstatement, we entered rating values from the post-extinction and postreinstatement assessment points into a 2 × 2 repeated measures ANOVA with factors Phase (extinction; reinstatement) and Stimulus (CS+; CS−). An additional 2 subjects were excluded from this analysis due to incomplete data after the reinstatement phase, thus 70 participants were included. The results showed main effects of Phase , indicating increases in negative affective ratings for both stimuli. Thus, we found no clear evidence for reinstatement, as indicated by the absence of a Phase by Stimulus interaction, but rather that negative affective ratings tend to increase for both stimuli, and differences between CS+ and CS− are maintained.
Of the 70 participants who had complete data for acquisition, extinction and reinstatement, 33 (47%) completed the long-term follow-up assessment, which was performed approximately 5-10 weeks (M = 52 days; SD = 15 days; Min = 35 days; Max = 106 days) after the initial experiment. To investigate whether differential responding to experimental stimuli was still present, we adjusted the follow-up rating scores for baseline ratings (by the deducting the rating scores from the pre-assessment from the follow-up assessment for the CS+ and CS− respectively), and performed a t-test. The results showed a difference in rating-scores with a small to mod-  www.nature.com/scientificreports/ differential responding to the CS+ and CS− at follow-up. To do this we investigated the correlation between number of days elapsed from the first test to follow-up and baseline adjusted CS−difference score (CS− ratings deducted from CS+ ratings) but we did not find any association (r(31) = −0.11; p = 0.546). This indicates that the effect is not diminished across the time-span studied here, but this should be interpreted with caution since the analysis is underpowered to detect anything but fairly large effects (r > 0.5 approx.) due to the small sample size. This indicates that fear conditioning affects response time specifically to early probes, which could be used as a non-verbal learning index for remotely delivered fear conditioning protocols, although the effect size is small (d = 0.26 for CS− vs CS+ during early probes). For extinction, we focused on response times to early probes, since this was the only result that indicated learning effects during the acquisition phase. Thus, we compared average response times for the CS+ and CS− during the acquisition and extinction phases, restricted to early probes, using a 2 × 2 repeated measures ANOVA with factors Phase (acquisition; extinction) and Stimulus (CS−; CS +). Since 7 additional subjects had to be excluded due to incomplete data during the extinction phase, a total of 63 subjects were included in this analysis. The results showed a main effect of Phase (F(1,62) Fig. S2 where trial-by-trial data is displayed. The main effect of Stimulus in combination with the absence of a Phase by Stimulus interaction effect indicates that extinction was not achieved by this protocol, however, simple main effects analyses showed a main effect of Stimulus only during acquisition (F(1,62) = 4.53; p = 0.037) and not during the extinction phase (F(1,62) = 0.54; p = 0.464). This suggests that the main effect of Stimulus is driven by differences during acquisition, and lends tentative support to an interpretation that the response time difference established during acquisition is reduced during extinction when the US is omitted.

Response times. To test whether
To evaluate possible effects of reinstatement we also focused on response times to early probes. We compared average response times for the CS+ and CS− during the extinction and reinstatement phases, to see if CS differentiation reemerges after presentations of 2 un-signaled US, using a 2 × 2 repeated measures ANOVA with factors Phase (extinction; reinstatement) and Stimulus (CS−; CS +). Two additional subjects had to be excluded due to incomplete data during the reinstatement phase, thus, 61 subjects were included in this analysis. The results did Consequently, we do not find any support that the protocol elicited reinstatement effects, but rather that response times during extinction and reinstatement are highly similar, and do not differ between the CS+ and CS−.

Relations between fear conditioning and trait anxiety. Affective ratings.
To explore possible associations with trait anxiety, as has been demonstrated previously using remotely delivered fear conditioning 14 , we used the STAI-T to investigate correlations between this measurement and affective ratings of conditioned stimuli during the different phases. Thus, we calculated an acquisition index where ratings of each stimulus during the pre-assessment was subtracted from ratings after acquisition and also calculated a CS difference score (CS diff ), where ratings of the CS− was subtracted from the CS+ adjusted for baseline-ratings. Similarly, we adjusted ratings of the CS+ and CS− post-extinction, by subtracting baseline ratings for each stimulus, and also calculated an extinction-index by subtracting CS diff after extinction from CS diff after acquisition. Pertaining to acquisition, the results showed a weak correlation between STAI-T and the baseline-   www.nature.com/scientificreports/ n = 72). Although exploratory, we did find indications that specifically CS− ratings after acquisition is related to trait-anxiety, but this should be interpreted with caution.
Response times. We also performed similar analyses for the response time measure. Since we only found evidence of learning effects for early probes during acquisition, we restricted the analyses to this outcome. Thus, we investigated the relationship between STAI-T score and response time to the CS−, the CS+ and the CS diff score for early probes only during acquisition. Similar to affective ratings we found a weak correlation between response time to the CS− and STAI-T (r(68) = 0.24; CI 95% = 0.002 to 0.447; p = 0.048; n = 70), no correlation to the CS+ (r(68) = 0.09; CI 95% = −0.15 to 0.32; p = 0.467; n = 70) and an uncertain weak negative association to the CS diff -score score (r(68) = −0.22; CI 95% = −0.43 to 0.02; p = 0.066; n = 70), see Fig. 5. Note, that given the direction of the effect of learning on response times, where CS+ responses are faster than CS− responses, a negative CS diff -score indicates larger stimulus differentiation. Thus, the negative correlation for CS diff suggests that larger stimulus differentiation is related to higher scores on the trait-anxiety measure.

Discussion
In this internet-based study of fear conditioning, we first replicate and extend previous findings of increased negative affective ratings to a cue (CS +) predicting an aversive scream. Next, we also demonstrate that negative affective ratings to CS− (predicting no aversive scream) are reduced during fear acquisition. Finally, we found tentative support for stimulus concurrent reaction time as an index of fear learning. However, the latter findings were weak and should be replicated before drawing firm conclusions about the suitability of using this approach to index remote, non-verbal cued fear responses during conditioning. Concerning negative affective ratings, essentially, our results replicate a previous study on remote fear conditioning 14 , in that we see clear indications of learning using affective ratings. We also extend previous findings, showing that differential responding to conditioned stimuli post-acquisition is driven by both increases to CS+, reflecting fear learning, as well as decreases to CS−, possibly reflecting safety learning or relief conditioning. We also find strong evidence for extinction using affective ratings, which could not be established by Purves et al. 14 , since they only recorded affective ratings pre acquisition and post extinction. The current results demonstrate the feasibility of investigating questions related to extinction processes using remote fear conditioning and affective ratings. We did not, however, find evidence of reinstatement using the current methodology, but rather a general increase in negative affective ratings after un-signaled US presentations. It should be noted that our study was poorly designed to detect such effects because affective ratings were collected only after the whole reinstatement phase, which included 2 un-signaled US presentations and 8 trials of each CS. Thus, the additional non-reinforced trials may have caused further extinction and attenuated any effect of US presentation on affective ratings. This could have been mitigated by using trial-by-trial affective ratings or rating sooner after the US presentations 9 . Furthermore, when investigating reinstatement, a between-subject design would have been much preferred in order to isolate the effects of un-signaled US presentation. Pertaining to results from the follow-up test, we could show that effects on affective ratings were still present, even after six weeks or more. This supports the conclusion that the experimental protocol elicits long-term retention. We could not, however, find any support for spontaneous recovery since the affective ratings and CS difference scores were largely unchanged from the end of the initial experimental session to the follow-up.
We could not replicate Purves et al. 14 findings of an association between CS+ affective ratings post-extinction and trait anxiety, but instead identified such an association for post-acquisition affective ratings of CS−, in that higher trait anxiety was associated with higher ratings of negative affect for the CS− and not the CS+. Our findings are exploratory and should be interpreted with caution, but they are in line with previous lab-based studies that have found similar results using trial by trial distress-ratings during fear conditioning 9 , and also with studies on clinical populations that have shown enhanced fear responses to safety stimuli in patients with anxiety disorders using various outcome measures 6 . Future studies employing remote fear conditioning, with larger sample sizes could possibly lead to more firm conclusions regarding this effect.
The second aim of our study was to test the feasibility of using concurrent measures of simple reaction time to an auditory probe during CS presentation as a learning-index for aversive Pavlovian conditioning and extinction. To this regard, we only found weak evidence that the current fear conditioning protocol has an impact on response time. Effects were small and could only be observed for probes presented early during CS presentation and only in the acquisition phase. Furthermore, effects were opposite to the predicted direction, as previous labbased studies investigating this effect 17,18 observed slower response times to CS+ using a similar protocol in a lab-environment, and interpreted this result as an effect of attention capture by the visual CS+. We instead found faster response time to early probes during CS+ presentation, which would be more in line with an interpretation that exposure to an aversively conditioned stimulus leads to a rapid but quickly dissipating increase in vigilance. This could be construed as evolutionary adaptive, since it would allow the organism to respond faster to possible environmental threats. Note that previous lab-based studies using other experimental set-ups have found similar effects of faster response times to the CS+ compared to the CS− 27-32 but results have been inconsistent with other studies showing no effects or effects in the opposite direction 13,33 . In addition, we only found tentative support that conditioned response time modulation undergoes extinction. Overall, we view our results as inconclusive with regard to the question of whether response times can be used as a non-verbal learning index in remote fear conditioning and extinction experiments. We did find differences in response time to conditioned stimuli during acquisition, but the effects were weak, suggesting that protocols that lead to more robust effects need to be developed in order for this experimental paradigm to be useful.
Similar to our findings on affective ratings, we also found an association between response times to the CS− and trait anxiety, as well as an uncertain association between response time CS diff score and trait anxiety. www.nature.com/scientificreports/ Although an association between responses to the safety cue and trait-anxiety could be expected based on previous studies 6,9 , the direction of this effect is surprising. We found that slower responses to the CS− was associated with higher trait-anxiety, and also an uncertain negative association to the CS diff -score. Because the conditioning effect indicated that responses are faster to the CS+, this means that larger stimulus differentiation is associated with higher levels of trait anxiety. This runs counter to previous studies showing smaller stimulus differentiation in pathological anxiety due to enhanced responding to safety cues 6 . This finding is exploratory and should be interpreted with caution. Furthermore, much larger sample-sizes are likely required in order to reliably detect an association of the current magnitude. If replicated however, this could prove to be an interesting finding in that it would show that conditioned response time modulation is differently associated to pathological anxiety compared to other CR-estimation methods, which could shed light on how fear learning is related to problematic anxiety. Interestingly, previous studies on fear generalization have shown similar effects of slower response-times to CS− than CS+ during fear learning 31,32 (but see also Lissek et al. 33 ). In these studies, the participants rated the perceived risk of US occurrence on each trial and also the response time during this task was measured. During a subsequent generalization test longer response latencies were consistently observed to ambiguous generalization stimuli compared to the stimuli used during the learning phase [31][32][33] . In light of this, slower response time to the CS− during the acquisition phase could possibly reflect ambiguity of the stimulus threat value, with larger latencies reflecting an uncertainty as to whether the stimulus will be followed by a US. Thus, it is possible that in our study, participants with higher trait anxiety perceived the CS− (i.e. the safety stimulus) as more ambiguous which slowed response times and produced larger CS differentiation. If this is the case, the finding that slower response time to the CS− is associated with higher trait anxiety would be in line with previous findings in that it reflects uncertainty as to whether CS− signals the absence of the US. This could be investigated further in larger samples using remote fear conditioning, examining the association between perceived risk of US-occurrence and response time, and how this relationship is moderated by trait anxiety. Some limitations of our study should be mentioned. First, there is a large majority of women in the sample, which may hamper generalization of the findings to men. Second, participants performed the fear conditioning task online in their own homes or in other locations where we did not have control over their environment or compliance. Nonetheless, participants were instructed to complete the task in an undisturbed location and they indicated in questions after the conditioning paradigm that they had performed the paradigm in accordance with instructions. Furthermore, although we only found weak evidence for the response time outcome, we found strong effects for affective ratings in line with previous findings, which indicates that compliance was adequate to detect conditioning effects.
Although remote data collection has several drawbacks compared to in-lab testing it also have several large advantages which could provide valuable contributions to the field. Data collected remotely is likely to be noisier due to low experimental control, non-compliance with instructions, and distractions that are present in a noncontrolled environment. However, this can possibly be overcome with large sample-sizes, and noise may be reduced using control indicators built in to the experimental design. Using remote data collection, it is possible to reach much larger sample-sizes at a fraction of what it would cost to collect data in the lab, and it is much easier to reach understudied populations. Indeed, university students are heavily overrepresented in psychological research, and the possibility to generalize from student populations to the general population is limited and can lead to faulty conclusions 34 . Arguably, collecting larger and more diverse samples using remote data collection could improve replicability. Remote data collection could be particularly useful for studies investigating the associations between personality variables and processes related to fear and extinction learning. These types of studies generally depend on using regression analysis to identify associations with small effect sizes, and need large samples to achieve adequate power. For example, previous studies have differentiated between trait anxiety and trait fearfulness, and these personality constructs may well have unique associations to fear and extinction processes 35,36 . Using remote data collection could be one way to shed more light on this issue using large samples from the general population. Furthermore, previous research on how fear learning affects response times has arrived at conflicting results 13 . One reason for this is likely to be differences in methodology since various experimental procedures have been used and effects may be dependent on the type of task and the type of response that is measured. In order to untangle these discrepancies future studies could make use of remote data collection to examine various experimental procedures in large studies with adequate power. Additionally, other non-verbal measures of fear and extinction related processes can be explored. Recent studies have shown the feasibility of such an approach measuring avoidance learning in a game-based environment 37 as well as using measures of perceptual discrimination following pavlovian condtioning 38 . Ultimately, this line of research could provide valuable tools for non-verbal assessment of fear and extinction learning outside a lab environment.
In conclusion, the internet-based fear conditioning paradigm described in this study is suitable for investigating remotely delivered fear acquisition and extinction using affective ratings as outcome in large samples in a cost-effective manner. More research is needed to find optimal parameters for experimental design before reaction time can be reliably employed in large-scale online studies of fear conditioning. The current study constitutes a first step in this direction.