Introduction

Referendums are one of the main collective decision-making instruments of modern democracies. They allow people to participate in political decision-making, but they also entail the challenge to present complex issues in a comprehensible and concise way. To be suitable for politics, the wording of the question must not only be easy to understand and to the point, but also phrased unambiguously and neutrally (Electoral Commission 2015).

The two important European Union membership referendums in the United Kingdom, those of 1975 and 2016, differ fundamentally in their linguistic structure regarding their possible response options. While in 1975, the usual yes/no choice was available for voting, in 2016 voters had to choose between two verbs: remain vs. leave. The 2016 poll results revealed that 51.89% of the voters opted for the checkbox answer leave (the European Union) and 48.11% for remain (a member of the European Union) when asked the question Should the United Kingdom remain a member of the European Union or leave the European Union? (Fig.1).

Fig. 1: The ballot paper.
figure 1

The referendum ballot paper for voters in England, Scotland, and Northern Ireland (Electoral Commission 2016 p. 16).

In 1975, however, 67.2% of the respondents answered yes when asked the question Do you think that the United Kingdom should stay in the European Community (the Common Market)? In this case, not only the possibility to choose between yes and no was different, but also the wording of the question, since only one verb—stay— was used. This substantial change in the voting tradition justifies a linguistic consideration of the suitability of the choice of verbs in general in a referendum and the neutrality of the formulations used in the 2016 question.

Previous studies have dealt intensively with the requirements of questioning in decision-making processes and have shown how micro deviations in the formulation of questions can lead to large variations in the response (cf. section “Theoretical and Linguistic Background”). Since studies in the area of response options have been lacking up to now, we examine in the following potential framing effects of replacing the neutral yes/no response option with a heterogeneous (static vs. dynamic) verbal alternative (remain/leave).

Even though the Brexit vote served us as a case study, since two very different votes were held on a substantively comparable issue within less than 50 years, it is important to emphasise that the focus of our study is not the question of whether the choice of words in the referendum text could have influenced the voters’ decision in any way. For this reason, we have also deliberately refrained from analysing the entire Brexit referendum campaign.

The present study aimed much more at taking a look at framing effects, independent of the topic, that can occur when different or even heterogeneous, but logically equivalent verbs are used as a substitute for a yes/no question. The goal was to examine the extent to which verbs can be suitable as alternative answers or whether their use should be limited to the formulation of questions, due to their frame-inherent structure. This is even more important if we consider that decision-making processes are not uncommon on a political level. Therefore, it is vital to understand how either choosing certain verbs or raising a neutral yes/no question might influence voting behaviour. In the present study we examined whether the concepts remain and stay have equivalent evaluative structures when being used in combination with leave.

We assumed that due to the different underlying etymological roots of these verbs, which might unconsciously activate different framing effects, we might find divergent evaluative structures. To investigate the potentially unconscious, or “implicit”, framing effects associated with these three original stative verbs—leave, remain and stay—, we designed a version of the implicit association test (IAT) (Greenwald et al. 1998, 2015, Lane et al. 2007). The IAT aims at assessing the associative links between mental concepts and their attributes (Nosek et al. 2011, Uhlmann et al. 2012). It comprises a series of tasks that require participants to classify word stimuli into dichotomic categories, such as positive vs. negative. To measure the strength of associations, scholars rely on reaction times to stimuli that represent the concepts and attributes, which are presented in rapid succession. The underlying assumption of the IAT is that experiences can be represented by the facilitation of the information processing of associated concepts, as measured by the response time (Fazio & Olson 2003, Nosek et al. 2011). Thus, automatic attitudes are exhibited as influences on the duration of button-press responses, controlled by participants’ implicit evaluations.

For example, in the original version of the IAT, Greenwald et al. (1998) had participants classify nouns as referring typically to black vs. white Americans (the “target concepts”; e.g., “ebony” vs. “heather”) by left vs. right manual key presses, as well as different words as “pleasant” or “unpleasant” (the “attributes”). Concept and attribute classification were then combined, so that either “pleasant” or “unpleasant” were associated via classification mapping to “black” or “white”. Participants responded faster when mapping “white” to “pleasant” and “black” to “unpleasant” compared to the other mappings, which, according to the authors, suggests an underlying automatic associative structure that corresponds to certain stereotypes (in this example, racial stereotypes), and the IAT has been used numerous times since (for meta-analyses and reviews, see Hofmann et al. 2005, Lane et al. 2007, Greenwald et al. 2009).

The IAT is part of a larger family of so-called implicit measures, which have been developed to tap into reasons for individual perceptions, judgments, and behaviour (Greenwald et al. 2009, Nosek et al. 2011). Note that the IAT, unlike some other implicit measures, meets psychometric criteria including reliability and predictive validity (Hofmann et al. 2005, Greenwald et al. 2009, Fazio et al. 2003).

The following investigation attempted to illustrate, through a combination of linguistic as well as experimental analysis, that there are subliminal linguistic factors that can influence decision-making processes. These linguistic factors require more attention in the future.

Theoretical and linguistic background

Collective decision-making processes are an important topic in various scientific fields (Wooley et al. 2010, Jeanson et al. 2012, Robinson et al. 2014). In the present study, the question was whether it was a good choice to replace the neutral way of decision-making by means of a yes/no question with other, semantically more substantial linguistic means, such as an alternative choice between two verbs.

The 53 pages report (Electoral Commission 2015) and the assessment guidelines of the referendum questions elaborated by the electoral commission suggested an awareness of the relevance of linguistic factors in policy making. The first mention of remain dates back to October 2013 (section 2.5). In this context, the yes/no vote was also called into question (sections 2.7 and 5.18), although, this would represent a deviation from previous referendum questions. In the struggle to bypass a yes/no vote, leave appeared as a potential adversary to remain, but the Election Commission also admitted that it had not yet been able “to fully test the second of these two alternative question wordings in the time available [to them] before [they] reported” (section 2.8). Further tests followed in 2014, but section 3.9 (as well as 3.32 and 5.12) of the declaration expressly refers to the fact that “whilst qualitative research can identify participant reported views regarding neutrality of question wording based on participant perceptions, the approach does not capture any unconscious impact of question wording and structure. It is thus possible that questions might influence participants to answer in a particular way without them being aware of it”. Our study has been designed to shed light on this potentially important, yet underexplored impact of different verbal concepts as response options in political referendums.

For rational decision-making, it is necessary that decisions concerning the phrasing of a question and the linguistic material used for the encoding in a particular context are coherent (Fahlmann 1979, Arrow 1982, Brachmann & Schmolze 1985, Tversky & Kahneman 1986, Donini et al. 1996, Baader et al. 2004). Studies on rational decision-making have shown that framing effects do influence rational decisions (Dawes 1988, Levin et al. 1998, Stanovich & West 2000, 2008, Shafir & LeBoeuf 2002, Bruine de Bruin et al. 2007, West et al. 2008). In general, word and sentence meanings, which are mentally processed by each individual, undergo a subjective filtering mechanism (Lee 2001, Kövecses 2002, 2006, Croft & Cruse 2004, Van Gorp 2005, Evans & Green 2006, Langacker 1987, 2008, Evans 2009, Ruiz de Mendoza Ibáñez & Galera Masegosa 2014).

Moreover, the substantial influence that framing effects can have on readers has been demonstrated in several areas, such as psychology (Baumeister et al. 2001, Berry et al. 1997, Pennebaker et al. 2003), marketing (Anderson & Jolson 1980, Goh & Bockstedt 2013, Jensen et al. 2013), or news coverage (De Lange et al. 2012, Bosman et al. 2015). They play a major role in the cognitive sciences, artificial intelligence research (Minsky 1975), linguistics, and the communication sciences (Löbner 2015). Research on the reception of individual words has been conducted primarily in the field of sentiment analysis and polarity effect measurement. The focus in the different areas was on opinion building and mining (Neviarouskaya et al. 2009, Baccianella et al. 2010, Maks & Vossen 2012), big data analysis and information processing (Liebmann et al. 2012, Reschke & Anand 2011, Agarwal & Dhar 2014, Villarroel Ordenes et al. 2017), and risk perception and prediction (Arrow 1982, Oliveira et al. 2016). All these approaches have in common that they analyse the role of linguistic elements in decision-making processes (synchronic perspective) without focusing on the etymologically inherent framing effects (diachronic perspective).

The synchronic view considers language from a static perspective; it focuses on how language is used by its speakers at a given moment in time. The diachronic view, on the other hand, concentrates on the historic evolution of words, on how their meaning evolves (de Saussure 1967). Even though arbitrariness has long been considered as one of the key features of the linguistic sign, a purely symbolic construal of language can only be entertained from a synchronic perspective. When examined diachronically, language seems far less random; instead, it appears to be directly connected to our perception and our experience (De Saussure 1989, Barsalou & Wiemer-Hastings 2005, Borghi & Cimatti 2009, Borghi & Bikofski 2014, Borghi & Caruana 2015).

Framing effects in particular originate at the interface of perception and linguistic coding. Due to the concept encoded in the linguistic root (cf. “parameters” below), an element can develop new functional areas. It is therefore particularly fascinating that while native speakers (as well as non-native speakers) usually have no access to etymology (diachrony), they have access to the conceptual parameters as the demarcation in the lexicon to other elements and thus the use of language (synchrony) is based on it (Weisgerber 1954/1973, Wierzbicka 1996, Varela 2005, Morera 2007, Lang & Maienborn 2011, Hernández Arocha 2014, Ströbel 2017).

Especially near-synonyms are an adequate tool to illustrate the strong connection between etymologically anchored parameters, which will be marked in the following with “[]”, and framing effects (Ströbel 2017). The adjectives beautiful, good-looking, handsome, pretty, attractive, lovely, and stunning can all function as synonyms in a given context, although – from a diachronic perspective – they refer to different parameters. Beautiful is derived from the Latin word bellus, good-looking refers to a [quality] and also implies a visual [evaluation], handsome has a “tactile” [contact] character and refers to the [shape] of something, attractive (< lat. ad ‘to’+ trahere ‘to draw’) to a [directed movement], lovely with its Proto-Indo-European root *leubh- ‘to love’ to an [emotion], stunning (< lat. tonāre ‘to thunder’) to an [overwhelming force] and pretty originally referred to a scalar [value] or “a diminutive beauty, without the higher qualities of gracefulness” (Watson et al. 2013, Online Etymology Dictionary 2020). In the same way, diligent, determined, industrious and enterprising can, in a given context, due to semantic bleaching through frequency, be regarded as synonyms for hardworking, even though diligent (< lat. legere ‘to collect, read’, < PIE root *leg̑- ‘to collect’ [action]) puts the focus rather on the acquisition process and therefore allows a shift from a physical (‘to collect’) to a mental (‘to read’) [effort], determined (< lat. de ‘down/off’ [direction] + termināre < PIE root *ter- ‘(go to the) end’ [end point]) displays a telic, industrious (< lat. in ‘in’ + struere ‘to build’, < PIE root *ster- ‘stiff’ [material]) a durative, enterprising (< lat. inter ‘among/in between’ prehendere ‘to take’ [directed action]) a causative character (Ströbel 2017).

In the present study, we were interested in whether response options such as stay, remain, and leave, which have been employed in the 1975 and 2016 UK referendums, might have activated different framing effects due to their underlying etymology.

The unique feature of our article is that it is the first time that the question of whether verbs are suitable as response options for political referendums in particular as well as decision-making processes in general has been addressed.

Regarding the formulation of the question itself and the use of verbs in the question, various studies already exist (Sudman & Bradburn 1982, Deutskens et al. 2004, Weisberg 2005, Friborg et al. 2006, Holleman 2006, Chessa & Holleman 2007, Swain et al. 2008, Saris et al. 2010, Kamoen & Holleman 2017). It has long been assumed that questions containing negation or negating verbs (cf. not to allow/to forbid) are more often answered with “no/disagree” than with “yes/agree”(Schuman & Presser 1981/1996, Kamoen et al. 2013, Warriner et al. 2013, Holleman et al. 2016). Moreover, that negative questions imply higher processing effort (Hoosain 1973, Sherman 1973, Clark 1976, Horn 1989, Kaup et al. 2006, Dillman et al. 2009).

However, these assumptions were put into perspective by the study of Kamoen et al. 2018. On the one hand, the conceptual interpretative framework must be considered (cf. to allow appears conceptually more complex and thus also more ambiguous than to forbid, cf. Holleman 2000), and on the other hand, the reference to the status quo seems to play a role. Similarly, the conceptually structuring role of interactions (cf. stay, remain vs. leave) is pointed out in Talmy’s force dynamics (Talmy 1988). Force-interactive patterns imply strategies of identification and framing as well as positioning (Hart 2011), as a process is represented as a power interaction (cf. maintaining the status quo vs. changing the status quo) between actors (e.g. the voters, the EU). Kamoen et al. (2018) therefore suggest that a question on political issues should best be formulated in terms of a change from the status quo. For this reason, they also recommend that a country that is currently in the European Monetary Union should be asked whether it should leave the union rather than remain in it (Kamoen et al. 2018).

The findings underline, that verbs should be treated with caution, especially when it comes to decision-making, due to their complexity and the associated different association spectra; furthermore, the results display that especially the choice between static (remain) vs. dynamic (leave) in relation to the status quo can have an impact on voting behaviour.

Our article tackles therefore a very important question: If even small changes in the question can have large effects on the response (Schumann & Presser 1981, Cicourel 1982, Jobe & Mingay 1991, Molenaar & Smit 1996), what effects might cause the replacement of the yes/no response option by a heterogenic (static vs. dynamic) verbal alternative (remain/leave)?

The close connection between perception and linguistic encoding is a vital part of natural languages (Chao et al. 1999, Damasio et al. 1996, Vigliocco et al. 2014, 2011).

While yes etymologically refers to an intensified affirmation and no to an intense negation and thus both conceptually offer little room for interpretation, all three verbs (stay, remain and leave) show a conceptually higher complexity. Therefore, we assume that the differences we will find in the IATs (see our description earlier and in the section “Method”) are due to implicit framing effects rooted in the etymology of the three verbs.

Remain can be traced back to the Latin verb remanere ‘to be left behind (to die)’, a derivation of the prefix re- ‘back’ and the verb manere ‘to stay’; it implies the notion of being forced to stay in a [location] and suggests (in its most positive reading) maintaining the status quo (Levin 1993, Sorace 2000). This negative association is even more salient in the derived nominal form remain, as in ‘human remains.’

In comparison, stay is derived from the Proto-Indo-European *sta-, which is associated with a fixed [position]. Stay appears homogeneous from a diachronic as well as a synchronic point of view. In many languages, the continuation of the Latin verb stare can be found in auxiliary verbs (e.g., fr. être or sp. estar). Whereas the concept has thus functionally expanded, its semantics has not. The verb is therefore easily accessible and widely used in everyday language, which could even strengthen its positive interpretation. Furthermore, the combination /st-/ is usually associated with a force acting against gravity (cf. STrength, STatics, STability, etc.), ending a movement (cf. STandstill, STop, STanding, etc.), accompanying a movement (cf. STamping, STepping, STinging, STorming, etc.) or evoking an elongated object (cf. STreet, STroke, STripe, etc.) (Bolinger 1965, Philps 2000, 2012, Bottineau 2007, 2008, Ströbel 2018a, 2018b, 2019a, 2019b). The semantic stability of stay, combined with the sound-symbolic associations (“strength” and “straightness”), can even enhance the positive framing effects.

Leave is derived from the Old English causative læfan ‘to allow sb. to stay in a [location] to survive’, which can be traced back to the Proto-Indo-European root *leip- ‘stick, adhere’ implying [contact]. The notion [contact] is further supported by the fact that, from a sound-symbolic perspective, /l-/ expresses connection as in link, latch, lock (Chomsky & Halle 1968, Rhodes & Lawler 1981). With time, leave turned into a dynamic achievement verb (Vendler 1967, Van Valin 1993, Van Valin & LaPolla 1997) and is nowadays associated with the dissolution of [contact] combined with an intended [change of location] (Gropen et al. 1991). Furthermore leave (/liːv/) shares phonological similarities with believe (/bɪˈliːv/, [+value]): BeLEAVE in the UK.

Taking this into account, while both verbs (remain and leave) originally refer to ‘staying in a place’, remain etymologically implies a negative (‘dying’) and leave a positive (‘surviving’) outcome ([value]) of the situation.

To briefly summarize and embed this linguistic and theoretical analysis before presenting two IAT studies: These three verbs etymologically refer to three stative and therefore durative associations, namely (a) staying in a fixed [position] with a strongly positive [value], (b) being left behind (to remain) in an unpleasant ([-value]) [location], and (c) not having to leave a safe [location] or heading to another one ([change of location], [+value]). In contrast to stay and remain, which maintained their stative reading ([position] or [location)], leave, over time, also developed dynamic and punctual readings implying a [change of location], e.g., leave everything like it is (stative) vs. let’s leave (dynamic).

Language in general is assumed to have a clear semantic effect on thought: it can affect the way we conceptualize the world, and it can be associated with positive or negative experiences ([value(s)]) by virtue of its semantic embedding or its contextual reference (Sher & McKenzie 2006, 2008, Mandel 2008, 2014, Neviarouskaya et al 2009, Reschke & Anand 2011, Maks & Vossen 2012, Chick et al. 2016).

Arguably, the framing effects emanating from these three verbs are even amplified by their association with bodily positions and sensorimotor movements (staying [position] or being stuck (remaining) in a [location] vs. leaving a [location]). Studies in the field from cognitive development to artificial intelligence show that different sensorimotor-based experiences shape the use and comprehension of complex situations and metaphorical statements (Nolfi & Floreano 2000, Oyama 2000, Beer 2003, Grossmann et al. 2008). The linguistic perspective is covered by theories in cognitive science that support the claim that many concepts are grounded in sensorimotor processes (Varela et al. 1991, Wilson 2002, Gibbs 2005, Barsalou 2008, Pezzulo 2011, Shapiro 2011, Wilson & Foglia 2011).

Processing such verbs probably re-activates memory records of previous episodes (O’Reagan & Noë 2001, Thompson & Varela 2001, Noë 2004, Spivey 2007, Goldman & de Vignemont 2009, Gallese 2010, De Jaegher et al. 2010, Núñez 2010, Thompson & Cosmelli 2011, Goldman 2012) that include sensorimotor-based experience, so that their evaluation is partly based on “embodied” representation of their meaning (Thompson 2007, Di Paolo 2009, Froese & Ziemke 2009, Jirak et al. 2010, Meteyard et al. 2012, Buccino et al. 2016). By just hearing or reading these verbs, our frame knowledge or our knowledge shaped by recurrent language action gets activated (Boulenger et al. 2008, Glenberg et al. 2008, Horoufchin et al. 2018).

Thus, even individual verbs never stand in isolation, but call up further framing effects that trigger each other (Gallese & Lakoff 2005, Casasanto & Lupyan 2015, Ströbel 2016). The present contribution will examine such potential framing effects of three isolated sensorimotor-based verbs. The novel approach consists in the notion that these apparently isolated verbs have a representative function for a political decision with considerable consequences.

In the present study, we designed a version of an IAT in which we examined the idea that response times would be shorter when paired target categories (remain vs. leave or stay vs. leave) and attribute labels (positive vs. negative) match an individual’s automatic associations than when they do not match, that is, conversely, that response times would be longer when the paired target category and attribute labels contradict automatic associations (Greenwald et al. 2003). To this end, we conducted two IAT studies. In Study 1, we tested two groups of participants (i.e., remain combined with leave vs. combined with stay) in order to assess the relative evaluation of these terms with each group, and subsequently, we tested whether the difference in evaluation relative to leave changes as a function of the other verb (i.e., “stay” group vs. “remain” group). In Study 1, only a minority of the participants reported having English as native language. Study 2 was thus a replication study testing participants only in English-speaking countries.

Method

Stimulus material

We represented the sensorimotor concepts of stay, leave, and remain with the following terms: to represent the attribute categories of good vs. bad, we made use of word lists of previous IAT studies (e.g., Greenwald et al. 1998, Nosek et al. 2005, Hekman et al. 2010) and selected five positively and negatively connoted terms from these lists that are morphologically similar. We used the adjectives good, outstanding, fantastic, wonderful, and excellent as well as bad, dreadful, nasty, terrible, and faulty.

Procedure

We followed the general procedure of the seven-block IAT (Greenwald et al. 2015). As part of the task, participants had to press keys of a computer keyboard to assign stimuli that appeared on the screen to the left (using the key ‘e’ of a quertz keyboard) or the right (using the key ‘i’). Blocks 1 and 2 of the seven-block IAT were training blocks. In block 1, the target concepts (remain vs. leave or stay vs. leave, respectively) were presented on the upper left and right sides of the screen and participants assigned the appearing stimuli to these categories. Participants, then, trained with the attribute categories in block 2 by assigning the stimuli to the categories of good vs. bad. In block 3, the tasks from blocks 1 and 2 were combined. Now, participants had to assign stimuli of two categories to the left (e.g., remain and good) or to the right (e.g., leave and bad). Block 4 repeated this combination with an additional number of trials. The assignment of categories to the left and right trained in block 2 was reversed during block 5. Then, blocks 6 and 7 repeated the tasks from blocks 3 and 4 but with reversed combinations (e.g., leave and good vs. remain and bad).

Table 1 provides an overview of an exemplary order of categories for our IAT. In our experiment, we counterbalanced the combinations and their order of appearance across participants to control for potential condition order effects.

Table 1 Illustrative structure of one IAT variant used in our experiment.

Per block, we presented the stimuli in random order on a light grey background in black letters. Stimuli were centred on the screen and remained there until the participants reacted. After each response, the next stimulus appeared after 400 ms. Except for block 4 and 7, all blocks of the seven-block IAT consisted of 20 trials and the two remaining blocks encompassed 40 trials. In sum, we recorded 180 reaction times per respondent.

All procedures performed in the present study were in accordance with the Helsinki declaration and comparable ethical standard. The study was not approved by an ethics committee because no physical or psychological discomfort and harm due to the participation in this study was expected. Moreover, we did not use invasive methods and did not test underage persons or patients.

Measures

This study explored the polar evaluation of the concepts stay vs. remain relative to leave in two groups (remain vs. leave and stay vs. leave, respectively). As dependent measure, we constructed the so-called D-measure. The D-measure reflects the difference in reaction times of the experimental conditions of the within-subject design of the IAT. To compute it, we employed an algorithm developed by Greenwald, Nosek and Banaji 2003 to aggregate the 120 reaction times measured by the IAT during blocks 3 and 4 as well as 6 and 7. The algorithm accounts for extreme reactions times, i.e. too fast or slow responses, for classifications made incorrectly, and normalizes reactions time differences across experimental conditions of the task. Table 2 provides an overview of the algorithm underlying the D-measure.

Table 2 The improved scoring algorithm developed by Greenwald et al. 2009.

As the D-measure is a relative measure, it can show negative as well as positive values. In our case, negative values indicate that the respective participant perceived leave as more positive and less negative than the respective other sensorimotor concept, i.e., stay or remain. In turn, positive values indicate that leave is associated less strongly with positive connotations but more strongly with negative ones when compared to stay or remain, respectively.

As part of our analysis, we first compared the implicit evaluation of stay and remain relative to leave (in two within-group comparisons). Secondly, we tested, across groups, the difference in evaluation of stay vs. remain relative to leave to assess which comparison is more balanced in terms of evaluative structure.

Participants

We conducted two studies using this experimental setup. Study 1 (10/2019) used a convenience sampling strategy while we conducted Study 2 (5/2021) with English native speakers from different English-speaking countries. In both studies, participants were first informed about the purpose of the study (to study a linguistic question of the understanding of two verbs) using a few survey items as well as a reaction time test. Next, we communicated how data were stored, analysed, and utilized for scientific publication. We also informed participants about European data security laws and their right to withdraw their consent. Lastly, participants were notified of the fact that they would not receive any payment for taking part in the survey. Only participants agreeing to these conditions were then forwarded to the study. All subjects provided written consent. As this research explored the polar evaluation of the concepts stay vs. remain relative to leave in two experimental groups (remain vs. leave and stay vs. leave, respectively), participants were randomly assigned to one of the experimental groups in both studies.

For Study 1, we recruited participants by sharing the link to our online survey via our professional networks in various countries, emphasizing English-speaking regions. Of the 185 participants taking the test, 121 were female. On average, participants were 33.42 years old (std. dev. 14.63 years), with age ranging from 17 to 83 years. Although we focused on participants from English-speaking regions, only 46 of the participants indicated that their native language was English. Of the 185 participants, 92 took part in the condition of remain vs. leave, while 93 subjects worked on the IAT on stay vs. leave.

For Study 2, we contacted participants via the online service Click worker (approximate 1€ for a 4-to-6-minute task). Here, we sampled representative groups for the population of the UK and the USA. Of the 355 people that took our test, 181 stem from the UK. We surveyed 209 female, 141 male, and 5 diverse people. Age ranged from 18 to 70 years. Of the 355 participants, 182 took part in the condition of remain vs. leave, while 173 subjects worked on the IAT on stay vs. leave.

Results: study 1

We first report the results of each IAT independently for each experimental group. Since the average D-measure of the IAT using the combination remain vs. leave is positive (μ = 0.189, σ = 0.486), the IAT reveals that participants on average perceived the word remain as slightly more positive and less negative than the word leave, t(91) = 3.739, p < 0.001, r = 0.365. The r-family effect size measure, based on Pearson’s correlation r, indicates a medium sized effect. According to conventions suggested by Cohen (1988), r < 0.1 refers to a small effect, r > 0.1 & < 0.3 to a medium-sized effect, and r > 0.5 to a large effect size. Please note, however, that the D-measures have an interindividual range from −1.159 to 1.077, depicting that at least few people also had an opposite association.

Similarly, the IAT investigating stay vs. leave showed that participants, on average (μ = 0.515, σ = 0.331), considered the verb stay to be more positive and less negative than the verb leave, t(92) = 12.979, p < 0.001, r = 0.804, indicating a large effect size. Interestingly, the range of values that we observed (−0.327;1.268) included fewer negative associations than in the other group.

After this analysis of the IATs and their D-measures, within each group, we compared both groups. The D-measures of the stay vs. leave IAT included more positive and higher values. A simple mean value comparison using an independent, two-sided (i.e., undirected) t-test showed a highly significant difference in the D-measures between the groups remain vs. leave and stay vs. leave, t(183) = −5.329, p < 0.001, r = 0.367. This means that, in relation to the word leave, the word remain is connoted less positively and more negatively than the word stay. Statistically, this effect is of medium size.

Robustness checks

To inspect the robustness of our results, we replicated the above analyses for native English speakers only. We found qualitatively similar results regarding (1) the IAT using the combination remain vs. leave (μ = 0.264, σ = 0.527), t(25) = 2.551, p < 0.05, r = 0.454; (2) the IAT testing the combination stay vs. leave (μ = 0.547, σ = 0.330), t(19) = 7.428, p < 0.001, r = 0.862; and (3) the difference between the groups remain vs. leave and stay vs. leave, t(44) = −2.097, p < 0.05, r = 0.301. An additional ANOVA (Analysis of Variance) with the independent variables verb combination (leave vs. stay; leave vs. remain) and native language (English as native vs. non-native language) revealed that the D-measures differed across the IATs (F(1,181) = 19.48, p < 0.001, ω2 = 0.092) but not across native and non-native English speakers (F(1,181) = 0.15, ns.). Notably, the interaction of both variables was not significant, meaning that the differences across the IAT’s did not differ between native and non-native English speakers (F(1,181) = 0.20, ns.).

Finally, we ran an OLS (Ordinary Least Squares) regression (F(4,180) = 8.19, p < 0.001, ω2 = 0.154) to control for potentially confounding effects of participants’ age (b = −0.003, ns.), gender (b = −0.051, ns.), and also being native English speakers (b = 0.125, ns.). Again, we found support that the D-measures differed across group comparisons (b = 0.319, p < 0.001.).

Results: study 2

In line with the reporting of our first study, we first report the results of each IAT. The average D-measure of the IAT in the combination remain vs. leave was positive (μ = 0.526, σ = 0.410) indicating that participants on average perceived the word remain as slightly more positive and less negative than the word leave, t(181) = 17.309, p < 0.001, r = 0.790. The r-family effect size measure indicated a strong effect. As in Study 1, we found individual D-measures with an opposite association, as the range is from −0.908 to 1.373. People from the UK (μ = 0.504, σ = 0.415) and the USA (μ = 0.547, σ = 0.406) did not differ in D-measures, t(180) = −0.7114,ns.

Participants in the IAT stay vs. leave also exhibited on average a positive D-measure (μ = 0.592, σ = 0.351). As such, participants perceived the verb stay to be more positive and less negative than the verb leave, t(172) = 22.213, p < 0.001, r = 0.861. The effect size measures indicated a strong effect. As in Study 1, the range of values included fewer negative associations (−0.391;1.319) compared to the combination of remain vs. leave. Again, people from the UK (μ = 0.562, σ = 0.367) and the USA (μ = 0.622, σ = 0.333) did not differ in the D-measures, t(171) = −1.134,ns.

Our final set of analyses compared the IATs across the two groups tested in Study 2. We found, similarly to Study 1, that in comparison to the word leave, the word remain (μ = 0.526, σ = 0.410) was connoted less positively and more negatively than the word stay (μ = 0.592, σ = 0.351). Compared to Study 1, however, we found only a smaller group difference of the D-measures between the two IATs, t(353) = −1.619, p < 0.53, r = 0.086 (two-tailed). Note, however, given that Study 1 informed a directed prediction, we believe that a one-tailed t-test would be justifiable, in which case the result would be considered statistically significant (i.e., p < 0.027). Hence, we take the results of Study 2 as a confirmation of the pattern observed in Study 1.

Discussion

Using a version of the IAT, we tested the suitability of two near synonyms (stay and remain) in combination with a potential antonym (leave) as an alternative response option in decision-making processes. The important findings were that remain was found to be connoted less positively and more negatively than its near synonym stay when evaluated in relation to leave. However, remain was evaluated still more positively than leave (Study 1). In addition, we found a substantial increase in negative perceptions of leave between Study 1 and Study 2. Notably, there were no significant differences in the empirical effects (the D-measure) between English native speakers (L1) and non-native speakers (L2) in Study 1 and people from the UK and the USA (L1) in Study 2. Furthermore, age groups and gender did not reliably affect the observed data.

How can it be that near-synonyms (stay vs. remain) display such salient differences in their evaluative structures, and why do framing effects appear to be affected neither by language nor by age or gender? And finally, what could explain the dynamics in the perception of leave? We will discuss these questions in turn.

We assume that the reasons for the fact that the participants perceived the verb stay to be more positive and less negative than its potential synonym remain, as well as its antonym leave, are due to implicit framing effects rooted in the etymology of stay.

As already mentioned in the “Theoretical and Linguistic Background” section, even though speakers may not have any direct access to etymology at all, the conceptual parameters that determine the functional range of an element in a given language are similarly accessible to L1 and L2 speakers. Therefore, stay, besides its etymological association with a fixed [position], displays from a phonetic-symbolic perspective prototypical readings associated with (a) a [force] acting against gravity (cf. STrength, STability, etc.), ending a movement (cf. STandstill, STop, etc.) or accompanying a movement (cf. STamping, STorming, etc.), as well as, (b) a specific [shape] (cf. STreet, STroke, etc.). The sound-symbolic associations of stay, implying “strength” and “straightness”, as well as the etymological associations suggesting “stability” and “security”, have the power to activate positive framing effects. Remain, on the contrary, displays the negative notion of being forced to stay in a [location] as it can be traced back to the Latin verb remanere ‘to be left behind (to die)’. Similarly, leave (derived from the Old English causative læfan ‘to allow sb. to stay in a [location] to survive’) is associated with an unpleasant situation, in which originally the survival of someone was ensured by leaving them behind.

Taking this into account, both verbs, remain and leave, originally refer to ‘staying in a place’, but etymologically remain implies a negative (‘dying’ [-value]) and leave a positive (‘surviving’ [+value]) outcome of the situation. Nevertheless, the difference between remain vs. leave was not as salient as expected in the IAT (cf. Study 1, but even less in Study 2). We assume that a suppressive effect in Study 1 caused by the fact that the combination remain vs. leave (other than remain vs. stay) was closely linked to the Brexit debate and that the majority of participants of Study 1 had an academic background and might have therefore been more likely to associate leave with something negative, which has more than levelled the etymologically positive [value] (Moore 2016). Nevertheless, in Study 2, people with different backgrounds from the UK (μ = 0.504, σ = 0.415) and the USA (μ = 0.547, σ = 0.406) did not differ in D-measure (t(180) = −0.7114, ns), which makes it unlikely for the level of support for membership in the European Union to dilute the linguistic effect that we seek to isolate. Furthermore, the deviations encountered in Study 2, which was conducted with a considerable time lag to Study 1, could also be explained by the fact, that it cannot be ruled out that in the meantime, due to awareness of the social and economic consequences of Brexit, the evaluation or perception of these three verbs might have fluctuated or changed.

In other words, a suppressive effect linked to the synchronic or actual connotations of remain vs. leave might have levelled out the diachronic or etymological associations of remain, and even more of leave. This is supported by the finding that the r-family effect size measure only indicates a medium-sized effect and that the d-measures (ranging from −1.159 to 1.077) are depicting that at least few people also had an opposite association.

As mentioned before, our study is not concerned with the justification of potential choice of pairs of response options; it rather aimed at contributing to a discussion as to whether verbs in general should be used as response options due to inherent framing effects and therefore their lack of neutrality. In this context it is notable that in Study 1, we could not find significant differences in the IAT between L1 and L2 speakers, nor could we detect differences in Study 1, as well as in Study 2, between different age groups or genders.

This might be due to the fact that all three verbs are sensorimotor-based concepts. As indicated previously, bodily concepts are experienced by humans in a particularly intense manner and sensorimotor-based verbs are more likely to be remembered than others. We assume, therefore, that when we hear verbs, such as stay, remain, or leave, some kind of “embodied” processing mechanism referring to the neural systems responsible for balancing the body in that particular [position] or carrying out the respective [movement] is implicitly activated, to some degree. This kind of mental simulation or mental “re-enactment” generates a stronger identification with what is said, since the perception is ‘experienced’ not only audibly or visually, but also at the sensorimotor level, and this experience is language, independently of age and gender.

Sensorimotor-based concepts can be found in many areas of language, like, for instance, in to comprehend (<lat. comprehendere ‘to catch hold of, seize’ [contact]) or to grasp (an idea). Both verbs imply that in order to understand something, one must first examine it with the sense of touch [contact]. In grammar, too, sensorimotor concepts are often used to verbalize complex facts, as when movement in space is used to express movement in time (cf. I am going to do sth. [change of location]). The strong anchoring of these concepts in the mental lexicon as well as in grammatical structures may explain why we could not find significant differences in the IAT between L1 and L2 speakers, age groups or gender.

Framing effects in isolated linguistic units are based on the interaction of perception, cognition, and language. The underlying processes are rarely actively accessible even to L1 speakers. Nevertheless, the underlying universal structures or basic parameters (such as [location], [position], [contact], [change of location], etc.) seem to have an influence, at least unconsciously, on the concrete contexts of use. This obvious paradox between a lack of transparency on the one hand and intuitively coupled connotations with the coding parameters on the other hand is also evident in the IAT.

It should be noted that the IAT has been criticized on methodological grounds, for example, because it provides only a relative rather than an absolute measure of evaluative structure. Moreover, the resulting D-measure may not be sufficiently reliable in order to allow strong diagnostic predictions at the level of the individual, especially because the link between implicit measure of evaluative structure to actual behaviour may not be very strong in certain applied contexts (cf. e.g., Antons et al. 2017 for discussion). Yet, for the present purposes, we were interested in relative rather than in absolute effects on evaluative structure and given the statistical power of general elections (with millions of voting participants), statistically small effects on evaluative structure, even if reflecting only weak relations to overt behaviour, could still have a significant and meaningful impact on explicit voting behaviour.

Taking this into account, verbs in general and sensorimotor verbs in particular have the inherent power, due to their framing effects, to convey a great deal of information in a concise manner, and to condense complex issues into more simplified packages of information by defining patterns of perception (via embodiment and enaction processes) to which people can respond. In other words, verbs are suitable in many contexts (marketing, campaigning, etc.) as they help to address symbolic themes residing in segments of public awareness. Nevertheless, as suggested by our data, they might be, due to their strong language-, gender-, and age-independent framing effects, less suitable as alternative response options.

Therefore, it might be wise to limit the use of such verbs to the formulation of questions, rather than use them as linguistic anchors for the behavioural choice decision itself. This could contribute to voting behaviour less prone to bias and to a more objective formulation of referendum questions.

In order to be able to show even more clearly the greater scope for interpretation and the stronger context-dependency of verbal response options in comparison to the conceptually simpler, context-independent and already established yes/no alternative, we can envisage further studies analysing non-sensorimotor based concepts, simple vs. complex predicates (cf. hold/keep vs. give away/let go), the role of negation (do vs. do not) or verbs sharing the same etymology but differing in their scalar orientation (cf. decrease vs. increase).

Conclusion

As many studies have shown that small changes in the formulation of a question can have large effects on the decision, our article addressed the question of whether two heterogenic (static vs. dynamic) verbs can function as response alternatives in an important decision-making process. We used the two referendums on membership of the European Union in the United Kingdom in 1975 and 2016 as a case study, as the yes/no response option was replaced in the second referendum (2016), in contrast to 1975, by a verbal and conceptually more complex alternative response option (remain/leave). Our aim was therefore to illustrate that verbal framing effects can influence decision-making processes through a combined linguistic and experimental analysis. Overall, the data from two temporally separated IATs, focussing on L2 (Study 1, n = 185) and L1 speakers (Study 2, n = 355), suggests that the exact wording of dichotomous response options has the potential to influence response choice. As a result, verbs seem therefore less suitable to replace yes/no response options.