Introduction

The kind of food we buy in a grocery store, or order in a restaurant or a snack bar, is not only determined by what we desire to eat, but also by what we perceive in our surroundings. For instance, an attractive food image may capture one’s attention and evoke a positive emotion that results in the decision to buy the food that is displayed or to visit a restaurant that serves it, while an aversive image may lead to the opposite behaviour (Dalenberg et al., 2014). Images play an important role in advertising food since they have a strong impact on customer attitudes and expectations (Jaeger and MacFie, 2001; Townsend and Kahn, 2013). Food images are for instance effectively used for culinary tourism promotion and for destination marketing to influence travel planning (Andersson et al., 2016; Liu et al., 2013). Images can trigger emotional processing, which in turn influences consumers’ perceptions and attitudes towards a product or service (Lee et al., 2009; Poor et al., 2013). Emotions are a major driver behind human consumption behaviour (Desmet and Schifferstein, 2008; Gutjar et al., 2015). To date, there is a growing interest in the emotional response to food images (for a recent review, see e.g. Lagast et al., 2017). One of the main drivers is the increasing popularity of the online-to-offline (O2O) food delivery market (Wang et al., 2020; Xu and Huang, 2019), which has increased significantly since the outbreak of the COVID-19 pandemic. O2O food delivery services allow customers to order food items through mobile apps or websites and have them delivered to their workplace or home within a short timeframe. Food images play an essential role in this process. Compared to text, images more effectively attract attention, are processed more rapidly and automatically (Luna and Peracchio, 2003; Townsend and Kahn, 2013), and more effectively elicit emotions. Also, restaurants and food courts increasingly use digital menu boards showing static or even dynamic food images to provide more appealing descriptions of the dishes served (Beldona et al., 2014; Peters and Mennecke, 2011; Peters and Mennecke, 2013; Toet et al., 2019c). And last but not least, people increasingly share food images over the internet (‘digital grazing’ Spence et al., 2015), as reflected in the growing number of sites dedicated to food images on platforms like Instagram, Flickr and Snapchat (Abbar et al., 2015; Mejova et al., 2016).

Although it is clear that perceiving a food image triggers an emotional response, and that this response influences consumer behaviour substantially, it remains unclear whether this response depends solely on the food image that is seen at a particular moment or whether previously seen food images affect this response. Indeed, it is important to note that an emotional response does not disappear instantaneously after the stimulus disappears, but instead slowly decays over time (in the order of seconds: Schönfelder et al., 2013). As a result, the emotional response to a particular food image is most likely not a ‘pure’ response to the stimulus itself, but rather a mixture or sum of (both positive and negative) emotional experiences over a limited time. From the perception literature we know that our percept is not solely determined by what we perceive at a given moment in time, but also by the information we received and processed recently (in the order of seconds). For instance, if participants are asked to rate the attractiveness of faces, then the rating for a given face does not only depend on its attractiveness, but also on the rating of the faces they recently saw (Kok et al., 2017; Van der Burg et al., 2019; Xia et al., 2016). Sequential dependencies in visual perception have been predominantly reported when participants were asked to judge low-level features such as colour (Barbosa and Compte, 2020), orientation (Fischer and Whitney, 2014), and motion (Alais et al., 2017). However, several studies also reported sequential effects associated with more complex stimuli or judgements, like for face attractiveness ratings in an online dating paradigm (Taubert et al., 2016), and even for goalkeepers’ decisions about the direction of their dives in penalty shoot-outs (Misirlisoy and Haggard, 2014). These sequential effects can either be negative or positive. In case of a positive or assimilative sequential effect, the percept of a current stimulus is biased toward the previous stimulus. For instance, a face is perceived as more attractive after seeing an attractive face, and as less attractive after seeing an unattractive face (Kok et al., 2017; Taubert et al., 2016; Van der Burg et al., 2019; Xia et al., 2016). In case of a negative or repulsive effect, the percept is biased away from the previous stimulus. For instance, the perceived orientation of a Gabor patch is shifted away from its orientation on the previous trial (i.e., the tilt aftereffect: Gibson and Radner, 1937). It was recently postulated that a negative sequential dependency reflects a low-level perceptual process, whereas a positive sequential effect reflects higher-order post-perceptual (e.g., decisional) processes (Fritsche et al., 2017, but see Taubert et al., 2016). Regardless of sign, both serial effects may have functional benefits. From a behavioural perspective, a negative serial dependence makes it easier to detect changes in the world by enhancing perceptual differences. A positive serial dependence reduces perceptual differences, as if the brain integrates (or averages) information over time. A potential behavioural advantage of this process could be that the associated averaging process (termed an association field: Kiyonaga et al., 2017) reduces noise in the visual system (induced by eye-movements, blinks, light conditions, occlusion, etc.), resulting in a more stable or coherent percept of the world around us.

Whereas the vast majority of studies investigated serial dependencies using clearly defined perceptual features within the visual, auditory and even tactile domain (such as motion, colour, orientation, frequency, etc), very little research has been conducted to study serial dependencies for amodal stimuli, like emotions, which can be evoked through different sensory channels. An exception is a study by Liberman et al. (2014) who reported a positive serial dependence when participants judged the emotions of facial expressions. Although it is possible that such serial dependencies for facial expressions reflect an emotional aftereffect, it is also feasible that this reflects a pure visual inter-trial effect. Indeed, faces are typically processed by the fusiform face area (FFA, Kanwisher et al., 1997), a part of the human visual system that is specialized for facial recognition and that is also sensitive to emotional facial expressions (Harry et al., 2013). It therefore remains a question whether serial dependencies can be observed for emotions, which are typically processed in neural networks involving different parts of the human brain like the Amygdala (Rasia-Filho et al., 2000).

According to the circumplex model of affect (Russell, 1980), emotions can be described by their two main principal dimensions: valence (pleasantness; the degree of positive or negative affective response to a stimulus) and arousal (the intensity of the affective response to a stimulus; the degree of activation or deactivation). In this exploratory study we investigate whether sequential effects also play a role in the affective appraisal of food images, since its known that such stimuli typically evoke emotional responses (Kaneko et al., 2018). Thereto, we re-analysed an existing data set, containing both published (Kaneko et al., 2018) and unpublished data from 16 different countries. Participants watched a randomly presented sequence of 60 different food images and reported their affective appraisal of each image in terms of valence and arousal. Through an inter-trial analysis we investigate whether the valence and arousal ratings on a given trial depend on the ratings on the trials immediately preceding that trial (up to four trials back; for a similar approach see: Van der Burg and Goodbourn, 2015; Van der Burg et al., 2019). We expect a positive serial dependence if (a) the food images evoke an emotional response, and (b) if that response remains present for a while such that the emotion for the subsequent food image is merged with the previous emotion. There is ample evidence that people’s nationality may affect their perception and experience of food (Ichijo and Ranta, 2016; O’Connor, 2009). For instance, Kaneko et al. (2018) observed that Japanese participants systematically reported lower mean arousal and used a smaller range of the valence scale for food images than European participants. Given that our sample contains participants from 16 different countries (Bulgaria, Canada, China, Netherlands, Ethiopia, Germany, Indonesia, Iran, Japan, Korea, Nigeria, Pakistan, South Africa, Taiwan, Turkey, and United Kingdom), we also examine whether nationality moderates the serial dependence for affective appraisal of food images (in terms of valence and arousal). To the best of our knowledge, no previous study examined whether serial dependencies in general (regardless of the stimuli) depend on the participants’ nationality, as most studies used rather small sample sizes (in the order of 10–20 participants) with participants from a rather homogeneous group (e.g., Australian students in the study by Van der Burg et al., 2019). Besides nationality, we also examine whether the participants’ age, gender, and body mass index (BMI) moderate the serial dependence for valence and arousal, as these are known to influence food-evoked emotions. For instance, Burger et al. (2011) observed that BMI correlated positively with the rated appeal of discretionary foods (desserts, energy dense foods), while Abdella et al. (2019) found that age, gender and BMI all affect food craving. Note that BMI is also related to nationality: people in more individualistic countries (e.g., United Kingdom) have higher BMI than people in collectivist countries (e.g., Japan; Masood et al., 2019).

Methods

Participants

In total, 1322 participants from 16 different countries [Bulgaria (BG), Canada (CA), China (CN), Netherlands (NL), Ethiopia (ET), Germany (DE), Indonesia (ID), Iran (IR), Japan (JP), Korea (KP), Nigeria (NG), Pakistan (PK), South Africa (ZA), Taiwan (TW), Turkey (TR) and United Kingdom (UK)] participated in the experiment. The data from 44 participants (3.1%) were discarded from further analyses, because of the following exclusion criteria: age (< 18 years, > 80 years, or not provided), body height (unrealistic: < 140 or > 210 cm, or not provided), or weight (unrealistic: < 35 kg, or not provided). Note that we excluded people based on their body height or weight as these two numbers are required to calculate the BMI. The demographic information from the remaining 1278 participants (795 females; mean age: 29.8 years, ranging from 18 to 72 years; mean BMI: 23.6, ranging from 13.2 to 70.2) for each country is shown in Table 1. Most participants were recruited through postings on social media and direct emailing. Groups from Germany and the United Kingdom were recruited via the Prolific database (https://prolific.ac), and participants from Japan via the Crowdworks database (https://crowdworks.jp). Exclusion criteria were age (either younger than 18 years or older than 80 years) and colour vison deficiencies. Participation was voluntary, and all participants were naïve as to the purpose of the experiment. The experimental protocol was reviewed and approved by the TNO Ethics Committee (Ethical Approval Ref: 2017-011), and was in accordance with the Helsinki Declaration of 1975, as revised by the World Medical Association (World Medical Association, 2013).

Table 1 Demographic information for each country.

Stimuli

The experiment was setup and run using Gorilla (see www.gorilla.sc). The stimulus set consisted of 60 different food images (850 × 640 pixels): 50 images were selected from the CROCUFID cross-cultural food image database (Toet et al., 2019b) and 10 images were selected from the FoodCast research image database (FRIDa, Foroni et al., 2013). Figure 1b shows some representative images that were used in the present study. The full set of stimuli is described in more detail elsewhere (Kaneko et al., 2018; Toet et al., 2018) and is also available online (https://osf.io/cyqg7/download). The set of food images were selected such that their associated mean valence (as determined in earlier studies, see Toet et al., 2018, 2019a) are distributed along the entire scale (ranging from very low valence for rotten, moulded or contaminated food, via neutral for raw onions, boiled eggs or potatoes, to very high valence for fresh fruit, chocolates and pastries). The 10 additional food images from the validated FRIDa database were selected such that their associated valence scores (as reported in their accompanying data file, see Foroni et al., 2013) also cover a large part of the scale. These validated FRIDa images were included as anchor points for verification purposes (see Toet et al., 2018). Based on previous studies, the associated mean valence ratings range between 2 and 90 on a scale from 0 to 100 (Kaneko et al., 2018; Toet et al., 2018).

Fig. 1: Rating tool and stimuli.
figure 1

A EmojiGrid affective self-report tool used to rate valence (horizontal axis) and arousal (vertical axis) for each image. B Example images used in the present study.

Measures

Valence and arousal were measured with the EmojiGrid (see Fig. 1a; this tool was introduced in Toet et al., 2018). The EmojiGrid is a rectangular grid (similar to the Affect Grid: Russell et al., 1989) that is labelled with emoji showing different facial expressions. The facial expressions of the emoji along the horizontal (valence) axis vary from unpleasant (left) via neutral to pleasant (right) (scale 0–100), while the intensity of their expressions gradually increases along the vertical (arousal) axis (scale 0–100). The opening of the mouth and the shape of the eyes represent the arousal dimension, while the concavity of the mouth, the orientation and curvature of the eyebrows, and the vertical distribution of these features over the area of the face express the degree of valence. Participants report their affective state (valence and arousal) by placing a checkmark at the appropriate location on the grid. Previous validation studies confirmed that the facial expressions of the emoji and their arrangement over the valence-arousal space agreed with the users’ intuition (Toet et al., 2018). In previous studies we found that users were intuitively able to use the EmojiGrid to report their food-evoked emotions (Kaneko et al., 2018; Toet et al., 2018). The EmojiGrid has been extensively validated with different age groups, nationalities, and ethnicities for a range of different affective stimuli (Kaneko et al., 2018; Toet et al., 2018, 2019a; Toet and Van Erp, 2019).

Procedure

Participants took part in an anonymous online survey. The survey commenced by presenting general information about the experiment and thanking participants for their contribution. Also, the participants were asked to perform the survey on a (laptop) computer only (smartphones and other handheld devices with small displays were not allowed) and to activate the full-screen mode of their web browser to maximize the resolution and to avoid distractions from other software that could be running in the background. Subsequently, participants were informed that they would see 60 different food images during the experiment, and they were asked to rate their first impression of each image. It was emphasized that there were no correct or incorrect answers, and that it was important to respond seriously. Subsequently, participants electronically signed an informed consent by clicking “I agree to participate in this study”, affirming that they were at least 18 years old and voluntarily participated in the study. The survey then continued with an assessment of the demographics (age, gender, height, and weight). Next, the participants were shown the EmojiGrid response tool together with a brief explanation about its use: “Click on a point in the grid that best matches your feelings towards the picture”. No further reference was made to the dimensions of valence and arousal. Then they performed two practice trials to further familiarize themselves with the use of this tool. Immediately after these practice trials, the actual experiment started. The 60 different food images were randomly presented over the course of the experiment. On each trial, participants saw a food image and reported their affective appraisal using the EmojiGrid. The entire experiment lasted about 10 min on average. The instructions were provided in English for all countries, except for China, Indonesia, Iran and Turkey. For these countries, the instructions were translated into Mandarin, Bahasa Indonesia, Farsi and Turkish, respectively. Screenshots illustrating the procedure and the screen layout are available online (https://osf.io/cyqg7/download).

Data analysis

Valence serial dependence

We performed an inter-trial analysis to examine whether the valence rating on a given trial t depends on the valence rating on the previous trial (t−1; for a similar approach, see e.g. Alais et al., 2017, 2015; Harvey et al., 2014; Van der Burg et al., 2013, 2015, 2019). Thereto, for each participant individually, we first calculated the Median valence rating over all the 60 food images (i.e., a neutral valence rating). Subsequently, each image was either labelled ‘low valence or ‘high valence’, if the valence rating for that particular image was either < or ≥ than the participant’s Median valence rating, respectively. Then, we binned the data into two bins. One bin contained those trials in which the preceding food image was labelled low valence, and the other bin represented those trials in which the preceding trial was labelled high valence. We then calculated the mean valence rating for each bin. The difference between the mean high and low valence bins then represents the valence inter-trial effect for that participant. Note that this analysis was not only conducted for one trial back, but also for up to four trials back (i.e., trial t−2, t−3, and t−4). These analyses were identical to the one described above with the exception that for each trial back, two new bins were created. For instance, for a sequential effect two trials back (t−2), one bin contained those trials in which the food image on trial t−2 was labelled low valence, and the other bin representing those trials in which trial t−2 was labelled high valence. For each of the four analyses (t−1, t−2, etc.), the first four trials of the experiment were excluded from further analyses since we were interested in inter-trial effects up to four trials back. We conducted a repeated measures ANOVA on the mean valence inter-trial effect with trial distance (1–4) as within-subjects variable. Here and elsewhere in the manuscript, α was set to 0.05 and, if applicable, p-values were Huynh–Feldt corrected for sphericity violations. The trial distance effect was further investigated using one sample two-tailed t-tests for each trial distance to examine whether the inter-trial effect was significantly different from zero. Unless otherwise stated, all statistical analysis were conducted using Just Another Statistical Package (JASP, Love et al., 2019).

Arousal serial dependence

A similar inter-trial analysis was conducted to investigate whether the mean arousal rating on a given trial t depends on the arousal rating on the previous four trials. The statistical analysis was the same as the analysis for the valence serial dependence.

Serial dependencies for individual images

We performed both a valence and an arousal inter-trial analysis for each food image used in the present experiment to examine whether all images were susceptible to the valence and arousal rating on the previous trial, respectively. Here, and in the remaining analyses, we focus on the previous trial (t−1) only. Therefore, only the first trial was discarded from further analyses. Provided that each image was only shown once to each participant and, if it was not the first image in the experiment, it could either follow a low rated image or a high rated image. Therefore, for each food image, we conducted an independent t-test with valence rating (low versus high) as between-subjects variable. Similar analyses were conducted for each image on the mean arousal rating. Statistical analyses were performed using the SciPy module in Python (Virtanen et al., 2020).

The effect of gender, age, BMI and nationality

To examine whether demographic variables moderate the valence and arousal serial dependencies, we conducted an ANCOVA on the valence and arousal inter-trial effects with age and BMI as continuous variables and gender and nationality as categorical variables. For each participant, the BMI was calculated using Eq. (1) (Keys et al., 1972):

$${\mathrm {body}}\,{\mathrm {mass}}\,{\mathrm {index}} = \frac{{{\mathrm {weight}}\,\left( {{\mathrm {kg}}} \right)}}{{{\mathrm {height}}\,\left( {\mathrm {m}} \right)^2}}$$
(1)

Here, the BMI is computed as the ratio between the participants’ self-reported weight (kg) and squared height (m2).

If a particular covariable turns out to have a significant effect on a sequential dependence, then we conduct an ANOVA on the mean sequential dependence rating with previous rating (low or high) as within-subject variable and the covariable as between-subject variable.

Results

Valence serial dependencies

We first examined whether the valence rating on a given trial t depends on the valence rating up to four trials back (i.e., either trial t−1, t−2, t−3 or t−4; see Manassi et al., 2018, for a similar approach). Figure 2a depicts the mean valence rating (on a 100-point scale) on a given trial t as a function of the valence rating for trial t−1 back to trial t−4. The valence inter-trial effect (i.e., the difference between the high and low valence ratings in the upper panel) is plotted as a function of the trial distance relative to the current trial t in the lower panel.

Fig. 2: Sequential effects for food image processing.
figure 2

A Upper panel: Mean valence rating (on a 100-point scale) on trial t as a function of the valence rating (low: purple squares, or high: orange circles) on trial t−1 back to trial t−4. The dotted line indicates the group mean valence rating. Lower panel: Valence inter-trial effect (i.e., the difference between the high and low valence rating in the upper panel) as a function of the trial distance relative to the current trial t. B Upper panel: Mean arousal rating on trial t as a function of the arousal rating (low: purple squares, or high: orange circles) on trial t−1 back to trial t−4. The dotted line indicates the group mean arousal rating. Lower panel: Arousal inter-trial effect (i.e., the difference between the high and low arousal rating in the upper panel) as a function of the trial distance relative to the current trial t. Error bars represent the ±2 standard error of the mean. Significance levels: *p < 0.05, **p < 0.01, and ***p < 0.001.

The ANOVA on the mean valence inter-trial effect yielded a significant trial distance effect, F(3, 3831) = 34.855, p < 0.001, indicating that the inter-trial effect varied as a function of trial distance relative to the current trial t (see lower panel in Fig. 2a). The inter-trial effect for trial t−1 (2.4) was significantly different from zero, t(1277) = 9.277, p < 0.001, as the valence rating on a given trial t was greater when the valence rating on the previous trial was high than when it was low (i.e., a positive serial dependence). The inter-trial effect was not significantly different from zero with regard to trial t−2 (−0.1, t(1277) = 0.457, p = 0.647) and trial t−3 (−0.4, t(1277) = 1.749, p = 0.081). Interestingly, a repulsive inter-trial effect was observed with respect to trial t−4 (−0.7), t(1277) = 2.824, p = 0.005, as the valence rating on trial t was significantly lower when the valence rating four trials back was high than when it was low.

Arousal serial dependencies

Subsequently, we investigated whether the arousal rating on a given trial t depends on the arousal rating up to four trials back (i.e., trial t−1, t−2, t−3, and t−4). Figure 2b depicts the mean arousal rating (on a 100-point scale) on a given trial t as a function of the arousal rating for trial t−1 back to trial t−4. The arousal inter-trial effect (i.e., the difference between the high and low arousal ratings in the upper panel) is plotted as a function of the trial distance relative to the current trial t in the lower panel.

The ANOVA on the mean arousal inter-trial effect yielded a significant trial distance effect, F(3, 3831) = 80.590, p < 0.001, indicating that the inter-trial effect varied as a function of trial distance relative to the current trial t (see lower panel in Fig. 2b). The inter-trial effects were significantly different from zero for trial t−1 (4.3, t(1277) = 16.907, p < 0.001), t−2 (1.6, t(1277) = 6.267, p < 0.001), and t−3 (0.5, t(1277) = 2.085, p < 0.05). The sign of these sequential effects was positive, suggesting that the arousal rating on trial t was significantly higher when the arousal ratings on trial t−1, t−2, or t−3 were high than when the arousal ratings on these preceding trials were low. The sequential effect was not significantly different from zero for trial t−4 (0.2, t(1277) = 0.605, p = 0.545).

Serial dependencies for individual images

Figure 3a and b illustrate the mean arousal rating as a function of the mean valence rating for each food image for both the valence as well as the arousal inter-trial effect, respectively.

Fig. 3: Valence and arousal inter-trial effects for each image (left versus right panels, respectively).
figure 3

A and B Mean arousal rating as a function of the mean valence rating for each of the 60 food images. In panel A, the orange circles signify those trials whereby the image was preceded by an image with a high valence rating, whereas the purple squares signify the opposite. In panel B, the orange circles signify those trials in which the image was preceded by an image with a high arousal rating, whereas the purple squares signify the opposite. C Valence inter-trial effect as a function of the mean valence rating for each food image. Red circles indicate a significant difference (p < 0.05; two-tailed independent t-test), whereas grey circles signify non-significant differences. D Arousal inter-trial effect as a function of the mean arousal rating for each food image. Note that the continuous lines in all panels represent quadratic fits to the data points (see also Kaneko et al., 2018).

As shown in Fig. 3 (upper panels), the mean valence and arousal ratings for the 60 food images used in the present study were rather diverse and spread over the valence-arousal space, forming a U-shaped pattern (see also Kaneko et al., 2018; Toet et al., 2018). A question is then whether the observed valence and arousal inter-trial effects hold for each image used, or whether the sequential effects observed were driven by a subset of the images used. For instance, the observed inter-trial effects could be driven predominantly by images whose rating was ~50.0 (i.e., a rather neutral rating) and to a lesser extent by images rated at either ceiling or floor performance, as in the latter two cases there is simply less room for an increment or decrement, respectively. Here, and in the remaining part of the manuscript, we only take into account the ratings on the previous trial. Figure 3c and d illustrate the valence and arousal inter-trial effects for each image, as a function of the mean valence and mean arousal rating, respectively. Here, each circle represents a food image (like in Fig. 3a and b).

Regarding valence, a positive serial dependence was observed for 55 (out of 60) food images (i.e., 91.7% of the images). The two-tailed independent t-tests yielded a significant inter-trial effect for 50.0% of all the images (see the red dots in Fig. 3c). To correct for multiple comparisons, we applied false detection rate (FDR) correction on the p-values (Benjamini and Hochberg, 1995). After correction, there was a significant effect for 35.0% of the images. In general, a positive serial dependence for valence was observed for most food images used (although not all food images showed a significant effect). However, as is clear from the quadratic fit in Fig. 3c, the valence inter-trial effect was most pronounced for the images whose mean valence rating was far away from either floor (mean valence rating < 10.0) or ceiling (mean valence rating > 70.0) performance, and rather minimal when the valence rating approached either ceiling or floor performance.

Regarding arousal, a positive serial dependence was observed for 59 out of 60 food images (i.e., 98.3%). The t-tests yielded a significant inter-trial effect for 66.7% of the images (see the red dots in Fig. 3d). Note that after FDR correction, a significant inter-trial effect was observed for 63.3% of the images. In contrast to the valence rating, the mean arousal rating never approached floor performance, and the optimal quadratic fit was more like a linear fit, with the food images above the fit showing a significant effect regardless the mean arousal rating.

We observed a valence as well as an arousal sequential effect for the vast majority of food images. However, this assimilative effect was not statistically significant for all images. An explanation for why this was not the case is that each image was only presented once to each participant, making the inter-trial effect rather noisy. In the case of valence, strongest inter-trial effects were observed for those images which were rated as rather neutral (i.e., valence preference was not clearly determined). In the case of arousal, there was no clear subset that drove the inter-trial effect.

The effect of gender, age, BMI and nationality

We conducted an ANCOVA on both the mean valence and arousal inter-trial effects with age and BMI as continuous variables and gender and nationality as categorical variables to examine whether these co-variables moderate the observed effects. With regard to the arousal rating, the ANCOVA yielded a significant effect of gender, F(1, 1244) = 5.266, p = 0.022, indicating that gender moderated the arousal serial dependence. None of the other variables were significant (all F-values ≤ 1.413, p-values ≥ 0.133). With regard to the valence rating, the ANCOVA yielded a trend towards a significant gender effect, F(1, 1244) = 3.450, p = 0.063, suggesting that gender moderated the valence serial dependence (note that without nationality as covariate, the effect of gender was significant, F(1, 1274) = 4.712, p = 0.030). All other effects were not significant (all F-values ≤ 2.580, p-values ≥ 0.108).

Next, we examined how gender affected the serial dependencies. Figure 4a illustrates the mean valence rating as a function of the valence rating on the previous trial for both males (n = 483) and females (n = 795).

Fig. 4: The effect of gender on the valence and arousal serial dependencies.
figure 4

A Mean valence rating as a function of the valence rating on the previous trial for males and females. B Mean arousal rating as a function of the arousal rating on the previous trial for males and females. Error bars represent the ±2 standard error of the mean. Significance levels: *p < 0.05, **p < 0.01, and ***p < 0.001.

We conducted an ANOVA on the mean valance rating with previous valence rating (low, or high) as within subject variable and gender (male or female) as between subject variable. The ANOVA yielded a significant previous valence rating × gender interaction, F(1, 1276) = 5.217, p = 0.023, as the valence serial dependence was larger for males than for females (3.4 versus 2.2). Importantly, the inter-trial effect was significant for both males and females, t(482) = 8.380, p < 0.001, and t(794) = 6.671, p < 0.001, respectively. The main effect of gender was also significant, F(1, 1276) = 9.217, p = 0.002, as the valence rating was overall larger for males (43.3) than for females (41.8).

Figure 4B illustrates the mean arousal rating as a function of the arousal rating on the previous trial for both males and females. We conducted an ANOVA on the mean arousal rating with previous arousal rating (low, or high) as within subject variable and gender (male or female) as between subject variable. The ANOVA yielded a significant previous arousal rating × gender interaction, F(1, 1276) = 3.945, p = 0.047, as the arousal serial dependence was larger for males than for females (5.1 versus 4.1). Importantly, the inter-trial effect was significant for both males and females, t(482) = 12.873, p < 0.001, and t(794) = 12.947, p < 0.001, respectively. The main effect of gender was also significant, F(1, 1276) = 23.135, p < 0.001, as the arousal rating was overall larger for females (59.1) than for males (55.5).

Discussion and conclusion

In the present study, we investigated whether sequential effects play a role in the affective appraisal of food images. Here we showed for the first time that both the valence and arousal ratings for a food image are contingent upon the valence and arousal rating of a different food image on the previous trial, respectively. More specifically, a positive sequential dependence was observed with regard to the valence rating on the previous trial, indicating that the valence rating for a given image was higher when the previous food image was rated high on valence than when the previous food image was rated low on valence. A similar positive sequential effect was observed for the arousal rating. For the arousal rating, this assimilative effect was also observed up to three trials back, indicating that even the arousal rating of a food image three trials back affects the rating on the current trial. Interestingly, for the valence rating, we observed a negative (repulsive) effect with regard to the valence rating four trials back, indicating that the valence rating on a given trial was higher when a food image four trials back was rated low on valence than when it was rated high on valence.

The present study is not the first study reporting a serial dependence using emotional stimuli. Liberman et al. (2014) found a positive serial dependence when participants were instructed to rate the emotion of a facial expression on successive trials. Although the positive serial dependence in the Liberman et al. study can, in principle, be assigned to an emotional serial dependence, it is more likely that their effect reflects a visual phenomenon. Although, in the present study, participants were instructed to rate the food images by visual inspection, it is most likely that the observed serial dependence reflects an emotional process that is per definition amodal (LeDoux, 2007; McDonald, 1998). In other words, if participants have an emotional response when they view a food image, then we propose that some residual emotional activity remains present in the brain when viewing the subsequent food image, such that the evoked emotional response to new images is more a mixture of the previous emotion and the actual emotion triggered by the new image. This may also explain why we observed a positive serial dependence (as the emotions are merged), and why we found effects up to three trials back in our current study using food images, and others even up to 5 or more trials back using faces (Van der Burg et al., 2019; Xia et al., 2016). These sequential dependencies over multiple trials may reflect the temporal window over which visual information (in the case of faces) or emotional information (in the case of food images) is merged. However, an alternative possibility is that the inter-trial effect over multiple trials does not reflect the temporal window over which multiple emotional experiences are merged, but instead represents a carry-over effect from one trial to another (i.e., trial t−2 affects trial t−1, and trial t−1 affects trial t, explaining why indirectly trial t−2 affects trial t). An intriguing and another open question is then why such a merging effect or carry over effect should go up to three trials back for arousal ratings and only one trial back for valence ratings, as observed in the present study.

In the present study, participants were always instructed to rate the food images. Therefore, it remains unclear whether simply viewing a food image is sufficient to observe an assimilative serial dependence or whether an explicit emotional task is a prerequisite to observe the effect. Recently, Van der Burg et al. (2019) found evidence for a positive serial dependence when participants were instructed to judge the attractiveness of a face from trial-to-trial. However, this positive serial dependence disappeared when the task alternated between a gender judgement task and attractiveness task even though the same stimuli were used. Thus, when the previous trial was associated with a different task, the serial dependence for facial attractiveness was absent. This supports the idea that positive serial dependencies (for at least facial attractiveness) are task specific (but see Van der Burg et al., 2018). However, other studies reported both a positive serial dependence (Fornaciai and Park, 2018) and a negative serial dependence (Van der Burg et al., 2013) when no judgement was made on the preceding trial (i.e., a passive trial), indicating that sequential dependencies can, in principle, arise rather effortlessly and automatically. The present study does not afford any conclusions about whether simply viewing an image is sufficient to elicit an assimilative serial dependence as the participants always performed the task.

To our knowledge, the present study is the first study investigating whether nationality moderates serial dependencies. In general, we observed an assimilative serial dependence with regard to the valence and arousal rating. Such positive after-effects are proposed to reflect post-perceptual processes, such as decisional and response selection processes, and it is therefore possible that such processes may be driven by the participants’ country of residence as food perception in general depends largely on the participants’ culture, or nationality (Ichijo and Ranta, 2016; O’Connor, 2009). However, we did not find any evidence that nationality moderates the positive serial dependence for neither the valence nor arousal rating. That nationality did not moderate the serial dependence for food evoked emotions suggests that the inter-trial effect generalizes across different countries (or at least, for the 16 different countries we tested). Interestingly, using the same EmojiGrid rating method, Kaneko et al. (2018) reported that the valence and arousal rating of food images differed significantly across different nationalities. Even though a part of our data set was identical to the data set in their study, we did not observe any evidence for a nationality effect with regard to the serial dependencies. In the present study we also did not find any evidence that age and BMI moderate the valence and arousal serial dependencies.

Whereas nationality, age and BMI did not moderate the inter-trial effects, we find that gender had a significant effect on both serial dependencies. Overall, mean valence was rated higher by males than by females, while females gave higher mean arousal ratings than males. These results may reflect gender differences in cerebral responses to food images: while females show a greater activation than males in lateral prefrontal midline cortical regions (including the insula, a region implicated in arousal-related feelings of hunger, visceral sensations and current need states) males in contrast show a greater activation than females in the Amygdala (a primal limbic structure involved in determining the valence or attractiveness of food, Killgore and Yurgelun-Todd, 2010). Furthermore, it is also known that females report higher arousal ratings than males for some types of food (e.g. fruits and deserts, Padulo et al., 2017). While males and females both showed positive valence and arousal sequential dependencies, the inter-trial effects were significantly larger for males than for females. This is an intriguing finding as this may suggest that females apply a different strategy than males. For instance, a larger inter-trial effect may indicate that males may be more inclined to repeat the previous response than females (i.e. response bias). However, the food images were presented in a random order (such that positive and negative images were randomly shown), and we used an EmojiGrid, making it difficult for the participants to simply repeat the previous response, like in a two-alternative forced-choice (2AFC) paradigm. What is more likely, is that the serial dependencies for food images are due to sex differences in the brain (Cahill, 2006). For instance, it might be possible that short term memory consolidation for emotions is decreased in males compared to females, explaining why the most recent emotional experiences are more pronounced on the current trial for males than for females (see LaBar and Cabeza, 2006 for an interesting review regarding memory for emotions).

Summarizing, we reported here for the first time that the affective appraisal of a food image depends on the food images that have been perceived in the immediate past. This effect occurred for most food images. This finding may be relevant for the design of websites of O2O food delivery services or restaurant menus. In principle, one could design these sites such that they influence a person’s preference by manipulating the order of the images. Whether the effect is large enough to be relevant is still an open question. Although an effect of 5 points on a 100-point scale may look small, it may be quite appreciable when the response range is taken into account (see Fig. 3 for the minimum and maximum valence and arousal ratings).