Introduction

Synesthesia, affecting approximately 1 in 23 people1, is a condition where one attribute of an (inducing) stimulus automatically engages the experience of additional (concurrent) features that are not physically present in the inducing stimulus. For instance, in grapheme-color synesthesia (GCS), reading black letters triggers color experiences.

The prevailing view is that synesthesia is a congenital condition. For instance, the neonatal hypothesis posits that all infants are born synesthetes, but whereas most individuals lose this trait following the pruning of neural connections, synesthesia that persists into adulthood is due to reduced pruning2. However, although infants do exhibit cross-modal correspondences3, these have only superficial features in common with standard forms of synesthesia and probably involve different mechanisms4. Provisional polygenetic evidence5,6 indicates a heritable component to synesthesia. Although synesthesia tends to run in families7,8, the existence of monozygotic twins, only one of which has synesthesia9,10, suggests only a moderate genetic contribution.

The weak influence of heritable factors suggests that there may be a major role for learning in both shaping and engendering synesthesia. Simner and colleagues tested grapheme-color consistency in synesthetic children between 6 and 7 years of age and again in the same children a year later. This interim year appeared critical in transforming chaotic pairings into consistent fixed associations11. The same cohort were retested 3 years later and found to have even more consistent pairings12. Therefore, GCS appears to emerge in early school years, where first major pressures to use graphemes are encountered and then becomes cemented in later years. In fact, for certain abstract inducers, such as graphemes, it is implausible that humans are born with synesthetic associations to these stimuli. Hence, learning must be involved in the development of at least some forms of synesthesia13,14.

Further support a role for learning in synesthesia comes from the non-arbitrary source and pattern of grapheme-color associations15,16. Also, specific color-grapheme associations in genuine synesthetes have been traced to the colors on childhood grapheme-based jigsaw puzzles and refrigerator magnets17. In addition, Elias and colleagues describe an adult who spent 8 years learning digit-color associations from cross-stitch (where specific numbers signify the use of a particular colored thread)18. This subject showed similar effects to genuine synesthetes on color-digit associations, Stroop tests and one conscious priming test. However, it was not reported to what extent this subject experienced synesthetic phenomenology.

The substantial involvement of a learning component in synesthesia raises the possibility that phenomenological synesthetic experiences can be acquired by means of training. Training to induce synesthetic traits has been attempted many times already, with mixed results14,19,20,21,22,23,24,25,26,27,28,29. Stroop and priming effects, with slower reaction times to graphemes linked to colors incongruent to trained color-grapheme associations, are commonly reported. However, trained subjects have not as yet demonstrated any synesthetic conditioning responses14. Moreover, previous studies have failed to elicit subjective reports indicative of clear synesthetic phenomenology, which is considered the hallmark of genuine synesthesia30.

An alternative explanation for the previous absence of phenomenological reports is that the training regimes may have lacked key components that characterize the developmental trajectory of natural synesthesia29. For instance, previous training studies have tended to apply very limited training durations, at most usually only up to one week. Given that synesthesia may take months or years to fully develop11, longer training regimes may yield more effective results. Also, most training paradigms have failed to adapt task difficulty to improving performance, both limiting training on the specific task and failing to reflect the developmental nature of skills, like reading and numeracy. When adaptive training paradigms have been compared with non-adaptive equivalents, enhanced synesthetic priming effects have been found with adaptive versions28. Furthermore, the trained associations have usually been arbitrary, despite the non-random nature of genuine synesthetes' associations, which commonly involve semantic connections between the grapheme and color15.

A related concern has to do with sustaining motivation on repetitive training tasks. In school environments, for instance, when children are learning to read, motivation is enhanced by strategies such as peer-, teacher- and parental- pressure, rewards for good performance, a large mix of learning stimuli and so on. Strategies for optimizing motivation in adults on such tasks would increase the naturalism of the training and may therefore lead to closer correspondences with genuine synesthetic traits.

Here, we implemented a synesthetic training regime considerably closer to putative real-life synesthesia development than has previously been used. We significantly extended training time compared to all previous studies, employed a range of measures to optimize motivation, such as making tasks adaptive and we selected our letter-color associations from the most common associations found in synesthetic and normal populations15. Participants were tested on a range of cognitive and perceptual tasks before, during and after training. We predicted that this extensive training regime would cause our participants to simulate synesthesia far more closely than previous synesthesia training studies have achieved.

Methods

All methods were carried out in accordance with the approved guidelines.

Participants

33 subjects were recruited from the University of Sussex student population. During an initial briefing, participants were asked whether they had any letter-color synesthetic phenomenology and were excluded from the experiment if they answered that they had. In addition, any participant who had color consistency scores (see below) comparable to synesthetes was excluded from the experiment. Thus, all participants were initially not grapheme-color synesthetes. Participants were also initially assessed using two measures of visual imagery: Vividness of Visual Imagery Questionnaire (VVIQ)31; and the Spontaneous Use of Imagery Scale (SUIS)32. Given that previous findings have suggested, based on self-report questionnaires, that grapheme-color synesthetes score highly on visual imagery33,34, a subset of participants with no evidence of initial grapheme-color phenomenology and the greatest combined visual imagery scores on the VVIQ and SUIS were selected, to increase the chances of successful training. Note that the connection between visual imagery ability and GCS should be taken as provisional, given the absence of independent behavioral evidence for this association. This final sample size was based on that used in previous synesthesia training studies. Motivation levels to complete the study were also considered during selection. The final sample consisted of 14 subjects, 2 male and 12 female, mean age 19.35 (SD = 1.78). Mean VVIQ scores for this sample were 38.75 (SD = 10.27) and for the SUIS were 38.79 (SD = 7.65). Participants were reimbursed for their time in all the test and training sessions. The study was approved by the University of Sussex Life Sciences and Psychology Cluster Research Ethics Committee. Informed consent was obtained from all subjects.

Behavioral procedure

A test session lasting ~3 hours was administered before and after training. This included working memory, long-term memory, IQ, perceptual and phenomenological assessments. At the midpoint of training, in the 5th week, a subset of tests was also administered. 3 months after the final testing session a follow up session was administered on another subset of tests (Table s1).

Training consisted of ~30 minute sessions including 4–5 tasks per day, 5 days per week, for 9 weeks, with one or two new tasks each week replacing older tasks (Table s3). In addition, “homework” was assigned on each training day, which involved reading an e-book at home, with colored letters to match the training tasks (see below), using a similar paradigm to Colizoli and colleagues22. Participants were paid an extra £1 at the end of the week for each training task they scored higher on than at the end of the previous week. For full details of the training tasks, see supplementary information.

Test tasks

For details of which tasks were administered at what stage, see Table s1.

Color Naming Stroop

The Color Naming Stroop procedure was adapted from previous studies35,36. Stimulus presentation was controlled by E-Prime 1.2 (Psychology Software Tools, Pittsburgh, PA, USA). Participants were presented with 130 trials. For half of the trials, one of the 13 trained letters was presented in a color congruent with the trained association and for the other half the color was incongruent. The order of stimulus presentation was randomized. Each trial started with a fixation cross presented at the center of the screen for 1000 ms, followed by the grapheme, which covered a visual angle of 0.64°. The color of the background was set to 196,188,150 (RGB). The onset of each grapheme was accompanied by a beep sound (used for manual coding of reaction times). The stimulus remained on the screen until the participant made a response into a microphone. Participants were required to name the veridical color as fast as they could, while ignoring the trained color. Participants' voice response reaction times were manually coded using Audacity audio editing software (http://audacity.sourceforge.net). The time was measured for each trial from the peak of the beep (which coincided with the presentation of the stimulus) to the onset of the participant's voice response. Only correct trials were included (average accuracy pre-training: 94%; mid-training 97%; post-training: 93%). Responses that fell outside of +/−2 SDs (calculated per participant) were removed (7/130 on average per participant pre-training, 8/130 mid-training and 8/130 post-training).

Synesthetic Stroop

This was identical to the Color Naming Stroop task, except that participants were required to name the trained color for the letter shown and ignore the veridical color presented. Again, only correct trials were included (average accuracy 93%) and responses that fell outside of +/−2 SDs (calculated per participant) were removed (9/130 on average per participant). As can be seen from this and the previous related test, accuracy was high and very few trials were excluded.

Color Consistency Test

Participants completed the internet-based standardized grapheme-color consistency test (www.synesthete.org)37. In this test, each participant was presented with the graphemes A–Z and 0–9 three times in randomized order (108 trials). Participants were instructed to select a color that best fit with each grapheme and to use their first instinct and to always pick a color for each grapheme. Colors were represented on a plane varying in lightness along the vertical and in saturation along the horizontal axis with a separate bar to adjust hue. Each participant was given a demonstration of how the selection procedure worked before they began.

Analysis was based on a recent method optimized to maximize sensitivity and specificity38. For the purpose of this study, we were interested in the consistency of trained and untrained letters before and after training. Trained and untrained letters were analyzed separately for each of the testing sessions.

Consistency was calculated on the basis of euclidean distances in CIELUV color space i.e., L*u*v*: the L* axis represents perceived lightness, the u* axis contrasts green (negative values) against red (positive values) and the v* axis contrasts blue (negative) against yellow (positive)). Consistency is typically below 135 for genuine GCS38.

Synesthetic conditioning

The procedure for the synesthetic conditioning task was adapted from previous studies14,27,39,40.

Apparatus. Stimulus presentation was controlled by E-Prime 1.2 (Psychology Software Tools, Pittsburgh, PA, USA). Auditory materials were presented at 100 dB via headphones by a stereo integrated amplifier. Skin Conductance (SC) was measured with two non-shielded disposable electrodes (Biopac System Inc, Goleta, CA, USA), pre-gelled with an isotonic 0.5% NaCl solution. SC data were acquired with a Biopac MP36 skin conductance level meter (Biopac System Inc, Goleta, CA, USA) and SC data were recorded using Biopac Student Lab software version 3.7.7 (Biopac System Inc, Goleta, CA, USA).

Procedure. Skin conductance response (SCR) was continuously measured to assess autonomic arousal. It was sampled at 1000 Hz with two electrodes, attached to the thenar and hypothenar eminences of the non-dominant hand. Participants were seated 60 cm in front of a computer screen. They were asked to relax, remain silent and to attend to squares that would appear on the screen. Five possible colored squares (red, green, blue, yellow and white), covering a visual angle of ~10.6°, could be shown centrally, superimposed on a peripheral light beige background set to 196,188,150 (RGB). The white square included a letter which was associated with a color during training. No motor or verbal response was required. Each square was shown for 2 s and the inter-trial-interval (ITI) was ~10 s. In the habituation phase, stimuli were presented in a random order 12 times for a total of 60 trials.

In the conditioning phase, a total of 29 trials were presented in a fixed pseudo-random order. Seven squares of the same color were followed by a loud startling sound, which acted as the unconditioned stimulus (UCS). Six white squares, including a letter associated with the UCS's color during the training, were used as conditioned stimuli (CS). The letter stimulus was selected on the basis of the strongest self-reported letter-color association for each individual. None of the CS stimuli were followed by the UCS. An additional 16 squares showing the other three colors were used as neutral filler stimuli. Neutral stimuli were never followed by the startling sound and were only considered for the analysis if the preceding trial was not a UCS or CS trial.

In the extinction phase, two white squares including the same letter as previous trials and two squares of the letter's associated color were presented in alternating order for a total of 24 trials. These were never preceded by the UCS. These trials were included to extinguish the conditioned response and were not considered for the analysis.

Analysis. Using Ledalab (version 3.4.3) for continuous decomposition, SC data were down-sampled to 20 Hz and separated into phasic and tonic activity41. For analysis of phasic SCR a response window of 2 s was used. The starting point was defined as the offset of the stimuli (i.e., colored squares). Note that this rather short response window was used to minimize the likelihood of physiological random noise. SCRs were defined as the average phasic driver in the response window with higher SCR indicating higher autonomic arousal. This score represents phasic activity most accurately41. One participant was excluded from the analysis due to a previous injury to her non-dominant hand.

Cattell Culture Fair IQ

In order to assess the potential of the synesthetic training regime for general cognitive enhancement, the Cattell Culture Fair form 2a was given to participants both before and after training. A control group (n = 9, mean age 22.5, 2 females and 7 males) carried out the same test 9 weeks apart with no training component in between, in order to assess the potential confound of improvement due to practice on the test.

Phenomenology questionnaire

Participants were interviewed about the effects of training during training week 5, immediately after training was complete and then again three months later (two participants were unable to attend the three month follow up session). Participants were invited to describe whether they used any mnemonic strategies to aid in letter-color associations and if so what they were. The extent of color phenomenology was assessed for each letter using two methods. First, participants were asked to respond to the question, “Which statement characterizes your grapheme-color associations best: a) Whenever I see a letter there is only that letter, but no color at all; b) I can't even think of an associated color, no matter how hard I try; c) I know the associated color, but I never see it; d) I see the color in front of my mind's eye; e) I see the color outside my head (e.g., a few inches away); f) I see the color floating on the surface where the letter/number is.” Second, participants were shown a black letter and asked to describe any associated color phenomenology, both inside the lab and during their daily lives. Subjects were finally given a chance to report any additional effects of their training.

Results

Training effects

The effects of training on performance were well explained by linear functions (all tasks p < 0.001 – see table s4 and figure s1), demonstrating that participants significantly improved on all tasks across sessions and effectively learnt the letter-color associations.

Test results

Phenomenological interview

Participants were asked a series of questions about their subjective perceptual response to each of the 13 trained letters.

  1. A

    Intermediate

    8 out of 14 participants reported phenomenological experiences related to the trained letters (see table s5). For instance, one participant reported, “When reading a sign on campus I saw all the letter E's colored green on the sign.” All those reporting no phenomenology chose the phenomenological category “I know the associated colors but never see them” whereas 7 out of the 8 participants reporting phenomenology chose the category “In front of mind's eye” and 1 chose the option “I see the color floating on the surface where the letter is.”

    12 out of 14 participants reported associating personas to letters (ordinal linguistic personification (OLP)). For instance, one participant felt that “x” was “aggressive”, while another reported that “u” induced “pity”. Although we did not test for OLP prior to training, we did ensure using the color consistency test that our participants began the study with no synesthetic traits. We note that OLP is regarded as a distinct form of synesthesia which is uncommon in the general population, but which commonly co-occurs with GCS42.

  2. B

    Post Training

    Table 1 shows the extent of phenomenology immediately following the full 9 weeks of training. 9 out of 14 participants reported phenomenological experiences (including all but 1 of those reporting this at the intermediate stage), 3 were borderline and only 2 reported no phenomenology. Phenomenological descriptions were similar to the intermediate stage, though with more subjects providing reports. The same 12 out of 14 participants as in the intermediate stage reported ordinal linguistic personification. Note that neither imagery self-report measures VVIQ nor SUIS correlated with the extent of phenomenology, nor correlated with any other post-training test. In addition, the extent of phenomenology did not correlate with training progress or any post-training test (for more details, see supplementary information).

    Table 1 Summary of the phenomenological interview immediately following training

Color Naming Stroop

Participants were shown one of the 13 trained letters on each trial, colored either congruent or incongruent with the letter-color trained associations and had to verbally report the veridical color as fast as possible. A repeated measures ANOVA of congruency effect revealed a significant effect of training stage (pre-training, intermediate and post training) (F(2, 26) = 11.78, p < 0.001, effect size: partial eta2 = 0.475) (see Figure 1a). Further analyses revealed a significantly greater congruency effect immediately following training, compared with before training (t(13) = 3.90 p < 0.001, effect size: Cohen's d = 1.45) and with the mid-training stage (t(13) = 3.17 p < 0.004, effect size: Cohen's d = 0.87). The mid-training stage also had a significantly greater congruency effect, compared with pre-training (t(13) = 2.36 p < 0.017, effect size: Cohen's d = 0.69).

Figure 1
figure 1

Color Naming Stroop task.

For each trial, participants were presented with a colored grapheme and were required to name the veridical color as fast as they could, while ignoring the trained color association. Response times (±S.E.) before, during and after training are shown, a) overall and b) split according to semantic and non-semantic stimuli.

Note that even before training there was a significant congruency effect (t(13) = 5.78 p < 0.001, effect size: Cohen's d = 0.29). This was presumably due to the fact that the associations in this study were chosen based on preferences from large-scale studies, for both synesthetic and normal populations, where semantic associations may be a common basis for links between letters and colors1. Therefore, the contribution of semantic (e.g. r = red) versus non-semantic (e.g. x = dark grey) associations to the overall Stroop effect was also investigated (see Figure 1b). A 2 (semantic, non-semantic) × 3 (pre-training, intermediate and post training) repeated measures ANOVA of congruency effect revealed a main effect of training stage (F(2,26) = 11.10, p < 0.001, effect size: partial eta2 = 0.461) and of semantics (F(1,13) = 35.17, p < 0.001, effect size: partial eta2 = 0.73), with a trend towards a significant interaction (F(2,26) = 2.71, p = 0.086, effect size: partial eta2 = 0.172). Further analyses revealed a significantly greater congruency effect for semantic associations, but not non-semantic associations, post-training compared with pre-training (semantic: t(13) = 4.64 p < 0.001, effect size: Cohen's d = 1.36; non-semantic: t(13) = 0.54 p > 0.1, effect size: Cohen's d = 0.23) and compared with the mid-training stage (semantic: t(13) = 2.76 p = 0.008, effect size: Cohen's d = 0.67; non-semantic: t(13) = 0.598 p > 0.1, effect size: Cohen's d = 0.21). There was also a significantly greater congruency effect for semantic associations, but not non-semantic associations, halfway through training compared with pre-training (semantic: t(13) = 3.15 p < 0.003, effect size: Cohen's d = 0.64; non-semantic: t(13) = 0.067 p > 0.1, effect size: Cohen's d = 0.02).

Color consistency test

Participants were presented with all letter and number graphemes three times in random order and each time had to pick the color they most closely associated with the grapheme. Consistency for the 13 letters that participants were trained on was contrasted with those that they did not train on. Data of one participant for the pre-training stage was lost due to technical issues. As figure 2 illustrates, there was a significant test-stage (pre- vs. post-training) by letter set (trained vs. untrained) interaction (F(1,12) = 17.01 p < 0.001, effect size: partial eta2 = 0.586). There was no significant difference between the trained and untrained letters at the pre-training stage (t(12) = 1.38, p > 0.1, effect size: Cohen's d = 0.11), which is unsurprising since letter-colour associations had not been trained at this point. In contrast, there was a significant difference between the trained letters after training and both the untrained post-training letters (t(13) = 4.08 p < 0.001, effect size: Cohen's d = 1.30) and the trained letters at the pre-training stage (t(12) = 4.67 p < 0.001, effect size: Cohen's d = 1.47). Crucially, post-training consistency scores for trained letters surpassed the threshold assumed to validate synesthesia, when applied to genuine synesthetes38. These results did not depend on whether the associations either had some conceptual association (e.g. r for red), or lacked any such association, or whether the colors were of a standard (i.e. where all the RGB values are either 0 or 255) or non-standard nature (see supplementary information).

Figure 2
figure 2

Color consistency scores on the Color Consistency Test.

Color consistency scores (±S.E.), based on the CIELUV Euclidian distance algorithm, using the online Color Consistency Test for the 13 trained and 13 untrained letters before and after training. A lower score reflects increased color consistency. Values below the dashed line are standardly assumed to signify genuine synesthesia.

Synesthetic conditioning

In this task, participants attended to a stream of colored squares, with an aversive stimulus paired with one color, while Skin Conductance Response (SCR) for presentation of its associated letter was monitored. The results of the conditioning procedure are shown in Figure 3. Although there was no significant time (pre conditioning versus post) by stimulus type (inducer versus neutral) interaction, due to theoretical and practical interest, we further conducted paired-sampled t-tests for the critical comparisons. These tests showed that SCRs for inducer stimuli were significantly higher during the conditioning phase than during the learning phase (t(12) = 2.20, p = .024, effect size: Cohen's d = 0.79). This was not the case for the neutral trials (t(12) = 1.60, p > 0.1, effect size: Cohen's d = 0.44). In previous studies using this paradigm, only genuine synesthetes exhibited a conditioning effect36,39,40; trained controls showed automatic associations by means of a Stroop paradigm, but not a conditioning effect36. Therefore, these results provide further evidence that our participants were experiencing synesthetic-like phenomenology, although this evidence should be viewed as only suggestive due to the insignificant interaction.

Figure 3
figure 3

Synesthetic conditioning demonstrating automaticity and phenomenology of associations.

During habituation, color and letter stimuli were presented with no aversive stimulus. During conditioning, an aversive stimulus consistently followed the color part of the strongest letter-color association for each individual. The conditioned response (±S.E.) was measured for presentation of the letter part by Skin Conductance Response (SCR). Higher SCRs indicate greater autonomic arousal.

Synesthetic Stroop

Note that due to the nature of the experiment, this test could only be performed after training. One participant's results were not included due to a technical issue. A 2 by 2 ANOVA (semantic/non-semantic versus congruent/non-congruent) revealed a main effect of semantics (F(1,12) = 54.89, p < 0.001, effect size: partial eta2 = 0.82) and of congruency (F(1,12) = 33.01, p < 0.001, effect size: partial eta2 = 0.73), with a significant interaction (F(1,12) = 6.47, p < 0.026, effect size: partial eta2 = 0.35). Unlike the Color Naming Stroop Task, both semantic and non-semantic subsets of the experiment showed a significant Stroop congruency effect (semantic 223.0 ms: t(12) = 4.66 p < 0.001, effect size: Cohen's d = 0.90; non-semantic 329.7 ms: t(12) = 5.81 p < 0.001, effect size: Cohen's d = 1.23). Therefore, as is shown in figure 4, incongruent stimuli induced slower responses for both semantic and non-semantic stimuli, but semantic stimuli were responded to significantly faster overall than non-semantic stimuli.

Figure 4
figure 4

Synesthetic Stroop Task.

For each trial, participants were presented with a colored grapheme and were required to name the trained color as fast as they could, while ignoring the veridical color. Response times (±S.E.) after training for congruent and incongruent stimuli, split according to whether the associations had a semantic component or not.

Cattell Culture Fair IQ test

The trained group were compared to a control group who only carried out the IQ test twice, 9 weeks apart, without any other tasks or tests. A 2 × 2 ANOVA revealed a significant group (trained vs control) by session (pre-training/1st test vs post-training/2nd test) interaction (F(1,20) = 6.30, p = 0.021, effect size: partial eta2 = 0.24), a main effect of session (F(1,20) = 7.31, p = 0.014, effect size: partial eta2 = 0.268), but no effect of group (F(1,20) = 0.99, p > 0.1, effect size: partial eta2 = 0.05). Specifically, the trained synesthetes demonstrated a significantly greater gain in IQ following training (see Fig s2 - trained group gain: 3 raw points, equivalent to 12.46 IQ points; control group gain: 0.11 raw points, equivalent to −0.11 IQ points; t(21) = 2.61 p = 0.008, effect size: Cohen's d = 1.21).

Discussion

Following a 9 week training regime that associated 13 letters of the alphabet with specific colors, non-synesthetic participants passed a range of tests designed to demonstrate genuine synesthesia, including the color consistency test38, synesthetic Stroop tasks36,43 and a classical conditioning test14,27,39,40. These tests demonstrate that such an intensive training regime generates highly automatic letter-color associations. Crucially, by the end of training, the majority of participants also developed phenomenological experiences that were very similar to those described by genuine grapheme-colour synesthetes. In many participants, such experiences were already present after only 5 weeks of training and for some participants the experiences were described as occurring on a daily basis. These results cast doubt on claims that genuine synesthetic phenomenology can only occur in a (genetically) distinct, rare subset of the population following early developmental influences30.

For the two Stroop tests we used here, we can provisionally make a direct comparison between our trained subjects and genuine synesthetes. Post-training we found a ~100 ms congruency effect for the Color Naming Stroop Test and a ~230 ms congruency effect for the Synesthetic Stroop Test. These data compare favorably with results from genuine synesthetes, although a direct comparison is difficult for several reasons. Stroop effects in genuine synesthetes vary considerably depending on the type of synesthetes tested (i.e., projector vs. associators) and the specific task requirements (i.e., voice key vs. keyboard, number of different inducer/concurrent stimuli used etc.). Previous studies on genuine synesthetes found Stroop effects that ranged on average from approximately 30 ms up to 200 ms36,43,44. While Stroop effects and Stroop-type effects (i.e., inducer-concurrent priming and concurrent-inducer priming) in short training studies tend to be smaller (i.e., around 20 ms) they can in some instances reach the same magnitude14,22,27,28. However, the Stroop effects in training studies are more difficult to interpret because they seem to depend on the nature of the training (i.e., duration, adaptivity vs. non-adaptivity, number of consistent (and inconsistent) pairings)29.

Each of these synesthesia-like behaviors, including the phenomenological reports, were particularly strong for those letter-color associations that involved a clear semantic component (for instance r = red). This has the intriguing implication that the formation of at least some forms of synesthesia is fuelled by conceptual associations.

Could similar forms of training account for the prevalence of ‘natural’ synesthesia? It is well known that strategies to enhance semantic richness, usually involving binding or chunking, can significantly enhance task performance45,46,47,48,49. In line with this, there is increasing evidence that synesthesia leads to a specific profile of perceptual and memory advantages34,50,51,52. In addition, at least two case studies have been reported where a particularly rich form of synesthesia is accompanied by exceptional memory53,54,55. Although the exact reasons for this advantage remain unclear, it has been suggested that synesthesia provides for a richer world experience, which in turn leads to memory advantages34,51,52. Therefore it is possible that learning pressures, for instance for digits and letters, during early school years in certain individuals may lead to the formation of various semantic hooks, to aid memory, with color-letter associations a prime candidate. These habitual memory aids may then crystallize into synesthetic traits.

Indeed, the current study provisionally indicates a link between learning reliant on synesthetic associations and enhanced cognitive ability. Compared to a control group who carried out the IQ test twice, 9 weeks apart without any training and whose IQ remained the same, participants who undertook the synesthesia training regime increased their IQ on a fluid intelligence test by an average of 12 points. It should be emphasized that this result is peripheral with respect to our central goal to simulate synesthesia. Furthermore, in our design it is of course possible that the working memory aspects and not the synesthetic features of the training regime, generated this effect. Nevertheless, finding any IQ improvements in healthy young adults, let alone the marked improvement we observed, is notoriously difficult56 and usually limited to those at the lower IQ range (which doesn't apply to our student group). Our results of an IQ improvement therefore provisionally indicate that cognitive training including synesthetic associations may in the future be a promising new tool for vulnerable clinical groups to enhance general mental ability. Future ‘active control’ studies, including similar working memory tasks but without synesthetic components, are needed to establish the utility of this method.

One might worry that the phenomenological experiences could have occurred because this is what participants assumed the experimenters were looking for (i.e., ‘demand characteristics’). However, this is unlikely because participants were told instead that the aim of the training was to investigate if the pairing of colors and letters could aid cognition. Although some participants might well have entertained the possibility that this was in fact a study about synesthesia, presumably the same is true of most synesthesia training studies and ours is the only one to-date to have demonstrated clear synesthetic phenomenology. Therefore, demand characteristics do not readily account for the results we found.

It should be emphasized that we are not claiming to have trained non-synesthetes to become genuine synesthetes. When we retested our participants three months after training, much of their synesthetic phenomenology had disappeared, although they still showed a significant color-naming Stroop effect (not significantly different from the post-training results – see supplementary information, including fig. s6). This fading of phenomenology further speaks against the issue of demand characteristics. In terms of why the phenomenology disappeared, there are key differences between even our extensive training regime and likely normal GCS development: as well as a shorter timeframe compared with real synesthesia, our participants had many years where overlearned graphemes were not associated with colors, so that when training stopped and our subjects were again only exposed, on the whole, to achromatic letters, the associations were likely to fade or be overwritten. In addition, caution is warranted in extrapolating our data to the general population, given our relatively small sample size of 14 subjects, selected from a student population.

It is also difficult to know whether the phenomenology induced in our participants is equivalent to those of genuine synesthetes, who have had such experiences since childhood. Although this provides another reason to refrain from equating our trained participants with genuine synesthetes, it is worth noting the considerable heterogeneity in the form of phenomenology that genuine synesthetes report. For instance, some genuine synesthetes are able to locate their concurrent in a spatial location, whereas others don't perceive their concurrents in the environment at all37.

Leaving aside questions of whether it is possible to turn non-synesthetic adults into genuine synesthetes, our results nonetheless are consistent with the idea that synesthetic phenomenology has a major developmental component. Specifically, synesthesia may be primarily founded on conceptual associations that aid learning, in line with the notion that the genetic factors which determine the phenotype of synesthesia may be those that determine the upper limits of creativity, imagery and memory - all processes known to be enhanced in genuine synesthesia29.

Finally and independent of synesthesia, our findings highlight major new avenues for learning to influence perceptual content in adults. The results of this study demonstrate that with a total training period of less than 24 hours it is possible to alter how the majority of adults experience features of the world at a phenomenological level. Indeed, some of our subjects reported perceiving colors superimposed on the external world that was similar to veridical perception in quality, while the majority described some phenomenological alterations to achromatic graphemes as a result of training. There is good evidence from linguistics that language can shape perception; for instance, Russian speakers, who make an extra distinction between light and dark blues in their language, are better able to visually discriminate shades of blue57. We have shown here that, for some people, learning an extra dimension to a pre-learnt alphabet, even in adulthood, can change their subjective conscious phenomenological response to those letters.