Simulating reading acquisition: The link between reading outcome and multimodal brain signatures of letter–speech sound learning in prereaders

During reading acquisition, neural reorganization of the human brain facilitates the integration of letters and speech sounds, which enables successful reading. Neuroimaging and behavioural studies have established that impaired audiovisual integration of letters and speech sounds is a core deficit in individuals with developmental dyslexia. This longitudinal study aimed to identify neural and behavioural markers of audiovisual integration that are related to future reading fluency. We simulated the first step of reading acquisition by performing artificial-letter training with prereading children at risk for dyslexia. Multiple logistic regressions revealed that our training provides new precursors of reading fluency at the beginning of reading acquisition. In addition, an event-related potential around 400 ms and functional magnetic resonance imaging activation patterns in the left planum temporale to audiovisual correspondences improved cross-validated prediction of future poor readers. Finally, an exploratory analysis combining simultaneously acquired electroencephalography and hemodynamic data suggested that modulation of temporoparietal brain regions depended on future reading skills. The multimodal approach demonstrates neural adaptations to audiovisual integration in the developing brain that are related to reading outcome. Despite potential limitations arising from the restricted sample size, our results may have promising implications both for identifying poor-reading children and for monitoring early interventions.


Replication of prediction based on artificial-letter training in enlarged sample
To confirm the higher predictive accuracy of training-based learning rate compared with RAN, the prediction analysis was also performed in a larger sample including 35 participants (18 normal, 17 poor; Supplementary Table S1 and S3). Again, the cross-validated prediction accuracy of the learning rate in the artificial-letter training (P=0.0210; 68.6%; sensitivity: 58.8%; specificity: 77.8%) outperformed the established behavioural reading precursors (RAN P=0.1109; 57.1%; sensitivity: 64.7%; specificity: 50%). The comparable results in this larger sample reinforce the finding of the manuscript.

Prediction based on late negativity ERP
Adding the left-hemispheric incongruency difference of the late negativity (644-704 ms), RAN, and the learning rate to the multiple logistic regression model resulted in a somewhat reduced predictive cross-validated accuracy of 75% (Supplementary Table S3). Based on the applied stepwise procedure, the late negativity ERP (P=0.0747) and the learning rate (P=0.0261) were included in the model (specificity: 80%; sensitivity: 69.2%).
This result confirms that ERP data reflecting audiovisual integration improve the prediction of future reading outcome. Including the ERP of the initial time window of audiovisual integration (382-442 ms) to the model resulted in a better prediction than including the late negativity ERP (644-704 ms). Including both ERP components (382-442 ms: P=0.0986; 644-3 regression model only reached a cross-validated prediction accuracy of 71.4%, which was lower that the prediction accuracy of the models including each ERP component separately (Supplementary Table S3). Therefore, the two ERP components probably reflect distinct neural processes, each of which carries different information regarding audiovisual integration that might be crucial for successful reading acquisition.

Supplementary ERP analyses
Time windows were selected using adaptive segmentation based on global field power (GFP) peaks to define ERP components 1

Supplementary fMRI analysis
Event-related BOLD activity was analyzed by computing a 2 x 2 analysis of variance (ANOVA) to investigate the interaction of the factors reading fluency (normal vs. poor) and congruency (incongruent vs. congruent). Whole-brain analysis revealed a significant interaction of reading fluency and congruency in the right middle frontal gyrus (MFG; P<0.001 uncorrected, cluster level corrected P<0.05; Supplementary Figure S3). While future poor readers engaged the right MFG significantly stronger for incongruent than congruent pairs (t(26)=3.51, P=0.0084), future normal readers showed a significantly stronger deactivation for incongruent than congruent pairs (t(26)=-3.17, P=0.0189) in this region. For incongruent pairs, activation in the right MFG was significantly enhanced for poor readers compared with normal readers (t(26)=4.43, P=0.0008), while no difference was found for the congruent condition (t(26)=-1.47, P=0.4716). Hence, the incongruency difference, was significantly more positive for future poor readers than normal readers (t(26)=4.73, P<0.0001). In beginning readers, an overactivation in the right MFG has been previously reported to predict poor word decoding skills 2 , an ability that strongly relies on letter-speech sound binding.
Linear mixed models with the factors reading fluency (normal vs. poor) and congruency (congruent vs. incongruent) were also performed for mean beta values derived from the right hemispheric PT and vOT ROIs. These analyses yielded no significant main or interaction effects.

Methods
Familial risk for dyslexia: Parents completed the Adult Reading History Questionnaire (ARHQ) to assess familial risk for dyslexia. 31 children had at least one parent with an ARHQ value greater than 0.3, indicating an increased familial risk, 2 children had an older sibling with reading difficulties, and 2 children had a history of specific language impairment.

Standardization of reading fluency test:
Due to the lack of age-matched norms in the middle of 1 st grade, the fluency tests were standardized in an independent sample of 75 German-speaking children, coming from similar school districts as the sample of this study.
With parents' written informed consent, test standardization was performed at school, also five to seven months after the onset of formal reading acquisition.
Artificial-letter training: First, participants were introduced to the six grapheme-phoneme correspondences, while each false font character appeared on a computer screen and its corresponding speech sound was presented over headphones. Then, in a series of test trials, participants were presented with one speech sound and two to four false font characters with the instruction to click on the false font character that was previously associated with this speech sound. The trials of the computerized training contained background images including a banner on which one to six false font characters of the untrained set were shown implicitly but clearly. Participants' performance was calculated based on 131 test trials by introducing a weighting factor, accounting for the varying number of presented items per trial (Aw = I/Imax*A; Aw: weighted accuracy, I: number of presented items; Imax: maximum permissible number of presented items, A: unweighted accuracy of the trialcorrect/ incorrect).

Task design:
The task included block-wise presentation of bimodal congruent and incongruent, and unimodal visual and auditory stimuli (four conditions). In each condition, the six trained false fonts and/or speech sounds were presented 9 times, resulting in 54 trials per condition. Stimulus presentation time was 613 ms with an interstimulus interval of 331 or 695 ms. 16 unimodal and bimodal blocks (four blocks per condition) alternated pseudorandomly and consisted of 15 randomly presented stimuli and targets (6 targets per condition). Fixation periods of 6 or 12 s were presented between blocks. The duration of the task was 375 s. Next to the described paradigm, the subjects also completed three further parts 6 of the implicit audiovisual target detection task, including untrained false fonts and phonemes, real letters and the corresponding speech sounds, and digits and spoken number names.

Presentation of stimuli:
Using video goggles (VisuaStimDigital, Resonance Technology, Northridge, CA), false font characters were centrally presented in black on a grey background (mean visual angle: 2.8º horizontally; 4.8º vertically). Speech sounds, spoken by a female speaker, were digitally recorded (sampling rate: 44.1 kHz; 32 bit) and normalized using Audacity (± 1 dB). For high-quality binaural auditory stimulation, in-ear headphones were used (MR confon GmbH, Magdeburg) and acoustic noise of the MRI was kept to the minimum, by implementing a SofTone factor in the sequence. Participants wore soundabsorbing over-ear headphones, which were additionally padded with a custom made foam layer. Sound level was individually adjusted. In addition, a sound-absorbing mat was placed around the participants in the MRI bore. Next to conventional head-padding, we used a custom-made EEG head pad to reduce head movement. Task performance: Mean task performance and reaction times were calculated for each participant. Group comparisons (normal vs. poor reading fluency) of mean task performance and reaction times were performed with two-tailed independent t-tests. Three subjects from the poor reading group were excluded from this analysis due to technical problems, resulting in inaccurate task performance recording. EEG data processing: EEG data processing was performed using Brain Vision Analyzer Normalization to Montreal Neurological Institute (MNI) standard space was performed based on deformations derived from the segmentation and a paediatric anatomical template (age range 5.9-8.5 years) created using the Template-OMatic toolbox 4 .

Movement artefact correction: Movement artefact correction was performed using the
ArtRepair toolbox 5 . Based on the scan-to-scan motion threshold of 1.5 mm/TR, affected volumes were repaired using linear interpolation between the nearest unrepaired scans. Out of 28 data sets, 12 had at least one volume exceeding the defined threshold and less than 8.5% of the scans were repaired per participant. The data sets of four participants were not analyzed in full length either because the last blocks exceeded the defined motion thresholds (2 subjects, 4 and 6 blocks discarded respectively) or because scanning had to be stopped prematurely (2 subjects, 2 blocks discarded).