Enhanced activations in the dorsal inferior frontal gyrus specifying the who, when, and what for successful building of sentence structures in a new language

It has been argued that the principles constraining first language acquisition also constrain second language acquisition; however, neuroscientific evidence for this is scant, and even less for third and subsequent languages. We conducted fMRI experiments to evaluate this claim by focusing on the building of complex sentence structures in Kazakh, a new language for participants having acquired at least two languages. The participants performed grammaticality judgment and subject-verb matching tasks with spoken sentences. We divided the participants into two groups based on the performance levels attained in one of the experimental tasks: High in Group I and Low in Group II. A direct comparison of the two groups, which examined those participants who parsed the structures, indicated significantly stronger activations for Group I in the dorsal left inferior frontal gyrus (L. IFG). Focusing on Group I, we tested the contrast between the initial and final phases in our testing, which examined when the structures were parsed, as well as the contrast which examined what structures were parsed. These analyses further demonstrated focal activations in the dorsal L. IFG alone. Among the individual participants, stronger activation in the dorsal L. IFG, measured during the sentence presentations, predicted higher accuracy rates and shorter response times for executing the tasks that followed. These results cannot be explained by task difficulty or memory loads, and they, instead, indicate a critical and consistent role of the dorsal L. IFG during the initial to intermediate stages of grammar acquisition in a new target language. Such functional specificity of the dorsal L. IFG provides neuroscientific evidence consistent with the claims made by the Cumulative-Enhancement model in investigating language acquisition beyond target second and third languages.

The participants shown in Figs.3a-d are listed here as a subset of the participants reported on in our previous study 1 .Participants 1-14 were bilinguals and Participants 15-31 were multilinguals in that study; see our previous paper 1 for details including the definition of bilinguals/multilinguals.In the present study, there were no statistical differences between the bilinguals and multilinguals under the G4 condition regarding the behavioral results or brain activations.The  rANCOVA (uncorrected p < 0.001 for the voxel level and FWE corrected p < 0.05 for the cluster level).The NV pair′ contrast was performed in the first-level analyses; activations were averaged among OS, OO, and SO conditions to replicate the results of Group Ⅰ (Fig. 4d).
An exclusive mask of negative activation for the 1st quarter was applied (one-sample t-test, uncorrected p < 0.05).(d) L. IFG activations revealed by the interaction of a two-way [groups × quarters] rANCOVA (uncorrected p < 0.001 for the voxel level and FWE corrected p ≤ 0.05 for the cluster level).The Sentence′ contrast was performed in the first-level analyses; activations were averaged among OS, OO, and SO conditions.noun phrase.The stimulus sentences were checked for grammaticality and accuracy by native speakers of Kazakh.

Stimuli
Under the G4 condition, each trial began with a visual cue of "G4" (Fig. 2).In a task trial, this cue was followed by the "Lexical list" of five Kazakh words: two verbs (in the simple past tense with a third-person singular suffix), two nouns (a pronoun and a proper noun; in the nominative case), and adamdï (or adam).This order of presentation was maintained throughout all trials.Given that the verbs used in the study had generally a greater number of syllables than did the nouns, we adjusted the stimulus length of a verb and that of a noun to 0.6 s and 0.5 s, respectively; the original pitch for each word was maintained throughout.For the Lexical list, each of the five Kazakh words presented auditorily was accompanied by a visual presentation of an English translation of the Kazakh word (see Fig. 2b).Each visual presentation started 0.25 s before the auditory presentation, and continued until 0.25 s after the end of the auditory presentation.We further added an interval of 0.5 s between each visual stimulus.
In each demo trial (Fig. 2a), a five-word sentence ("Sentence" capitalized here) was presented auditorily.A visual sign (either + or −) was simultaneously presented for 5.0 s, indicating the grammaticality of the sentence (see Table 1), either grammatical (+) or ungrammatical (−); an ungrammatical sentence always included an error in the verb suffix.
We then randomly extracted an "NV pair" from the Sentence in the same trial.After a pause of 2.7 s, the NV pair was presented auditorily with a silent period of 1 s between the noun and verb.For the NV pair, another +/− sign was simultaneously presented for 2.7 s.This sign indicated the correctness of the NV pairing, either matched (+) or mismatched (−) with the SV pairs in the sentence structure.Note that there were two SV pairs in each Sentence (see Table 1).In the NV pairs, the nouns were presented always without a suffix for the accusative case, and the verbs were presented always with a third-person singular suffix in the simple past tense.
All participants in the present study successfully completed the G1-G3 conditions in the previous study 1 .Given that the participants were familiar with the Kazakh lexical items to be used in the present study, the Lexical list was not shown during the demo trials under the G4 condition.For most participants (23 out of 31), the first session in the G4 condition was conducted on the same day that they completed the G3 condition.
During the scans, the participants wore MRI-compatible video goggles (resolution = 800 × 600 pixels, framerate = 60 fps; VisuaStim Digital, Resonance Technology Inc., Northridge, CA).A small red cross was always shown at the center of the goggle screen.The participants were instructed to look at the small red cross as much as possible during the tasks.The Presentation software package (Neurobehavioral Systems, Albany, CA) supported and controlled the presentation of the stimuli and the collection of the behavioral data [accuracy and response times (RTs)].

Tasks
In each task trial (Fig. 2b), a Lexical list was presented as explained above, and then a Sentence was presented after a pause of 0.75 s.We tested the grammaticality task ("GR task"), in which a Sentence using all of the five words in the Lexical list was presented.The symbols of a blue "+" and yellow "−" were then presented.The participants judged whether the Sentence was grammatical (+) or ungrammatical (−), and responded within 3.0 s by pressing one of the two colored buttons (blue or yellow) on a response pad ("GR response").
After a pause of 0.75 s, we tested the subject-verb task ("SV task"), in which an NV pair was presented.The symbols of a green "+" and red "−" were then presented.The participants judged whether the NV pair was matched (+) or mismatched (−) with the SV pairs in the sentence structure, and responded within 3.0 s by pressing one of the two colored buttons (green or red) ("SV response").We measured RTs from the onset of the GR or the SV response.
The demo and task blocks under the G4 condition were given over one to three days, with possible intervals of 12 ± 15 days (mean ± SD; range: 1-63 days).At the beginning of each day, the participants reviewed their knowledge of the G1-G3 conditions outside of the scanner, until the participants performed accurately for each condition (at least six out of the eight task trials for both the GR and SV tasks).A one-day experiment (equal to or less than 12 scanning runs) took less than 2.5 hours, including instructions (see the Introduction) along with one or two breaks outside of the scanner.
As tested in our previous experiment 1 , three of the "Words" trials in a scanning run were followed by a block of eight G4 trials, and then by two Words trials (i.e., 13 task trials in total).Each Words trial began with a visual cue of "W," followed by the presentation of a Lexical list of five words [two verbs, two nouns, and adamdï (or adam); the order was maintained throughout all trials].In the Words trials, the Lexical list was basically the same as in the G4 trials, except that those words could have suffixal changes.A set of five words ("Lexicons") was then presented auditorily to the participants.The Lexicons consisted of the same five words as the Lexical list in the same order, but one to four of them had a different suffix.The duration of the Lexicons phase was adjusted to 4.35 s.The symbols of a blue "1," yellow "2," green "3," and red "4" were then presented.The participants judged the number of words that had a different suffix, and responded within 3.0 s by pressing one of the four colored buttons of a response pad ("Words response").

fMRI data analyses
The acquisition timing of each axial slice of the MRI scans (see the main text) was corrected using the middle slice (the 15th slice chronologically) as a reference for the functional images.We spatially realigned each volume to the first volume of consecutive runs, and a mean volume was then calculated.We set the threshold for head movement during a single run within a displacement of 2 mm in any of the three directions, and within a rotation of 1.4˚ around any of the three axes.These thresholds were empirically determined from our previous studies 1,2 .If a run included one or more images over this threshold, we replaced the outlying image with an interpolated image.This interpolated image was the average of the chronologically former and latter ones of the outlying image.We then spatially realigned each volume to the first volume of consecutive runs again.Due to excessive head movement, even after this realignment procedure was conducted, one or two runs were excluded from the analyses for three participants.The realigned data were resliced every 3 mm using seventhdegree B-spline interpolation.Each participant's structural image was matched with the mean functional image generated during realignment.The resultant structural image was spatially normalized to the standard brain space as defined by the Montreal Neurological Institute (MNI).This was accomplished by using the extended version of the unified segmentation algorithm with light regularization.This is a generative model that combines tissue segmentation, bias correction, and spatial normalization in a single model 3 .The resultant deformation field was applied to each realigned functional image to be spatially normalized with non-linear transformations.All normalized functional images were then "smoothed" by using an isotropic Gaussian kernel of 9 mm full-width at half maximum (FWHM).
In the first-level analysis (i.e., the fixed-effects analysis within a participant), each participant's hemodynamic responses were estimated for the following events: Lexical list, Sentence, GR response, NV pair, and SV response (see Fig. 2b), together with Lexicons and Words response.The events of Sentence, GR response, NV pair, and SV response were separated for each of the OS, OO, SO, and SS conditions.Each event was estimated using the "boxcar" function [taking a constant value during the event period (0 otherwise)], and then that function was overlaid with a hemodynamic response function.Low-frequency noise was removed by a high-pass filtering at 1/128 Hz.To minimize the effects of head movement, the six realignment parameters obtained from preprocessing were included as nuisance factors in a general linear model.The estimated responses for each event were then generated using the general linear model for each participant.
In the second-level analysis (i.e., the random-effects analysis for a group), the above estimated responses were used for the inter-subject comparison.We adopted whole-brain analyses, in which we performed the following tests with three nuisance factors (age, gender, and laterality quotient): one-way repeated measures analyses of covariance (rANCOVAs) within each group (Figs.4a, 4b, 4d, and 4e), unpaired t-tests for direct group comparisons (Fig. 4c), rANCOVAs for all participants (Supplementary Figs.S3a and S3b), and two-way [groups × quarters] rANCOVAs (Supplementary Figs.S3c and S3d).For the anatomical identification of the activated regions, we used the Anatomical Automatic Labeling (AAL) method (http://www.gin.cnrs.fr/AAL2/) 4 and the cortical data with the region names as provided by Neuromorphometrics Inc. (http://Neuromorphometrics.com/) under academic subscription.In addition to the whole-brain analyses, we adopted analyses for each region of interest (ROI) by using the MarsBaR-toolbox (http://marsbar.sourceforge.net/).

Supplementary Figure S3 .
Global and local syntax-related activations.(a) Bilateral activations for all participants in the [Sentence − Lexical list] contrast (abbreviated as Sentence′).Activations during the fourth quarter are shown under the OS and SO conditions [family-wise error (FWE) corrected p < 0.05 for the voxel level].(b) Localized activations in the [NV pair − Lexical list] contrast (abbreviated as NV pair′).The Sentence′ and NV pair′ contrasts were performed in the second-level analyses.(c) L. IFG activations for all participants revealed by the main effect of quarters in a two-way [groups × quarters]

Basic data for participants.
table above shows the number of languages (second to fifth languages, L2-L5) that each participant had exposure to, if any were known.Language proficiency in English was evaluated by the Listening Comprehension sub-test of the Avant STAMP 4S (Standards-based Measurement of Proficiency -4 Skills).This English proficiency test was administered to all participants.With respect to the other languages, participants first received a sample test for at least one language other than English, and those who were able to answer the first question in this sample test then received the full test.If a participant stopped taking the sample test, the score was not registered.DOR: duration of residence in a country where the language is an official language; DOE: duration of exposure to each language; Lg.: language; DE: German; ES: Spanish; FR: French; IT: Italian; KO: Korean; PT: Portuguese; RU: Russian; ZH: Chinese.Supplementary TableS2.