Rhythmic motor behavior explains individual differences in grammar skills in adults

A growing body of literature has reported the relationship between music and language, particularly between individual differences in perceptual rhythm skill and grammar competency in children. Here, we investigated whether motoric aspects of rhythm processing—as measured by rhythmic finger tapping tasks—also explain the rhythm-grammar connection in 150 healthy young adults. We found that all expressive rhythm skills (spontaneous, synchronized, and continued tapping) along with rhythm discrimination skill significantly predicted receptive grammar skills on either auditory sentence comprehension or grammaticality well-formedness judgment (e.g., singular/plural, past/present), even after controlling for verbal working memory and music experience. Among these, synchronized tapping and rhythm discrimination explained unique variance of sentence comprehension and grammaticality judgment, respectively, indicating differential associations between different rhythm and grammar skills. Together, we demonstrate that even simple and repetitive motor behavior can account for seemingly high-order grammar skills in the adult population, suggesting that the sensorimotor system continue to support syntactic operations.

synchronization performance 26 .In addition, endogenous rhythmic behavior has been evaluated by means of spontaneous motor tapping that involves finger tapping at one's most natural and comfortable interval, often associated with 'personally preferred' or 'optimal' tempo 27,28 .Here we measured these motoric rhythm skills as well as a perceptual rhythm skill using a same/different rhythm discrimination task.For grammar measures, we employed two language tasks on spoken sentences wherein participants were asked to judge syntactic wellformedness of spoken sentences with either a subject-or object-relative center-embedded clause or to identify the grammatical agent linked to an action verb 29 , 30 .In addition, we assessed participants' verbal working memory, which was used as covariates along with demographic information including age, gender, and music background.We predicted that performance on the rhythm tasks would be associated with performance on the grammar tasks, even after controlling for verbal working memory.

Description of the behavioral tasks and outcomes
A total of 150 participants (mean age = 20.4years, SD = 2.46) underwent two language tasks on spoken sentences (grammaticality judgment and sentence comprehension), three rhythm tasks (rhythm discrimination, spontaneous tapping, and auditory beat tapping), and a working memory task (letter-number sequencing) (Fig. 1).The behavioral results are summarized in Fig. 2 and Table 1.
The grammaticality judgment task assessed participants' ability to discern syntactic well-formedness on a series of spoken sentences that differed in syntactic complexity, containing either a subject-relative (SR) or objectrelative (OR) center-embedded clause (Fig. 1A).Half of these sentences were grammatically correct while the other half had a morpho-syntactic error.As expected, participants performed worse on detecting grammatical errors in OR compared to SR sentences (Fig. 2A; F(1,149) = 38.1,P = 6 × 10 −9 ), due to its non-canonical structure 31 .
The sentence comprehension task also consisted of a series of SR and OR sentences (all syntactically wellformed).Half of these sentences were mixed with multi-talker babble noise.In this task, participants identified the gender of a noun (e.g., boys) linked to an action verb (e.g., assist), while ignoring a noun related to any of the following preference verbs: love, adore, dislike, and hate (Fig. 1B).As was the case with the grammaticality judgment task, OR sentences yielded lower comprehension accuracy than SR sentences (χ 2 (1) = 527.8,P < 2 × 10 −16 ).The background noise yielded lower accuracy (χ 2 (1) = 125.7,P < 2 × 10 −16 ), especially for OR sentences, resulting in significant interaction with the sentence type (χ 2 (1) = 10.7,P = 0.001) (Fig. 2B).www.nature.com/scientificreports/ In the rhythm discrimination task, participants determined whether pairs of short rhythmic sequences were the same or different (Fig. 1C).Half of the rhythms were metrically simple in that the inter-tone intervals were regularly arranged into groups of 1-s beat intervals, rendering a sense of beat 16 .The other half of rhythms were metrically complex, making it difficult to detect the underlying beat.Consistent with previous findings 16,32 , metrically complex rhythms were more difficult to discriminate than simple rhythms (Fig. 2C; F(1,149) = 105.6,P < 2 × 10 −16 ).
In the spontaneous tapping task, participants were instructed to tap their right index finger consistently at their most natural and comfortable tempo (Fig. 1D).The average tempo (0.66 s) and its range (0.18-1.6 s) were similar to those in the previous report 28 .We gauged participants' internal rhythm performance using a consistency index derived from the variance of inter-tap intervals (ITI; see Methods for more details).
In the auditory beat tapping task, participants tapped their right index finger to metronome beats with inter-beat intervals (IBIs) of 500, 750, 1125, and 1687 ms (synchronization phase) (Fig. 1E).These IBIs roughly correspond to 120, 80, 53, and 36 beats-per-minutes (BPM).Participants were also required to continue tapping consistently at the same tempo after the metronome stopped halfway through (continuation phase).A synchronization index was calculated from the uncertainty of the distribution of tap timings with respect to beat timings for the synchronization data, and a consistency index was computed from the ITI variance for the continuation data (see Methods for more details).We found that longer IBIs yielded more synchronized tapping responses during the synchronization phase (χ 2 (3) = 177.0,P < 2 × 10 −16 ), while shorter IBIs yielded more consistent tapping during the continuation phase (χ 2 (3) = 16.4,P = 9 × 10 −4 ) (Fig. 2E).
Lastly, we measured participants' working memory using the letter-number sequencing task (Fig. 1F).In this task, participants were required to verbally re-organize a series of alternating numbers and letters in ascending orders (e.g., for K3F7R5, the correct answer is 357FKR).Working memory has been shown to be positively associated with a wide range of music and language abilities including sentence comprehension 33 and grammaticality judgment 34 , as well as timing 35 , perceptual rhythm 25 , and motor rhythm 36 .Thus, we included the working memory measure as a covariate along with demographic variables of age, gender, and music background in the regression analyses.
To further learn whether different rhythm skills account for unique variance in grammar skills, we performed multiple regression analyses with all four rhythm measures-that were highly inter-related with each otherincluded as predictors.In this most stringent model, beat synchronization still robustly accounted for sentence comprehension (b = 0.40, t = 3.62, P = 4 × 10 −4 ), whereas rhythm discrimination (b = 0.20, t = 2.22, p = 0.028) was predictive of grammaticality judgment, even after controlling for the other rhythm measures (Table 3).This suggests that there are dissociable aspects of temporal processing in the relationship between rhythm and grammar skills.

Discussion
Although a growing body of research has demonstrated connections between musical rhythm and linguistic grammar skills [5][6][7][8] , the evidence has been limited in children and mostly to a perceptual rhythm skill, i.e., musical rhythm discrimination.In the present study with 150 healthy young adults, we demonstrated that a set of rhythm skills in both receptive and expressive domains were associated with receptive grammar tasks.Notably, even simple and repetitive motor behavior involving spontaneous or synchronized finger tapping was predictive of comprehension on spoken sentences that varied in syntactic structure.This finding extends the existing evidence beyond the perceptual rhythm toward expressive/motoric rhythm skills, as well as from children to young adults.Together, these results suggest that common neurobiological mechanisms may be at play in both rhythm and syntactic processing 9 , contributing to the association between individual differences in rhythm and grammar that persists into adulthood.The present findings may shed light on the role of the motor system in auditory syntactic processing 37 , by showing that even a relatively simple motor task such as spontaneous or continued finger tapping explained both grammaticality judgment and syntactic interpretation on spoken sentences.Spontaneous rhythmic behavior has been theorized to reflect activity of self-sustaining internal oscillators 27,28 , which is likely regulated by www.nature.com/scientificreports/cortico-striatal motor circuits 38,39 .Our data speak to the functional role of internal rhythmicity in readily analyzing syntactic structures during auditory sentence processing.In addition, beat synchronization tapping had a robust explanatory power in predicting sentence comprehension performance, which was significant even when controlling for the effects of the other rhythm measures.Given that sensorimotor synchronization to external rhythm is thought to rely on predictive timing mechanisms 26,40 , the current finding supports the idea that temporal prediction (e.g., when the next event occurs) may play a functional role in syntactic prediction (e.g., what word comes next given the preceding words in a sentence) 37,41 .By contrast, the synchronized tapping measurements had a less robust relationship with grammaticality judgement performance, suggesting that this task may require some different mechanisms that cannot be solely explained by (forward) temporal prediction, such as (backward) re-analysis of preceding syntactic contents 42,43 .The re-analysis of temporal structure may have been captured by the significant relationship between rhythm discrimination and grammaticality judgment.This is perhaps because both tasks require back-and-forth comparison of words or tones in relation to the preceding ones to achieve timely judgement of the linguistic or rhythmic structure.This finding is in line with the idea that musical rhythm processing may recruit neurobiological resources for rule-based temporal processing shared by syntax processing system 9 .
Together, the current results suggest that different rhythm skills may uniquely contribute to accounting for different grammar skills.This underscores the importance of assessing multiple rhythm skills to gain a more complete picture of the rhythm-grammar relationship, which may be manifested through multiple neurocognitive mechanisms such as precise temporal predictions and re-analysis of temporal structures.Moreover, from a broader perspective, our findings suggest that the relationship between music and language may not be a unitary construct, but rather a consequence of multiple mechanisms shared by different aspects of music and language.For instance, phonological processing may be correlated more dominantly with components of rhythm processing that tap into precise auditory encoding 44,45 .Thus, future investigation of a wide range of music and language tasks using a well-powered sample may provide a more comprehensive understanding of the multifaceted associations between music and language.
In sum, the current study provides evidence for the association between expressive rhythm skills and receptive grammar skills in an adult population.This is in line with clinical observations that children with developmental language disorder often exhibit deficits in producing rhythmic movements 46 , which might build upon common genetic underpinnings 47,48 .Moreover, we found different rhythm skills uniquely explained different grammar skills, suggesting that there are dissociable aspects of temporal processing in the rhythm-grammar relationship.Our findings suggest that what has been regarded as a core linguistic operation, i.e., syntax, is associated with domain-general temporal processing in the sensorimotor system.
to immediately repeat the practice trial if they chose.There was no time constraint in this task.Participants underwent 48 trials presented in a randomized order without feedback during the main task.A 15-s break was provided every 12 trials.We computed a d-prime (d′) score for each sentence type, which was obtained by calculating the difference of the z-scores between hit and false alarm rates.The rates of 0 and 1 were adjusted to prevent an indefinite d′ score 49 .

Sentence comprehension task
The language materials were comprised of 64 base sentences, each consisting of six words: a male noun (e.g., men, sons, kings, etc.), a female noun (e.g., women, daughters, queens, etc.), a gender-neutral noun (e.g., children, farmers, artists, etc.), a relative pronoun 'that' , a transitive action verb (e.g., help, protect, tease, bully, etc.), and one of four transitive preference verbs: love, adore, hate, and dislike.The six words were arranged to contain either an SR or OR center-embedded structure by switching the position of the noun and the verb within the relative clause (Fig. 1B).An action verb and a preference verb were located in either the main or relative clause.Half of the sentences were mixed with a background noise of multi-talker babble consisting of two male and two female speakers at a signal-to-noise ratio of − 1 dB.The sentence type (i.e., SR or OR), the clarity of the sentence (i.e., clear or noisy), the action verb location (i.e., main or relative clause), and the gender of the agent (i.e., female or male) were fully counterbalanced across trials.In each trial, participants listened to a spoken sentence and indicated the gender of the individuals performing an action, while disregarding those who love/adore/hate/dislike others, by pressing either the 'male' (left arrow) or 'female' (right arrow) key within 3 s.After the task instruction, participants underwent 8 practice trials during which the corresponding sentences were concurrently presented on the screen.They were provided with visual feedback and (if needed) additional verbal instruction from the experimenter following each response.Then they received 16 more trials with visual feedback but without visual presentation of the sentences.During the main task, participants completed a total of 64 trials presented in a randomized order without feedback and without visual presentation of the sentences.The sentences used during the practice trials were not presented.There was a 15-s break after every 16 trials.
Percent accuracy for each combination of sentence type and clarity conditions is displayed in Fig. 2B.We used a mixed effects logistic regression model (glmer in the lme4 package) 50 to analyze the binary correct/incorrect responses, with the factors of sentence type (SR and OR), clarity (clear and noisy), and an interaction between the two as fixed effects and with a random intercept of participant.We evaluated statistical significance using the Type III Wald chi-square tests in the car package 51 .

Rhythm discrimination task
Rhythm stimuli were comprised of 10 metrically simple and 10 metrically complex rhythmic tone sequences and their respective variants, chosen from a set of rhythms used in Grahn and Brett 16 .Each rhythm consisted of seven or eight woodblock sounds (20 ms in duration) with intervals of 250, 500, 750, or 1000 ms between the sounds, spanning 3 s.A variant of each rhythm was made by switching the order of two adjacent intervals in the rhythm (Fig. 1C).On each trial, participants listened to a pair of rhythms presented sequentially with a 2-s interval and judged whether the two rhythms were the same or different by pressing either the 'same' (left arrow) or 'different' (right arrow) key.There was no time constraint in this task.Each rhythm was presented twice, once paired with itself (i.e., 'same') and once paired with its variant (i.e., 'different'), resulting in a total of 40 trials presented in a randomized order.Participants performed two practice trials (one 'same' and one 'different') with feedback using one of the 10 simple rhythms and its variant.Participants were allowed to replay the practice if they chose and hear the same two rhythm pairs again.No feedback was given during the main task.We computed a d′ score for each rhythm type.The d′ data were analyzed using a repeated-measures ANOVA with a within-subject factor of rhythm type (metric simple and metric complex).

Spontaneous tapping task
In the spontaneous tapping task, participants were instructed to tap the space bar with their right index finger consistently at their most natural and comfortable rate.They were encouraged not to tap too fast or too slow but at a tempo that they could keep tapping most consistently (Fig. 1D).Each of the four trials was terminated after 43 taps on the space bar (i.e., 42 ITIs).There was a short practice with 15 taps.The task was performed prior to the auditory beat tapping task (Fig. 1G), to prevent influence of exposure to metronome beats on spontaneous motor tempo.The first two ITIs were discarded form each trial.We obtained a mean ITI from the remaining 40 ITIs, which was calculated after excluding ITIs exceeding the initial mean ITI ± 3 standard deviations.We obtained a consistency index from a coefficient of variation (CV) of ITIs, i.e., a standard deviation of the remaining ITIs divided by the mean ITI.The consistency index was computed as − log (CV) for each trial and averaged across the four trials (Fig. 2D).

Auditory beat tapping task
Each trial of the task started with synchronization and ended with continuation.During the former phase, participants tapped the space bar with their right index finger in synchrony with a metronome beat (i.e., isochronous woodblock sounds each with 20 ms in duration).They were instructed to start tapping after the first four beats, accompanied by a countdown timer on the screen.The metronome stopped immediately after 21 taps (i.e., 20 ITIs), after which the continuation phase began wherein participants continued tapping consistently while trying to maintain the same tempo without metronome beats.The continuation phase lasted until receiving 21 tapping responses.Each trial used one of four IBIs, equally spaced in log scale: 500, 750, 1125, and 1687 ms (Fig. 1E).Each of these four tempos was repeated four times for a total of 16 trials.The presentation order of the four tempos was pseudo-randomized within each four-trial set, such that there was no transition from shortest to longest IBI or vice versa.Participants underwent a practice trial with an IBI of 850 ms and were allowed to repeat the practice until understanding the instruction.A 30-s break was given after each set.
We analyzed beat tapping data during the synchronization and continuation phases separately.For the synchronization tapping data, we computed a circular metric (SI) based on Shannon entropy (SE): where M is the number of bins, covering the range from − 180° to + 180°; p(i) is the probability of tap timings (i.e., relative phase angles with respect to nearest beat timings) assigned to i th bin; N is the total number of tapping responses.The bin size was set to 15° based on an optimal bin search procedure described in Kim et al. 25 The SE quantifies the degree of spreading of a data distribution, i.e., the relative phase distribution in our data.We derived a synchronization index (SI) from SE as shown in the formula above.SI ranges from 0 (all responses occurred in different bins) to 1 (all responses occurred in a single bin), exhibiting a less skewed distribution compared to other consistency measures (e.g., 23 ).For the continuation tapping data with no reference beat timings, we computed a consistency index in the same way as for the spontaneous tapping data.The synchronization and consistency indices were computed for each tapping trial and averaged across the four trials for each tempo condition.The synchronization and consistency indices were analyzed using the Friedman test, a nonparametric test for repeated-measures data, with a within-subject factor of IBI (500, 750, 1125, and 1687 ms).

Letter-number sequencing task
The verbal items consisted of a set of spoken letters ('C' , 'D' , 'F' , 'H' , 'J' , 'K' , 'L' , 'P' , 'Q' , 'R' , 'S' , 'T' , 'W') and numbers (from 'one' to 'nine').On each trial, participants heard a list of alternating numbers and letters presented via headphones with a 1.5-s interval (Fig. 1F).After presentation of the sequence, they were required to say the numbers in order from the smallest to the largest and then the letters in alphabetical order.An experimenter manually recorded the correctness.There were two practice trials with a list length of three and four.The main task started from a set of four trials with three list items each, and the list length increased after every four trials until participants failed to recall all trials in each length, after which the experiment ended.The largest length was eight, resulting in a maximum of 24 trials.The letter-number sequencing score was computed as the number of trials correctly recalled.

Multiple linear regression analysis
For each of the behavioral tasks, we used a representative outcome by collapsing scores across task conditions (Fig. 2).For example, a single measure was computed for synchronization by averaging consistency measures (SI) across four IBI conditions.A multiple linear regression analysis was performed for each combination of the two grammar and four rhythm measures, in which one of the four rhythm measures, working memory (i.e., letter-number sequencing score), gender, age, and years of musical training were entered as independent variables into the model to predict each of the two grammar measures (Table 2).The subsequent P values of the regression coefficients were adjusted with the false discovery rate (FDR) procedure (α = 0.05) to correct for multiple testing.

Figure 1 .
Figure 1.Overview of the experimental procedures.(A) Example sentences in grammaticality judgment task.Participants indicate if each spoken sentence is grammatically correct or not.Half of the sentences contain a subject-verb agreement (SVA) error or a past tense error.The relative clause is underlined, and the syntactic error is shown as italic in the parenthesis.(B) Example sentences in sentence comprehension task.Participants indicate the gender of individuals linked to an action verb, but not to four pre-designated preference verbs (love, adore, hate, and dislike), on each spoken sentence presented with or without a multi-talker babble noise.The relative clause is underlined, and the target action verb is in bold.(C) Schematic representation of rhythm sequences in rhythm discrimination task.Participants listen to each pair of rhythms and indicate if they were the same or different.(D) Spontaneous tapping task.Participants are instructed to tap consistently at their own tempo without external metronomes.(E) Auditory beat tapping task.Participants tap along with metronome beats presented in one of four tempos (inter-beat intervals of 500, 750, 1125, or 1687 ms) (synchronization phase) and continue tapping after the metronome stops (continuation phase).(F) Schematic representation of auditory sequence in letter-number sequencing task.Participants listen to a sequence of alternating letters and numbers and repeat them back in a sorted order.(G) Each participant completes the six behavioral tasks in one of the two orders shown.See Methods for more details.

Figure 2 .
Figure 2. Rhythm and grammar performances.Half violin plots display the distributions of task performance.Box plots display the median (horizontal line within the box), the first quantile (lower boundary), and the third quantile (upper boundary).(A) Grammaticality judgment accuracy.(B) Sentence comprehension accuracy.(C) Rhythm discrimination accuracy.(D) Spontaneous tapping consistency (left) and mean tapping interval (right).(E) Auditory beat tapping consistency during synchronization phase (top) and continuation phase (bottom).

Table 1 .
Mean raw scores (M) and standard deviations (SD) of demographic and behavioral variables.

Table 2 .
Standardized regression coefficients, t values, and uncorrected P values in predicting each of two grammar measures as a function of demographic variables and each of four rhythm measures.Significant values are in bold, and covariates are italicized.

Table 3 .
Standardized regression coefficients, t values, and uncorrected P values in predicting each of two grammar measures as a function of demographic variables and all four rhythm measures.Significant values are in bold, and covariates are italicized.