Musical interaction is influenced by underlying predictive models and musical expertise

Musical interaction is a unique model for understanding humans’ ability to align goals, intentions, and actions, which also allows for the manipulation of participants’ internal predictive models of upcoming events. Here we used polyrhythms to construct two joint finger tapping tasks that even when rhythmically dissimilar resulted in equal inter-tap intervals (ITIs). Thus, behaviourally a dyad of two musicians tap isochronously at the same rate, yet with their own distinct rhythmical context model (RCM). We recruited 22 highly skilled musicians (in 11 dyads) and contrasted the effect of having a shared versus non-shared RCM on dyads’ synchronization behaviour. As expected, tapping synchronization was significantly worse at the start of trials with non-shared models compared to trials with a shared model. However, the musicians were able to quickly recover when holding dissimilar predictive models. We characterised the directionality in the tapping behaviour of the dyads and found patterns mostly of mutual adaptation. Yet, in a subset of dyads primarily consisting of drummers, we found significantly different synchronization patterns, suggesting that instrument expertise can significantly affect synchronization strategies. Overall, this demonstrates that holding different predictive models impacts synchronization in musicians performing joint finger tapping.


Musical interaction is influenced by
Musical interaction is a unique model for understanding humans' ability to align goals, intentions, and actions, which also allows for the manipulation of participants' internal predictive models of upcoming events. Here we used polyrhythms to construct two joint finger tapping tasks that even when rhythmically dissimilar resulted in equal inter-tap intervals (ITIs). Thus, behaviourally a dyad of two musicians tap isochronously at the same rate, yet with their own distinct rhythmical context model (RCM). We recruited 22 highly skilled musicians (in 11 dyads) and contrasted the effect of having a shared versus non-shared RCM on dyads' synchronization behaviour. As expected, tapping synchronization was significantly worse at the start of trials with non-shared models compared to trials with a shared model. However, the musicians were able to quickly recover when holding dissimilar predictive models. We characterised the directionality in the tapping behaviour of the dyads and found patterns mostly of mutual adaptation. Yet, in a subset of dyads primarily consisting of drummers, we found significantly different synchronization patterns, suggesting that instrument expertise can significantly affect synchronization strategies. Overall, this demonstrates that holding different predictive models impacts synchronization in musicians performing joint finger tapping.
Successful interpersonal interaction and communication is dependent on shared predictive models [1][2][3][4] . We understand others' actions by inferring their goals, beliefs, and attitudes 5,6 . Recently, musical paradigms have proven very useful for studying interpersonal interaction mechanisms, since precise predictive models can be easily experimentally manipulated (e.g tonality or meter) and quantified using interaction dynamics, which occur at a timescale of milliseconds 7 . Here we used a minimalistic musical paradigm to investigate the interactions between a dyad of two musicians having the same or different top-down predictions.
Humans are highly adept at coordinating movements with one another and do so often without noticing, which again influences social behaviour. Hence, interpersonal coordination may be a human predisposition, evident in behaviours such as the tendency towards synchronised walking 8 , or how dyads in rocking chairs attempt to synchronize even when the natural frequencies of their chairs are incongruent 9 . This tendency to synchronize movements is also visible in body movement in more complex interactions, such as the bodily synchronization during joke telling 10 . Synchronization in these tasks appears to be spontaneous, seemingly without conscious effort. In contrast, musical performance requires the ability to synchronize movements on-the-fly between performers, and is a case of intentional interpersonal coordination. Music consists of temporally related sounds, where movements must be precisely coordinated both within and between performers 11,12 . Thus, in music, interpersonal coordination is a result of intended joint action rather than a spontaneous occurrence or basal mimicry.
Much of interpersonal coordination research -and joint action more broadly -has been concerned with understanding the mechanisms that enable people to successfully coordinate with one another 13 -specifically, the interplay between the top-down predictive and bottom-up reactive mechanisms. It is now well established that we make continuous predictions about the sensory consequences of our own actions, allowing us to attenuate perceptions of predicted sensations and amplify salience of externally caused ones 14 . Prediction hence plays a major role in joint action, embedded within a common coding framework of action and perception, scaling across the what, when, and where of another's actions -i.e. what another's intention to act is, when they will act www.nature.com/scientificreports www.nature.com/scientificreports/ the fact that the dyads were actually tapping different rhythms. Hence, we were able to quantify the effects of the internal predictive model on synchronization within the dyads. Previous research has emphasized the necessity of shared predictive models of successful interactions, and we therefore hypothesized that a non-shared RCM would result in decreased synchronization measures 19 . We expected this effect to occur due to discrepancy between the perceived stimuli and the internal predictive model. In particular, the polyrhythm used in the experiment has a bar length of 1.5 seconds whereas the straight rhythm has a bar length of 2 seconds. Thus, any rhythmical adjustment done on the timescale of the bar will be different between the two. (a) An illustration of how an isochronous rhythmic sequence with a 500 ms ITI can be resolved into two distinct rhythms -a 3-against-4 polyrhythm at 160 BPM, and a 4/4 straight rhythm at 120 BPM. (b) The two scenarios in our experiment. Participants either tapped along with a computer metronome -the noninteractive condition; or in a bidirectionally coupled state. (c) An overview of the conditions, showing both the meter and the task. Note that condition A + B is balanced, so that both member 1 and member 2 performs the two distinct rhythms an equal amount of times.
Further, based on earlier studies, we hypothesized that a leading-following strategy would be more prominent in the non-shared RCM condition 2,4 (for an overview of synchronization strategies, see Fig. 3). While the underlying mechanisms of synchronization strategies remains a topic of discussion, recent research suggests that mutual adaptation is the most efficient strategy and is premediated by a merging of self-other representation 19 . In joint finger tapping tasks, this would entail considering the auditory feedback created by the other as linked to one's own action. In contrast, the leading-following strategy necessitates a consistent categorizing of the auditory feedback as belonging to the other person. In the non-shared RCM condition, we hypothesized that differences in tapping would force the members into considering the auditory feedback as dissociated from their own tapping. From this, one of the members, presumably the one with the most confidence in their own RCM, would then gravitate towards a leading role. Here, simulated data is shown exhibiting a mutual adaptation dynamic. In the first graph the ITI of a simulated sequence is shown. Cross-correlation is performed on the ITI timeseries, giving correlation coefficients for lag −1, 0 and +1. (b) An illustration of different synchronization strategies and their corresponding lag patterns. Mutual adaptation occurs when a perception-action loop is formed between the two members of a dyad, such that they equally weigh the incoming auditory stimuli from the other member, and their own model of the task. Leading-following depends on one of the dyad members to attenuate the information coming from the other member, here illustrated by the missing connection between the blue member's tapping and the orange member's headphone. As such, the leader puts more confidence in their own internal model. In the case of leading-leading, as observed in our experiment, both participants exhibit leading behaviour. Here, they both discard information from the other member and rather relies on their own model. 2), and did therefore not require ethical approval (reference number 87/2016). Nonetheless, the experiment was performed in accordance with Aarhus University's policy for responsible conduct of research and the Declaration of Helsinki. Participants signed consent forms and was informed that their participation was entirely voluntarily, and that they could exit the experiment at any time without any penalty to the payment for participation. In addition, participants were debriefed and given contact information to the researchers performing the experiment. The collected data were anonymized, and no national identification number was stored.
Paradigm. As we were interested in the interpersonal effects of the RCM, we decided to employ a joint finger tapping paradigm with bidirectional coupling -participant 1 received auditory feedback of participant 2's taps, and simultaneously participant 2 received auditory feedback of participant 1's taps. In addition, we added a non-interactive scenario where the participants tapped along with a computer metronome to serve as a baseline condition (Fig. 2b). Here, they did not receive any auditory feedback from their tapping. To constrain the amount of shared information within the dyads in the bidirectional condition we restricted auditory feedback from the tapping to only contain a transient sound of equal amplitude and length independent of tapping strength. This way the only information communicated between the pairs were their willingness to adapt, through adjusting their ITI.
Participants. In order to have participants that could reliably perform the 3-against-4 polyrhythm we recruited highly skilled musicians with normal hearing and no known sensorimotor nor neurological disorders. Participants were recruited predominantly from the Aarhus region, Denmark. A total of 30 paid volunteers participated. They were paired depending on the time slots they signed up for, for a total of 15 pairs with both same and mixed gender. Out of these, 2 dyads were discarded due to their inability to reliably perform the polyrhythm, and one dyad were discarded due to figuring out the hidden condition. The remaining participants self-reported their musical abilities primarily as professional (n = 14) and semi-professional (n = 9), with one participant self-reporting as amateur. The participant's mean age was 23.2 years (SD = 2.8), and 2 were female. Their expertise as measured by the mean of years of formal musical training and years of actively playing ranged from 5 to 24 years with a mean of 13.6 years (SD = 4.6). Most musicians had a percussion instrument (reported as Drums, Classical Percussion, Percussion, Cajon) as their primary instrument (n = 15), with the rest playing harmonic or melodic instruments such as the Piano, Guitar, Saxophone and Trombone.

Materials and apparatus.
Two Arturia Beatstep MIDI controllers were used as tapping devices. MIDI from both devices were recorded on the computer running the experiment, using an M-Audio MIDI-to-USB converter. In order to reduce latency between a participant's tap and the resulting sound, we used two overclocked Teensy 3.0 microcontrollers, each with an SGTL5000-based audio board. These were connected to the gate voltage output on the MIDI controllers. Using a custom sound generating script, this resulted in tap-to-sound latencies of under 1 ms. The paradigm was programmed in Python, using PsychoPy 32 and Pyo 33 , presented using a computer running Windows XP, with two monitors -one for each participant.
Stimuli. Each trial was initiated by one bar of an isochronous and unaccented metronome sound. The metronome stimuli were made with a freely available click sound. For rhythm A, the metronome was four beats at 120 BPM. For rhythm B, the metronome was four beats at 160 BPM. In the bidirectional scenario the metronome stopped after one bar. In the non-interactive scenario, the metronome continued throughout the length of the trial. In condition A + B the metronome for rhythm B started 500 ms later than the metronome for rhythm A, so that the participants would start tapping at the same time.
Task and procedure. Upon arrival, the members of each pair were given information on voluntarily participating in a research project, received task instructions, and signed consent forms. They were told that they would be participating in an experiment wherein we would be looking at the differences between playing a polyrhythm, and a straight rhythm. As in previous experiments their main objective was to "synchronize with the sound/ metronome, and try to maintain tempo" 34 . Participants were given examples of the two tasks they would be performing -a 120 BPM 4/4 simple rhythm (condition A) and the triplet of a faster 160 BPM 3-against-4 polyrhythm (condition B), both resulting in an inter-tap interval (ITI) of 500 ms. They were told there would be two scenarios -a non-interactive computer scenario wherein they would be tapping along with a computer metronome, and an interactive bidirectional scenario wherein they would be tapping along with their partner. In the bidirectional scenario the participants' auditory feedback was the tapping of their partner, so that dyad member 1 heard member 2's tapping, and vice versa. In the non-interactive scenario, the participants only heard the computer metronome.
The participants were instructed that a metronome would always start the task with one bar of clicks, and images and text on the screens would inform them of which rhythm to play. Further, they were told to tap with their index finger on a marked pad on the MIDI interface, and to sit as still as possible with their eyes open looking at a fixation cross during the actual tapping. Auditory stimuli were delivered using ER-2 insert earphones (Cortech Solutions), which provided an external noise reduction of >30 dB. Sound levels were set at a comfortable level for each participant. Participants were placed in the same room. While the dimensions of the room made it unfeasible to position the participants back-to-back, they were positioned such that no visual contact was possible without removing the earphones. www.nature.com/scientificreports www.nature.com/scientificreports/ For the participants, there were four different tasks -non-interactive: playing a 4/4 simple rhythm (A) referred to as condition A, or the triplet in a 3-against-4 polyrhythm (B) referred to as condition B; and bidirectional: playing rhythm A, or B. However, unbeknownst to the participants there were trials wherein one participant played rhythm A, and the other rhythm B (and vice-versa). This hidden condition, containing the non-shared RCM, is referred to as condition A + B (see Fig. 2c). Since we did not want the participants to know that they would be performing different rhythms at the same time, we informed them that they would always tap the same. As their goal ITI would remain the same in all conditions, this was technically correct. An overview of the conditions can be found in Fig. 2c. Each scenario had 100 trials, with 50 trials for each rhythm, for a total of 200 trials. This results in 25 trials per scenario for each possible combination of rhythms. Each trial lasted 12 seconds, with the first 2 seconds consisting of the metronome, and the rest of tapping.
Data analyses. The paradigm software produced files containing note-on and -off timestamps, which were pre-processed in MATLAB 35 . Only the tapping onset times were analysed, and missing data were dealt with by pair-wise deletion of the onset time. One pair of participants were excluded at this stage of the analysis due to excessive missing data (more than 25% trials with more than 25% missing taps). From the onset times we calculated the inter-tap interval (ITI), the time between two successive taps. Five outcome measures were calculated -ITIs, synchronization indices (SI), cross-correlations, signed asynchrony, and tapping variability as measured by the standard deviation of the asynchronies (SD asy ). To quantify stability in synchronization behaviour we also calculated the trajectory of the SI during the trial by calculating the average SI at four timepoints consisting of four equal subdivisions of the taps in each trial, so that timepoint 1 represents SI in the first quarter of each trials, timepoint 2 represents the next quarter of each trial, etc. From MATLAB the data was exported to R for statistical analyses 36 . Averages are reported as mean with standard deviation in cases where a normal distribution exists, and median with interquartile range if the distributions did not pass the Shapiro-Wilk test.
To measure how well participants synchronised their tapping to either the metronome or each other, we used synchronization indices (SI). These are calculated based on the variance of relative phase between two signals, giving a unitless number ranging from 0 to 1, based on the following formula 37 : where N represents the number of taps in each trial, and θ 1 and θ 2 are the phases of each dyad member (or computer metronome and participant in the noninteractive scenario). An SI of 1 represents perfect synchronization, and 0 no synchronization. For the non-interactive scenario, we calculated the SI with respect to the metronome signal for each participant, for each of the two rhythms (A and B). The resulting SI were then averaged per participant per rhythm and compared using a Wilcoxon rank sum test. For the bidirectional scenario the SI were calculated between the pair's time series, indicating how well the participants synchronized to each other. To classify the effect of condition on synchrony while accounting for dyadic differences we used a linear mixed effects model (LMM) with the dyads as the random factor and condition as fixed effect 38 . A rank transformation was applied to the SI, and p-values were obtained by likelihood ratio tests of the model with a null-model retaining only the random factor. Asynchrony in the bidirectional scenario was calculated as the difference between tap timing of partners in milliseconds. It is signed based on which person is the point of reference, so that when calculated with dyad member 1 being the reference, a positive asynchrony means that member 1's tap precedes member 2's tap, and a negative asynchrony the opposite. We calculated the mean asynchrony per trial. The standard deviation of the asynchronies (SD asy ), was calculated by taking the standard deviation of the absolute asynchronies per trial, and gives an indication of the variability of the finger tapping. A high SD asy indicates that the participants are adjusting their taps over a wide range, whereas a low SD asy indicates a more stable and narrow adjustment range. We used the same LMM approach as described earlier to analyse both the signed mean asynchrony and the tapping variability (SD asy ), with dyads as random factor, and condition as fixed effect.
To assess directionality in the interaction between the participants we calculated cross correlations at lag −1, 0, and +1 between each dyad member's tapping time series (Fig. 3a). The relation between these coefficients gives an indication as to how the dyad interacts. A perfectly synchronized trial with no variation in tempo would give high correlation at all lags, whereas a trial with high synchrony, but with some variation in tempo, would produce the highest correlation at lag 0. If a leader-follower dynamic is present, a positive correlation at either lag −1 or lag +1 occurs, depending on which participant is the follower (more adaptive one). This occurs due to one member, the follower, lagging one tap behind the leader. Mutual adaptation, wherein both members exhibit following behaviour leads to positive correlation at both lag −1 and lag +1, and negative correlation at lag 0 4 . This lag pattern occurs due to both members constantly adjusting their ITI in opposite directions on a tap-by-tap basis causing the negative correlation coefficient at lag 0. Hence, they are both more correlated to the previous tap of the other dyad member, resulting in an oscillation around an optimal ITI and the corresponding positive correlation at lag −1 and lag +1 (for an overview, see Fig. 3).
A challenge in correlation analysis of timeseries such as these is the assumption of stationarity. One way of addressing this issue is to divide the data into overlapping windows, which comes at the cost of making data interpretation harder, due to the increasing risk of spurious correlations 39 . As our participants consisted of highly trained musicians, and our data showed highly stable ITIs, we decided to use conventional cross correlation over the entire trial length. To account for dyadic differences, we performed a two-way MANOVA with condition and dyad as dependent variables on the Fisher Z-transformed lag coefficients. (2019) 9:11048 | https://doi.org/10.1038/s41598-019-47471-3 www.nature.com/scientificreports www.nature.com/scientificreports/ Public significance statement. This study shows that when a pair of musicians thinks differently about a rhythm they play together, their performance is worse. However, they are able to recover back to normal performance levels after a few taps. When musicians synchronize they use different strategies. Interestingly, we find that the strategies used by drummers may be different from other musicians.

Synchronization indices. A likelihood ratio test revealed a significant effect of condition on synchroni-
zation in the bidirectional scenario (χ 2 (2) = 13.03, p = 0.0015). We performed Bonferroni corrected post-hoc test, using the multcom package in R 40 . Here, we found that the SI was significantly lowered by approximately −0.023 (p = 0.0016) in condition B compared to condition A, and by approximately −0.021 (p = 0.0123) in condition A + B compared to A (see Fig. 4). No significant difference between condition B and condition A + B were found, nor between the rhythms in the non-interactive scenario. In the trajectory analysis we performed three separate one-way ANOVAs of the rank-transformed SI over the four timepoints, for each of the condition, and a significant effect was only found in condition A + B (F(3, 2068) = 4.767, Bonferroni corrected p = 0.0077) (see Fig. 5b). A Tukey HSD test showed that the SI significantly increased between timepoint 1 and 3 (p = 0.008) and between timepoint 1 and 4 (p = 0.005). While the change in SI between timepoint 1 and 2 appears to suggest an increase, it only approached significance (p = 0.067). This means that the synchronization levels increased over time in condition A + B.
Tapping variability, SD asy . A highly significant effect of condition on SD asy was found using a likelihood ratio test (χ 2 (2) = 16.5, p = 0.0003). The results mirror the findings from the synchronization index, summarized in Table 1. Due to this apparent similarity, we performed a correlation to determine their relationship. A Kendall's tau-b correlation showed a strong negative correlation between SD asy and SI (tau-b = 0.496, p < 0.0001).

Inter-tap intervals.
A likelihood ratio test showed significant differences in the participants' ITIs between the non-interactive and bidirectional scenario. (χ 2 (1) = 5.43, p = 0.0198). In the non-interactive scenario, the median ITI for both rhythms was 500.24 ms (IQR = 498.9, 503.1), decreasing to 497.34 ms (IQR = 490.7, 510.1) in the bidirectional scenario. Three example trials can be seen in Fig. 5a. No significant difference between conditions in either scenario was found.
Asynchrony. Overall, the participants exhibited a low mean asynchrony in the bidirectional scenario of only −3.64 ms (SD = 30.8). A likelihood ratio test on the absolute asynchronies only gave a close to significant effect of condition on asynchrony (χ 2 (2) = 4.87, p = 0.0876).
Inter-dyad differences in tapping strategy. In order to explore the between-dyad variability in the cross-correlations, we clustered the data by dyad with the lag data as input using Ward's clustering method (Fig. 6a) 41 . A similarity profile analysis using the R-package SIMPROF classified three clusters as significantly different at α < 0.05 42 . Visually inspecting the averaged lag coefficients shows that cluster 2 and cluster 3 both exhibit patterns resembling mutual adaptation, whereas cluster 1 shows little to no apparent pattern in its correlation coefficients (Fig. 6b). www.nature.com/scientificreports www.nature.com/scientificreports/ To further classify these clusters, we looked at participants' primary instruments, categorizing them as 1: drums & percussion (D/P), 2: harmonic & melodic (H/M). We also considered the combinations of primary instruments within a dyad by pairing them in three levels according to whether they consisted of two D/P Figure 5. In (a) representative trials from one dyad are plotted as the intertap interval in seconds over the number of taps, for the three conditions. In (b) we show the trajectory of the SI over four timepoints. The timepoints represent the median SI over four equal subdivisions of the taps for each trial. Error bars are bootstrapped 95% confidence intervals. P-values are obtained from a Tukey's HSD test. www.nature.com/scientificreports www.nature.com/scientificreports/ participants, two H/M participants, or one of each. Further, we classified their experience defined as the mean of years of formal musical training and years of actively playing per participant. We compared these values between the clusters, in addition to the synchronization index and tapping variability. Only instrument and dyad pairing yielded a statistically significant difference (see Table 2).

Discussion
In cases of a non-shared RCM, participants performed significantly better at the end of the trials than at the start. This pattern was not seen in either of the conditions with an equal RCM. This indicates that the internal predictive models impact synchronization when performing joint finger tapping, even though the intended motor outputs in the different conditions are the same. In addition, we identified a subgroup of our participants exhibiting an Figure 6. Results from clustering the cross-correlations. A dendrogram of the resulting clusters is shown in (a), with colour indicating significant differences at α < 0.05. In (b), the averaged correlation coefficients for lag −1, lag 0, and lag +1 is shown for each of the clusters, with standard error of the mean error bars. Clusters 2 and 3 can be described as mutual adaptation. Cluster 1 is an example with no clear pattern of interaction, that may come from both participants attempting to lead, causing a leading-leading pattern. Interestingly, this was found primarily in dyads with drummers. (2019) 9:11048 | https://doi.org/10.1038/s41598-019-47471-3 www.nature.com/scientificreports www.nature.com/scientificreports/ interaction pattern not reported in coupled interactions before -with very low correlation coefficients across lags −1, 0, and +1, as if they did not interact much with each other, while exhibiting stable tapping behaviour. The subgroup consisted primarily of drummers paired with drummers. Earlier studies primarily report two synchronization strategies in coupled conditions, namely mutual adaptation and leading-following 2,4,34 . We observed a third possible strategy we term leading-leading, wherein synchronization occurs without any apparent dyadic interaction.
Spontaneous synchronization is thought to occur due to a merging of self-other representations, wherein an action-perception loop is created such that one person's action output becomes another's perceptual input, and vice-versa 4,19,43 . However, explicit synchronization as encountered in our study may be a more complex phenomenon. Musical rhythmic interaction relies on both maintaining the tempo and synchronizing with other performers, and thus participants may choose to weight these two aspects differently. In our study, we did not find any differences in inter-tap intervals between the bidirectional conditions. The participants performed at a median ITI of 497.34 ms, with an IQR of 490.7 ms to 510.1 ms. This is consistent with research on the just-noticeable differences in tempo sensitivity, which has shown a detection threshold of around 2% difference in sequences at 500 ms 44 , and suggests that the RCM did not significantly influence our participants' ability to maintain tempo.
While tempo remained relatively stationary, we found differences in synchronization measures. The dyads in our experiment exhibited an increase in SI in the trials with a non-shared RCM. This effect was significant when comparing the first 25% of taps in a trial with the third and fourth 25% of taps. While there was a trend towards an increase in SI in the second 25% of taps compared to the first 25%, this did not reach statistical significance. This effect was not found in neither condition with a shared RCM. A possible explanation here is that the participants start out by keeping their two distinct RCMs, but after recognizing that synchronization is impaired, they abandon their individual RCM and default to a following behaviour. This then leads to an interaction that, as a whole, does not exhibit a large negative effect on synchronization measures. When looking at the entire trials we found that condition A + B was in fact in between condition A and condition B in synchronization measures. The dyads achieved an average SI of 0.955 in condition A + B, as compared to 0.962 in condition A and 0.949 in condition B. Hence, the musicians in our study appear to be resilient to the effects of holding a non-shared RCM.
A plausible explanation of this observed behaviour is found in the brain's tendency for computational optimization. Viewed from the framework of predictive coding under the free energy principle, the brain constantly attempts to optimize its use of energy by minimizing the error between its predictions and the sensory input 45 . Thus, when humans are interacting, spontaneous synchronization may occur as a result of individuals minimizing the differences in self/other representations 19 . In our study, the decreased SI seen at the start of the non-shared RCM trials may represent the participants individually searching for a magnitude of variability wherein they are able to minimize the difference between self and other representation. Our data then suggests that explicit synchronization such as rhythmic joint action may share many traits with spontaneous synchronization.
Previous studies on joint finger tapping in a bidirectionally coupled setting predominantly report mutual adaptation as measured by cross-correlations 4,34 . Similar patterns of interaction are also found in experiments studying, for instance, imitative hand movements 46 , in target directed tapping tasks 47 , and in piano performance 48 . This type of behaviour, wherein participants constantly adapt to each other, is proposed to be the most energy efficient way of interpersonal coordination 19 .
Consistent with this, when analysing the lag coefficient of our participants we found that two out of three clusters showed patterns in line with mutual adaptation. These two clusters included the majority of our dyads and differed only in strength of the correlation while retaining the same pattern. However, we found a third cluster which exhibited a lag pattern not indicative of either leading-following nor mutual adaptation (see Fig. 5b). The pattern, consisting of weak but positive correlations at all three lags, shows that there are little to no systematic interaction between the dyad members. Yet, the participants still achieve synchronized behaviour. A likely reason for this type of pattern to occur is if both participants are attempting to lead, and hence do not adapt to each other. This type of leading-leading-pattern is, to our knowledge, only previously observed in uncoupled interactions 4 .
The cluster found in our data did not show significant differences from the other two in terms of experience or synchronization measures, however, it did have a significantly higher occurrence of drummers and a higher occurrence of dyads consisting of drummers paired with drummers. In ensembles, and particularly in rhythmic  www.nature.com/scientificreports www.nature.com/scientificreports/ bands, drummers are usually expected to be the main timekeeper. They usually produce rhythmic patterns consisting of multiple actions on differing metrical levels, as exemplified by the archetypical four-on-the-floor rhythm found in disco, pop, EDM, and rock. This pattern exists on a 4/4 meter, with the bass drum being hit every beat, the snare every other beat, and the hi-hat usually every half beat. To play such a rhythm a higher internal stability is needed than in for instance more melodic or monophonic instruments such as the voice, or the saxophone. Thus, one may argue that drummers are trained to be less metrically adaptive than other musicians.
It should, however, be noted that the cluster found in our data is relatively small, and individual differences or social factors could also play a large role in the observed data. For instance, synchronization is reduced when performing a repetitive task with a partner that arrives late 49 . As the two other clusters in our data also contained drummers, and dyads consisting of drummers paired with drummers, we would not generalize this effect to all drummers, but rather posit that this lag pattern may come about when two musicians that both are highly confident in their own internal RCM interact. To further explore the factors underlying if this type of interaction a follow-up study is necessary.
It is well known that musicians' skills depend on instrument, level of expertise, and which genre of music they predominantly perform in [50][51][52][53] . On a neuronal level, these differences may manifest in low-level processing such as the mismatch negativity (MMN), a pre-attentive component of the auditory event-related potential considered to be a prediction error signal 54,55 . EEG studies have shown differences in sensitivity and amplitudes in MMN responses between musicians belonging to different genres 53,56 . On a behavioural level, drummers have been shown to be better at certain rhythmical tasks, yet the extent of these differences between types of musicians are not yet fully understood [57][58][59] . It might well be that these differences do not fully manifest in the common single-person experiments typically used in research on rhythm perception and production. Interacting with a computer-generated pacing signal, even if such signal is adaptive, does not fully capture the intricacies of joint action. Our study is one of the first to show that instrument-specific differences on an individual level may also impact synchronization strategies in interpersonal interaction.

conclusion.
We tested the effect of dyads having a shared or non-shared rhythmic context model on synchronization in a joint finger tapping task. Having a non-shared model resulted in impaired synchronization at the start of a task, but the dyads recovered quickly. This suggests that in rhythmic joint action, musicians are able to efficiently adapt their own predictive models in order to facilitate interaction. Most dyads exhibited lag coefficient patterns indicative of mutual adaptation, except for a subset considering predominantly of drummers which showed a novel leading-leading-pattern suggesting little to no dyadic interaction. We find that a possible explanation for this is that the drummer's role in normal musical performance may differ from many other instruments. Drummers often serve the role of timekeeper and motorically require a higher degree of internal synchronization of motion. The finding is of high interest, as it complements the strategies of leading-following or mutual adaptation in synchronization behaviour, by providing a third alternative wherein synchronization occurs without any apparent dyadic interaction. Further, it emphasises that differences between individual musicians, such as which instrument they play, also affect interpersonal synchronization strategies.

Data Availability
The data reported in this paper is available upon request. Due to data protection laws, a data processing agreement may be required.