We thank Patrick Savage and Shinya Fujii for their highly relevant comment about our recent Review (Music in the brain. Nat. Rev. Neurosci. 23, 287–305 (2022))1, in which they extend the predictive coding of music (PCM) framework to encompass the perception of music — and music listeners — from cultures beyond the Western tradition (Towards a cross-cultural framework for predictive coding of music. Nat. Rev. Neurosci. https://doi.org/10.1038/s41583-022-00622-4 (2022))2. As Savage and Fujii rightly point out, there are music genres outside the Western tradition that do not include harmony — often being based on musical modes other than the major and minor modes — and music pieces that are non-isochronous or unmetered.

One key offering of the PCM account is that it explains music perception (and action, emotion and learning) as guided by the brain’s real-time generative model. This model relies on cultural background (and thereby on experience-dependent learning), musical competence, the context in which we experience music and our current brain state (including attentional set and emotional states), as well as individual traits and innate biological factors. It is important to note that the musical percept is not necessarily tied to the auditory input3,4. The percept (that is, posterior beliefs) is the product of belief updating under a hierarchical generative model, which may differ fundamentally among listeners. A key example in our Review is that certain musical excerpts may be heard with different metres or different tonalities, depending on the musical priors (see Fig. 3 in the Review), and this experience can be manipulated by priming (that is, changing prior beliefs)5,6. Accordingly, the PCM model explains how growing up in a certain musical culture profoundly influences our experience of music; by shaping the predictive frameworks that underlie perception, action, affect and learning.

We agree that Western-based harmony is a special case of the more general phenomenon of tonality. As we note in our Review: “The experience of music is therefore intimately linked to brain-bound predictive models: for example, tonality … and metre”1. We included a discussion of harmony as a well-researched example of how music perception may be subdued to a statistically learned musical grammar, for listeners from a Western culture. Furthermore, the statistical learning processes involved in harmony or tonality — and thereby the principles that underwrite predictive processing — have been generalized beyond musical cultures through behavioural and scanning studies using artificial tonal systems and grammars7,8,9,10,11.

For many musical genres — including for contemporary styles of music with roots in African music that are now considered Western, such as modal jazz — it would make sense to exemplify PCM by tonality. However, tonality may not even be an endpoint prediction for melody (or harmony), as it is intertwined with rhythmic predictions12.

However, as Savage and Fujii correctly point out2, there is a need for neuroscientific studies of music involving stimuli and listeners with a non-Western background. Clearly, we do not fully understand the predictive coding involved in the processing of non-isochronous and unmetred musics. An obvious experiment would be to examine the neural correlates of temporal violations in such music in encultured listeners; for example, using the mismatch negativity recorded by electroencephalography or magnetoencephalography. The prediction of the PCM model would be that the mismatch negativity would have a larger amplitude and a shorter latency to violations of such temporal predictions in encultured listeners than in Western listeners.

We very much look forward to seeing and evaluating evidence from empirical neuroscientific investigations within the exciting field of cross-cultural neuroscience of music; it is an ideal way to probe and expand the compass of the PCM framework to instantiations of music from a breadth of cultures.