Introduction

Prosody, often referred to as the music of speech, is defined as the organizational structure of speech, including linguistic functions such as tone, intonation, stress, and rhythm (Gussenhoven and Chen, 2020; Ladd, 2008). It has been well-established that prosody plays a key role in sentence processing in both L1 (first or native language) and L2 (second or non-native languages), including lexical activation and segmentation (e.g., Cutler and Butterfield, 1992; Cutler and Norris, 1988; Norris et al., 2006; Sanders et al., 2002), syntactic parsing (e.g., Cole et al., 2010a; Frazier et al., 2006; Hwang and Schafer, 2006; Ip and Cutler, 2022; Lee and Watson, 2011; O’Brien et al., 2014; Roncaglia-Denissen et al., 2014; Schafer et al., 2000), information structure marking (e.g., Birch and Clifton, 1995; Breen et al., 2010; Calhoun, 2010; Clopper and Tonhauser, 2013; Katz and Selkirk, 2011; Kügler and Calhoun, 2020; Namjoshi and Tremblay, 2014; Steedman, 2000; Welby, 2003; Xu, 1999), and pragmatic information signaling such as speech attitudes, acts and emotion (e.g., Braun et al., 2019; Lin et al., 2020; Pell et al., 2011; Prieto, 2015; Repp, 2020).

Prosody has been investigated extensively given its significant status in language processing and its highly interdisciplinary nature involving linguistics, psychology, cognitive science, and computer science, especially with the advent of two early reviews: Shattuck-Hufnagel and Turk (1996) and Cutler et al. (1997). A decade later, more publications have provided comprehensive and state-of-the-art overviews of the theoretical and experimental advances of prosody (Cole, 2015; Wagner and Watson, 2010). However, to be best of our knowledge, no bibliometric overview of prosody research has been conducted to offer a better understanding of how research in this area has evolved and where the boundaries of prosody research might be pushed in the future.

The present study used a bibliometric approach which was initially used in library and information sciences for the analysis and classification of bibliographic material by sketching representative summaries of the extant literature (Broadus, 1987; Pritchard, 1969). Based on a large volume of bibliometric information, mathematical and statistical methods in bibliometric analysis make it possible to extract patterns that reveal the characteristics of publications in a specific discipline. In addition, with the assistance of network mapping techniques, bibliometric analysis can also be used to visualize the state of the intellectual structure of a specific research topic or field. In this study, we have used such an approach to perform co-citation analysis and keyword analysis to review publications on prosody in linguistic journals. Co-citation is a measure that gauges the connection between frequently referenced documents, with the intensity of co-citation determined by the frequency at which two documents have been jointly cited (Small and Sweeney, 1985). Co-citation analysis is important in bibliometric studies as “co-citation identifies relationships between papers which are regarded as important by authors in the specialty, but which are not identified by such techniques as bibliographic coupling or direct citation” (Small and Sweeney, 1985, p. 19). Keyword analysis involves comparing the frequency of keywords in different periods, to identify significant changes to the key topics which is helpful in predicting the emerging trends of a research field (e.g., Lee, 2023; Lei and Liu, 2019a; Zhang, 2019).

Bibliometric analysis has been widely used in different areas of linguistics. For instance, Zhang (2019) used this method to examine the field of second language acquisition (SLA); Lei and Liu (2019a) rendered a bibliometric analysis of the 2005–2016 research trends in the field of applied linguistics; and Fu et al. (2021) employed this approach to analyze the evolution of the visual world recognition literature between 1999 and 2018. Since no bibliometric analysis has been conducted on prosody in linguistics, this present study takes the bibliometric approach to describe the state of the intellectual structure and the emerging trends of research on prosody in linguistics. The following research questions are addressed:

  1. 1.

    What is the research productivity of linguistic journals on prosody?

  2. 2.

    What is the intellectual structure in the field of prosody in terms of influential authors, references, and venues of publications?

  3. 3.

    What are the research trends of works on prosody in linguistics?

Methodology

Data

The bibliometric data used in this study were retrieved from Web of Science (henceforth WoS) on 14 June 2022. There are three reasons why we used the database of WoS. First, WoS is a more widely used library resource than other databases such as Scopus and Google Scholar. For instance, the number of subscribers of WoS is two times larger than that of Scopus (Zhang, 2019). Second, only academic citations are provided by WoS. That is, compared to databases such as Google Scholar which provides mixed information of both academic and non-academic citations, WoS is more appropriate for calculating the scholarly values of the publications. Third, the availability of co-citation information in WoS makes it possible to conduct co-citation analysis which is one of the important bibliometric methods used in this study. As for the search terms, the present study used a combination of “prosod*” (In regular expressions, the asterisk (*) is used as a quantifier that specifies “zero or more” occurrences of the preceding character or group.), “autosegmental-metrical”, “metrical structure”, “accent”, “intonation*”, “stress”, “suprasegment*”, “F0” (fundamental frequency), “rhythm”, and “pitch” as the search query, and the Boolean OR operator was used to separate these terms. Moreover, the Boolean NOT operator was used to exclude research on “semantic prosody”Footnote 1. The timeframe for the search was from January 2001 to December 2021.

Since the present study focuses on prosody research in linguistic journals, English research articles (excluding book reviews, editorial reviews, etc.) published in high-quality journals in the field of linguistics were included. Only published research articles were included to guarantee the quality and reliability of the publications under a strict quality control mechanism such as peer review (Zhu and Lei, 2021). As for the selection of high-quality journals, the current study chose SSCI-indexedFootnote 2 international journals in the field of linguistics for two reasons. First, those journals have rigorous peer-review processes. Second, most SSCI-indexed journals are accessible to worldwide academia. More than 200 SSCI-indexed journals in the field of linguistics have published research articles on prosody. However, some of those journals published less than three prosody-related articles in the past 21 years. A cut-off point of 30 articles per journal in the past 21 years was set to ensure that the majority of linguistic journals that published research on prosody are included for analysis in the current study. The cut-off point of 30 was set for two reasons. First, with this criterion, publications in the included journals cover more than 70% of the total publications. Another rationale behind our initial choice of 30 papers (as a rule of thumb) was to ensure a sufficient number of data points for robust statistical analysis and to focus on journals with a more substantial presence in the field of prosody. The list of journals included for and excluded from analysis in the present study can be found in the supplementary data.

Data cleaning

To avoid coding errors, data cleaning was performed using the measures proposed by Zhang (2019). Specifically, if different author names were used to refer to the same author, they were recoded to one unique version. For instance, “Elisabeth O. Selkirk”, “Selkirk, E. O.”, “Selkirk, EO”, “Selkirk, E.”, and “Selkirk, E” were all recoded as “Selkirk, E.”. Similarly, different keywords used to refer to the same concept were also recoded. For instance, “Event-related Potential”, “Event-related Potential (ERP)”, and “ERP” were all recoded as “Event-related Potential”. In addition, singular and plural forms of the same concept were identified and recoded as one. For instance, “boundary tone” and “boundary tones” referred to the same concept, hence all “boundary tones” were recoded to “boundary tone”. However, keywords that share a degree of similarity were not recoded as the same since their meanings can be different. For instance, “bilinguals” and “bilingualism” were kept as separate words since the former refers to people who speak more than two languages while the second exists as an ability of people or a characteristic of a community of people.

Data analysis

In this study, co-citation analysis and keyword analysis were performed. The data which spanned 21 years were divided into three periods (i.e. the 2001–2007 period, the 2008–2014 period, and the 2015–2021 period) and the results of the two forms of bibliometric analyses in the three periods are compared with each other to reveal important changes during the last 21 years.

Co-citation analysis assumes that if publications were frequently cited together, these publications would probably share similar themes (Hjørland, 2013). This technique is frequently used in previous bibliometric studies to reveal the intellectual structure in a particular research field (Rossetto et al., 2018; Zhang, 2019). Based on the references (i.e. papers that are cited by publications retrieved for the present study) in the surveyed articles, the co-citation network will cluster two publications together if they are co-cited by a third publication. The greatest strength of co-citation analysis is that apart from identifying the most influential authors, references, and venues of publications, it is also capable of discovering thematic clusters. It should be noted that clusters in the present bibliometric study are groups or sets of closely related items. Co-cited items will fall into the same cluster by using cluster techniques in VOS viewer (detailed information can be found in van Eck and Waltman, 2010).

The prosody-related articles published in linguistic journals between 2000 and 2021 cited more than 50,000 unique references. It would be impossible to interpret such a massive number of nodes if all the cited references were included in a network map. Therefore, when constructing the network maps in VOS viewer (van Eck and Waltman, 2017), we set a cutoff point at the values that could include the top 50 most-cited items in the maps in the present study to restrict the number of nodes following Zhang (2019).

Keyword analysis was used to identify important topics in publications retrieved for the present study in each period, and a cross-period comparison of the frequencies of those important topics was conducted to determine whether significant diachronic changes existed or not. The following four steps were used to conduct keyword analysis. First, author-supplied keywords and keywords-from-abstractsFootnote 3 were retrieved from each article. Keywords extracted from abstracts were utilized to augment the information provided by author-supplied keywords, compensating for either the absence of specified keywords in certain papers or when authors furnished a restricted set of keywords. The approach for extracting these keywords from abstracts was adapted from the methodology outlined in Zhang (2019). It is crucial to underscore that the keywords extracted from abstracts serve as supplementary additions to the author-provided keywords. Second, the raw frequency information of each keyword was computed. Third, the raw frequencies of each topic are normalized for the statistical test in the next step. Normalized frequencies of the topics were the prerequisite of a valid comparison since there were substantial differences in the number of publications in the three periods. We adopted the method proposed by Lei and Liu (2019b) for the normalization. That is, for example, the normalized frequency of an author-supplied keyword in a period was calculated using the following formula: normalized frequency = (raw frequency in that period/total number of publications in that period) * 10,000. Last, a one-way chi-square test of the normalized frequencies of each topic in the three periods was conducted for the identification of significant cross-period differences between the research topics.

Results and discussion

In this section, the information about the productivity, authors, and affiliations of the retrieved publications is presented, followed by a co-citation analysis to visualize the intellectual structures in terms of influential publications, references, and authors, as well as the keyword analysis that could facilitate the identification of prominent topics in the field.

Annual volume of publications, authors, and their affiliations

A total number of 4598 publications on prosody in the SSCI-indexed key linguistic journals were retrieved. Figure 1 shows the annual productivity of prosody articles in linguistic journals. From early 2000 the publications exhibited an upward trend and remained at more than 300 publications per year since 2019 (see Fig. 1). A dip in terms of the number of publications in 2020 can be observed, likely due to the COVID-situation which slowed the research and publication process across the board.

Fig. 1
figure 1

Distribution of prosody articles from 2001 to 2021.

Table 1 shows the top 25 prolific authors, regions, and institutions/authors’ affiliations with which the publications were associated. In terms of authors, 58 published more than 10 (10 included) articles, while 5279 authors published one article, the former only accounting for 0.81% of the total number of authors. Regarding the regions, the USA, Germany, England, Canada, the Netherlands, Australia, France, China, Spain, and Japan (in descending order in terms of being involved in publications related to prosody) topped the first ten and each was associated with more than 100 publications. When it comes to authors’ affiliations, the top five most prolific institutions published more than 100 articles. It is not surprising to note that the most prolific authors were highly associated with prolific regions and institutions.

Table 1 The most prolific authors, regions and institutions/authors’ affiliations in prosody research from 2001 to 2021.

Co-citation analysis and network mapping

Top-cited sources of publications

Figures 24 show the network maps of the top 50 most-cited sources in the three periods (2001–2007, 2008–2014, 2015–2021), respectively, using the smart local moving algorithm (Waltman and van Eck, 2013). The term “sources” denotes the academic journals or books in which the references have been published. The density view is provided below in companion to illustrate the most-cited sources of publications in the respective period. The network maps show the major clusters of the top 50 most-cited sources in the three periods.

Fig. 2
figure 2

Network map of the most-cited sources of publications (2001‒2007).

Fig. 3
figure 3

Network map of the most-cited sources of publications (2008‒2014).

Fig. 4
figure 4

Network map of the most-cited sources of publications (2015‒2021).

Firstly, it is important to note that, as shown in Table 2, the Journal of Acoustic Society of America and the Journal of Phonetics have remained to be the top two most-cited journals across the three periods. The number of citations in the two journals has increased sharply across the three periods (2329 citations in the first period, 6249 in the second, and 9978 in the third). Five other journals (i.e. Journal of Memory and Language, Journal of Speech Language Hearing Research, Language and Speech, Cognition, and Phonetica) have always remained the top 10 most-cited sources. These journals publish works in the production and comprehension of language and speech (including prosody), serving as valuable venues for novice researchers pursuing research in this area. It is important to note that ‘thesis’ is one of the most-cited sources of publications, which is probably because certain doctoral theses (e.g. Rooth, 1985) by influential experts on prosody have received substantial and continuous attention over the years.

Table 2 The top 10 sources of publications in the three periods.

These network maps indicate not only sub-areas of prosody research but also an interesting merge and split of research areas across the three periods in this field. The first period (Fig. 2) indicates five major clustersFootnote 4 representing five main areas in prosody. From left to right, the first is related to the linguistic investigation (e.g., Journal of Phonetics, Journal of Acoustic Society of America, Language and Speech); the second small cluster on the top relates to L2 learning (e.g., TESOL Quarterly, Language Learning); the third cluster on the bottom concerns the psycholinguistic aspects (e.g., Journal of Memory and Language, Cognition, Language and Cognitive Processes); the fourth widely-spread cluster on the top right of the map is language development/language disorder (e.g., Journal of Child Language and Journal of Speech Hearing Research); the last cluster located at the bottom right of the map represents the neurolinguistic research on prosody (e.g., Brain and Language, Neuropsychologia). The clusters in the second (Fig. 3) and third periods (Fig. 4) are similar to those in the first period (Fig. 2). However, several changes are worth noticing. For example, the second period witnessed a merge of psycholinguistic and neurolinguistic journals (top) which then became the largest cluster of all, dominating the whole map. In addition, the third period has again separated the experimental approach from the formal/theoretical approach (e.g., Laboratory Phonology vs. Linguistic Inquiry). Consistent with the increasing references and authors in the L2 research area identified in further below sections, the cluster of L2 prosody of the network map has expanded slightly from 2001–2007 (top of Fig. 2) to the 2015–2021 period (top of Fig. 4).

The highly influential references

Figures 57 show the network map of the top 50 most-cited references in the three periods (2001–2007, 2008–2014, and 2015–2021), respectively. The network map in Fig. 5 shows four major clusters of the top 50 (out of 24,383) most-cited references that were cited 16 times or more between 2001 and 2007. Figure 6 has five clusters of the top 50 (out of 51,107) cited references that were cited 37 times or more. Figure 7 represents four clusters of the top 50 (out of 82,660) cited references that were cited at least 46 times.

Fig. 5
figure 5

Network map of the most-cited references (2001‒2007).

Fig. 6
figure 6

Network map of the most-cited references (2008‒2014).

Fig. 7
figure 7

Network map of the most-cited references (2015‒2021).

Fifteen references have appeared on the top 50 list across all three periods, with Beckman and Pierrehumbert 1986, Chomsky and Halle (1968), Hayes (1995), Ladd (1996; 2008), Pierrehumbert (1990); Selkirk (1984), and Nespor and Vogel (1986) remaining in the top 20 throughout (the top 10 references were listed in Table 3). Five publications by Cutler and colleagues (1986; 1987; 1988; 1992; 1997) and four publications by Ladd (1996, 2008; Ladd et al., 1999, 2000) were on the list. Two clusters led by Ladd (1996/2008)Footnote 5 and Nespor and Vogel (1986) representing intonational phonology and prosodic phonology were among the top three most-cited references between 2001 and 2007, and between 2008 and 2014, and continued to be popularly cited between 2015 and 2021 ranking 4th for Ladd and ranking 8th for Nespor and Vogel (1986). Some other important references in the same cluster as Ladd across the three periods are Pierrehumbert and Beckman (1988), and Pierrehumbert and Beckman (1988), whose works are associated with the “Autosegmental-Metrical” (AM) approach that describes prosody on autonomous tiers for metrical structure and tones. The same cluster in the third period (Fig. 7) also covered publications in information structure (e.g., Rooth, 1992) and the use of prosody in marking information structure (Breen et al., 2010). These publications only made their first appearance on the Top 50 list only between 2015 and 2021. The possible reason might be the recent interest in the acoustic realization of focus as well as testing the Roothian theory that focus indicates the presence of alternatives that are relevant for the interpretation of discourse in a range of languages (e.g., Braun et al., 2019; Braun and Tagliapietra, 2010; Gotzner, 2017; Repp and Spalek, 2021; Spalek et al., 2014; Tjuka et al., 2020; Yan and Calhoun, 2019; Yan et al., 2023).

Table 3 The top 10 influential references in the three periods.

The largest cluster is located on the right side of the 2015–2021 map (Fig. 7), and this appears to be the only cluster that emerged in the last period, indicating a general topic of statistical methods/tools such as mixed-effects modeling with crossed random effects for subjects and items (Baayen et al., 2008) and logit mixed models (Jaeger, 2008)Footnote 6. This newly emerged cluster also indicates the importance of applying state-of-the-art statistics in prosody research. These two references have been influential in motivating researchers, especially psycholinguists and cognitive psychologists, to switch from ANOVA to MEM analysis, with the latter now being the dominant type of analysis. Some of the most-cited references in this cluster are concerned with tools commonly used in prosodic research and analysis such as R software (R Core Team, 2017) and Praat (Boersma and Weenink, 2018)Footnote 7. Some focus on model fitting procedures, e.g., parsimonious mixed models (Bates et al., 2015) and ‘maximal’ models (Barr et al. 2013). Although Baayen et al. (2008) was already cited 56 times, ranking the 14th between 2008 and 2014, its citations doubled to 118 in the recent period, ranking the 5th between 2015 and 2021, with Bates et al. (2015) and Barr et al. (2013) ranking the second and the third with 287 and 177 citations, respectively.

The most influential authors

Table 4 shows the top 50 most-cited authors across the three periods. It is not surprising that some authors of the most-cited references discussed in the previous section are also the most-cited authors overall (e.g., D.R. Ladd, M. Beckman, J. Pierrehumbert, E. Selkirk). Twenty-one of the top 50 authors have remained very influential across the three periods, among whom nine authors have topped the first 20 in all three periods (i.e., A. Cutler, J. Flege, C. Gussenhoven, D.R. Ladd, M.J. Munro, J. Pierrehumbert, E. Selkirk, L.D. Shriberg, Y. Xu). A. Cutler and J. Flege have remained to be in the top five most-cited authors list across all three periods.

Table 4 The top 50 influential authors in the three periods.

With the trend of applying mixed-effects models using R software in prosody research, Bates et al. (2015), Barr et al. (2013), Baayen et al. (2008), and R Core Team (2017) moved to the most-cited authors’ list in the third period, i.e., 2015–2021 (see the middle cluster in Fig. 10, network map of the most-cited authors). Among the many researchers who became influential authors, S.A. Jun joined the bottom right cluster (see Fig. 10), and the other influential authors in this cluster have remained highly cited across three periods (i.e., Y. Xu, L.D. Shriberg, J. Pierrehumbert, C. Gussenhoven, D.R. Ladd, P. Prieto, A. Arvaniti, M. Beckman, and E. Grabe). Notably, at the bottom of the 2001–2007 map (Fig. 8, network map of the most-cited authors) is the smallest cluster represented by F. Flege and M. J. Munro, most likely the L2 prosody cluster. The cluster has continuously expanded across the three periods (left side of the 2008–2014, Fig. 9 and 2015–2021 map, Fig. 10) and was joined by other researchers in similar fields such as T.M. Derwing, P.K. Kuhl, K. Saito. It is interesting to note that some researchers are notably prolific within specific areas, such as Flege and Munro in the realm of L2 prosody, while others, like Ladd and Pierrehumbert, hold influence across the broader spectrum of the field. This divergence could probably be detected through cluster analysis. For instance, the former might have citations concentrated within a single cluster, while the latter could be cited across multiple clusters (see Fig. 10).

Fig. 8
figure 8

Network map of the most-cited authors (2001‒2007).

Fig. 9
figure 9

Network map of the most-cited authors (2008–2014).

Fig. 10
figure 10

Network map of the most-cited authors (2015–2021).

Keyword analysis

Keywords in the retrieved publications across the three periods whose number of occurrences equal to or greater than 10 were submitted for Chi-square analysis to test for significant changes across the three periods. This resulted in a total of 207 author-supplied keywords (out of 7269, 2.85%) and 37 keywords-from-abstracts (out of 821, 4.50%). The cut-off point of 10 was chosen because we observed that the p-values of nearly all keywords with frequencies below 10 were larger than 0.05, indicating that the frequencies of these keywords remained stable across all three stages and did not undergo significant changes.

The results revealed that 61 keywords experienced a significant change in frequency (p < 0.05) and the other keywords showed no significant change (p ≥ 0.05) across the three periods. The results from keyword analysis uncovered important research trends in the field of prosody in the past 21 years. Firstly, it is unsurprising to note that the top ten most frequent author-supplied keywords (see Table 5) are closely related to (1) the concept of prosody (including prosody itself, intonation, phonology, stress and accent/accents), (2) the two main aspects of the investigation of prosody (i.e., speech perception and speech production), (3) the notion in information structure (i.e., focus) that is usually signaled by prosody and widely studied by prosody researchers, (4) the language that is possibly most commonly investigated (i.e., English) and (5) bilingualism, which appears to be widely researched, especially from the second period (2008 onwards). It is important to note that in Table 5bilingualism was the only one on the top ten list whose frequency increased throughout the three periods, indicating the increasing significance of bilingual prosody research. Seven out of 10 topics have remained to be the most discussed throughout, while the other two topics have displayed a downward trend. The possible reasons for these trends will be discussed further below.

Table 5 The top ten most frequent author-supplied keywords (the numbers below each year are normed frequencies).

In the keyword analysis, as mentioned above, the biggest group contains words that remained unchanged in terms of the normed frequencies across the three periods, suggesting these topics are frequently discussed. One of the important findings is that the areas closely related to prosody, such as syntax (total count of author-supplied keywords across three periods: 50), morphology (53), lexical stress (52), and conversation analysis (59), turned out to be frequently discussed (≥30) research topics across the three periods. This suggests these areas have received constant attention in prosody research given the importance of prosody in these areas (Cole et al., 2010b; Fodor, 1998; Harley et al., 1995; Kjelgaard and Speer, 1999; Pratt, 2018; Pratt and Fernandez, 2016; Selkirk, 2011, 1984). Another key point to note is that keywords such as English, French, Dutch, Mandarin Chinese, and Japanese are languages that researchers in the field have maintained interest in throughout the history of prosody research. Among these languages, English has the most frequent occurrence, probably due to the importance of using prosody in language comprehension in English (as reviewed in Calhoun et al., 2023; Cole, 2015; Cutler et al., 1997) as well as its status as lingua franca leading to a large number of both L1 and L2 speakers. Other languages such as Mandarin Chinese (as a tonal language) and Japanese (as a pitch accent language) were also frequently investigated languages due to their typical prosodic features and their larger number of speakers (see Kügler and Calhoun, 2020). More importantly, intonation, fundamental frequency (f0), accent(s), pitch accentFootnote 8, stress and tone which are expected to be key topics in prosody research have indeed been shown to be the most-discussed throughout and continue to be the focus of prosody research.

We now turn to the keywords that have experienced a significant change, whose trends could be further divided into three groups. A sample of the three groups is provided in Table 6. Group 1 displays a general increase across the three periods, Group 2 a general decrease, and Group 3 a rise across the first two periods followed by a fall in the third period (although all normed frequencies in the last period were higher than the first period). Group 1 concerns topics that involve a second language or more than one language: bilingualism, second language (L2), second language acquisition, foreign accent(s) and cross-linguistic influence (CLI), suggesting that studying prosody in L2 or multilingual speakers beyond their native languages has gained more popularity across the three periods. This is probably because of the introduction of the L2 intonation learning theory (Mennen, 2015) which has been attested to be a useful model to predict difficulties that L2 learners encounter based on the intonation differences in learners’ L1 and L2. Group 1 also contains topics that might show newly developed directions in the last two decades: language attitudes, voice onset time (VOT), sound change, tone sandhi (TS), and syntax-phonology interface. Electroencephalography (EEG) is a topic in Group 1, indicating its rising importance in prosody research and its close connection to neuroscience to investigate brain activity in response to prosodic stimuli. This reflects the increasingly interdisciplinary nature of prosody research.

Table 6 Samples of the most frequent keywords (the numbers are normed frequencies).

While some topics have gained increased attention across the three periods, some seem to experience a drop from the second period to the third period, following a rise from the first to the second period (Group 2). The representative topics in descending order in terms of frequencies are gesture, aphasia, language contact(s), and Cantonese. Many of these topics rose from no occurrence in the first period and maintained 10 or more occurrences in the second and third periods. It is within expectation to note that gesture, as a key visual cue, became more popular in the second period. It is probably because the research in this field was boosted by the publication of the special issue Audiovisual Prosody edited by Krahmer and Swerts (2009), and the special issue seems to have a lasting effect on this topic in the third period as it was still more frequent than in the first period. Further, aphasia, a commutative disorder resulting from brain damage, was one of the topics that were receiving less attention from the second to the third period. Inspection of the entire keyword list shows that in the third period, aphasia appeared in a number of other forms: primary progressive aphasia, aphasia rehabilitation, aphasia severity, deep dysphasia, fluent aphasia, music, and aphasia.

A number of topics seem to become less interesting to researchers in prosody and exhibit a decreasing trend in terms of frequencies (Group 3). Although prosody and phonology were highly frequent across the three periods, the normed frequencies nevertheless showed a downward trend. At first, it seemed impossible that the two became less important, however, it is reasonable that prosody and phonology were replaced by more specific terms such as F0, pitch, or stress as keywords, and these terms were preferred for later empirical investigations of prosody. The frequencies of syllable and syllable structure decreased, possibly for similar reasons that these terms are relatively general and may have been substituted with more specific and relevant keywords such as onset, coda, or rhyme that are used in linguistic analysis to describe the components and organization of syllables.

Conclusion and implications

The present study has provided a systematic review of prosody research from 2001 to 2021 in linguistic journals through a bibliometric analysis. Based on the key findings in this study, several significant implications for prosody research have emerged. First, our results have shown a general rise of prosody-related publications in the past two decades, showing its increased significance in broader linguistic research. Second, the co-citation analysis has identified the most cited authors, references and journals, providing valuable information for scholars, especially novel researchers in the prosody field of where the most influential prosody research can be found, who is doing that research, and what areas that research covers. Another important finding worth noticing is that prosody research has witnessed a significant increase in statistical methods especially mixed effect models in the latest two periods (2008–2014, 2015–2021), compared to the previous period (2001–2007). This increase is likely due to the influential publication of the special issue Emerging Data Analysis in the Journal of Memory and Language in 2008. Therefore, it is reasonable to argue that as a unique mode of communication in academia, special issues are effective in highlighting essential or emerging research topics in a specific discipline.

Additionally, findings in the present bibliometric analysis shed light on research trends in prosody. For example, it reveals that intonation, stress and accent remained as the most-discussed topics across the three periods given their high relevance to prosody. It is also unsurprising that speech perception and production are also among the most-discussed topics. Some trends were observed by comparing the normed frequencies across the three periods. For instance, bilingualism has gained more popularity as a research topic from the second period, showing researchers’ increased focus on it given that more people are becoming bilingual or multilingual due to globalization. However, some languages (e.g. English, Chinese, and Japanese) always remain the most researched. The prevalence of English and Chinese might be partially attributed to extensive speaker and learner bases and existing extensive literature on prosody, while the rise of Japanese in prosody research could be attributed to the pioneering contributions made by Pierrehumbert and Beckman.

The bibliometrics-based method has gained popularity in the recent decade to offer a systematic review of research trends in many fields (e.g. Fu et al., 2021; Lei and Liu, 2019a; Wu, 2022; Zhang, 2019). Although this quantitative analysis is based on a substantial number of research papers and reveals developmental patterns of a research topic across different periods, some limitations are observed in the present study. First, as this paper aims to review a large number of prosody-related studies and provide major trends of research on prosody, we have to acknowledge that the literature search does not guarantee that every piece of relevant literature can be covered, due to the selection of search terms and the authors’ choice of keywords in their publications. In the pursuit of a comprehensive understanding of prosody research, we acknowledge a limitation that our choice of search terms may not encapsulate the entire landscape of prosody-related concepts. For instance, concepts such as “duration” and “emphasis” play pivotal roles in prosody analysis but also may have potentially led to an overly broad search with irrelevant results. However, it should be noted that although these terms were not included as search terms, they appeared in our list of keywords given their relevance to prosody. Future studies could explore a broader spectrum of prosodic elements, thereby further advancing the field of prosody research.

Another possible limitation is the sources for the prosody-related articles: some of the prominent journals were excluded from the publication analysis or keyword analysis because they did not meet the criteria in the filtering process in the present study. For example, Journal of Acoustic Society of America is a usual place for prosodists to publish their high-quality research, Frontiers in Psychology published extensively on linguistics, and Linguistic Inquiry might be a major sources of citation. It should be noted that such journals do appear in the co-citation analysis if they are frequently cited, and the inclusion of such journals in the publication/keyword analyses in future studies might be beneficial. Additionally, given the quantitative nature of the study, a more detailed qualitative analysis is needed to complement it; and given the space limitations of the paper, it is not possible to delve into every aspect of the significant trends observed on prosody.

Furthermore, although the qualitative analysis of the research trends was supported by quantitative data, some extent of subjectivity was still involved. Therefore, the interpretations of research trends in our paper need to be confirmed or substantiated by other experts on prosody. It would be more helpful if bibliometric reviews could be read together with traditional reviews to gain a fuller picture of research in prosody.