Phonological network fluency identifies phonological restructuring through mental search

We investigated network principles underlying mental search through a novel phonological verbal fluency task. Post exclusion, 95 native-language Mandarin speakers produced as many items that differed by a single segment or lexical tone as possible within one minute. Their verbal productions were assessed according to several novel graded fluency measures, and network science measures that accounted for the structure, cohesion and interconnectedness of lexical items. A multivariate regression analysis of our participants’ language backgrounds included their mono- or multi-lingual status, English proficiency, and fluency in other Chinese languages/dialects. Higher English proficiency predicted lower error rates and greater interconnectedness, while higher fluency in other Chinese languages/dialects revealed lower successive similarity and lower network coherence. This inverse relationship between English and other Chinese languages/dialects provides evidence of the restructuring of the phonological mental lexicon.

www.nature.com/scientificreports www.nature.com/scientificreports/ Finally, when Mandarin participants remembered lexical items as presented in either Chinese characters (syllable-sized) or pinyin (Mandarin romanization), during the form preparation paradigm, onset effects occurred when onset letters overlapped but not when onsets of Chinese characters overlapped 30 . This revealed that a principle paradigm used in the theory's creation cues the memory of the orthographic units rather than solely their phonological representations. The connection between phonology and orthography in a recent study implementing a picture-naming form preparation task more firmly places the proximate unit within the realm of literacy acquisition. Kindergartners, who were learning via pinyin, showed onset effects, grade 1 and grade 2 students tonal syllable effects, while grade 4 students and adults, in line with the proximate unit principle, revealed atonal syllable effects 31 .
One possible explanation to account for the mixed results of the above-mentioned studies, particularly with the Mandarin speaking participants, lies in their often, rich language backgrounds. The bulk of participants tested to date have been university students with not only varying degrees of proficiency in English but other Chinese languages/dialects, which make them candidates for putative bilingual advantages or disadvantages.
Bilingual disadvantages have been attributed to lower proficiency in both languages when compared to monolinguals [32][33][34][35][36][37] , increased lexical competition 38,39 or language attrition due to a drop in first language (L1) frequency of use during increased use of a second language (L2) 40,41 . Advantages in processing are hypothesized to be due to gains in executive control due to switching between languages 42 . Counter evidence [43][44][45][46] has however begun to mount as to bilingualism's putative benefits. An example of this can be seen within the verbal fluency domain where despite a number of bilingual effects in the letter fluency task, a recent meta-analysis 47 revealed no reliable effect. The same meta-analysis however showed a bilingual disadvantage in semantic fluency.
Recent studies have also pointed to an interaction between other Chinese languages during Mandarin processing. Wu et al. 48 found that bilingual speakers of both Jinan Mandarin and Mandarin (Puthonghua or Standard Chinese), which critically differ in tonal assignments per lexical items, were facilitated in lexical judgments in an auditory lexical decision task unlike their Mandarin monolingual peers. Facilitation was also seen in a speech production task wherein native-Mandarin speakers of greater proficiency in other Chinese languages were faster in producing phonological neighbors of Mandarin monosyllables 49 . These two studies support a building hypothesis that typologically similar languages facilitate rather than inhibit processing 50 .
To explain a possible influence of either English proficiency or other Chinese languages/dialects on mental search it is necessary to rely on the literature of phonological/orthographic effects. In speech recognition tasks with speakers of alphabetic scripts two proposals have taken form that entail either the cross-activation of orthographic and phonological representations [51][52][53][54] , or the restructuring of phonological representations due to the acquisition of orthography 55,56 . Mandarin has played a unique role in this literature because this population undergoes years of learning a logographic script that critically differs from learning an alphabetic script due to its demand on memory and reliance on handwriting 57 . fMRI studies have shown that phonological and orthographic brain networks activated during auditory and reading tasks across multiple age groups critically differ from those seen in English speakers [58][59][60] . For instance, when children and adults, each of English L1 and Mandarin L1 proficiency performed auditory rhyme judgments of orthographic/phonological inconsistency, (e.g., pint / pʰaɪntʰ/, mint /mɨntʰ/) only English adults, and children with high reading skill showed greater activation in brain areas related to phonological processing 58 . Meanwhile, in the area of speech production, there is evidence of cross-activation in picture naming [61][62][63][64] . Recent Mandarin speech production results support the implication that orthography is active during retrieval of phonological information, without providing evidence of restructuring. Naming latencies of colored line drawings were facilitated for words that shared radicals (components of Chinese characters) 64 despite characters never being exposed to the participants during the task. Using the same paradigm, Tibetan L2 speakers of Mandarin showed that interactivity of Chinese orthography and phonological information during speech production occurs for L2 speakers as well 63 .
How might the acquisition of an alphabetic language affect Mandarin speakers? There is evidence to suggest that greater fluency leads to greater segmentation in Mandarin. Mandarin speakers of high English proficiency showed onset priming effects in Mandarin when portions of syllable structure overlapped (e.g., CV→CVN) 65 . Meanwhile, Mandarin-English bilinguals also showed priming effects in Mandarin after multiple repetitions of target items in a form preparation task 66 . A potential interpretation of these results is that greater proficiency in an alphabetic language, in which the orthography capitalizes on phonological information, leads to greater segmentation of Mandarin during the processing of Chinese characters, despite there being little transparency between Mandarin phonology and Chinese characters 67 . Rather than this interpretation being one of cross-activation, it is dependent on acquisition and subsequent proficiency in the L2, and as such represents a possible route for the restructuring hypothesis.
A current gap in the literature on phonological segmentation in speech production is the lack of investigation into the nature of the participants' phonological representations and how such representations might vary between participants, particularly due to knowledge of other languages. One methodology being used in the study of mental representations involves the use of network science with psycholinguistic tasks 68 . In such studies lexical items are nodes and their edges a relational parameter based on linguistic aspects such as semantics 69 , phonology 70 , orthography 71 , and/or two 72 , or more layers 73,74 . The use of networks to study variation amongst a population has thus far been done through networks created from individual participants' responses in semantic fluency tasks. Results from such studies have revealed differences between monolingual and bilingual speakers 75 , children of typical hearing and those raised with cochlear implants 76 , and healthy controls and patients of mild cognitive impairment and Alzheimer's 77 . The innovative application of network science to semantic fluency has occurred despite the challenge of establishing which words are or aren't neighbors (i.e., those words that share an edge) 78,79 . Contrary to this limitation, a task implementing phonological verbal fluency is uniquely fit for participant-level networks due to the modeling assumption that phonemes are mental categories shared across the speakers of a language 2,80 .
www.nature.com/scientificreports www.nature.com/scientificreports/ While individual networks have not yet been explored in the literature dedicated to phonological networks, a wealth of methodological and theoretical tools for the modeling of speech processing has taken shape. At the macro level, topological features have been analyzed both within 49,73,74,81,82 and between languages [82][83][84][85][86] to reveal commonalities, such as positive mixing by degree (wherein nodes with many edges tend to be neighbors of nodes with many edges). At the meso-level, analyses of communities (groups of nodes within the network) 87 , and components (subgraphs and unconnected nodes, i.e., "isolates"), suggest that words of greater connectivity are slowed in recognition and recalled less accurately 88 . At the micro level, the word-level measure known as clustering coefficient (the proportion of a node's neighbors that are also neighbors of each other) expanded the previous understanding of interactivity between lexical items in both speech production 89 , and recognition 90,91 by showing that a word's interconnectedness affects lexical processing.
The use of phonological networks within a fluency paradigm builds on a long history of probing the phonological mental lexicon through the letter fluency task 92,93 . Letter fluency has participants produce words based on their sharing an alphabetic letter. Dependent variables from the task have primarily included three simple measures 94 : number of valid productions, number of clusters, and number of switches. A network phonology approach improves on the biases of the letter fluency task by making each measure quantitative and dependent on knowledge of phonology rather than orthography. For instance, in the letter fluency task, two responses are considered part of a cluster if they begin with the same two letters. This is problematic because while letters might correspond to phonemes, they often do not (e.g., cat /kʰaetʰ/, car /kʰɑɹ/, cake /kʰeɪkʰ/). We replace the qualitative concept of clusters by counting our participants' network components, and measuring their networks' clustering coefficients (i.e., node interconnectedness). Additional measures are introduced to account for graded similarity, our participants' switching (divergence from a cluster), and their rate of producing items with a syllable bias through measuring the consecutive production of syllable neighbors (i.e., items that share the same atonal syllable).
In the current study, we construct for the first time individual-level phonological networks wherein nodes are monosyllabic lexical items produced within a phonological verbal fluency task that asks participants to produce phonological neighbors to monosyllabic stimuli. Participants were instructed on the creation of networks' wherein edges were defined by the relational parameter known as phonological edit distance 95 , in which two lexical items are immediate phonological neighbors (i.e., edit = 1) if they differ from one another by the addition (at /aetʰ/ → cat /kʰaetʰ/), deletion (cat /kʰaetʰ/ → at /aetʰ/), or substitution (mat /maetʰ/ → cat /kʰaetʰ/) of a single phonological segment or lexical tone (ma1/ma 55 / → ma3/ma 214 /). We hypothesized that by giving instructions to produce neighbors that required the segmentation of the target stimuli we would induce differential results between those participants that tend toward segmentation and those that tend toward syllable processing. We first described our participants' networks through a correlation analysis and visualization of example networks in order to identify whether participants used a search strategy indicative of mental search of a segmental or syllabic nature. In terms of network structure, we predicted our participants' networks would vary in clustering coefficient, number of network components and particularly mixing by degree, where we assumed networks would reveal both positive (assortative) and negative (disassortative) values contrary to the results reported from whole-vocabulary phonological networks. We then performed a multivariate analysis to investigate whether our participants' language backgrounds influenced search through the mental lexicon. Because of the rather mixed results in regards to bilingualism, and specifically the null effect in letter fluency, we were agnostic as to its possible effects. However, we hypothesized that differential effects for fluency and network measures due to both English proficiency and knowledge of other Chinese languages/dialects would present a case for phonological restructuring. We hypothesized that speakers with greater fluency in other Chinese languages/dialects would be biased towards mental search of syllable sized units and thus tend toward a greater proportion of syllable neighbors. A greater proportion of syllable neighbors would in turn result in low clustering coefficient, a high number of network components, a higher mean rate of switching, and greater errors, seeing as many items within a syllable family are nonitems. Meanwhile, we predicted that participants with higher English proficiency would tend towards greater segmentation (i.e., low edit distance between target stimuli and verbal productions), which would lead to high clustering coefficient, a lower number of network components and fewer errors due to their greater facility in segmenting related to proficiency in processing an alphabetic language. recruited from the Hong Kong metropolitan area, were included in this study. Before beginning the experiment, participants completed a short biographical survey which included, besides age and sex, the name of their home province, self-rated spoken proficiency on a scale of 1 (beginner) to 10 (native speaker) in English (levels 5-8, M: 6.74; SD: 0.94) and other Chinese languages/dialects and/or other non-Chinese languages.

Methods
Self-assessed English proficiency was used in the current study. Given the inordinate cost and duration of administration of language proficiency instruments (e.g., 2-5 hours at $799 96 ), self-assessment is used as a replacement 97,98 , particularly in low-stakes and low-resource settings. Despite their limitations, self-assessment has shown reliability and validity as a measure of oral proficiency in regards to other direct methods [99][100][101] . All participants reported native-level proficiency in Mandarin with no history of speech or hearing disorders. To create a rough representation of knowledge of other Chinese languages, we summed the number of Chinese languages/dialects (Num_Chinese) that fell within the self-rated values of 3-10 (levels 1-3; M: 1.92; SD: 0.66). Num_Chinese was recently found to significantly facilitate the spoken production of phonological neighbors 49 . From our participants' spoken language proficiency ratings, we created the variable, Multilingual, through categorizing speakers as either monolingual (48), for ranking only one language between 9-10, or multilingual (47), for ranking more than one language between 9-10. Note that due to only five participants ranking three languages between 9-10 in fluency, we did not create bilingual and trilingual categories, but collapsed them into the single multilingual category.
www.nature.com/scientificreports www.nature.com/scientificreports/ From the original 107 participants recruited, one participant was excluded due to researcher error in acquiring demographic data. Seven participants were excluded due to excessive error rates lying 2.5 standard deviations above the group mean for productions of nonwords (5) and repetitions (2). A further four participants were excluded due to being the sole participants in English proficiency levels 1, 4 and 9, and Num_Chinese level 4.
The Hong Kong Polytechnic University's Human Subjects Ethics Sub-committee (reference number: HSEARS20140908002) reviewed and approved the details pertinent to all experiments conducted in this study prior to beginning recruitment. The methods were carried out in accordance with guidelines and regulations. The participants gave their informed consent and were compensated with 50HKD for their participation.
Stimuli. The lexical statistics to describe our stimuli and later our participants' word-level networks come from the tonal fully segmented schematic representation (C_G_V_X_T) found in the updated version of the Database of Mandarin Neighborhood Statistics 102 . The current instantiation of the database is freely available here: https://github.com/karlneergaard/Database_of_word-level_statistics.
The stimuli consisted of 6 monosyllabic Mandarin words that were high in single edit neighbors (M: 21.67; SD: 4.13), and differed in syllable structure: ye1 (/iɛ 55  Procedure. Seated in a quiet room and wearing headphones equipped with an adjustable microphone, each participant was exposed to 8 short videos featuring a male Mandarin speaker in his 20s from the Beijing area that verbally provided the experiment's instructions, practice, and six stimuli. During the experiment's instructions phase, participants were told to produce as many phonological neighbors to a given stimuli as possible within 1 minute. To illustrate what qualified as phonological neighbors the speaker presented the monosyllable jie1 /tɕiɛ 55 / and neighbor examples according to the replacement of the glide, jue1 /tɕyɛ 55 /, the addition of a final nasal, jian1 /tɕiɛn 55 /, the replacement of the onset, xie1 /ɕiɛ 55 /, replacement of the monophthong, jia1 /tɕia 55 /, and the substitution of lexical tone, jie2 /tɕiɛ 35 /. The practice phase had participants produce phonological neighbors to the monosyllable jing3 /tɕiŋ 214 / for 1 minute. Prior to beginning the experiment participants were instructed to not produce non-items, which included syllables that do not correspond to existing Chinese characters. Participants had one minute for each of the six randomized stimuli. Measures. The current measures depend on shared phonological similarity between participants' verbal productions and an experimenter-provided stimulus, as is illustrated in Fig. 1a  www.nature.com/scientificreports www.nature.com/scientificreports/ Error. Verbal fluency tasks require participants to simultaneously suppress and retain lexical items in working memory, which draws from executive control processes 103 . To assess the level of executive control used during the task we summed both error responses (Error): repetitions and nonword productions.
In order to capture similarity over time, and thus a participant's rate of switching, we introduce the measure, 'running edit' (RE). RE entails the mean of successive weighted edit distances of all items produced per trial. Similar to the method of measuring switching 94 , nonword productions and repetitions were included in its calculation. Note that an RE of 1 means that every successive production was an immediate neighbor of the one that preceded it, while lower values mean that mental jumps were made that repetitively deviated from successive similarity.
Given the question of whole-syllable retrieval specific to Mandarin speakers, and the known tendency to manipulate lexical tone while maintaining the syllable in phonological association tasks 49,104 , we devised a variable that would test if successive productions were syllable neighbors (SN). SN entails the proportion of successive syllable neighbors within a trial including repetitions and nonwords.
Network. Correct lexical items per each trial were constructed into undirected graphs wherein two given items shared an edge if they had an edit distance of one. Mean clustering coefficient (CC), mixing by degree (M), and the number of components per network (NC) were calculated through the use of the igraph package in R 105 .
At the word level, clustering coefficient (CC), is the proportion of neighbors who are also neighbors of each other and represented as falling between 0 and 1. In studies that contrasted the effects of high versus low CC, words high in CC have been tied to greater speech errors 89 , lower accuracy in auditory perception 90  A network's M is represented as falling between -1 and 1. Negative M values are referred to as disassortative. They occur when networks have star like patterns, i.e, wherein few nodes are connected to many nodes. Positive M values are referred to as assortative. They occur when high degree nodes are connected to other high degree nodes. We included this measure in the current study because any deviation from assortativity, which has been found in all phonological networks investigated to date 73,74,[81][82][83][84]86 , including Mandarin 49 , would be indicative of differences between participant-level and whole vocabulary investigations of the phonological mental lexicon.
Knowing from previous association tasks 49,104 that Mandarin speakers produce edit distances greater than one when asked to produce minimal pairs, we assumed our participants' networks would feature both disconnected lexical items (isolates) or separate groupings of items (islands). Our final dependent variable, number of components (NC), is a quantitative account of the number of clusters per verbal productions. It represents the success (few components) or failure (many components) to produce a coherent network. An example network with an NC of three can be seen in Fig. 1c.
The custom in the analysis of networks is to exclude isolates/islands and then report values for the largest connected component. This tradition is due to calculations diverging to infinity without an available edge between components. In order to represent all given correct lexical items, while simultaneously penalizing poorer performance marked by the presence of islands/isolates, we calculated a weighted average across all components for both CC and M, wherein isolates were set to zero. Thus, if in a 10-node network, wherein component A has a CC of 1 with 9 nodes, and component B is an isolate, the weighted CC value would be reduced by 10% and equal 0.9.

Results
Responses were transcribed to pinyin by two native-Mandarin speaking volunteers. Using the above-mentioned database 102 , items were categorized as either real lexical items, or nonwords. Real lexical items consisted of syllables that could be ascribed to at least one Chinese character whether or not they qualified as monosyllabic words. Items were classified as correct responses if they did not correspond to a nonword in the database or were not repetitions.
Exclusions of outliers were made according to trials that lied 2.5 standard deviations above (WE: 11 trials; SN: 1 trial) and below (RE: 12 trials) the mean, and through the use of a boxplot: M (4 trials). Total number of exclusions (28) accounted for 4.9% of all trials. According to the same criteria, no exclusions were made based on the distributional properties of CC. Finally, no exclusions were made based on NC or Error, because their distributional features were addressed in the statistical models to follow. Table 1 displays both correlations between the dependent variables and their descriptive attributes by min, max, mean, and standard deviation.
As is shown in Fig. 2a, nonwords and repetitions were proportionally rarer than correct responses. Meanwhile, in Fig. 2b we see that participants by and large produced phonologically similar responses, such that hop-1 www.nature.com/scientificreports www.nature.com/scientificreports/  . While a syllable-driven search method is evident in the data, it was not the dominant search method. A split of all trials based on SN > 0.50, revealed that 34% of trials were created with a predominantly syllable-driven search method.
In Fig. 3 we illustrate two search methods that produced networks of equivalent fluency scores (3a, WE = 15.66; 3b, WE = 15.57), despite not having an equivalent number of correct responses (3a = 16; 3b = 22). In Fig. 3a, we see that the participant utilized their awareness of syllable constituents in their maintenance of the rime an3 /an 214 / while primarily manipulating onsets. This led to a network high in interconnectedness (CC = 0.914), with a low proportion of syllable neighbors (SN = 0.176), consisting of fifteen hop-1 responses and zero hop-2 responses. In Fig. 2b we see an example of a syllable-driven search, wherein the identification of an immediate neighbor (e.g., wai4 /uaɪ 51 / → zai4 /tsaɪ 51 /) brought with it syllable neighbors (zai1, zai2, zai3) whether or not they were non-items (zai2). In contrast to the network in Fig. 3a, the syllable driven method seen in 3b resulted in lower CC (0.391) and a much higher proportion of syllable neighbors (SN = 0.760), consisting of seven hop-1 responses, twelve hop-2 responses, and four hop-4 responses.
To understand the roles of RE, and M it is necessary to look at the graphs in terms of the correlation analysis. As is evident from their respective RE values (3a, RE = 0.87; 3b, RE = 0.89), both search methods can produce high RE values. From the correlations in Table 1 we see that rather than inform us on the use of syllables as a means of search, RE entails search methods that either closely tie to phonological similarity (low RE) and thus result in high CC, or deviate from successive similarity (high RE) and result in less network coherence (high NC).   www.nature.com/scientificreports www.nature.com/scientificreports/ Statistical analysis. The statistical analysis utilized the mcglm package in R 108 , which allowed for the fitting of regression models with multiple dependent variables, and distributional family types. Of the fluency (WE, RE, SN), and network variables (CC, NC), only M was best fit with a normal distribution. The tweedie variance function 109 was used to fit WE, RE, SN and CC, while the poisson-tweedie variance function 110 was used to fit the count variables, NC and Error.
As can be seen in Table 2, higher self-rated spoken-English proficiency showed a facilitative effect on greater CC. The proportion of syllable neighbors and errors produced by participants also significantly predicted English proficiency, such that higher proficiency participants produced less syllable neighbors (lower SN), and fewer errors than lower proficiency speakers. Greater Num_Chinese equated both the production of less successively similar items (RE), and less successive syllable neighbors (SN). Networks of less coherence (high NC) were also produced by participants with greater Num_Chinese. Finally, multilingual speakers on average produced less errors (Multilingual, M: 1.66; SD: 1.85; Monolingual, M: 1.93; SD: 2.33). No effects were found for either WE, or M.
To disentangle the effects displayed in Table 2, in Fig. 4 we graphed interactions with tensor product smooths through the use of generalized additive models within the mgcv package in R 111 . Figure 4a illustrates that higher Num_Chinese participants produced low successive similarity (low RE), and a greater proportion of network components (high NC), while simultaneously illustrating the strong negative correlation between NC and RE previously shown in the correlation analysis. Contrary to our predictions, high Num_Chinese did not equate a tendency toward the syllable-driven search method. This means, that high Num_ Chinese participants predominantly choose items that were neither syllable neighbors nor immediate neighbors, resulting in low network coherence (high NC).
In contrast to Num_Chinese, in Figure Fb we see that greater proficiency in English resulted in networks of greater precision. While this is implied in the significant CC effect, it is also evident in an interaction between SN, Error, and English. Figure 4b shows that higher proficiency English speakers produced less syllable neighbors (low SN), and that as SN increased so did the proportion of errors.

Discussion
In the current study we asked if variation in phonological representations of Mandarin-speaking participants, elicited through a novel verbal fluency task, would inform on the question of segmentation in speech production. Mixed results in speech production studies, and evidence of segmentation among both speakers of high English fluency 65 , and young participants learning through pinyin 31 , led us to hypothesize that differential results would  www.nature.com/scientificreports www.nature.com/scientificreports/ arise due to biases towards either segmental or syllable-driven search through the mental lexicon. We further hypothesized that these search methods would be due to the influence of our participants' language backgrounds, namely, their self-reported English proficiency (English), number of other Chinese languages/dialects spoken (Num_Chinese), and the number of languages spoken with native-level proficiency (Multilingual). Our analysis revealed variation in network structure, distinct mental search methods, an effect of error for multilingual speakers, and an almost inverse relation between English and Num_Chinese.
Due to the novelty of implementing participant-level phonological networks, we begin by discussing the network measures. As expected, both mean clustering coefficient (CC) and number of network components (NC) varied across participants. Of methodological and theoretical import are the findings related to mixing by degree (M). All previous measures of M with whole-vocabulary phonological networks have shown positive mixing by degree 49,73,74,[81][82][83][84]86 , also known as assortativity. Assortativity describes the state in which nodes of high degree tend to be connected to other high degree nodes while disassortativity entails high degree nodes connected to many low degree nodes 112 . Vitevitch 81 theorized that in an assortative network the spread of activation during lexical selection would be restricted due to high clustering, effectively lessening the number of lexical candidates, while in a disassortative network, activation would be spread throughout the mental lexicon leading to the need to reject a greater number of lexical candidates. Vitevitch and colleagues 113 later found evidence of assortativity through investigating failed lexical retrieval through simulations with jTRACE 114 and a number of psycholinguistic tasks that used phonological degree values extracted from the 1967 version of Webster's Seventh Collegiate Dictionary.
In contrast to past investigations on phonological networks, our participants' networks were of equal assortative and disassortative proportions. On the one hand, this difference between whole-vocabulary measures and the current variation in M highlights a need to account for individual differences in cognitive modeling 115 . Variation in structure according to M would imply speakers experience differential lexical competition and facilitation during selection due to how information spreads across lexical networks. This has been implied in cross-linguistic differences in the effect of phonological degree (i.e., phonological neighborhood density) where high degree has been shown to slow recognition in English 95,116 , but speed recognition in both Mandarin 117 and Spanish 118 . On the other hand, the positive correlation seen between fluency (weighted edit: WE) and M suggests that disassortativity might be a result of low fluency rather than the structure of the lexicon. Future work can resolve the second implication simply through manipulating task duration, while the first implication will need to be addressed through investigating a greater number of individual differences than were addressed in the current study.
Through graph visualization and correlation analysis we identified two distinct patterns of search: one in which syllable constituents such as onsets, tones, and rimes were manipulated to create immediate neighbors, and the other in which all four tones of a given atonal syllable were successively produced. Both the segment-and syllable-driven search methods were successful at producing high-fluency networks. Contrary to our expectations, the successive production of syllable neighbors (SN) correlated with a high rate of successive similarity (running edit: RE), and a low number of network components. However, in line with our predictions was that this search method resulted in more errors and low clustering coefficient. A split of the distribution based on SN > 0.50 revealed that the syllable-driven method accounted for roughly 34% of all trials. In light of the proximate unit principle 14,15 , it would appear that while there was a clear influence of atonal syllables on search, it was less influential than our participants' ability to segment. However, one limitation important to consider is that given the task instructions purposefully biased participants toward segmentation we do not know whether an opposite www.nature.com/scientificreports www.nature.com/scientificreports/ bias towards syllable neighbors would lead to a lower proportion of immediate neighbors. Future research would benefit from investigating whether participant's fluency and network structure alter if given a task that involves the manipulation of syllable targets rather than segments.
The current evidence shows that variation exists among Mandarin speakers in the segmentation of syllabic constituents during speech production. Our investigation into our participants' language backgrounds sheds light on how that variation might have taken form. While the findings of English and Num_Chinese are note-worthy, those of Multilingualism are lacking. Of the seven dependent variables analyzed only error showed a slight multilingual advantage. On the one hand this could be used to argue for a benefit of executive function. Yet, given the null effect of letter fluency in the most recent meta-analysis 47 , and the growing evidence that bilingual/multilingual effects are best explained by alternative hypotheses, such as social demographics 45,46 , intelligence 45 , or culture 44 , a multilingual advantage in verbal fluency, according to a notably small effect, should be viewed skeptically. The effects of English proficiency and Num_Chinese, however, present a case for phonological restructuring seen through network precision.
Greater knowledge of other Chinese languages/dialects led to networks of less precision. Higher Num_Chinese led to lower successive similarity between responses, and lower network cohesion (high NC). Surprisingly, greater knowledge of other Chinese languages also equated a lower proportion of successive syllable neighbors. In short, participants from the three levels of Num_Chinese predominantly chose items that were neither predominantly syllable neighbors nor immediate neighbors. Meanwhile, low network coherence for high Num_Chinese speakers points towards greater competition between lexical items from other Chinese languages/dialects during the task. This stands in contrast to the accounts of facilitation in lexical selection seen in previous studies 48,49 . However, it should be noted that those tasks differed from the current fluency paradigm and reflected facilitation through speed of reaction time.
The influence of greater English proficiency on our participants' networks is one of greater precision. Higher proficiency English speakers produced a lower proportion of successive syllable neighbors, fewer errors, and more interconnected networks than lower-proficiency speakers. What can be gleaned from this knowledge is that lower-proficiency speakers drew on less knowledge of manipulating phonemic information due to less experience with an alphabetic language. Thus, rather than depending on segmentation of phonological units that in turn create highly interconnected networks, our lower-proficiency participants tended toward the use of atonal syllables and thus the increased likelihood of producing errors that comes with the syllable-driven search method.
While the contrast in network precision between English and Num_Chinese is evidence of the orthographic restructuring of the phonological mental lexicon, there are limitations to be considered. The first lies in our use of self-reported English proficiency, which does not account for the multi-dimensionality of assessing language proficiency 119 . Future research might focus on particular aspects of proficiency where segmentation is expected to affect learning outcomes. Next, due to the task design, we were unable to account for why participants of greater Num_Chinese experienced greater lexical competition. For instance it is possible that participants consciously or unconsciously used knowledge of radicals during search 63,64 . Given the constraints of the task, and the low phonological consistency of radicals, this search criteria would likely lead to spurious neighbors and thus the effect of producing neither high successive neighbors nor immediate neighbors. This is however speculation seeing as the current design was not able to account for the orthographic nature of our participants' responses in light of any given spoken response corresponding to numerous homophones.
In summary, the current study expanded on the explanatory capabilities of a phonological fluency task through the use of network science and phonological edit distance. The edit distance metric allowed for a graded assessment of fluency but more importantly, quantifiable means to assess the contents and structure of our participants' responses. Critical to the evidence to support the claim of phonological restructuring, was the ability to quantify successive similarity, network coherence, and node interconnectedness. It is prudent to point out that the identification of restructuring is only likely to occur with language pairs such as English and Mandarin, specifically due to the contrast in orthographic systems. Future uses of phonological network fluency would benefit from applications to both languages other than Mandarin and to clinical populations wherein graded fluency and network characteristics can inform on the knowledge and access of phonology.