The multidimensionality of Japanese kanji abilities

The aim of this study was to identify the cognitive structures of kanji abilities in the Japanese general population and to examine age and cohort effects on them. From a large database of the most popular kanji exam in Japan, we analyzed high school graduation level data of 33,659 people in 2006 and 16,971 people in 2016. Confirmatory factor analyses validated the three-dimensional model of kanji abilities, including factors of reading, writing and semantic comprehension. Furthermore, the age effect on writing, and correlations between writing and semantic dimensions, were different between 2006 and 2016, suggesting reduced writing ability and stagnation in integrated mastery of kanji orthography and semantics in current-day Japanese adults. These findings provide the first evidence of the multidimensional nature of Japanese kanji abilities, and age/cohort differences in that dimensional structure. The importance of the habit of handwriting for literacy acquisition is discussed.

However, the rapid spread of digital writing devices, such as PCs and smartphones in recent decades, has drastically reduced frequency of handwriting, e.g., an 11.4% decrease in Japanese adults who habitually write letters by hand over the eight-year period from 2004 to 2012 16 . This is despite the fact that the frequency of kanji use (e.g., typing) does not seem to have changed. In addition, there was a 22.8% decrease in the number of Japanese people who took the most popular kanji exam in Japan, the Japan Kanji Aptitude Test (Nihon Kanji Noryoku Kentei: Kanken) over the ten-year period from 2006 to 2016 (about 2.6 million people in 2006, and about 2 million people in 2016) 17 . On the other hand, the number of people who took the Test of English for International Communication (TOEIC) increased by 63.8% over the same period (about 1.5 million people in 2006, and about 2.5 million people in 2016) 18 . Although opinion polls taken by the Agency for Cultural Affairs in Japan reported that public interest in learning the Japanese language, which includes spoken and written language, has not generally changed in recent decades 19 , increased attention to learning English or internationalization may have resulted in a decrease in the time and effort dedicated to learning Japanese kanji. These environmental changes possibly affect age-dependent acquisition of kanji abilities in Japanese, particularly the dimension related to writing accuracy or orthographic lexicon in adults, as well as integrated mastery of multidimensional kanji skills.
The primary purpose of the present study was to identify the cognitive structures of kanji abilities in the general Japanese population. To examine the validity of multidimensional models of Japanese kanji abilities, we retrospectively investigated a large database of the Kanken, using confirmatory factor analyses (CFA). In addition, we examined the effects of age and cohort on kanji abilities, using comparable data from 2006 and 2016. We hypothesized that (1) the three-factor model of Japanese kanji abilities, including factors of reading accuracy (kanji phonology), writing accuracy (kanji orthography), and semantic comprehension (kanji semantics), fits better than two-or single-factor models, (2) age of examinees affects kanji abilities factor-specifically, and in terms of the relationships among the factors, and (3) the pattern of age effects shown in 2006 data differs from that of 2016.

Methods
Nature of data. We investigated a large database of the most popular kanji exam, i.e. the Kanken, which a large number of Japanese take voluntarily or semi-voluntarily. The Kanken started in 1975 and provides twelve levels of difficulty from the easiest (Level 10) to the most difficult (Level 1, including Pre-2 and Pre-1). From the entire dataset, the present study focused mainly on Level 2 data (mastery of 2,136 daily-use kanji; 12th school year level) of 33,659 people (aged 9-106 years) in 2006, and 16,971 (aged 8-91) in 2016, who had simultaneously taken the exam at public test sites open to everyone. Additionally, to examine the replicability of the factor structures derived from the main data, we used an independent dataset, namely, Level 2 data from 2006 and 2016 from two data points per year, i.e. 12,050 people (aged 9-82 years) and 9,255 (aged 10-78) from 2006; and 9,141 (aged 10-81) and 2,671 (aged 11-78) in 2016. Each of these groups had taken the exam at non-public test sites (schools or public offices) on two dates in each year. Furthermore, we considered level Pre-2 data (daily-use of 1,940 kanji; 10-11th school year level) of 17,796 people (aged 8-97 years) in 2006, and 12,586 (aged 9-92) in 2016; Level 3 data (daily-use of 1,607 kanji; 9th grade level) of 15,769 people (aged 8-106) in 2006, and 12,470 (aged 8-91) in 2016, and Level 4 data (daily-use kanji of 1,322; 7-8th grade level) of 9,125 people (aged 7-86) in 2006 and 6,227 (aged 8-91) in 2016, who took the exam at public sites. The extremely small samples of preschool age children (6 years or younger) were excluded from the analysis.
The following characteristics of the Kanken support the methodological validity of using this dataset in the study: (1) as many as ten subtests that could broadly measure likely components of Japanese kanji abilities, which include reading and writing accuracy, and semantic comprehension; (2) simultaneous implementation at more than one public site in each of the 47 prefectures in Japan, thus reducing region-specific effects; (3) multiple levels of difficulty and multisite implementation around the same period using alternative versions of exam papers, which enabled us to examine the replicability of the results of factor analyses; (4) a large number of examinees varying in age from elementary school age to advanced age; and, (5)  Measures. Level 2 of the Kanken is composed of ten subtests including (1) Reading, (2) Radicals, (3) Structure of compounds, (4) Completion of compounds, (5) Meaning of compounds, (6) Antonyms and synonyms, (7) Homophones, (8) Error correction, (9) Kana suffixes, and (10) Writing. Based on the nature of tasks, we hypothesized the correspondence between subtests and likely factors as follows: subtest 1 and reading accuracy, subtests 2-5 and semantic comprehension, subtests 6-10 and writing accuracy (see Fig. 1). The time limit for this level of the exam was 1 hour and criterion for certification was 80% or higher of a maximum score of 200. Pass rates for this level in 2016 were 19.1-23.1%.
Reading accuracy 1. Reading: This subtest requires examinees to write the correct pronunciation (i.e., convert it to kana) of a marked kanji word appearing in each of 30 sentences, taking context into consideration. Again, a kanji word can alternatively be written only with kana letters that have highly regular letter-sound correspondence. Thus, this conversion from kanji characters to kana letters is usually used as a kanji education in Japan. Each correct item was given a score of 1, adding up to a maximum of 30.
Semantic comprehension 2. Radicals: The examinees were required to extract a radical from each of 10 kanji characters. Radicals are visual components of kanji characters, most of which represent the semantic category, as the left part of the kanji 海 (umi or kai, sea) is regarded as a radical 氵 (sanzui) that means "water" or "fluid". In general Japanese kanji dictionaries, 214 radicals are used for classifying kanji characters and each kanji is assigned one radical. Each correct item was given a score of 1, adding up to a maximum of 10. 3. Structure of compounds: This subtest requires examinees to classify 10 two-character kanji compounds into five categories based on their structure. The categories included cases where the two characters have similar meaning, two characters have opposite meaning, the former modify the latter, the latter is an object/complement of the former, and the former deny the meaning of the latter. Each correct item was given a score of 2, adding to a maximum of 20. 4. Completion of compounds: Examinees were required to complete 10 four-character kanji compounds by choosing one that precedes or follows each of 10 two-character kanji compounds from kana words and converting kana to kanji. There were ten prepared options of kana words. Each correct item was given a score of 2, adding up to a maximum of 20. 5. Meaning of compounds: This subtest requires examinees to choose one option that represents the meaning of 5 sentences from 10 four-character kanji compounds in subtest 4. Each correct item was given a score of 2, adding up to a maximum of 10.
Writing accuracy 6. Antonyms and synonyms: The examinees were required to choose an antonym or synonym for each of 10 two-character kanji compounds from kana words and write it correctly in kanji. There were ten prepared options of kana words. Each correct item was given a score of 2, adding up to a maximum of 20. 7. Homophones: This subtest required examinees to differentially write two homophones of kanji words that were written as marked kana letters in each of 5 pairs of sentences. Each correct item was given a score of 2, adding up to a maximum of 20. 8. Error correction: The examinees were required to identify a homophonic error of a kanji character in each of 5 sentences and write the correct one. Each correct item was given a score of 2, adding up to a maximum of 10. 9. Kana suffixes: This subtest required examinees to write a correct kanji character and a kana suffix accompanying it, based on marked kana letters in each of 5 sentences. Each correct item was given a score of 2, adding up to a maximum of 10. 10. Writing: The examinees were required to write a correct kanji word that was written as marked kana letters in each of 25 sentences. Each correct item was given a score of 2, adding up to a maximum of 50.
Subtests in the other levels. Whereas all subtests of the Level Pre-2 of the 2016 data coincided with those of Level 2, some alternatives were involved in the Level Pre-2 of the 2006 data and Levels 3 and 4 of both datasets. For the Level Pre-2 of the 2006 data, we hypothesized that semantic comprehension factors included Radicals, Structure of compounds, Completion of compounds (two subtests: one employed four-character compounds and another used two-character ones), and Homophones (all of the subtests excluding Radicals were multiple-choice questions). We hypothesized that writing accuracy factors included Antonyms and synonyms, Error correction, Kana suffixes, and Writing. For Levels 3 and 4 of both cohorts, we hypothesized that semantic factors included Radicals, Structure of compounds, Completion of compounds (two-character), and Homophones (the latter three employed multiple-choice), and writing factors included Antonyms and synonyms, Error correction, Kana suffixes, Writing, and Completion of compounds (a task that requires examinees to write a correct kanji character that was written as marked kana letters in each four-character compound). In all cases, the reading accuracy factor included only the Reading subtest. (2020) 10:3039 | https://doi.org/10.1038/s41598-020-59852-0 www.nature.com/scientificreports www.nature.com/scientificreports/ Statistical analyses. Data were analyzed in four steps. All statistical analyses were conducted using R version 3.4.3 (The R Foundation for Statistical Computing, Vienna, Austria).
Step 1: The goodness of fit for each of three structural models of Japanese kanji abilities was assessed with CFAs with the Level 2 data of the Kanken implemented at public sites (2006, 2016), using maximum likelihood estimation. In addition to the traditional χ 2 statistics, the following indices of model fit were employed: the root mean square error of approximation (RMSEA) with its 90% confidence interval, the comparative fit index (CFI), the Tucker-Lewis index (TLI), the standardized root mean square residual (SRMR), and Akaike's information criteria (AIC). RMSEA values <0.05 suggest a good fit, and values <0.08 are considered acceptable. We also calculated p-values for the test of the close-fit hypothesis that RMSEA ≤0.05. This one-sided null hypothesis should be adopted (i.e. p close ≥0.05) for a good fit 21 . The CFI and TLI values should be >0.95, the SRMR values should be <0.08 for a good fit, and lower AICs indicate relatively better fit [22][23][24] . Furthermore, internal consistency was assessed with the coefficient omega of composite reliability 25 for the subtests loaded by each factor after these CFAs, as well as for all ten subtests before the analyses.
Step 2: To examine the replicability of the best fitting model identified in Step 1, CFAs were undertaken with the Level 2 data of the Kanken implemented at non-public sites on the two dates and the Level Pre-2, 3, and 4 data implemented at public sites in 2006 and 2016. Maximum likelihood was used for parameter estimation. Composite reliability was calculated for the subtests loaded by each factor.
Step 3: One-way analyses of variance (ANOVAs) were used to investigate factor-specific differences among four age groups, including high school (13-18 years), university (19-22 years), early (23-39 years) and middle adult (40-59 years). Children aged twelve or younger (n = 149, in 2006; n = 93, in 2016) and adults aged sixty or older (n = 652, in 2006; n = 908, in 2016) were excluded from the analyses in Steps 3 and 4, because of small sample sizes and concerns about the effects of cognitive decline caused by normal aging and neurodegenerative diseases in the case of the latter. These analyses employed the sums of standardized scores (z-scores) on subtests loaded by each factor in the comparable level 2 data of the exam implemented at public sites in 2006 and 2016, to examine the age effects on the abilities of each cohort. Tukey's HSD tests were used for post-hoc comparisons among age groups. Statistical testing in steps 3 and 4 was two-tailed, and α was set at 0.05.
Step 4: Pearson's correlation coefficients among factors were compared by age group with Fisher's z test. These analyses also used the sums of z-scores on subtests loaded by each factor in the comparable level 2 data of the exam implemented at public sites in 2006 and 2016, to assess the age effects on the correlations in each cohort. To see the simple effects of the Kanken total scores on the correlations among factors, we compared the correlations in the two groups including the examinees who got median or higher total Kanken scores or the examinees who got lower scores.

Results
Means and standard deviations of total and subtest scores on Level 2 of the Kanken implemented at public sites in 2006 and 2016 are shown in Table 1. As expected, raw scores on each subtest in these two datasets were broadly similar to each other, though it was hard to draw rigorous direct comparisons between the scores on the different test papers.
Composite reliability coefficients estimated for all ten subtests of these level 2 data in 2006 and 2016 were 0.94 and 0.93 respectively, and all item-total correlations without that item itself were 0.44 or higher (0.77, 0.   Table 2. The three-and two-dimensional models were not nested. The two-dimensional model we hypothesized was composed of the reading comprehension factors (including subtests of Reading, Radicals, Structure of compounds, and Meaning of compounds), and the writing accuracy factors (including subtests of Completion of compounds, Antonyms and Synonyms, Homophones, Error correction, Kana suffixes, and Writing). In the latter group of subtests, examinees were required to write kanji characters accurately, whereas each item of the former ones did not require them to write a whole kanji character but rather kana letters, copy a part (radical) of kanji character, or fill in the bubble.
CFAs with Level 2 data from 2006 showed that the RMSEA estimate for the three-dimensional model (0.047) was lower than those for the two-(0.070) and unidimensional models (0.070), and tests for closeness of fit indicated significance only for the three-dimensional one (p close > 0.05). Additionally, the AIC value for the three dimensional model was lower than those for the other two. The CFI, TLI, and SRMR values indicated a good fit for all three models, though χ 2 statistics were significantly large due to the large sample sizes. In the three-dimensional model (Fig. 1), composite reliability coefficients for semantic comprehension (0.81) and writing accuracy (0.92) were adequate, and those for reading comprehension (0.74) and writing accuracy (0.92) were also acceptable in the two-dimensional model.
In line with the results of CFAs with the 2006 data, the RMSEA for the three-dimensional model with the 2016 data (0.055) was also lower than those for the two-(0.063) and unidimensional ones (0.062), though all of them were not significant on statistical tests (all p close ≤ 0.05). Similarly, the lowest AIC value for the three dimensional model was replicated with the 2016 data, and the CFI, TLI, and SRMR values also indicated good fits for all three models. Composite reliability coefficients for semantic comprehension (0.78) and writing accuracy (0.90) in the three-dimensional model and for reading comprehension (0.72) and writing accuracy (0.92) in the two-dimensional model were at a similar level to those with the 2006 data.

Replicability of model fit.
To examine the replicability of the good fit for the three-dimensional model, we administered the CFAs with the Level 2 data of the exam implemented at non-public sites on two dates and the Level Pre-2, 3, and 4 data at public sites in 2006 and 2016. The results of the CFAs are shown in Table 3.
The CFAs showed that the RMSEA estimated with the Level 2 data of the exam implemented at non-public sites at the later date in 2006 (0.43) and at both dates in 2016 (0.040, 0.054, respectively) and the Level 3 data at public sites in 2016 (0.049) were significantly low (all p close > 0.05), and acceptable in all other cases (all RMSEA ≤0.071). In addition, the CFI, TLI, and SRMR values with these data indicated a preferred fit for the three-dimensional model. Composite reliability coefficients with these data for writing accuracy were adequate (0.86 to 1.02). The coefficients for semantic comprehension were also acceptable (0.70 to 0.77) in most cases, though those with the Level 2 data of the exam implemented at the earlier date in 2016 (0.68), and the Level 3 (0.67) and Level 4 (0.68) of the 2016 data were slightly low.
Effects of age on the scores in each cohort. We administered one-way ANOVAs for four age groups on the sums of z-scores in each of three dimensions of Japanese kanji abilities, using comparable level 2 data in 2006 and 2016, to examine the effects of age in each cohort. The results of ANOVAs are shown in Table 4.  www.nature.com/scientificreports www.nature.com/scientificreports/ higher than those of high-school age children, and the scores of young adults were higher than those of university and high-school age on all three dimensions (all p < 0.001). In addition, the scores of middle adults were higher than those of young adults and high-school and university age examinees on reading accuracy and semantic comprehension, in both 2006 and 2016 (all p < 0.001). In terms of writing accuracy, the scores for this dimension in middle adults were different from those of high-school and university age (all p < 0.001) but not young adults in 2006 (p = 0.660), whereas significant differences among these age groups were shown in 2016 (all p < 0.001).

Effects of age on the correlations among dimensions in each cohort.
To examine the effects of age on integration of kanji abilities in each cohort, we compared correlations among dimensions by age group (Table 4). We investigated only the correlations between the dimensions of semantic comprehension and writing accuracy, because of the ceiling effect of the Reading subtest. In Level 2 of the 2006 data, 72.8% of middle adult examinees and 60.3% of young adults got 90% or higher on the Reading subtest. Similarly, 56.7% of middle adults and 47.3% of young adults got 90% or higher on the subtest in 2016.
The correlations between the sums of z-scores on subtests loaded by semantic comprehension and writing accuracy factors in university age examinees (r = 0.73) and young (r = 0.77) and middle adults (r = 0.77) were higher than that of high-school children (r = 0.70) in 2006 (z = 4.1, p < 0.001, z = 10.6, p < 0.001, z = 8.2, p < 0.001, respectively). In addition, the significant differences between the correlations in high-school age (r = 0.71) and other groups (r = 0.73, r = 0.75, r = 0.74, respectively) were shown also in 2016 (z = 2.5, p = 0.01, z = 3.6, p < 0.001, z = 3.1, p < 0.001, respectively). In contrast, the correlations in young and middle adults were not higher than that of university age examinees in 2016 (z = 1.6, p = 0.12, z = 0.9, p = 0.37, respectively), whereas the differences between those correlations were also shown in 2006 (z = 7.0, p < 0.001, z = 5.1, p < 0.001, respectively). The correlations in middle adults were not higher than those of young adults in both 2006 (z = 0.8, p = 0.45) and 2016 cohorts (z = 0.6, p = 0.52). In contrast to the higher correlations in the older groups whose scores were generally higher, supplementary analyses showed that the correlations in the high score group (r = 0.43, n = 16,441) were not higher than the low score group (r = 0.42, n = 16,417) in 2006 (z = 1.43, p = 0.15). In the 2016 data, the correlations in the examinees with high scores (r = 0.38, n = 8,077) were lower than that of people with low scores (r = 0.44, n = 7,893; z = 4.62, p < 0.001).

Discussion
This study of the most popular kanji exam in Japan tested three hypotheses concerning the dimensionality of Japanese kanji abilities and age effects on them in two cohorts ten years apart. Our results supported (1) the three-dimensional model of Japanese kanji abilities, including facets of reading, writing, and semantic comprehension (see Fig. 1); (2) showed that each ability, as well as the correlations between them, grew with increasing age from adolescence to adulthood, however, (3) the pattern of age-related increase in the writing and orthography-semantics relationship are different for 2006 and 2016. These findings represent the first reported evidence of the multidimensional factor structure of Japanese kanji abilities, and the age/cohort differences between them.
Multidimensionality of Japanese kanji abilities. The good fit of the three-dimensional model for this large dataset supports the supposition that Japanese kanji abilities largely depend on the lexical route in dual-route models. It is assumed that multiple abilities reflecting phonological, orthographic, or semantic processing are needed to master and manage kanji characters that have multiple pronunciations, meaning and visual complexity.
The underlying basis or cognitive demands of literacy acquisition include not only universal but language-specific factors 26 . Before now, visual memory 27 , visuospatial cognition 3 , and morphological awareness 28 have been reported as cognitive predictors of Japanese kanji acquisition, in addition to relatively small contributions of phonological processing and rapid automatized naming, crucial in alphabetical orthographies 29 . These reports suggest a broad range of cognitive functions underpin multidimensional kanji abilities in Japanese. Decades ago, low prevalence of developmental dyslexia was reported in Japan (0.1-2%) 30,31 . However, as noted by Uno and coauthors 3 , many more people than previously recognized may experience difficulties in kanji acquisition caused by a variety of cognitive atypicalities.   in the two cohorts. First, whereas writing accuracy reached a peak in early adulthood in 2006, a further increase in this area was observed from early to middle adulthood in 2016. Second, whereas the correlations between writing accuracy and semantic comprehension also peaked in early adulthood in 2006, an increase in these areas was not observed after university age in 2016. These results indicate reduced kanji writing ability in young adults, as well as weakened orthographic-semantics relationships in adults, a decade later. A possible explanation for this is that the rapid spread of digital writing devices, such as PCs and smartphones 20 that provide the correct kanji or at least options from which to select them, have affected kanji writing ability in the past decade. Alternatively, or in addition, increased attention to learning English 18 or internationalization, which may have resulted in a decrease in the time and effort dedicated to learning kanji, could affect kanji skills.
Although the strong correlations between writing and semantic factors appear to be contrary to the distinctiveness of the dimensions, these close relationships are expected considering the highly consistent relationship between kanji orthography and semantics 32 . Even when the meaning of kanji is unknown, we can sometimes read them aloud. However, in such cases, we cannot single out the correct kanji character from a number of   www.nature.com/scientificreports www.nature.com/scientificreports/ homophones. A decreased orthographic-semantics relationship suggests increased ambiguity in the differential use of homophonic kanji in current-day Japanese adults, which could lead to reduction in use frequency of specific kanji characters or possibly kanji itself.
Finally, our results imply that the habit of handwriting, once essential for daily tasks, may be advantageous for the acquisition of writing-related skills in Japanese. In the present day, digital writing devices are increasingly replacing handwriting not only in Japan but also worldwide, however, whether this technology should be applied to early literacy education is controversial 33,34 . On the one hand, the ease of typing or other supportive capabilities of digital tools are seen to be advantageous for literacy learning, particularly in children with undeveloped motor skills 35 or reading/writing difficulties 36 . On the other hand, it has been argued that the coupling of motor action and perception during handwriting can facilitate literacy acquisition, based on evidence from experimental 37 , neuroimaging 38 , and intervention studies 39 . Our data support the latter view and imply that keeping the habit of handwriting may be important for integrated mastery of higher-level literacy skills in adults.
Limitations. First, in this retrospective analyses of data obtained not for research, the sample was not randomly extracted from each age group of the general population. Differences in motivation for taking the exam may have affected scores. Second, this cross-sectional study cannot indicate whether the observed age-related differences reflect a developmental process. Finally, the ceiling effects of the Reading subtest possibly influenced the results of the CFAs. The distinctiveness of the reading dimension or the relationships with the other two dimensions need to be examined in future.

conclusion
The current study showed new evidence of the multidimensional nature of Japanese kanji abilities composed of reading, writing, and semantic comprehension. The different pattern of age-related effects on the abilities between 2006 and 2016 cohorts suggested reduced kanji writing ability and stagnation in integrated mastery of kanji orthography and semantics in current-day Japanese adults. This decline appears against the backdrop of the rapidly spreading use of digital writing devices and/or increased attention to learning English or internationalization. These findings warrant further research on cognitive and neurobiological bases of kanji acquisition in Japanese people, and the effect of handwriting on literacy acquisition in children and adults acquiring Japanese or other orthographies.

Data availability
The data analyzed in this study is available from the corresponding author upon reasonable request.