From language to meteorology: kinesis in weather events and weather verbs across Sinitic languages

Huang, Chu-Ren; Dong, Sicong; Yang, Yike; Ren, He

doi:10.1057/s41599-020-00682-w

Download PDF

Article
Open access
Published: 04 January 2021

From language to meteorology: kinesis in weather events and weather verbs across Sinitic languages

Humanities and Social Sciences Communications volume 8, Article number: 4 (2021) Cite this article

3047 Accesses
10 Citations
8 Altmetric
Metrics details

Subjects

Abstract

Interactions among the environment, humans and language underlie many of the most pressing challenges we face today. This study investigates the use of different verbs to encode various weather events in Sinitic languages, a language family spoken over a wide range of climates and with 3000 years of continuous textual documentation. We propose to synergise the many concepts of kinesis that grew from Aristotle’s original ideas to account for the correlation between meteorological events and their linguistic encoding. It is observed that the two most salient key factors of weather events, i.e., mass of weather substances and speed of weather processes, are the two contributing components of kinetic energy. Leveraging the linguistic theory that kinesis underpins conceptualisation of verb classes, this paper successfully accounts for the selection of verbs for different meteorological events in all Sinitic languages in terms of both language variations and changes. Specifically, weather events with bigger weather substances and faster weather processes tend to select action verbs with high transitivity. The kinesis driven accounts also predict the typological variations between verbal and nominal constructions for weather expressions. The correlation between kinesis and the selection of verbs is further corroborated by an experiment on the perception of native Sinitic language speakers, as well as analyses of regional variations of verb selections that do not follow general typological patterns. It is found that such typological exceptions generally correspond to variations in meteorological patterns. By explicating the pivotal role of kinesis in bridging weather events and the linguistic encoding of weather, this study underlines the role of cognition as the conceptualisation of physical and sensory inputs to sharable knowledge encoded by language.

Entity, event, and sensory modalities: An onto-cognitive account of sensory nouns

Article Open access 22 May 2023

Yin Zhong, Kathleen Ahrens & Chu-Ren Huang

Spatial communication systems across languages reflect universal action constraints

Article Open access 30 October 2023

Kenny R. Coventry, Harmen B. Gudde, … Ozlem Durmaz Incel

Determining the cognitive biases behind a viral linguistic universal: the order of multiple adjectives

Article Open access 05 December 2022

Evelina Leivada

Introduction: environment, human and language

The impact of human beings on the environment and the impact of environmental and climate changes on the future of humanity are two of the most pressing challenges for our current scholarly pursuits. However, it is impractical to get the required time-span and geographical scale of data to scientifically establish eco-environmental changes and/or impact causality, given the time it takes for the impact on the environment or humans to be fully documented. To tackle this shared challenge, the scientific community has proposed alternatives to purely quantitative-measurement studies. A notable alternative is to leverage traditional ecological knowledge (TEK, e.g., Pierotti and Wildcat, 2000; Huntington, 2000). That is, although we cannot afford to gather enough data for full cycles of eco-environmental changes, such experience and knowledge have already been collected and accumulated in the shared memory of the people who have lived in a particular ecological system for the past centuries or millennia. Thus, it is possible to form a comprehensive account by leveraging relevant TEK that has been converted to up-to-date forms of scientific knowledge representations.

The incorporation of TEK has been shown to benefit the research of sensitive and vulnerable environments (e.g., Berkes et al., 2000), biological and ecological education (e.g., Kimmerer, 2002) and Artic science and climate change studies (e.g., Huntington, 2000; Parlee et al., 2005; Gearheard et al., 2010; Cuerrier et al., 2015; Savo et al., 2016). Also, recent ontology-based studies of TEK have provided various approaches that link diverse ecological and environmental information (e.g., Vossen et al., 2008; Ludwig, 2016; Huang et al., 2018). Interestingly, previous efforts to integrate TEK predominantly involve languages with an exclusively oral tradition. To the best of our knowledge, there has been no large-scale, text-based analysis of synchronic and diachronic data that could benefit from linguistic features, nor have there been studies focusing on the systematic correlations between the use of different linguistic devices and the meteorological events they describe.

Linguistic studies on language and environment, especially weather, started with the debate of linguistic relativism and later focused on the theory of linguistic typology based on weather and space expressions (e.g., Levinson and Wilkins, 2006; Erikson et al., 2012). Coincidentally, one of the most frequently and hotly debated issues in this area involves the interpretation of weather expressions in the Eskimo-Aleut languages, especially snow terms in Inuit (e.g., Pullum, 1989; Kaplan, 2003; Regier et al., 2016). Unfortunately, many years of focused discussion on this relatively small set of data did not bring the two sides of linguistic relativism significantly closer to each other. These debates focused on the relation between the number of categories a language can describe and their environment. With rare exceptions, such as Cruikshank (2012), previous linguistic and linguistic anthropological studies on weather expressions did not link the linguistic system to meteorological knowledge of weather events.

This study, though not extracting TEK per se, attempts to provide a tool to bring the two above-mentioned approaches closer together to benefit both fields. We will look at both the selection of different verbs to describe different weather events, and the correlation between such selection and meteorological patterns. The data set we chose to work with is verbs in weather expressions in different Sinitic languages. The choice of this language family is crucial in three ways and helps us to establish an environment as close to a controlled study as possible, given the scale and complexity of the issues involved.

First, Sinitic languages are a language family with well-known and well-documented typological patterns. Working with the Sinitic languages allows us to focus on distributional patterns that do not follow the general patterns predicted by linguistic typology. This is a crucial research design that lays the foundation for the robust results that will be elaborated on further in the methodology section. Second, the language family exhibits the same degree of diversity as the Romance languages (Norman, 1988, p. 213) and has more than 1000 well-documented languages and dialects; it also has unbroken, written documentation containing meteorological data for over 3000 years (Dong et al., 2020a), therefore we can compare weather expressions with attested morphosyntactic differences across languages and verify the validity of our hypothesis with diachronic data. Third, Sinitic languages spread over a wide range of climates consisting of 12 temperature zones, 24 moisture regions and 56 climatic sub-regions (Zheng et al., 2010), which renders more linguistic data on meteorological diversity. Therefore, Sinitic languages provide a good opportunity to establish the mechanisms underlying the correlation between the linguistic device (e.g., selection of verbs) and meteorological events in a significant language family. This research may then be used as the basis of comparison with other comprehensively studied languages to determine if the generalisations are cross-linguistic or language-specific.

Crucially, in order to bridge the knowledge systems of language and meteorology beyond the dependency relation studied previously, we propose to link the common-sense linguistic concept of kinesis with theories of kinetics. Note that the ideal model of a meteorological event is the movements of meteorological objects subject to some external forces, especially gravity. This is especially true for precipitation but can also apply to other meteorological events.

The concept of kinesis in daily language and in several disciplines is derived from Aristotle. With the original meaning of motion and movement in Greek, Aristotle classified kinesis under his notion of actuality. He also variously described it as energy and work. For the purpose of our current discussion, we can interpret kinesis as the physical reality of motion, which can be conceptualised as (1) the perception or experience of motion, (2) the force that can be translated into motion (as possessed by the object) and (3) the energy/work exerted by an external body to cause motion. When linguistics (e.g., Hopper and Thompson, 1980, 1985) borrowed the concept of kinesis as the cognitive motivation for transitivity and the encoding of transitive verbs, they referred mostly to (1) and (2). The same conceptual premise applies to other studies showing that perception of physical properties account for several linguistic behaviours (e.g., Wolff and Song, 2003; Malt et al., 2014).

However, when physicists study on kinetics (as the study of kinesis), they mostly focus on (2) and (3). Newton’s second law deals with kinesis as force (F = ma) and defines it as mass times acceleration. The same equation can be transformed to the definition of kinetic energy (\({\mathrm{KE}} = \frac{{\Delta p}}{{\Delta t}} = \frac{1}{2}mv^2\)) to represent (3). Note that the physical definition of kinesis (as force or as kinetic energy) can be broken down to the contribution of perceivable properties, i.e., mass and speed/acceleration. Based on this observation, we propose that, integrating several versions of definition/explication of kinesis, the perception of physical properties of kinesis is the cognitive basis of linguistic encoding. For meteorological events as physical motion with object of different masses and speeds, it follows that the overall impression of kinesis/kinetic energy or the more salient perception of mass or speed correlates with different linguistic encoding schemes, including noun-verb choices and transitivity of verbs. Hence the goal of our current study is to resolve the following research question:

Can kinesis account for how meteorological events shape languages, including the choices and variations in encoding different meteorological events?

Related previous studies and approaches

Weather and language

Typological research on weather and language saw renewed interest and several innovative studies recently. Eriksen et al. (2010, 2012) proposed a comprehensive framework of the language of weather, based on previous cross-linguistic studies on argument structure patterns of weather expressions, such as Ruwet (1991), Bartens (1995), Saarinen (1997), Mettouchi and Tosco (2011) and Salo (2011). The framework involves two typologies: a formal typology of weather constructions and a semantic typology of weather events. The formal typology consists of three major types, i.e., predicate type, argument type and argument-predicate type, depending on which element is primarily responsible for encoding the weather phenomenon. The semantic typology classifies the linguistic meanings of weather events as either dynamic or static, and further into four subtypes.

This dual module view of weather expressions allows for variety in selecting formal encoding types for a given event type (as assigned to a weather type), depending on the language. Following this framework, subsequent studies have examined how weather events are expressed in several languages. For instance, Andrason (2019) explicates the encoding mechanisms of the subtypes of weather events in the Polish language with the typologies of Eriksen et al. (2010, 2012). Similarly, Andrason and Visser (2019) analyse precipitation constructions in isiXhosa within such a framework. It is important to underline that this approach does not refer to any empirical knowledge of meteorology. Instead, it relies on the event types of weather expressions that are assumed to have already been linguistically encoded.

Weather expressions in Sinitic languages

Van Hoey’s (2018) study of Mandarin Chinese also follows Eriksen et al.’s (2010, 2012) framework. More recently, a series of research (e.g., Dong, 2018, 2019; Dong et al., 2020c, forthcoming; Huang and Dong, 2020; Ren and Dong, forthcoming) examined a range of different meteorological events and their linguistic expressions in Mandarin and other Sinitic languages. Based on the morphosemantic and grammatical behaviours of weather words, chiefly for different forms of precipitation and other weather events involving water (e.g., rain, snow, fog, frost), these studies demonstrated the diversity of verbs in weather expressions, especially those encoded with the argument type. For example, in terms of directionality, weather events such as fog and dew could select verbs of falling, while they do not move downwards in reality (Dong et al., 2020c). In addition, contrary to the predictions of previous studies, unaccusative, unergative and transitive verbs can all be attested in weather expressions in Sinitic languages (Dong et al., 2020b). The results suggest that previous linguistic typology of weather expressions that focused on the prediction of different types of linguistic realisation may have limitations. These recent studies took a closer look at how meteorological events are conceptualised and classified as the basis for the prediction of types of linguistic expressions. This approach is summarised as a new typology, Typology of Meteorological Events (TyME, Dong et al., 2020a). TyME relies on two features, [±Process] and [±Material], to classify meteorological events and map the classification to linguistic expressions.

Kinesis and verbs

One of the fundamental issues in language and cognition is the conceptual basis of the noun-verb dichotomy. Research on this topic by philosophers (e.g., Quine, 1960), psychologists (e.g., Gentner, 1982), as well as many linguists and neuro-linguists, has a significant influence on the developments of each field and on each other. A new trend on this topic is to compare the noun-verb dichotomy in the context of other cognitive concepts. Barsalou et al. (2008) argue for a language model based on situated simulation in the spirit of embodied cognition. Strik Lievers and Winter (2018) show that concepts of eventivity play a role in the categorical encoding of different sensory modalities. We consider that the encoding of meteorological events is a set of highly relevant facts that have not yet been studied from this perspective. Weather expressions consist of a set of highly embodied events that are consistently given mixed noun/verb encoding in most languages in the world. While most previous studies on the noun-verb dichotomy take the embodied, object-like concept of noun as default, the embodied but eventive nature of meteorological events offers a different and potentially enlightening perspective to the cognitive basis of grammatical categories.

Although this study does not aim to resolve the noun-verb dichotomy per se, we designed our study with the possibility of providing new data to contribute to this topic. One important reason why previous studies typically started assuming entities (i.e., referential form) as the basic concept is due to its intuitive interpretation, another is that the opposing concept of event is complex and not well-defined. However, theoretically, it is possible to view the eventive concept as opposite (and not diagonal) to the entity concept. Givón (1984, p. 52) argued that the nature of verbs, as opposed to nouns, was the lack of temporal stability. More recently, borrowing from formal ontology, Huang (2015, 2016) proposed that the [+V] ‘verby’ feature in linguist theories stood for perdurant concepts, which are by definition time-dependent; ‘nouny’ concepts are, oppositely, endurant and independent of time.

On the other hand, Hopper and Thompson (1980) introduced kinesis as cognitive basis of verbs, and it has been widely adopted in the linguistic studies of verbal semantics. Hopper and Thompson (1985) later elaborated that the semantic features of prototypical verbs were kinesis (movement occupying a certain amount of time), visibility (a tangible moving entity and process) and effectiveness (rate of movement over time). Adopting both lines of theories and with the observable effects of meteorological events in mind (i.e., the [±Process] and [±Material] features of TyME), we hypothesise that these observable elements of kinesis plays a central role in the encoding of weather words.

Methodology and data

Recent research on the relation between environment, language and cognition has mostly involved either in vitro controlled experiments using a man-made environment (e.g., Christensen et al., 2016; Nölle et al., 2020), or a data analytic approach looking at the correspondences between two different systems without accounting for possible interactions that may impact either system (e.g., Everett, 2013; Palmer et al., 2017). In contrast with such paradigms, our methodology is inspired by Wilhelm Dilthey’s notion of human sciences as a field of scholarship (Makkreel and Rodi, 1990). He was driven by the quest to understand the complex issues and challenges of the human condition, and sought to integrate scientific methodologies based on empirical evidence. This approach amounts to a study based on empirical evidence from a structured context that crucially includes human experience and how human beings collectively understand these experiences, elaborating on the interpretation of Makkreel (2016).

Furthermore, interpretive tools are adopted from several disciplines. This approach is adopted because the nature and complexity of our research question does not lend itself to a clean, causal analysis following the paradigm of natural sciences. Instead, a structured context of human experience is constructed with the explicit goals of facilitating the interpretation and understanding of the human condition with the help of corroborating scientific evidence. The application of inter-disciplinary approaches to collective human behaviour data for complex societal issues is similar in spirit to the web-usage driven studies in social sciences, such as Bokányi et al. (2016) study of how word frequency reflects demographic differences found in the United States. A similar approach has also been adopted in Li et al. (2020) to establish correlations between important socio-economical events and historical changes of lexical competition, by investigating the co-development relations between the pair of near synonyms gamble and game. Hence in this methodological section, we will focus on how the structured context is construed with empirical data, and what scientific tools and corroborating evidence is used to help us provide an interpretive account.

Structured context of empirical data

One of the often-raised challenges to studies of collective human behaviour is what is called Galton’s problem (Naroll, 1961; Roberts and Winters, 2013). One version of Galton’s problem that concerns our current study is how to rule out external dependencies while still examining the sum of complex human behaviour that, in general, cannot be disassociated from its socio-cultural-historical contexts. This indeed would be an inherent challenge to the interpretive human sciences that Dilthey advocated. Our research is designed to minimise such concerns with a carefully constructed and structured context. In particular, with the goal of establishing a correlation between meteorological events and the variations of weather expressions, the challenge is to eliminate the potential influence of socio-cultural factors (e.g., both inheritance and contact).

We meet this challenge by contextualising our study within the detailed and well-documented knowledge of the typology of Sinitic languages. For all major Sinitic languages (including Mandarin, Wu, Cantonese, etc.), it is well known that the most consistent and significant differences are between the Northern and Southern types, roughly demarcated by the Yangtze River. This generalisation is, for instance, well corroborated by the isoglosses presented in the dialectal geographical studies of Cao (2008) and Xiang and Cao (2005). Thus, in the structured context, we focus on the significant exceptions to this north/south dichotomy, i.e., taking typology as the sum of internal (dynamic change within a language) and external (contact, migration, etc.) factors. For significant regional variations that fail to follow the general typological patterns, we can safely conclude that the best attested external dependencies did not contribute to these collective behaviours. In addition, if we can show that the same variable accounts for the full range of idiosyncrasies, then by Occam’s razor, we interpret that the functional variable in the proposed hypothesis is the likely cause of these variations. In this study, we focus on exceptional northern Chinese weather expressions that follow southern patterns, and vice versa. We also look at significant exceptions in a continuous region that is otherwise homogeneous. Of course, we are aware that small-scale perturbations are likely due to other external factors and do not include them in our current study.

Another crucial aspect of the structured context of this study is the ontology-lexicon interface approach (Vossen et al., 2008; Huang et al., 2010). The crucial assumption is the system to system correspondences between the traditional knowledge system and the formal scientific knowledge system. This mapping relation between knowledge systems is the foundation for the TEK approach; as piecemeal information such as isolated factoids cannot be easily incorporated in a new system and rarely lead to the discovery of new knowledge. A system to system interface means that any node to node mapping has the potential of creating new links and new information. Furthermore, the structured context of connected conceptual network provided by the knowledge system also means that any node to node mapping will have implications that can be verified to corroborate the original hypothesis. For example, in the structured context of kinesis as motion, mass and speed are crucial contributing elements to kinetic energy. In this structured context, the kinesis hypothesis predicts that mass and speed are contributing factors in linguistic events. Thus, it provides additional ways to verify the implications and to corroborate the original hypothesis.

The third aspect of the structured context is the cognitive basis of linguistic encoding of verbs, as discussed above. Given that meteorological events are typically embodied and imageable, their verbal encoding should be fairly straightforward. Yet, in addition to frequent nominal encoding, verbal encoding also varies significantly in different languages. For instance, Dong et al. (2020b) report that many Sinitic languages use the highly transitive verb 打 dǎ ‘to hit’ in 打霜 dǎshuāng hit-frost ‘to frost’, while using directional verbs such as 下 xià ‘to fall’, for snow. Given that the movement of snow is faster and easier to observe, this preference of a non-volitional verb with low transitivity cannot be easily accounted for. To the extent that there is a cognitive basis for noun-verb encoding, the high variability of different verbs for similar meteorological events in related languages remains a puzzle to be solved.

This structured context approach is essential to our current study and to empirical human sciences. This approach also offers an important alternative to tackle the interaction of separate cause-effect relations. That is, although it is not possible to study such issues in an experimental paradigm that would establish direct causal relations, it is still possible to make and prove hypotheses by putting the relevant issues in several linked structured contexts. As each structured context is self-contained and likely independent of each other, a hypothesis that is corroborated in these contexts is not only strongly supported but also has the potential to lead to new knowledge discovery in each of the structured contexts. Based on this research design, we will extract data as appropriate for each structured context, and apply research methodology as befitting each structured context to test the hypothesis.

Lexical choices in Sinitic languages: a study in the structured context of ontology-lexicon interface

Weather expressions from 221 Sinitic languages/dialects are extracted and compiled based on Li (1993–2003), Tao (2007), Xu and Miyata (1999) and Zhang and Mo (2009). Note that given the ongoing dispute on how to differentiate languages and dialects within the Sinitic languages, we do not make explicit reference to either. Suffice to know that there are 221 data points of (mostly) dialects. The data covers nine common weather phenomena: rain, snow, hail, fog, dew, frost, wind, thunder and lightning. All these meteorological events are compatible with argument type encoding in modern Sinitic languages (Dong, 2019), hence suitable for the study of the verb types they select. The transitivity of selected verbs was then analysed in terms of the kinesis of their correlated weather events.

Among the 221 languages/dialects, 216 contain somewhat generic weather expressions, e.g., ‘raining average rain’ but not ‘heavy rain’. See our datasets in the Dataverse repository for the full data. We classified the languages/dialects into three groups based on the transitivity level of the selected verb, regardless of word order variations: high transitivity, low transitivity and both. The verbs with high transitivity include 打 dǎ ‘to hit’, 拍 pāi ‘to slap’, 扯 chě ‘to pull’, etc. Verbs with low transitivity include 下 xià ‘to fall’, 落 luò ‘to fall, to drop’, 起 qǐ ‘to rise; to begin’, etc. The distribution of degrees of transitivity levels for each weather event, in terms of attested percentages of languages/dialects, is provided in Table 1. For example, 24.2% of languages use high transitivity verbs to encode the occurrence of dew, while 68.2% use low transitivity verbs, and 7.6% use both kinds.

Table 1 Distribution of transitivity degrees of verbs in Sinitic languages (%).

Full size table

An experiment on the cognitive basis of weather verb encoding

Although the transitivity variations could be influenced by weather types, the cognitive motivation for the encoding of verbs needs to be independently verified. We designed an experiment using several made-up novel weather nouns. Since the informants have no prior knowledge of these weather events, their choice of encoding verbs and/or interpretations should depend on the controlled conditions of the experiment. More specifically, this study investigated the role of mass and speed, the components of kinetic energy, in the selection of weather verbs with different degrees of transitivity: 打 dǎ ‘to hit’, 起 qǐ ‘to rise’, 上 shàng ‘to rise’ and 下 xià ‘to fall’. Note that these four verbs are frequently used in weather expressions in standard Mandarin Chinese and are familiar to all speakers of Chinese (Sinitic languages) regardless of their own language background (e.g., Cantonese, Hakka, Mandarin, or Min). The use of these common verbs is crucial as any language/dialect specific verbs may receive varying interpretations in terms of kinesis and transitivity depending on each informant’s multilingual background. And a verb not commonly used for weather events could introduce a compounding factor to the preference.

To examine the mass and speed of each verb, we designed two conditions. In the first condition, we explicitly asked the informants to judge the mass and the speed involved in the general events of the four verbs on a four-point scale: ‘very heavy/fast’, ‘heavy/fast’, ‘light/slow’ and ‘very light/slow’. In the second condition, we created pseudo-weather events encoded with the four verbs, respectively, and asked the informants to judge the mass and the speed of these weather events on the same four-point scale. The two conditions will be referred to as ‘direct’ and ‘context’, respectively. There were 16 experiment questions and 20 filler questions, which were divided into nine blocks (four questions per block) following a Latin square design and were randomly presented to the informants. For the full questionnaire, the cleaned raw data and the analysis script, see our datasets in the Dataverse repository.

We conducted the experiment online and invited our informants to attend the experiment remotely. Note that crowdsourcing-based tests were shown to have comparable results with laboratory-based tests (e.g., Wang et al., 2015), provided that they are conducted following quality control protocols (e.g., Wang et al., 2017; Yang et al., 2018). In this study, we removed all the responses from one informant if (1) the time spent on the test was longer than 514 s (the upper bound; the lower bound is negative in our data); (2) for each subset of questions, only one point on the scale was used; and (3) the informant is not a native speaker of Sinitic languages. Prior to the experiment, ethical approval was obtained from the Human Subjects Ethics Sub-committee (HSESC) of the Hong Kong Polytechnic University (Reference #: HSEARS20190102001). All informants gave their consent to participate in the experiment.

For the data analysis, we adopted the ordinal logistic regression model, which assumes the hierarchy but not the distance of the levels within the dependent variable (Harrell, 2015). In model construction, either Mass or Speed was included as the dependent variable, while Verb and Condition were included as the independent variable. To examine whether language background and demographic information affected the ratings, we added Language and Age as independent variables. For the variable Language, we roughly divided participants into two groups of northern and southern, according to their ‘dialectal’ linguistic background, i.e., the regional Sinitic languages they use. Since it has been well-documented in Chinese dialectology that many isoglosses roughly follow the north vs. south pattern along the line of Yangtze River (e.g., Cao, 2008), we examine in this experiment the potential differences in kinesis perception by the northern vs. southern Sinitic language speakers. The models were fitted with the ‘MASS’ package (Venables and Ripley, 2002) in R (R Core Team, 2020). The figure was plotted with the ‘ggplot2’ package (Wickham, 2016). Likelihood ratio tests were used to determine whether the effects of independent variables reached significance.

After the data cleaning procedures, 367 informant responses, out of 564 informants, were identified as valid. Significant effects were found for the rating of Mass and Speed when controlled for either Verb or Condition (all p-values < 0.001). This suggests that the mass and the speed properties of the tested verbs are perceived differently by native speakers in the two conditions. No effects of Language or Age were found in the models, revealing that native speakers of Sinitic languages share similar mental representations. As shown in Fig. 1, which presents the ratings of Mass and Speed across verbs and conditions, there is an obvious hierarchy of the verbs in terms of Mass and Speed (from high to low): 打 dǎ > 下 xià > 上 shàng > 起 qǐ. Moreover, although ratings in the two conditions were statistically different, the above hierarchy was unaffected.

**Fig. 1: Ratings of *Mass* and *Speed* across conditions and verbs by native Sinitic language speakers.**

Correlation between the distribution of meteorological events and weather expressions in Sinitic languages

Lastly, with the encoding preference of mass and speed verbs attested and the wide range of variations of weather verb encoding in Sinitic languages documented, we explore possible correlations between transitivity of verbs and kinesis in weather events. Since the overall patterns of weather expressions in Sinitic languages arise from the accumulation of multiple factors apart from kinesis, including the shared ancestry and a history of contact and borrowing, it is difficult to attribute the selection of verbs exclusively to either a historical relationship (e.g., inheritance or contact) or a functional relationship (e.g., kinesis in terms of mass and speed). As discussed above, we pre-empted Galton’s problem in this study by looking at structured context only. That is, we take the typological distribution of Sinitic languages as given, as such distribution is the result of the sum of historical and functional relationships. Given this structured condition, we then focus on the inconsistencies in the selection of verbs among related languages in adjacent geographical areas. Additional typological knowledge, such as the northern/southern dichotomy and the central geographical areas of a certain Sinitic language, are also incorporated in our investigation. The assumption is that smaller-scale regional variations that deviate from an established pattern may indicate functional fluctuations. Once significant variations, not predicted by the typological distribution, are identified, we compare them with the corresponding meteorological map in two structured contexts: whether there are general correspondences in geographical areas, and whether the exceptional weather word encoding is consistent with the prediction of kinesis driven encoding.

Discussion and analysis of the results from three studies

Mechanisms for the use of action verbs

As shown in Table 1, precipitation events, i.e., rain, snow and hail, are almost uniformly expressed with verbs meaning to fall, which show low transitivity. The other weather phenomena, however, utilise verbs with high transitivity at different levels. Our major findings are as follows.

First, though all being condensed water, frost, dew and fog exhibit different tendencies in this regard. Frost is encoded with verbs with high transitivity in 56.1% (including ‘High’ and ‘Both’ in Table 1) of the Sinitic languages we investigated, mostly in the form of 打霜 dǎshuāng hit-frost ‘to frost’. While dew and fog were also found to co-occur with such verbs, e.g., 扯露 chělù pull-dew ‘to dew’ in Guiyang Chinese, 打露 dǎlù hit-dew ‘to dew’ in Changsha Chinese, 打雾 dǎwù hit-fog ‘to fog’ in Lichuan Chinese, 拉雾 lāwù drag-fog ‘to fog’ in Xining Chinese, etc., the percentage for dew is 31.8 and 7.9% for fog, both much lower than the percentage for frost.

Second, lightning expressions have the largest proportion of verbs with high transitivity in Sinitic languages. According to our data, the verbs denoting the lightning events in Sinitic languages can be divided into two groups: action verbs with high transitivity such as 打 dǎ ‘to hit’, 扯 chě ‘to pull’ and 划 huá ‘to scratch’, and low transitivity change-of-state verbs meaning ‘to flash’, such as 闪 shǎn, 烁 shuò, 亮 liàng and 熠 yì. The high transitivity group appears in 87.3% of the languages, as shown in Table 1.

Third, according to our data, it is common for Sinitic languages to use different verbs for rain with different intensity, e.g., average rain vs. heavy rain. Volitional verbs with high transitivity are used for heavy rain, e.g., 跑 pǎo ‘to run’ and 打 dǎ ‘to hit’ in Wuhan Chinese, 压 yā ‘to press’ in Ningbo Chinese and 做 zuò ‘to do’ in Haikou Chinese. On the other hand, only non-volitional verbs meaning ‘to fall’, such as 下 xià and 落 luò are used to denote raining without any particular description.

Fourth, Sinitic languages also tend to use verbs with high transitivity to express strong wind. While 53.5% of high transitivity verbs are used in describing average wind, we found that 15 Sinitic languages have lexical items for blowing strong wind, and all of them adopt high transitivity verbs, such as 打 dǎ ‘to hit’ in Guangzhou Chinese, 做 zuò ‘to do’ in Ningbo Chinese and 拍 pāi ‘to slap’ in Fuzhou Chinese.

Based on the above findings, we can generalise a rule that verbs with high transitivity tend to occur in weather expressions that involve big weather substances and/or fast weather processes. Fog droplets, being around 0.01–0.02 mm in diameter (Houghton, 1932), are so small that they are suspended in the air. Dew droplets are bigger, ranging from about 0.035 mm in diameter at their initial stage to about 0.2 mm at sunrise (Hughes and Brimblecombe, 1994). And obviously, frost that includes ice crystals is much bigger and heavier than fog and dew droplets. So, their percentages of high transitivity verb usage are in direct proportion to their size or mass.

As for lightning, the light generated can be perceived as a movement with a very high speed, hence the expressions of which are also related to the high transitivity verbs. As for rain, heavy rain and average rain differ mainly in terms of rainfall intensity, which is measured by the rainfall over a given area for a given time (Ahrens, 2012, p. 144); thus, the size of raindrops and the intensity of precipitation are crucial. Therefore, mass and speed/rate also account for the preference for high transitivity verbs when describing heavy rain. Lastly, the 15 locations with independent entries for strong wind in dictionaries are all located in China’s south-eastern coastal or near coastal areas that are usually hit by typhoons (Yang and Lei, 2004), and traditionally many people there make a living on sea fishing and ocean trade (Lin, 1994; Lou and Gu, 2005), which may lead to the experience of fiercer wind offshore. Thus, it can be seen that the experience of winds as high kinesis weather events underlies its lexicalisation and the selection of verbs.

Note that the reason why the precipitation events tend not to take high transitivity verbs even if they have relatively big weather substances and fast processes, is due to the prominence of their palpable and consistent downward movement in conceptualisation, unlike fog, dew, frost and thunder without (obvious) movement, or wind and lightning with inconsistent movement. This conceptual differentiation corresponds to the contribution of acceleration in Newton’s second law and change in speed in kinetic energy. Another aspect regarding thunder should be noted. Technically, thunder is the sound wave created during the explosive expanding of air caused by lightning (Ahrens, 2012, p. 290), so it could be described with verbs meaning sound-making activities, such as 响 xiǎng ‘to make sound’, 吼 hǒu ‘to roar’, 叫 jiào ‘to shout’ and 哀 āi ‘to whine’ in Sinitic languages. As a sound, it is difficult to measure its mass or speed. In fact, its usage with high transitivity verbs comes from its intertwined relation with lightning in the conceptualisation by Chinese people. Thunder and lightning normally accompany each other, and Chinese people usually say that the thunder strikes people or animals, while in fact, it is the lightning that strikes them.

A question then ensues: why do weather phenomena with bigger mass and higher speed tend to choose the prototypical action verbs with high transitivity such as 打 dǎ ‘to hit’? We have mentioned that the prototypical properties of verbs are kinesis, visibility and effectiveness (Hopper and Thomspon, 1985). As mentioned above, we hypothesise that the linguistic conceptualisation of kinesis can be linked to the physical measurement of kinetic energy, the energy possessed by a moving object. Kinetic energy (KE) of a moving object of mass m travelling a speed v can be estimated at \({\mathrm{KE}} = \frac{{\Delta p}}{{\Delta t}} = \frac{1}{2}mv^2\). Based on the theories in linguistics and physics discussed above, we propose that the propensity for a weather phenomenon to select action verbs correlates with its kinetic energy. In other words, a weather phenomenon with a more sizable weather substance will have bigger mass and will be more visible, and a phenomenon exhibiting a faster movement shows higher effectiveness and visibility, both of which yield higher KE (which indicates the three features of prototypical verbs can actually be reduced to kinesis by Occam’s razor). And, the weather phenomenon with higher KE is more likely to select a verb with high transitivity. That is, other things being equal, bigger weather substances with faster movements have higher KE, are more kinetic, visible and effective, and tend to select action verbs with high transitivity, which, in turn, also prototypically exhibit kinesis, visibility and effectiveness.

The above analysis supports the role of kinesis in accounting for how meteorological events shape languages. Note that the probability of lexical selection in weather expressions is not claimed to be accurately measured by the formula of kinetic energy. Our proposal primarily aims to underline the importance of kinesis in bridging linguistics and meteorology. Some recent studies have shown that kinetic energy can also account for the selection of verbs in weather expressions of other languages. For example, according to the data of Andrason (2019), high transitivity verbs such as bić ‘to beat’ and rozedrzeć ‘to tear apart’ may also co-occur with weather nouns for lightning in the Polish language.

An account for ‘verbhood’ from kinetic energy

The kinetic energy model can be further applied to the grammatical category of verb, i.e., to provide an account for the concept of ‘verbhood’ based on the mass and speed involved in an event. As analysed above, kinetic energy and the semantic feature of kinesis of prototypical verbs are highly compatible. In other words, prototypical verbs are associated with substances with a bigger size and higher speed. Therefore, it is reasonable to assume that an event with higher kinetic energy is more likely to be assigned a verbal category. Evidence has been found in Archaic Chinese to support that the grammatical categories of weather words can be reliably predicted by the size and perceived mobility of the weather substances.

According to Ren (2018), 雨 yǔ ‘to rain’, 雪 xuě ‘to snow’ and 雹 báo ‘to hail’ can function as weather verbs in Archaic Chinese, while 雾 wù ‘fog’, 露 lù ‘dew’ and 霜 shuāng ‘frost’ have almost no verbal usage. Given an equal gravitational force and Galileo’s observation that all objects fall at the same speed, the kinetic energy of a falling object is directly correlated to its mass. However, in the real world, air provides resistance and negates kinetic energy. Thus, objects with the smallest kinetic energy, i.e., the smallest mass, are likely to have their kinetic energy greatly reduced or even cancelled by the resistance of the air. This is exactly the kind of physical condition required for the suspension weather events (e.g., fog) to occur. Similarly, water on fixed surfaces (e.g., dew and frost) may not be able to fall, either. Therefore, fog, dew and frost do not show visible movement, and moreover, dew and fog lack sizable substances, making them less likely to function as verbs than precipitation.

As for phenomena without tangible substances, e.g., thunder and lightning, speed is the most important factor. Since thunder and lightning are conceptualised in an intertwined manner as analysed, they both should have high perceived mobility, which leads to higher kinetic energy. As also shown in Ren (2018), 雷 léi ‘to thunder’ and 电 diàn ‘lightning flashes’ were encoded as verbs in Archaic Chinese. The cross-linguistic investigation conducted in Dong et al. (2020a) can lend support to this hypothesis: fog, dew and frost tend to be encoded as nouns across languages, and precipitation events can function as verbs in more languages than fog, dew and frost. Hence, the kinetic energy model can account for how meteorological events shape languages; this can be seen in the selection of verbs and the corresponding grammatical categories to represent such events.

Apart from kinesis, multiple factors, including language-specific ones, may affect the final realisation. A good example is that the weather words in Archaic Chinese mentioned above are all nouns in modern Sinitic languages. Hu (2005) argued that the Chinese vocabulary had undergone an essential change during the late Archaic Chinese and Middle Chinese periods: from implying to presenting. For example, an act was previously implied in a verb but is now presented in a VP. Thus, 誓 shì ‘to vow’ changed to 发誓 fāshì express-vow ‘to make a vow’, and similarly, 雨 yǔ ‘to rain’ changed to 下雨 xiàyǔ fall-rain ‘to rain’.

Evidence from experiments

Based on the results of our experiment, as shown in Fig. 1, the verbs are perceived to vary in mass and speed, both in general use and in weather expressions, and such knowledge is shared by native speakers of Sinitic languages. The verb 打 dǎ ‘to hit’ has the highest level of mass and speed, which corresponds to its high transitivity. The results support our hypothesis that the concept of kinesis, as measured by kinetic energy, can bridge meteorological and linguistic behaviours, i.e., mass and speed underlie both the perception of 打 dǎ ‘to hit’ and the perception of heavy frost, fast lightning, intense rain and strong wind, thus accounting for their frequent combination. Since no effects of Language (i.e., speakers of different linguistic backgrounds) were found in the models, the perception of verbs cannot be predicted by linguistic background of the informants. This provides further evidence to rule out non-functional factors and support the functional account based on the concept of kinesis. From a synchronic perspective, at least, the kinesis account is corroborated by the cross-linguistic uniformity. Our results are in line with the relevant findings in Britton and Huang (2019) that the degrees of embodiment and imageability of 打 dǎ ‘to hit’ are higher than those of 上 shàng ‘to rise’, 上升 shàngshēng ‘to rise’, 起来 qǐlái ‘to rise’ and 下降 xiàjiàng ‘to descend’, as shown in Table 2 (based on a scale of 1 to 7). In other words, the verb 打 dǎ ‘to hit’ has a closer relation to bodily experience and involves more obvious movements, coinciding with its high level of kinesis.

Table 2 Experimental results of embodiment and imageability of weather relevant verbs in Britton and Huang (2019).

Full size table

Evidence from geographical distribution

As introduced previously, comparisons between variations of weather expressions and local meteorological distribution may show a clear correlation between linguistic device and kinesis. For all of the geographical matching of meteorological patterns and linguistic usage patterns, it is not expected that the two areas will be exactly matched. First, historical relationships such as small-scale migration and contact borrowing may be present. Second, small-scale climate pattern shifts are possible within a couple of millennia. And lastly, not only is data density lower for linguistic maps, but there is also no reliable smoothing theory. For instance, between any two relatively close locations sharing the same linguistic usage, there is no reliable way to rule out another location with a different usage without field work. With these caveats, we next analyse cases of rain and frost expressions and hail nouns to provide further support.

The most prominent macroscopic difference among Sinitic languages is the distinction between the northern and southern languages, specifically, between mostly Mandarin and mostly non-Mandarin languages (Xiang and Cao, 2005, p. 116). The Yangtze River is the geographical landmark that tends to demarcate the north/south divide (Li and Xiang, 2010, p. 113). As such, verbs in rain expressions in Sinitic languages can be divided into two major groups. Languages spoken to the north of the Yangtze River predominantly use 下 xià ‘to fall’; languages spoken to the south of the Yangtze River predominantly use an older form 落 luò ‘to fall, to drop’ (Cao, 2008). This dichotomy, however, has one obvious exception. The lexicon map 005 in Cao (2008) shows that most parts of Sichuan Province and Chongqing City use 落 luò ‘to fall, to drop’ to describe raining in spite of being Mandarin speaking areas located to the north of the Yangtze River.

Figure 2 shows the map of China with three data-driven maps superimposed: The area where Mandarin Chinese is spoken is bordered in red (Li et al., 1987), the area where the main weather verb for raining is 落 luò ‘to fall, to drop’ is bordered by yellow and blue (Cao, 2008), and the yearly average rainfall of China is given in different shades of greyscale (Xu and Zhang, 2017). What is not shown directly on the map, but discussed earlier, is the area were the raining verb is 下 xià ‘to fall’, which is the area bordered by red, minus the greyish areas bordered by yellow. What this map shows crucially is that the exceptional use of the raining verb 落 luò ‘to fall, to drop’ roughly corresponds to the Mandarin speaking areas with yearly average rainfall of more than 1000 mm. These areas include a significant part of Sichuan and Chongqing (close to the centre of the red area), but also smaller slices just north of Yangtze river (which coincides roughly with the rightmost lower bound of red). In fact, the areas using 落 luò ‘to fall, to drop’ roughly follow the 1000 mm rainfall line (note that non-Sinitic languages coexist with Mandarin at the southwestern plateau are next to areas having abundant rainfall, which may account for their being outside the extent of 落 luò ‘to fall, to drop’). This indicates that the selection of verbs in rain expressions do have a strong correlation with kinetic energy as measured by the volume of rainfall. The fact that the distribution of the verb 落 luò ‘to fall, to drop’ largely matches rainfall distribution contrary to typological patterns strongly suggests that the functional relationship is the valid one.

**Fig. 2: Map of distribution of rain expressions, yearly average rainfall and Mandarin border.**

Another interesting observation can be made based on Fig. 2. Among the Sinitic languages choosing the verb 落 luò ‘to fall, to drop’, most of them also adopt the cognate noun 雨 yǔ ‘rain’ as the noun for weather product, while a few use 水 shuǐ ‘water’ in the form of 落水 luòshuǐ fall-water ‘to rain’. The isoglosses of 落水 luòshuǐ ‘to rain’ based on data from Cao (2008) is marked by blue border in Fig. 2. The areas they cover roughly overlaps with areas where yearly average rainfall is above 1500 mm, and contain most of the places with rainfall above 2000 mm. The pattern suggests that speakers using 落水 luòshuǐ ‘to rain’ in general experience heavier rainfall than their immediate neighbours, though they all live in a region that rains a lot. Note that this distribution is different from the case of using 落 luò in Sichuan/Chongqing. In that prior case, there is a consistent pattern of intensity of rainfall over-riding the default typology of Sinitic languages. The blue area in question is in fact typologically complex with several Sinitic languages such as Cantonese, Hakka and Mandarin spoken. Linguistic typology as sum of historical relations predict that different weather expressions should be used. The fact that they share the default form of 落雨 luò yǔ ‘to rain’, and opt for 落水 luòshuǐ ‘to rain’ regardless of typological differences, strongly suggests a functional relationship. And the fact that the use of the mass noun 水 shuǐ ‘water’ over the cognate weather noun 雨 yǔ ‘rain’ corresponds roughly to distribution of extremely heavy rainfall further supports our kinesis driven hypothesis.

The investigation of geographical distribution of frost expressions also shows a correlation with meteorological patterns. According to our data in Tables 1, 56.1% of Sinitic languages use high transitivity verbs such as 打 dǎ ‘to hit’ to encode this weather event, while other languages using verbs with lower transitivity. Figure 3 shows the isogloss of transitive verb usage, as well as the area of frost damage. The area bordered in red represents the isogloss of the use of 打 dǎ ‘to hit’ for frost. In addition, the area with frost damage based on data in Feng et al. (1999) is traced with blue. Note that the use of 打 dǎ as verb for frost covers a wide stretch of area that is both typologically and topologically complex. The area covers the Yangtze River, the Yellow River, the Qin Mountains, the Yungui Plateau, etc., and contains a variety of Sinitic languages including Mandarin and many southern Sinitic languages such as Wu, Gan, Xiang, etc. Historical relationships cannot predict for this distribution. However, the correlation with frost damage to crop could easily lead to perceived higher impact. And it is natural to attribute higher impact to higher kinetic energy.

**Fig. 3: Map of distribution of frost expressions with high transitivity verbs and frost damage areas.**

Lastly, hail nouns also provide evidence for the correlation between weather expressions and meteorological events. Hail nouns in Sinitic languages can be grouped into three major types based on the root morphemes they use: nouns with the morpheme 雹 báo ‘hail’, which is the oldest form and the cognate noun for the historical weather verb; nouns with the morpheme 蛋 dàn ‘egg’, which is more recent than 雹 báo ‘hail’; and nouns with the morpheme 冷 lěng ‘cold’, which is the most recent form (Xiang, 2012). According to Xiang (2012), the three types demonstrate a centre-periphery distribution, with 冷 lěng ‘cold’ at the centre covering areas that previously used 蛋 dàn ‘egg’ and, earlier, 雹 báo ‘hail’. Figure 4 shows both the isoglosses of hail nouns with 冷 lěng ‘cold’ and 蛋 dàn ‘egg’ based on data from Cao (2008) and Xiang (2012); as well as the areas suffering from hail damage based on data from Wang and Wang (2001). The regions, centres and zones in Fig. 4 are areas where hail damage frequently occurs, and they differ mainly in geographical locations (Wang and Wang, 2001: 383). In general, hail regions represent the broad geographical areas that are susceptible to hail, hail zones represent scattered isolated incidents of hail, and hail centres are areas where hail is most concentrated. Figure 4 shows that most of the areas using the two newer forms of hail nouns are in areas seriously affected by hail damage. The scale of hail damage can be associated with the size and number of hailstones, and thus a higher degree of kinetic energy. Note that ‘egg’ is quite commonly used in Sinitic languages to describe particularly large hailstones. Thus, the conventionalisation of this noun for the weather product can be attributed to the experience of heavier hail. The choice of 冷 lěng ‘cold’ does not have any apparent link to kinesis, but its new usage to replace the original form of 雹 báo ‘hail’ also seems to be correlated with the more frequent experience of hail damage. The morpheme 雹 báo ‘hail’ is not frequently used colloquially, while 冷 lěng ‘cold’ and 蛋 dàn ‘egg’ are highly frequent daily used words, so it is plausible that the more immediate experience compelled the speakers in these regions to adopt a more vivid and colloquial term to replace a more pedantic one.

**Fig. 4: Map of distribution of hail nouns and hail damage areas.**

On smog

One last study we present shows that the same weather word chooses different verbs in Archaic and modern Chinese. The weather condition with the most immediate ecological concerns in China over the past few years is 霾 mái ‘smog’. The grammatical behaviours of 霾 mái in Archaic and modern Chinese are quite intriguing and can shed light on the current issues.

(1)	终	风	且	霾	惠	然	肯	来
	zhōng	fēng	qiě	mái	huì	rán	kěn	lái
	as-well	wind-blow	as-well	dust-storm-arise	compliant	alike	willing	come
	‘The wind blows, with clouds of dust. Kindly he seems to be willing to come to me.’ (Zhong Feng, in Shi Jing/Book of Odes)

In the pre-Qin ancient documents we consulted, 霾 mái only appears once and is used as a verb, as shown above in (1). According to Xu (121/1963), 霾 mái in Archaic Chinese means 风雨土也 wind-fall-dust-YE ‘wind blowing the dust to rise and fall’. Hence, similar to wind, thunder and lightning, 霾 mái denotes a dynamic event with higher kinetic energy. It is thus plausible to assume that in modern Chinese, 霾 mái also tends to co-occur with high transitivity verbs. However, this prediction is incorrect.

Given that 霾 mái is frequently coordinated with 雾 wù ‘fog’ in actual use, especially in daily colloquial language, we searched both ‘v 霾’ and ‘v 雾霾’ in the BCC balanced corpus (Xun et al., 2016; Accessed 26 Feb 2019). Statistics of the preceding collocating verbs show that prototypical action verbs are seldom used to denote the occurrence of smog. Moreover, the majority of verbs used are existential verbs such as 有 yǒu ‘to have’ and 出现 chūxiàn ‘to appear’. Accordingly, it is interesting to question why 霾 mái is not as ‘energetic’ as it used to be.

In fact, 霾 mái in Archaic and modern Chinese denote different weather events: dust storm vs. smog. The emergence of the meaning of smog in modern Chinese can be corroborated by its sudden rise in frequency, as evidenced by the uses of ‘霾’ in the BCC diachronic corpus (Xun et al., 2016; Accessed 24 Feb 2019). The frequency of 霾 mái rose drastically since 2012 (freq.: 2012: 43, 2013: 501, 2014: 317, 2015: 181). Its use frequency before 2012 was low and sporadic; it occurred less than 10 times per year and was not attested on a yearly basis. Shi (2015) also showed that 霾 mái was infrequently used both in Archaic and modern Chinese. The corpus data suggest that the public awareness of the air pollution problem in the past decade has triggered the change of the dominant meaning of 霾 mái ‘dust stirring storm’ to ‘hanging particle pollutant’, as well as its rise in frequency. In this case, the borrowing of an infrequently used word in the same conceptual class of weather phenomenon to represent a new post-industrial age weather phenomenon created a discontinuity to its previously encoded knowledge. Thus, 霾 mái in Archaic Chinese denotes the dynamic dust storm and indeed has high kinetic energy, which accounts well for its verbal usage. However, the knowledge encoded in 霾 mái is no longer present in Mandarin, and its grammatical behaviours nowadays can also be explained by the physical characteristics of smog. The suspended static particles, e.g., the ones frequently referred to in the media as PM 2.5 or PM 10, whose diameters are 0.0025 mm and 0.01 mm and less, respectively, have very low kinetic energy. Therefore, it follows that 霾 mái is the least likely to co-occur with high transitivity verbs in modern Chinese. Thus, we show again that when the same word refers to two different weather events in different historical periods, for selection of the grammatical categories and encoding verbs, kinesis makes the correct prediction over historical relationships.

In this paper, we have focused on the effect of meso-scale to large-scale meteorological events. An important reason for such a choice is the scale allows us to better describe the overall patterns in China without being distracted by local variations. Although smog (霾 mái) is a micro-scale meteorological event, it does command global attention and the experience is shared broadly even though the events happen locally. We have yet to deal with any micro-scale meteorological events that are experienced in specific areas only, and this should be an area for future exploration. Our preliminary study of freezing rain and a specific term referring to it in Sinitic languages, 下凌 xiàlíng fall-freezing_rain, does support the thesis posed in the current paper (Dong et al., forthcoming).

Conclusion

Following Wilhelm Dilthey’s concept of human sciences as a discipline, this paper has examined a series of structured contexts of human experiences based on language use, experimental and meteorological distributional data from Sinitic languages and Archaic Chinese. Within these contexts, we found that kinesis, measured by kinetic energy and perceived through the mass and speed of weather objects, can predict how weather events shape language in two respects. First, it helps to predict the selection of verbs for different meteorological events. More specifically, both experimental and distributional data show that a weather phenomenon involving bigger substances and faster movements tends to co-occur with high transitivity verbs when such phenomenon is encoded with the argument type. Second, differences in kinetic energy are shown to correlate to variations of weather expressions that do not follow known typological patterns. Other things being equal, a weather phenomenon with higher kinetic energy tends to be encoded with a verb with higher transitivity (e.g., 打 dǎ ‘to hit’) or with a noun marked in mass or size (e.g., 水 shuǐ ‘water’, 蛋 dàn ‘egg’).

In terms of TEK and ecological ontology, this result means that we have identified a principle and several linguistic cues that could facilitate the interpretation of traditional meteorological knowledge based on selections or variations of weather expressions. In addition, in situations where observation data are not possible to obtain, such as in the reconstruction of meteorological knowledge based on preserved oral or written documents, linguistic cues can be leveraged to enhance and/or verify the descriptive content.

Moreover, for linguistic research, our study introduces a new approach to typological studies on weather expressions, i.e., looking into the diversity of verbs in such expressions. Our results also lend support to the discussion on the concept of ‘verbhood’ in grammatical studies, by claiming that visibility, kinesis and effectiveness are largely compatible with kinetic energy, which provides the experiential basis for the concept of kinesis as one of the basic concepts of ‘verbhood’.

On the emergent research issue of the relation between environment, language and cognition, we implemented a new approach that is significantly different from the dominant paradigms. The approach we implemented follows Dilthey’s paradigm of human sciences as an ideal synergetic collaboration between humanities and natural sciences. To tackle issues with complex sub-event interactions that cannot be fully isolated, e.g., the kind of issues that lead to Galton’s problems, we proposed to study a series of linked structured contexts. Each structured context contains a specific generalisation that can be solved with a well-established scientific methodology. While the solution in each of the structured contexts is not definitive and open to alternative accounts, the full picture emerges when the functional relationship is shown to hold in all different structured contexts. To challenge this robust account, a competing functional relationship must be shown to hold in all (and more) contexts. For instance, unless it can be shown that speakers use different strategies to encoding different meteorological events, our kinesis-based account predicts synchronic variations among Sinitic languages and diachronic changes from Archaic Chinese to modern Mandarin, and offers the best explanation for environment, language and cognition. Most crucially, we have demonstrated that the connected structured context approach is an effective way to bridge the humanities and natural sciences and allows interpretative approaches to help us better understand complex issues, as suggested earlier by Dilthey.

Data availability

The datasets generated during and/or analysed during this study are available in the Dataverse repository: https://doi.org/10.7910/DVN/REQ51X.

References

Ahrens D (2012) Essentials of meteorology: an invitation to the atmosphere, 6th edn. Brooks/Cole, Cengage Learning, Belmont
Google Scholar
Andrason A (2019) Weather in Polish: a contribution to the typology of meteorological constructions. Studia Linguistica 73(1):66–105. https://doi.org/10.1111/stul.12091
Article Google Scholar
Andrason A, Visser M (2019) Precipitation constructions in isiXhosa. South Afr J Afr Lang 39(1):16–28. https://doi.org/10.1080/02572117.2019.1572307
Article Google Scholar
Barsalou LW, Santos A, Simmons WK, Wilson CD (2008) Language and simulation in conceptual processing. In: de Vega M, Glenberg A, Graesser A (eds) Symbols and embodiment: debates on meaning and cognition. Oxford University Press, Oxford, pp. 245–284
Bartens R (1995) Suomalais-ugrilaisten kielten meteorologisista ja muita luonnonolosuhteista merkitsevistä ilmauksista. J de la Société Finno-Ougrienne 86:33–65
Google Scholar
Berkes F, Colding J, Folke C (2000) Rediscovery of traditional ecological knowledge as adaptive management. Ecol Appl 10(5):1251–1262. https://doi.org/10.1890/1051-0761(2000)010[1251:roteka]2.0.co;2
Article Google Scholar
Bokányi E, Kondor D, Dobos L, Sebők T, Stéger J, Csabai I, Vattay G (2016) Race, religion and the city: twitter word frequency patterns reveal dominant demographic dimensions in the United States. Pal Commun 2(1). https://doi.org/10.1057/palcomms.2016.10
Britton J, Huang C-R (2019) The Chinese verb embodiment database (CVED). Paper Presented at the 28th Joint Workshop on Linguistics and Language Processing, Waseda University, Tokyo, 13–14 December 2019
Cao Z (ed) (2008) Hanyu fangyan ditu ji (Linguistic atlas of Chinese dialects). The Commercial Press, Beijing
Christensen P, Fusaroli R, Tylén K (2016) Environmental constraints shaping constituent order in emerging communication systems: structural iconicity, interactive alignment and conventionalization. Cognition 146:67–80. https://doi.org/10.1016/j.cognition.2015.09.004
Article PubMed Google Scholar
Cruikshank J (2012) Are glaciers ‘good to think with’? Recognising indigenous environmental knowledge. Anthropol Forum 22(3):239–250. https://doi.org/10.1080/00664677.2012.707972
Article Google Scholar
Cuerrier A, Brunet ND, Gérin-Lajoie J et al. (2015) The study of Inuit knowledge of climate change in Nunavik, Quebec: a mixed methods approach. Human Ecol 43(3):379–394. https://doi.org/10.1007/s10745-015-9750-4
Article Google Scholar
Dong S (2018) Lun wu de fangxiang (Directions of fog). Yuyan Wenzi Zhoubao Aug 2. https://doi.org/10.13140/RG.2.2.15796.81289
Dong S (2019) Hanyu qixiang ci yanjiu (A study on meteorological words in Chinese). Postdoctoral report, Peking University. https://doi.org/10.13140/RG.2.2.22648.29445
Dong S, Huang C-R, Ren H (2020a) Towards a new typology of meteorological events: a study based on synchronic and diachronic data. Lingua. https://doi.org/10.1016/j.lingua.2020.102894
Dong S, Xu J, Huang C-R (2020b) Angry thunder and vicious frost: remarks on the unaccusativity of Chinese weather verbs. Paper presented at the 21st Chinese Lexical Semantics Workshop, City University of Hong Kong, Hong Kong, 28-30 May 2020
Dong S, Yang Y et al. (2020c) Directionality and momentum of water in weather: a morphosemantic study of conceptualisation based on Hantology. In: Hong J-F, Zhang Y, Liu P (eds) Chinese lexical semantics: 20th workshop, CLSW 2019, Beijing, China, June 28–30, 2019, revised selected papers. Springer, Cham, pp. 575–584. https://doi.org/10.1007/978-3-030-38189-9_59
Dong S, Yang Y et al. (forthcoming) Directionality of atmospheric water in Chinese: a lexical semantic study based on linguistic ontology. SAGE Open
Eriksen PK, Kittilä S, Kolehmainen L (2010) Linguistics of weather: cross-linguistic patterns of meteorological expressions. Stud Lang 34(3):565–601. https://doi.org/10.1075/sl.34.3.03eri
Article Google Scholar
Eriksen PK, Kittilä S, Kolehmainen L (2012) Weather and language. Lang Linguistics Compass 6(6):383–402. https://doi.org/10.1002/lnc3.341
Article Google Scholar
Everett C (2013) Evidence for direct geographic influences on linguistic sounds: the case of ejectives. PLoS ONE 8(6):e65275. https://doi.org/10.1371/journal.pone.0065275
Article ADS CAS PubMed PubMed Central Google Scholar
Feng Y, He W et al. (1999) Wo guo dong xiaomai shuang donghai de qihou fenxi (Climatological study on frost damage of winter wheat in China). Acta Agronomica Sin 25(3):335–340
Google Scholar
Gearheard S, Pocernich M, Stewart R et al. (2010) Linking Inuit knowledge and meteorological station observations to understand changing wind patterns at Clyde River, Nunavut. Clim Change 100(2):267–294. https://doi.org/10.1007/s10584-009-9587-1
Article ADS Google Scholar
Gentner D (1982) Why nouns are learned before verbs: linguistic relativity versus natural partitioning. Center for the Study of Reading Technical Report (257)
Givón T (1984) Syntax: an introduction, Vol 1. John Benjamins Publishing Company, Amsterdam and Philadelphia
Book Google Scholar
Harrell FE (2015) Regression Modeling Strategies, with Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. Springer, New York
Book Google Scholar
Hopper PJ, Thompson SA (1980) Transitivity in grammar and discourse. Language 56(2):251–299. https://doi.org/10.2307/413757
Article Google Scholar
Hopper PJ, Thompson SA (1985) The iconicity of the universal categories ‘noun’ and ‘verbs’. In: Haiman J (ed) Iconicity in syntax: proceedings of a symposium on iconicity in syntax, Stanford, June 24-6, 1983. John Benjamins Publishing Company, Amsterdam and Philadelphia, pp. 151–183
Chapter Google Scholar
Houghton H (1932) The size and size distribution of fog particles. Physics 2(6):467–475. https://doi.org/10.1063/1.1745072
Article ADS Google Scholar
Huang C-R, Lenci A, Gangemi A et al. (2010) Ontology and the lexicon: a natural language processing perspective. Cambridge University Press, New York
Book Google Scholar
Huang C-R (2015) Notes on Chinese grammar and ontology: the endurant/perdurant dichotomy and Mandarin D-M compounds. Lingua Sin 1(1). https://doi.org/10.1186/s40655-015-0004-6
Huang C-R (2016) Endurant vs perdurant: ontological motivation for language variations. In: Park JC, Chung J-W (eds) Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation. Hankookmunhwasa, Seoul, pp. 15–25
Huang C-R, Dong S (2020) From lexical semantics to Traditional Ecological Knowledge: on precipitation, condensation and suspension expressions in Chinese. In: Hong J-F, Zhang Y, Liu P (eds) Chinese lexical semantics: 20th workshop, CLSW 2019, Beijing, China, June 28–30, 2019, revised selected papers. Springer, Cham, pp. 255–264
Chapter Google Scholar
Huang C-R, Hsieh S-K, Prévot L et al. (2018) Linking basic lexicon to shared ontology for endangered languages: a linked data approach toward Formosan languages. Journal of Chinese Linguistics 46(2):227–268. https://doi.org/10.1353/jcl.2018.0009
Article Google Scholar
Hu C (2005) Cong yinhan dao chengxian (shang)–shi lun zhonggu cihui de yi ge benzhi bianhua (From implying to presenting (part I): an essential change of Chinese vocabulary in the middle times). Essays Linguistics 31:1–21
Google Scholar
Hughes R, Brimblecombe P (1994) Dew and guttation: formation and environmental significance. Agric Forest Meteorol 67(3–4):173–190. https://doi.org/10.1016/0168-1923(94)90002-7
Article ADS Google Scholar
Huntington HP (2000) Using traditional ecological knowledge in science: methods and applications. Ecol Appl 10(5):1270–1274. https://doi.org/10.1890/1051-0761(2000)010[1270:utekis]2.0.co;2
Article Google Scholar
Kaplan LD (2003) Inuit snow terms: how many and what does it mean? In:Trudel F(ed) Building capacity in Arctic societies: dynamics and shifting perspectives. CIÉRA, Montreal, pp. 263–269
Google Scholar
Makkreel RA (2016) Wilhelm Dilthey. In: Zalta EN (ed) The Stanford encyclopedia of philosophy (fall 2016 edition). https://plato.stanford.edu/archives/fall2016/entries/dilthey/
Makkreel RA, Rodi F (1990) Wilhelm Dilthey: selected works, vol. I: introduction to the human sciences. Princeton University Press, Princeton, NJ
Google Scholar
Kimmerer RW (2002) Weaving traditional ecological knowledge into biological education: a call to action. BioScience 52(5):432–438. https://doi.org/10.1641/0006-3568(2002)052[0432:wtekib]2.0.co;2
Levinson SC, Wilkins DP (eds) (2006) Grammars of space: explorations in cognitive diversity. Cambridge University Press, Cambridge
Li L, Huang C-R, Wang VX (2020) Lexical competition and change: a corpus-assisted investigation of gambling and gaming in the past centuries. SAGE Open
Li R, et al. (eds) (1987) Zhongguo yuyan ditu ji (Language atlas of China). Longman, Hong Kong
Li R (ed) (1993–2003) Xiandai hanyu fangyan da cidian (Great dictionary of modern Chinese dialects) (42 Volumes). Jiangsu Education Publishing House, Nanjing
Li X, Xiang M (2010) Hanyu fangyan jichu jiaocheng (Basics of Chinese dialects). Peking University Press, Beijing
Google Scholar
Lin R (1994) Lun shi qi shiji Zhongguo yu nanyang ge guo haishang maoyi de yanbian (On the change of ocean trade between China and south-eastern Asian countries in the 17th century). J Chinese Soc Econ Hist 3:40–47. https://doi.org/10.13469/j.cnki.zgshjjsyj.1994.03.006
Article Google Scholar
Lou D, Gu S (2005) Zhongguo yuye ziyuan yu chanye de kongjian fenbu geju ji yanhua (Fishery resources, spatial distribution of fishery and its evolution in China). Chinese J Agric Res Reg Planning 26(1):27–31
Google Scholar
Ludwig D (2016) Overlapping ontologies and Indigenous knowledge. From integration to ontological self-determination. Stud Hist Philos Sci Part A 59:36–45. https://doi.org/10.1016/j.shpsa.2016.06.002
Article Google Scholar
Naroll R (1961) Two solutions to Galton’s problem. Philos Sci 28(1):15–39. https://doi.org/10.1086/287778
Article Google Scholar
Malt BC, Ameel E, Imai M et al. (2014) Human locomotion in languages: constraints on moving and meaning. J Memory Lang 74:107–123
Article Google Scholar
Mettouchi A, Tosco M (2011) Impersonal configurations and theticity: the case of meteorological predications in Afroasiatic. In: Malchukov A, Siewierska A (eds) Impersonal constructions: a cross-linguistic perspective. John Benjamins Publishing Company, Amsterdam and Philadelphia, pp. 307–322
Nölle J, Fusaroli R et al (2020) Language as shaped by the environment: linguistic construal in a collaborative spatial task. Pal Commun 6(1). https://doi.org/10.1057/s41599-020-0404-9
Norman J (1988) Chinese. Cambridge University Press, New York
Google Scholar
Palmer B, Lum J et al. (2017) How does the environment shape spatial language? Evidence for sociotopography. Linguistic Typology 21(3):457–491. https://doi.org/10.1515/lingty-2017-0011
Article Google Scholar
Parlee B, Manseau M, Łutsël K'é Dene First Nation (2005) Using traditional knowledge to adapt to ecological change: Denésǫłıné monitoring of Caribou movements. ARCTIC 58(1):26–37. https://doi.org/10.14430/arctic386
Article Google Scholar
Pierotti R, Wildcat D (2000) Traditional ecological knowledge: the third alternative (commentary). Ecol Appl 10(5):1333–1340. https://doi.org/10.1890/1051-0761(2000)010[1333:tektta]2.0.co;2
Article Google Scholar
Pullum GK (1989) Topic…Comment: the great Eskimo vocabulary hoax. Natural Lang Linguistic Theory 7:275–281. https://doi.org/10.1007/bf00138079
Article Google Scholar
Quine WVO (1960) Word and object. MIT Press, Cambridge
MATH Google Scholar
R Core Team (2020) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. https://www.r-project.org/
Ren H (2018) ‘Mingci dong yong’ yu shanggu hanyu mingci he dongci de yuyi shuxing (Noun-Verb conversion in Archaic chinese: from the perspective of lexical semantic analysis). Doctoral dissertation, Peking University
Ren H, Dong S (forthcoming) Gu Hanyu qixiang shijian bianma leixing de gongshi fenbu yu lishi yanbian (How to encode meteorological events in Classical Chinese: from a typological perspective). Bulletin of Linguistic Studies
Regier T, Carstensen A, Kemp C (2016) Languages support efficient communication about the environment: words for snow revisited. PLoS ONE 11(4):e0151138. https://doi.org/10.1371/journal.pone.0151138
Article CAS PubMed PubMed Central Google Scholar
Roberts S, Winters J (2013) Linguistic diversity and traffic accidents: lessons from statistical studies of cultural traits. PLoS ONE 8(8):e70902. https://doi.org/10.1371/journal.pone.0070902
Article ADS CAS PubMed PubMed Central Google Scholar
Ruwet N (1991) Syntax and human experience. (Trans. Goldsmith J, (ed)) The University of Chicago Press, Chicago and London
Google Scholar
Saarinen S (1997) Il pluit, idet dozhd’, it rains, es regnet, llueve, piove. In: Alinei M, Barros Ferreira M, Contini M (eds) Atlas linguarum Europae (ALE), Volume I: Cinquième fascicule, Cartes et Commentaires. Istituto Poligrafico e Zecca dello Stato, Roma, pp. 1–34
Google Scholar
Salo M (2011) Meteorological verbs in Uralic languages–are there any impersonal structures to be found. In: Malchukov A, Siewierska A (eds) Impersonal constructions: a cross-linguistic perspective. John Benjamins Publishing Company, Amsterdam and Philadelphia, pp. 395–438
Savo V, Lepofsky D, Benner JP et al. (2016) Observations of climate change among subsistence-oriented communities around the world. Nat Clim Change 6(5):462–473. https://doi.org/10.1038/nclimate2958
Article ADS Google Scholar
Shi H (2015) Mai de shiyi lishi yu guifan wenti (The definations, historical record and the improper use of the term 霾(mai)). J Putian University 22(4):69–74
Google Scholar
Strik Lievers F, Winter B (2018) Sensory language across lexical categories. Lingua 204:45–61
Article Google Scholar
Tao G (ed) (2007) Nantong fangyan cidian [Dictionary of Nantong dialect]. Jiangsu People’s Publishing, Ltd, Nanjing
Google Scholar
Van Hoey T (2018) Does the thunder roll? Mandarin Chinese meteorological expressions and their iconicity. Cognitive Semantics 4(2):230–259. https://doi.org/10.1163/23526416-00402003
Article Google Scholar
Venables WN, Ripley BD (2002) Modern Applied Statistics with S. Springer, New York
Book Google Scholar
Vossen P, Agirre E, Calzolari N, Fellbaum C, Hsieh SK, Huang CR, Isahara H, Kanzaki K, Marchetti A, Monachini M, Neri F (2008) KYOTO: a system for mining, structuring, and distributing knowledge across languages and cultures. In: Proceedings of the 6th Language Resources and Evaluation Conference (LREC, 2008)
Wang S, Huang CR, Yao Y, C han A(2015) Mechanical Turk-based experiment vs laboratory-based experiment: a case study on the comparison of semantic transparency rating data. In: Zhao H (ed) Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation, Shanghai, pp. 53–62
Wang S, Huang C-R et al. (2017) Word intuition agreement among Chinese speakers: a Mechanical Turk-based study. Lingua Sin 3(1):13. https://doi.org/10.1186/s40655-017-0032-5
Article Google Scholar
Wang W, Wang J (2001) Jiyu san zhong xinxiyuan de Zhongguo bingbao zaihai quyu fenyi yanjiu (The distributive pattern of hail disasters based on three data sources in China). Geogr Res 20(3):380–387
Google Scholar
Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer, Cham
Book Google Scholar
Wolff P, Song G (2003) Models of causation and the semantics of causal verbs. Cogn Psychol 47(3):276–332
Article Google Scholar
Xiang M, Cao H (2005) Hanyu fangyan dilixue (Geodialectology of Chinese). Chinese Literature and History Press, Beijing
Google Scholar
Xiang M (2012) Hanyu cihui dilixue de yiban chengxu: yi bingbao de yanjiu wei li (The general procedures of the Chinese word geography: a case study of Chinese bingbao ‘hail’). Geogr Sci Res 1:7–23. https://doi.org/10.12677/gser.2012.12002
Article Google Scholar
Xu B, Miyata I (eds) (1999) Hanyu fangyan da cidian (A comprehensive dictionary of Chinese dialects). Zhonghua Book Company, Beijing
Xu S (1963) Shuo wen jie zi (Explaining graphs and analysing characters). Zhonghua Book Company, Beijing, Vol. 121
Google Scholar
Xu X, Zhang Y (2017) Zhongguo qixiang beijing shujuji (Meteorological background dataset of China). https://doi.org/10.12078/2017121301
Xun E, Rao G et al. (2016) Dashuju beijing xia BCC yuliaoku de yanzhi (The construction of the BCC Corpus in the age of Big Data). Corpus Linguistics 3(1):93–109
Google Scholar
Yang Y, Huang CR, Dong S, Chen S (2018) Semantic transparency of radicals in Chinese characters: an ontological perspective. In Politzer-Ahles S, Hsu Y-Y et al. (eds) Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation, Hong Kong, pp. 788–797
Yang Y, Lei X (2004) Wo guo denglu taifeng yinqi de da feng fenbu tezheng de chubu fenxi (Statistics of strong wind distribution caused by landfall typhoon in China). J Tropical Meteorol 20(6):633–642. https://doi.org/10.16032/j.issn.1004-4965.2004.06.003
Article Google Scholar
Zhang W, Mo C (2009) Lanzhou fanyan cidian (Dictionary of Lanzhou dialect). China Social Sciences Press, Beijing
Google Scholar
Zheng J, Yin Y, Li B (2010) Zhongguo qihou quhua xin fang’an (A new scheme for climate regionalization in China). Acta Geogr Sin 65(1):3–12. https://doi.org/10.11821/xb201001002
Article ADS Google Scholar

Download references

Acknowledgements

This study was partly funded by the Hong Kong Polytechnic University–Peking University Research Centre on Chinese Linguistics (RP2U2), as well as the Hong Kong Polytechnic University Project #ZZJL ‘Transitivity in Light Verb Constructions: Studies in Mandarin Chinese and Beyond’. Earlier versions of parts of this paper were presented as Dong et al. (2020c) and Huang and Dong (2020) at the 20th Chinese Lexical Semantics Workshop (CLSW). We would like to thank the reviewers and audience of CLSW 2019 for their helpful comments. We also wish to express our gratitude to Kathleen Ahrens for her helpful comments on earlier versions of this paper.

Author information

Authors and Affiliations

Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong, China
Chu-Ren Huang & Yike Yang
Center for Chinese Linguistics, Peking University, Beijing, China
Chu-Ren Huang
School of Humanities and Social Sciences, Harbin Institute of Technology, Shenzhen, Guangdong, China
Sicong Dong
Institute of Linguistics, Chinese Academy of Social Sciences, Beijing, China
He Ren

Authors

Chu-Ren Huang
View author publications
You can also search for this author in PubMed Google Scholar
Sicong Dong
View author publications
You can also search for this author in PubMed Google Scholar
Yike Yang
View author publications
You can also search for this author in PubMed Google Scholar
He Ren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Chu-Ren Huang or Sicong Dong.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Huang, CR., Dong, S., Yang, Y. et al. From language to meteorology: kinesis in weather events and weather verbs across Sinitic languages. Humanit Soc Sci Commun 8, 4 (2021). https://doi.org/10.1057/s41599-020-00682-w

Download citation

Received: 21 March 2020
Accepted: 19 November 2020
Published: 04 January 2021
DOI: https://doi.org/10.1057/s41599-020-00682-w