Diverse climate actors show limited coordination in a large-scale text analysis of strategy documents

Networks of non-state actors and subnational governments have proliferated since the Paris Agreement formally recognized their contributions to global climate change governance. Understanding the ways these actors are taking action and how they align with each other and national governments is critical given the need for coordinated actions to achieve ambitious global climate goals. Here, we present a large analysis (n = 9,326), applying large-scale natural language processing methods and social network analysis to the climate strategy documents of countries, regions, cities and companies. We find that climate mitigation in employee travel and office operations, green building standards, and municipal and citizen actions are common themes in climate actions across companies and city and regional governments, whereas approaches to setting targets in specific sectors and emissions scopes are more diverse. We also find links between the strategies of regions and countries, whereas companies are disconnected. Gaps in climate action for most actors include adaptation and consumption/supply-chain emission reduction efforts. We suggest that although actors may appear to be self-organizing and allocating climate actions in a mutually beneficial and synergistic way, there may also be missed opportunities for deeper coordination that could result in more ambitious action. Climate actors such as cities, regions, countries and companies show diversity in climate actions with gaps in adaptation and consumption-supply chain emissions reductions, suggests a machine-learning based natural language processing and social network analysis.

N etworks of non-state actors (e.g., business and civil society) and subnational (e.g., cities and regions) governments have proliferated in number and membership since the 2015 Paris Agreement formally recognized their contributions to global climate change governance. They range in diversity from small clubs of specific actors, such as the U.S. Climate Alliance that includes 25 states collectively pledging action on climate change, to the Global Covenant of Mayors for Climate and Energy, a network of more than 10,000 cities and subnational actors that includes nearly 10 percent of the global population (GCoM) 1 . The latter is an example of a transnational climate action network, which links non-state and subnational actors across national borders and often performs governance functions including norm coordination, capacity building, rule implementation, and collective learning facilitation. Such networks also help to pool and distribute financial, managerial, and technical resources across cities aiming to take actions on climate change 2 . Frequently, transnational climate initiatives include national government actors, such as the New York Declaration on Forests, which pledges an ambitious goal to halt deforestation by 2030, and suggests that the structure and composition of global climate governance has changed 3,4 .
This shift to a more polycentric climate governance system 5,6 has raised questions about how to ensure policy coherence and integration [7][8][9] and avoid fragmentation that could undermine progress towards achieving collective climate goals 7,10-12 . Many scholars underline the need to orchestrate 13 efforts from intergovernmental organizations like the United Nations Framework Convention on Climate Change (UNFCCC), nation states, and non-state and subnational entities. This approach involves using indirect strategies, such as endorsements, convenings, agenda setting, or the provision of resources and assistance, to achieve policy coherence, in which multiple institutions work to achieve a shared goal 10,12,13 . Scholars suggest that "alignment," "interlinkages," "overlaps," "interactions," and "interplay" can generate synergies and avoid "conflictive fragmentation" that disrupts and dilutes climate action coordination 10,14 . Several researchers posit that links between different actors and regimes can generate interaction effects [15][16][17] that lead to convergence, and potentially, coherence, across different actors.
The degree of orchestration, defined as coordination and linking of non-state climate actions, to each other, and to national and international climate initiatives, is an active area of research in global climate governance. When actions are coordinated at a global level, regional and national orchestrators of climate action work to fill strategic gaps, act on comparative advantages, give voice to the underrepresented demographics, and address questions of equity in specific regional contexts 18 . This finding echoes Abbott 19 in that transnational initiatives perform activities, such as information sharing and capacity-building, that national governments may be less suited to implement. As Jordan et al. 9 point out, this dispersion of different types of activities across different actors may suggest that actors are dividing up labor in a mutually beneficial way, fulfilling Ostrom's 6 description of effective polycentricity.
But scarce evidence exists regarding whether such linking and coordination amongst climate actors occur. Do cities, regions, countries, and companies-primary climate actors-take actions in similar ways, or are they divvying them in a way that may be coordinated, or worse, dispersed and redundant? Kuyper et al. 20 note that "sites of climate governance are crowded, highly contested, and often disjointed from one another," particularly across multiple governance levels. Whether there are commonalities between the ways in which non-state and subnational actors take climate action (e.g., suggesting coordination or orchestration at best, or potential overlaps at worst) or whether atomization has led to specialization in climate actions (e.g., suggesting gaps or redundancies in climate action) are questions that have been unexplored in the growing universe of polycentric climate action. Unpacking the "black box of orchestration" is needed to better understand the mechanisms for how coordination within a polycentric system might occur 21 .
While information on the characteristics (e.g., governance function, the composition of participants, and thematic focus areas) and potential mitigation impact of subnational and nonstate action has grown 22 , there is a dearth of information as to how climate action "initiatives align, scale-up, and form lowcarbon pathways." 23 Moreover, as Jordan et al. 9 identify, there is a lack of empirical evaluation of how non-state and subnational climate actors are governing climate change, specifically whether they are filling "gaps" in regimes or simply reproducing already existing functions. Although ex-post performance evaluation data that would allow evaluation of the latter two gaps is scarce, there has been a proliferation in the number of non-state and subnational actors self-reporting climate commitments through transnational climate initiatives and disclosure platforms like CDP (formerly Carbon Disclosure Project) 24 . The content in these databases, however, varies widely: some platforms are pledgebased and only require actors to report climate commitment (e.g., "commitments to commit") with little supporting information, while others require their members to include detailed documents regarding specific actions and policies. The EUCoM, for instance, requires its members to submit a Sustainable Energy and Climate Action Plan (SEAP) that details specific measures and policies they intend to implement to achieve their targets, as well as regularly biennial monitoring and progress reports that include emissions inventories 25 . CDP's annual disclosure asks companies and subnational governments to respond to a lengthy survey on mitigation targets, risks, adaptation measures, including greenhouse gas (GHG) emissions inventory data. Countries' own submitted Nationally Determined Contributions (NDCs) to the Paris Agreement have also been criticized for the lack of standardization and comparability in structure and content 26 .
Due to the heterogeneity and size of these political texts, systematic analysis can be unwieldy and challenging. Natural language processing (NLP) and automated content analysis techniques such as topic modeling apply statistical and machine learning methods to text allowing scholars to quantitatively and systematically analyze large corpora of text (i.e., "text as big data" techniques) 27,28 . Compared to "top-down" qualitative coding techniques 29 , topic modeling enable researchers "to discover topics from the data, rather than assume them." 30 They may counter certain biases introduced through non-automated coding that relies on subjective interpretations or can be influenced by selection biases 31 .
Here, in this study, we address this critical gap in understanding how various actors are pledging climate actions to better understand their role and function in the new polycentric climate governance system. We apply NLP techniques, including topic modeling-an unsupervised machine learning method that allows for "discovery" of topics in a text corpus 30 , to what we believe to be the largest cross-sectional database of 9326 actors, including 5536 cities, 76 regions, 3542 companies, and 172 countries' actions on climate change. We also employ network analysis techniques to understand connections and disconnections in actors' commitments. Although network analysis is usually applied to describe relationships between individuals, researchers have increasingly applied these tools to describe various actors and the types of texts and messages they produce 32,33 . We use these techniques to identify common (or diverse) themes in climate actions; and second, to examine linkages (or disconnections) between these actions and the actors themselves with the aim of characterizing the expanding universe of global climate action and identify where greater coordination and orchestration may be needed.

Results
What climate actions are being taken? We identified a 30-topic model as the most robust model after comparing 20, 30, and 40topic models (see "Methods"). The 30 topics in our document corpus (see Supplementary Table 3 for high probability keywords and representative document excerpts that were used to identify and label topics) identify the most common strategies and actions actors employ to address climate change. The majority of these actions are focused on climate change mitigation (29 out of 30), although one topic emerged that focused on climate change adaptation, primarily mentioned in countries' Paris Agreement pledges. The other mitigation-focused topics range from a focus on employee travel and office operations, to green building standards that include certification schemes, to offsets, and climate change adaptation. The most prevalent topics that appear in the document corpus are topics that relate to city or municipal-level climate actions, focused on citizen actions (8%) and municipal public actions (8%). These topics appear most commonly for cities, which are the most dominant actor type represented in the corpus, although their mean document lengths are below average compared to other actor types (Table 1; Supplementary Fig. 1).
We observe substantial differences in the ways in which different actors are committing to climate change actions. The topic model predicts the probability of actors' documents belonging to each topic, with Fig. 1 illustrating the per-document, per topic probabilities for groups of climate actors. Some topics are more likely to appear in some actor groups' climate actions than others, which is apparent in Fig. 1, where some clear trends emerge. Countries' NDCs have on average a 70% probability of consisting of Topic 11 on climate change adaptation, while cities' climate actions have a much smaller probability of mentioning adaptation (2.5%). Companies' climate strategies are most likely to focus on LED lighting (8.5%), air and HVAC systems (12.8%), presumably within office buildings and other facilities. In articulating mitigation strategies that focus on employee travel (7%) and offsets for business and employee travel (4.3%), companies are the only actor type to explicitly mention focus on "scope 3" or indirect emissions 34 , although regions have a high probability of mentioning waste and transport efficiency, which could apply to scope 3 emissions if downstream (57%). Companies also tend to focus on "scope 1" (direct emissions) site-specific energy sources and efficiency, including a focus on boilers and furnaces (8%), natural gas (8%), and fuel efficiency within fleet vehicles (5.1%). These topics, while represented to some degree as top topics for cities, have a much smaller probability of occurring in their climate actions. Regions have a much higher probability of mentioning public and community water management (26%), while cities and municipal governments tend to focus on the building sector, specifically lighting (8.4%) and energy efficiency (7%). Cities also have a higher probability of mentioning sustainable transportation (9.7%), lighting specifically in schools and public buildings (8.5%), and citizen actions (23%).
Examining the top terms for each actor type and most commonly occurring words alongside them (i.e., word collocations, skip or n-grams) 35 can help to understand strategies actors employ and the commonalities in approaches. These word collocations are in some ways more informative than simply considering single-word probabilities, such as those included in Supplementary Table 3, because they provide more context for commonly occurring words and phrases. For instance, a frequently appearing word in actors' climate actions may be "emissions," but without additional context, it's not clear whether an actor is referring to "reducing emissions" or perhaps the opposite, "increasing emissions." Figure 2 shows the top 25 4word n-grams by actor type, providing insight into actors' approaches to climate change. For instance, cities' top four-word collocation is "public participation stakeholder engagement," suggesting involving the public to implement climate mitigation measures such as recycling and composting, which shows up as the fourth most frequent. Cities demonstrate technological approaches in their climate mitigation actions, which can be observed through word collocations on improving fuel economy (second most frequent) and building on-site renewable energy generation. Policy approaches such as building codes and standards, and fuel economy standards for private transportation are commonly referenced. Cities are also the only actor group to articulate strategies for preserving biodiversity in green spaces. Companies tend to emphasize monetary savings, payback periods, efficiency, and investment requirements. Several of the word collocations emphasize reducing scope 2 or indirect emissions from purchased electricity, including neutrality targets applied to these emissions. Like cities, companies' actions tend to focus on buildings: green building standards such as Leadership in Energy and Environmental Design (LEED) show up frequently, as well as actions that mention heating, ventilation and cooling (HVAC) systems. These skip or ngrams provide more context surrounding an actor's climate actions than single word probabilities since they provide more context for how an actor is referring to a particular word. For example, companies (b) tend to emphasize estimating annual CO 2 emissions savings the most frequently, while cities (a) tend to emphasize public participation and stakeholder engagement. Countries (c) refer to international institutions and processes, such as the United Nations Framework Convention on Climate Change and Intended Nationally Determined Contributions. Regions (d) mention increasing resiliency as a common strategy.
Regions and countries are more similar to each other in their climate action strategies. They are the only actors to frequently mention adaptation and resilience as some of the most common word collocations. Many of the countries' common word collocations seem to relate to the articulation of Intended Nationally Determined Contributions (INDCs) and reporting requirements through the UNFCCC, which emphasize the need for transparency, clarity in reporting and publishing communications. Regions, on the other hand, do show similarity to cities and companies' climate action strategies, with frequently occurring phrases emphasizing heating and cooling efficiency and audits in the building sector, reducing the cost of public transport, investing in more efficient public lighting, and installing white roofs and insulation. Regions are the only actor group to have sustainable farming practices commonly mentioned as well as the installation of on-shore wind power. They also mention collaboration with local governments and increasing awareness among the public as implementation methods. Increasing resilience appears as the most common word collocations for region actors.
We also discovered some clear trends when comparing actors in developed versus developing countries (Fig. 3). With the exception of national government actors, where there are more developing country actors (130 out of 172 countries), there are more climate actors from developed countries reflected in our text corpus. Comparing developed versus developed countries' Paris Agreement pledges, developing countries' documents reflected the climate adaptation topic in much higher frequency than developed countries (85% versus 24%). Developed country actors also tend to focus on mitigation target-setting in their Paris pledges compared to developing countries (69% versus 20%). Regions in developing countries tend to emphasize waste and transport efficiency (59% versus 51%), compared to regions in developed countries. Cities in developed countries tend to emphasize citizen actions (15% versus 2%) and sustainable transport (11% versus 2.5%), while cities in developing countries tend to emphasize waste and public transport (21% versus 1.7%).
How do actors' climate efforts relate to each other? We employed network analysis in two ways: to illustrate relationships between topics and actor groups (Fig. 4), and to understand connections between different actors and the strength of these relationships through their climate commitments (Fig. 5) 36 (see "Methods" for more details). Figure 4 shows a network displaying significant positive correlations between topics, with nodes colored according to the actor type with the highest perdocument probability of a particular topic and sized according to the overall prevalence of the topic. From the network visualization, three clear topic clusters are observed, which appear to be closely related to actor types: one that is comprised primarily of company-actor dominated topics; a second cluster primarily comprised of country-actor dominated topics, including targetsetting and climate change adaptation; and a third based on citydominated topics. Region-dominated topics are linkages between different actor types: community-building programs and government promotion of waste, transport and efficiency appear to be linkages between company and city actors, while public and community water management appear as a linkage between the company, country, and city actors, suggesting that this topic is positively correlated between all actor groups.
To examine spatial relationships between actors' climate commitments, we developed a geographic-based network analysis. All 6498 actors with available geolocation information are represented as nodes in the geographically explicit network linkages map in Fig. 5. This map includes 8133 edges, which are directed connections from an actor to the actor it is most similar to. There are more edges than actors, as in some cases an actor is The overall size of a network can indicate how similar actors' climate actions are by measuring the network diameter -the maximum distance between any two actors in the network. As Table 2 shows, the average network diameter is 9, meaning that within nine steps all actors' commitments are connected, although the diameter is shorter for some actors (i.e., companies (5), regions (6), and countries (6)) than others. The average distance (1.82) between any two actors, however, is much smaller ( Table 2). When considering the network's eccentricity distribution by actor, which shows the longest shortest path between an actor node and all other nodes ( Supplementary Fig. 2), this short path length suggests that most actors' climate actions are not too dissimilar from each other in terms of the strategies they are adopting, and that a small number of outliers are driving up the diameter statistic.
The geographic actor similarity network map in Fig. 5 reveals connections between actors located within different regions, the density of which appear to differ based on actor type. When examining the density of edges within certain regions, it appears that the highest proportion of actors' edges are connected to actors within the same region ( Supplementary Fig. 3). Actors within Europe, for example, are most similarly connected to other actors within Europe in 95% of connections originating from Europe. Disaggregating these connections based on actor type, however, different trends emerge. Cities' and regions' climate actions are most closely connected to other cities and regions within their own geographies ( Supplementary Figs. 4 and 5), which could explain the close within-region connections overall. Ninety-eight percent of city connections originating in Europe (n = 4803) are connected to another city in Europe, and close to 83 percent of city connections originating in North America (n = 185) are most similar to another city in North America. Of city connections originating from the Middle East and North Africa (n = 27), however, 70% are most similarly connected to cities in Europe. Companies appear to be the most connected to actors outside of their own region, with companies in Europe (n = 1296), North America (n = 1068) and East Asia and the Pacific (n = 697) connected with ( Supplementary Fig. 6). North American companies still have high similarities to other companies in North America (64%).

Discussion
Based on our analysis, we observe distinctions between actors' climate actions that illustrate both jurisdictional and sectoral differences in emissions sources and power to manage them. The greater likelihood of some topics to appear in some actors' climate actions over others suggests that actors of a certain type tend to act more similarly to actors of the same type, rather than another entity type. In other words, cities tend to describe climate actions that are more similar to other cities than to companies. While there are some similarities and overlaps in the topics, sectors, and ways in which cities, regions, companies, and countries take climate action, this finding of group similarity is consistent with previous research that has documented the heterogeneity of country, industry, and firm responses to environmental issues 37 . Considering the various motivations for these different actor groups to address climate change, these conclusions may be unsurprising. Corporate climate responses have been primarily influenced by internal economic motivations 38 that frame a "business case" for climate action in terms of cost-cutting and revenue-generating opportunities 39 . We observed this rationale in companies' emphasis on monetary savings, payback periods, and investment costs in commonly co-occurring phrases. Cities have asserted themselves as climate change actors to demonstrate leadership, but also due to the increasing climate risks they perceive and experience 40 . Although we did not observe many cities' climate actions referring to climate change risks and adaptation, this gap may be likely due to the fact that the transnational climate initiatives many cities participate within have a tendency to focus on climate mitigation rather than adaptation. Regions and countries are more similar in their approach to climate policy in that their larger jurisdiction necessitates more broad, crosscutting approaches, which we observed in our network analysis, although it is well-documented in the literature the inconsistency with which countries articulated their Paris NDCs 26 .
We also observe similarities in climate actors between actors within geography, although we acknowledge (see Methods: Actor Similarity) that the strength of these similarities depends on the type of distance metric selected. We observe in-region similarity particularly for actors in Europe and marginally so for those in the United States, echoing prior research on the city and corporate climate actions that have found geographical region is an Fig. 5 A network visualization of the similarity between municipal, regional, and national actors based on the text of their commitments. Each actor is represented by a node placed over its geographic coordinates. Edges are directed, meaning they are drawn from an actor to the actor it is most similar to, calculated using euclidean distance between actors' topic vector representations. Each edge is colored based on its source's longitude. Bottom panels are insets showing company connections between North America and Europe (a) and density of city connections within Europe (b). The map also reveals strong city-actor connections between actors in the global South (c) and some linkages in company and city actors in Europe and East Asia (d). Basemap source: Wikimedia Commons 96 . The first two columns describe the number of linkages actors of that type have within the same region versus with other regions. The last three columns describe network metrics that indicate the degree of connectivity of nodes in that actor class. Average path length is the average shortage path length between two nodes-the shorter the distance, the closer two actors are connected. Diameter indicates the maximum distance between two nodes within an actor group. Weighted degree is the average sum of the weights connected to a node, or a measure of how connected a particular node is. important factor influencing climate actions 41,42 . Overarching policy frameworks that are influenced by a country's international obligations to the UNFCCC, for example, explain why developed countries' climate actions emphasize target-setting and developing countries mention climate change adaptation at a much higher frequency (Fig. 2). The presence of market-based mechanisms, such as the EU Emissions Trading Scheme, has resulted in European companies viewing climate change as a core business issue, which may explain why cost and monetary savings is a frequent topic for companies in our corpus, and reducing carbon emissions as a central policy 43 . Different country contexts, which determine varying market structures and characteristics, business cultures, and regulatory environments, also play a role in shaping business responses to climate change 44 . The high participation of cities in the EU Covenant of Mayors for Climate and Energy (EUCoM), which encompasses more than 9000 primarily small cities (population <50,000), could explain the similarity in commitments between these actors, particularly when considering the active role territorial coordinators, usually regional or provincial authorities, provide to usually smaller or less-resourced cities in meeting requirements for network membership 45 . These territorial coordinators go as far as taking over "the responsibility to draft the climate action plan for their signatories or to finance the drafting of SEAPs" 45 .
Although the aim of our analysis is to not describe in exhaustive detail the strategies different actors take in addressing climate change, we are able to identify some key themes. First, most climate actions in our analysis pay little attention to adaptation, vulnerability reduction and resilience. These themes appear to only feature prominently as top topics for countries, and predominantly for developing countries 24 . A similar study 46 found that in 200 E.U. cities only 23% had dedicated climate adaptation strategies. Scholars 41,47 have suggested that one reason for the greater presence of mitigation-focused policies and strategies at the local level is due to their ability to complement or integrate sector-specific policies for transport or waste management 46 . National governments still struggle to clearly define urban adaptation measures, which complicates local governments' ability to design and implement them 48 . In terms of companies, the lack of in-depth understanding of firm-level adaptation to climate change is partly due to the lack of a large sampling of companies that have publicly disclosed adaptation strategies 37 .
Second, few actors explicitly address consumption-based or supply chain emissions. These "scope 3" indirect emissions are only referenced in corporate climate actions that address business or employee travel. Supply-chain emissions or emissions occurring outside of an actor's immediate boundary are critical considerations for cities where consumption-based emissions comprise the majority of an actor's carbon footprint and increasingly for companies like automobile manufacturers where a substantial part of their emissions arise from the use and endof-life cycles 49 . This finding is consistent with other studies that have found a lack of consumption-based or supply chain commitments in city 50 and company 39 climate commitments. Very few cities actually include consumption-based emissions in their accounting inventories 51 and few international cooperative climate initiatives require municipal governments to include commitments that address these emissions 50 . A study of 22 major multinational companies 39 found that corporate climate efforts were predominantly linked to owned emission sources, and only 12% of funds to address climate impacts were invested in value chain emission sources. A few topics emphasizing waste, packaging, and recycling, which had a higher probability of appearing in corporate climate actions over other actors, however, suggests that some companies are identifying downstream impacts into their climate strategies.
Third, reported climate actions primarily seem to take two broad approaches: technological solutions and "soft" policy or management approaches, with businesses seeming to favor the former and cities, regions and countries relying more on the latter. Companies' emphasis on renewable energy, LED lighting, boiler and furnace efficiency, and air and HVAC systems reflect technocratic climate strategies, a finding consistent with a survey 42 of corporate climate management that identified plant retrofits, clean technologies, fuel switching, offset projects, and energy efficiency as primary strategies. Companies' utilization of more technology-focused methods to address climate change may be in response to what scholars 43 identify as the "recognized weaknesses in public policy frameworks for corporate action on climate change," which include gaps in regulation (i.e., not all greenhouse gases being covered), low economic incentives (i.e., cost of carbon not sufficiently high enough), and regulatory uncertainty (i.e., unpredictable government support for climate policy measures). Cities and regions appear to focus on citizen engagement and community-building programs as primary strategies. Education and awareness-building campaigns that engage citizens to change consumption patterns are key strategies municipalities have adopted to implement climate mitigation, including "soft mobility" campaigns to encourage citizens to increase usage of public transportation 50,52 .
The variation in climate actions by actor type and lack of strong connections between actor types and across geographies (save for some exceptions), however, may not necessarily suggest that the global climate governance system is characterized by "fragmentation and functional overlaps" rather than by coherence and hierarchy 53 . Division and allocation of specific, targeted actions for sectors over which actors have control may actually result in greater efficiencies in action, in line with what Ostrom imagined a polycentric governance system would achieve 5,6 . Further, our finding that climate actions tend to be more similar within actor groups may be evidence of orchestration within actor networks. For example, in the EU Covenant of Mayors, the secretariat, the Joint Research Commission of the European Union, has observed such strong similarities in SEAPs under the same regional coordinatoran institution that coordinates lower administrative units' participation in the larger network-that starting in 2014 they adopted a "grouped approach" that allows for coordinators to submit a template for the municipalities under their guidance 54 . For instance, the province of Limburg in the Netherlands developed a common SEAP approach for 42 of its municipalities, providing them information on specific measures in renewable energy and the buildings sector 54 . For corporate actors, the Science-Based Targets Initiative provides a set of standards and guidance for companies' emission reduction targets that are aligned with 1.5 and 2-degrees C scenarios 55 .
These examples of orchestration, however, occur within subsets of actor types, and the gaps in climate action we observed (i.e., in adaptation, supply chain/consumption-based emissions) and few linkages between disparate actors seem to suggest that there are missed opportunities for actors to connect and bridge climate strategies to enhance the existing polycentric climate governance system. For a fragmented system to be coherent and achieve "polycentric order," 56 at a minimum, information sharing is needed to facilitate continuous policy learning between actors and does not necessarily require strong relations between actors or formal coordinating institutions 56 . Diffusion, which can be defined as a process that occurs when policy decisions in a given jurisdiction are "systematically conditioned by prior policy choices made in other [jurisdictions]" 57 , may be one way in which climate actors in a polycentric or multi-level system can actually increase the speed of policy change to achieve efficiencies and narrow policy gaps to achieve global climate goals 58 .
While our analysis provides a broad survey of major actorsʼ climate actions, it is limited by the lack of time-series data, regular and repeated reporting on climate actions, strategies, and policies. Data on their effectiveness and performance could provide a deeper, longitudinal analysis of how climate actions are evolving, converging or diverging. Regular reporting or collection of climate action and policy documents over time could allow for examinations of changes over time to better connect actions to specific outcomes.
Despite these data limitations, this study provides a proof of concept to understand how NLP and network analysis techniques may be useful to evaluate policy inputs that are by nature qualitative and challenging to empirically evaluate, but could yield insights as to the ways climate actors could better link and learn from each other. Such learning could establish new norms for ambitious climate action and build coalitions and support to pursue them 7,59-65 ; and help actors explore policy pathways and mechanisms that could help to identify political mechanisms to help scale and implement decarbonization 66 . These "indirect effects" could catalyze necessary know-how and support for governments to pursue national climate policies and international commitments with greater ambition 19,67-70 but that are inherently difficult to quantify 7,71 . These next steps could address calls 9 for more empirical analysis of the ways in which climate actors operating at multiple levels and domains interact and whether such interactions are ultimately positive for global climate governance.

Methods
Data collection and preparation of climate actions database and text corpus. We collated sources of mostly publicly available information on country, corporate, subnational climate actors and commitments from a range of data providers, including the CDP Annual Supply Chain Disclosure Survey, Carbonn Climate Registry, CDP Cities, EU Covenant of Mayors, Global Covenant of Mayors, Compact of States and Regions, Under 2 Coalition, C40 Cities for Climate Leadership, RE100, We Mean Business, Compact of Mayors, We Are Still In, Climate Mayors, Climate Alliance (see Supplementary Table 1 for more details on the data sources compiled). Data were available in tabular format (.csv or.xlsx) or we scraped data from the reporting website using the Beautiful Soup Python package 72 . Data on the actors' location (i.e., country, region), actor type (e.g., country, city, region, company), and climate actions were compiled for this analysis. Climate actions in this analysis primarily refer to specific sectors (e.g., buildings, transport, waste) and actions (e.g., installing LED lighting, increasing percentage of electric vehicles) actors take to implement specific climate mitigation and adaptation activities. We did not include specific emission reduction target commitments (e.g., reducing emissions 20% from 2005 baseline by 2020) because we did not observe syntactic diversity in these targets that provided much variation or insight into the strategies and ways in which these actors are tackling climate change.
Data limitations. Available data for country and non-state actor climate action is limited to self-reported data by the actors themselves and largely restricted to the networks and registries listed in Supplementary Table 1, which others 73 have found mainly cover actors in developed countries. These climate action initiatives are driven by an agenda developed in the Global North 18 , and under-represent small and medium enterprises (SMEs), as well as smaller cities and regions. Smaller entities, or those based in the Global South, may be taking climate action, but may not have incentives or resources to report to these platforms. The cost of collecting and reporting data can also form a barrier; the costs of monitoring transportation and energy use vary, for instance, depending on access to technology and human resources. Our analysis, therefore, is limited to what data are available, which are not necessarily representative of all existing climate actions because of the reporting gaps mentioned above. There have been recent efforts, such as the SME Climate Hub launched in partnership with the UNFCCC in September 2020, to further engage smaller private actors to report on climate actions (https://smeclimatehub. org).
Text preprocessing. All non-English text data was translated into English using the Google Cloud Translate API. We eliminated all commitments where actors report climate actions that are <25 words in length. To prepare the corpus for analysis, we removed common stopwords (i.e., "a", "and", "the") from the SMART stopwords list 74 , which is built into the STM package 30 . We also removed 66 custom stopwords (Supplementary Table 2) based on an analysis of high-frequency occurring words and place-specific words (e.g., "Indonesia") that did not take away from the semantic content of actors' commitments. The WordNet Lemmatizer in the python NLTK package was used to remove inflectional affixes from words with the same stem (e.g. produced, production, producing, producer, etc. become produce). The final corpus of climate action text totaling 4,064,798 words and contains climate actions from 9326 actors with a document on average of 436 words, although the range of document length by actor group is quite variable (Table 1; Supplementary Fig. 1).
Topic modeling. The topic modeling used in this analysis builds on Latent Dirichlet Allocation or LDA 28 , a common text analysis technique that identifies and allows for prediction of topic probabilities in a text corpus. The topic model represents the overall themes present in a corpus-topics-as probability distributions over words in a vocabulary; so while the probability of the word train might be high in a topic relating to public transportation, it might be relatively low in one relating to building sector emissions. Documents are modeled as being formed word-by-word by a generative process where first a topic is selected according to some probability distribution specific to each document, and then a word is selected from that topic in accordance with the topic's distribution over vocabulary words. Using what the model considers to be outputs of this processthe documents in our corpus-we can infer the probabilities of each topic given a document, and each word given a topic through a training process.
We implemented our topic modeling using the stm package for R 75 . We specifically used the Spectral algorithm, which is the stm's default 76 without the inclusion of covariates. When the structural topic modeling (STM) algorithm is used without covariates it is a correlated topic model but has several additional benefits over LDA. One major advantage, however, of STM over LDA is that it allows for groups of documents to vary word usage within topics. While LDA assumes that all documents in a corpus discuss topics with the same diction, STM allows groups of documents to vary word usage within topics 75 . Specifically, the Spectral algorithm implemented in the stm package provides more stable and consistent results because it is deterministic, an advantage over LDA, which is prone to problems of multi-modality in which there are multiple and sometimes equally likely outcomes 77 . When the number of documents is large, as is in our case, the Spectral algorithm has been shown to perform very well and is consistent across machines 75 . We experimented with several algorithm specifications, including the LDA algorithm, and found that the Spectral algorithm as implemented by the stm package yielded the most coherent and consistent topics, after multiple runs and across various machines.
To determine the number of topics in the text, we examined metrics provided by the STM package, including exclusivity (e.g., uniqueness), held-out likelihood (e.g., cross-validation), semantic coherence of models (e.g., whether the topics contain words that are representative of a single coherent concept), and minimizing residuals (e.g., error). To guide our choice of the number of topics we optimized for two metrics: held-out likelihood, which favors topics that are likely to produce documents held out of the training set, and semantic coherence 78 , which favors topics that assign high probabilities to words that appear close to one another in the corpus. From maximizing over these two metrics of performance and comparing 20, 30, and 40 topic models, we found that a model with 30 topics best-maximized distinctness and coherence between topics, while minimizing overlap and the number of "junk" topics (i.e., words that commonly co-occur but together as keywords lack coherence as a singular topic). We found that this balance of examining statistical parameters and our own evaluation of topic models yielded the best result 79 .
After selecting the 30-topic model, we produced brief summaries of each topic (i.e., topic labels; see Supplementary Table 3) by taking into account both the probability of words being generated by a specific topic, and by looking at how the topic was expressed in documents with a high probability of producing the topic. We do acknowledge, however, that these topic labels are subjective interpretations. This is a common limitation for topic modeling and other unsupervised statistical classification techniques, particularly as labeling is often determined through examination of the most probable words, which are not necessarily exclusive to a topic and represent a small fraction of the probability distribution 80 .
Actor similarity analysis Topic network analysis. The network map (Fig. 4) of the topics identified in the STM was developed using the topicCorr function in the STM package 75 to find positive correlations between topics in our selected 30-topic model. This function uses the estimated marginal topic proportion correlation matrix and eliminates edges where the correlation falls below 0, resulting in a network graph that only shows topics with positive correlations. Community clusters are determined using the fast greedy hierarchical clusterization algorithm, which is based on a modularity measure that reaches a maximum in each cluster, so detected topics are most likely to appear together in given texts 81 . The network is visualized according to a standard Fructerman Reingold layout employed in R using the ggplot package 82 . Nodes were sized according to the mean topic prevalence and colored according to the actor type that had the highest per-document per-topic probability (i.e., gamma statistic) for each topic 83 .
Actor similarity by geography. In order to analyze similarity relationships between actors' climate commitments, we then constructed a network where each node corresponds to an actor, and each edge is weighted by the inverse of the euclidean distance between 30-dimensional vector representations of the actors it connects. Following methods similar to the previous studies 80 , we compared topic distributions between documents, since topic proportions per document are vectors of the same length. Each value in these vector representations corresponds to the prevalence of one of the 30 topics listed in Supplementary Table 3, meaning the euclidean distance metric reflects the degree to which actors discuss different topics.
There are trade-offs and limitations in the selection of similarity metrics, with cosine similarity and euclidean distance being two common metrics in NLP and text clustering 77,84,85 . In some cases, researchers 84 found euclidean distance to perform worst in unsupervised clustering of similar text documents, while others 85 found it to perform the best in their evaluation of short texts of 20 words long. Another study 77 further found issues when applying cosine similarity: they found slightly less clear correlations between cosine similarity and top words and top documents, where there were multiple cases where high cosine similarity appears with comparatively low number of top words or documents in common. As a sensitivity check, we then calculated the similarity between all topic-document distribution pairs using both euclidean distance and cosine similarity ( Supplementary Figs. 3-7) and then visually inspected documents to evaluate the better metric. Our evaluation is similar to that of Roberts et al. 77 regarding cosine similarity, we observed more similarity in action plans of European city actors, which is observed through the euclidean distance metric, rather than European and Middle East/North African actors, which is suggested are more similar through a cosine similarity metric. This finding makes sense, considering most European city actors pledge and report actions through the EU Covenant of Mayors for Climate and Energy, which provides specific guidance on how actors should develop their action plans to meet the requirements of the initiative 86,87 .
The most prevalent topic representations for each actor node were used to construct the network map. Edges are directed, and are drawn from an actor to the actor it is closest to by this metric. The edges are also shaded based on the source's longitude.
To understand what Fig. 5 reveals about how actors from different regions interact, we matched each actor to one of eight regions and computed the proportion of the total number of edges that we observed between each pair of regions: Frequency i;j ¼ Number of edges from an actor in region i to an actor in region j total number of edges observed for actors in region j These values were calculated for all of the edges in the network as well as disaggregated by actor type and placed into corresponding heat maps ( Supplementary Figs. 3-7).
Sensitivity analysis. We conducted several sensitivity analyses and robustness checks to evaluate our choice of topic model and algorithm selection. We first evaluated results in relation to the length and number of actors included in our text corpus, given the variation in the number of actors(min 76 regional actors, max 5536 city actors) and the length of their climate actions in our database (min length 25 words to maximum over 20,000 words) (Table 1). First, we assessed whether the dominance of one data source for cities affected the topic model by randomly selecting 400 texts from the CDP (n = 535) and EU Covenant of Mayors (n = 4699), which represented the largest sources of data. Second, we randomly selected a number of actors' texts to achieve a balanced corpus length for each actor group, since previous studies 88 have found that relatively shorter (between 300 and 600 words) documents improve the accuracy and consistency of the topic modeling approach. We also repeated our topic model using a noun-only, lemmatized (i.e., root form) version of the text corpus to evaluate whether reporting styles or differences in writing about climate actions impacted the topic model or results. The results of the sensitivity analysis are in Supplementary Table 5 and Supplementary  Figs. 8-10.
Through visual inspection, we determined that the 30-topic model developed from our corpus that includes all actors' text is not affected by variable lengths in actors' documents or dominance of one data source. For the noun-only, lemmatized text corpus, we found that the topics were very similar to those of the unlemmatized text-a topic on climate adaptation, for instance, still appeared and to be most commonly found in the country's action plans. As Supplementary  Figs. 8 and 9 reveal, topics on business travel and employee commuting, consumption, renewable energy, efficiency, were still most common amongst business actors. Topics focused on municipal climate actions, public buildings, waste and transport, and mobility are prevalent amongst city climate actors. We also still found the same tendency for climate actions within the same regions to be more similar than for actors located in different regions ( Supplementary Fig. 10).

Data availability
Code and data to reproduce figures is available on figshare (https://doi.org/10.6084/m9. figshare.13501701) 94 . CDP data were provided under a license to A.Hsu and prohibits public resharing of climate strategy disclosure data that was used for this analysis. All other data were compiled into a single database in comma delimited format (.csv) format from publicly available sources, which are detailed in Supplementary Table 1. Contextual data for subnational actors was extracted from the ClimActor database (https://doi.org/ 10.1038/s41597-020-00682-0) 95 .

Code availability
All statistical analyses were conducted using the R statistical programming environment (Version 3.6.2) and the stm package 75 for the topic analysis. The quanteda package 89 was used for word collocation analysis. Figures were made using the ggplot package in R 82 . For the network analysis, the tidytext 83 , stm 75 and Textnets 32 were used. Gephi was used to generate and analyze the geographic network graphs presented, and the Python package NLTK 90 was used for minor pre-processing tasks. Pandas 91 and Numpy 92 were also used for dataframe and matrix manipulations, and some additional plotting was done with Matplotlib 93 . R and python code to reproduce the figures is available upon reasonable request.