Main

Social media platforms, such as Twitter, provide important locations for the everyday discussion and debate of climate change1. The nature of this role is highly contested, with some pointing to its democratizing potential while others argue that social media is accelerating political polarization2. Monitoring polarization is important given that a highly polarized environment has the potential to drive antagonism between ideological groups, generate political deadlock and threaten pluralist democracies3. The study of online polarization has thus gained momentum in recent years4,5,6.

In this paper, we analyse tweets related to the Conference of the Parties (COP) to clarify the nature of polarization in political debates on climate change. Specifically, we are interested in how the climate discussion is structured on Twitter in terms of the plurality of views and the interaction patterns among ideologically opposed groups. We find that a prominent opposition to the dominant pro-climate discourse has established itself since late 2019, resulting in a highly polarized online climate debate.

Twitter is the ideal platform for studying climate communication because it is widely used by politicians and journalists2, has broad social and cultural influence7, and because of the rich structural data it captures. Of course, Twitter is not directly analogous to public opinion, and our results probably derive from a combination of the platform’s well-documented tendency to foster polarization and the broader contexts for climate politics8,9,10. However, many studies highlight the importance of Twitter (and social media in general) as a critical tool for studying climate communication1,11,12,13,14,15,16, political polarization6,17 and misinformation18. Beyond social media, a broad literature considers the polarization and politicization of climate issues using other computational techniques and more traditional approaches19,20,21,22,23. Here we extend this literature by exploiting tools from the growing field of infodemics5,24,25,26,27,28,29,30,31.

The motivation for our focus on COP is threefold. First, the COP discussion can be characterized as a discrete, regularly repeated online event that lends itself to a quantitative, multi-year analysis of climate polarization (a key gap in the literature). Second, by focusing on a specific event, we ensure that tweet content is thematically focused (in our case on climate politics) and that the network of interactions is sufficiently connected to allow robust network analysis (which is not always possible with sampled datasets32). This event focus is a common feature of previous climate communication studies on Twitter (for example, on the IPCC report12 or the Finnish elections15). For a review of the benefits of studying specific events or controversies, see ref. 33. Finally, COP is the pre-eminent international forum for climate diplomacy, directing considerable public attention towards climate change34,35,36. This makes COP the ideal target for studying the intersection between climate change and political polarization.

Here we first highlight the significance of COP21 and COP26 relative to other COPs. Second, we derive a spectrum of climate ideologies (defined by constructing a synthetic distribution of opinions based on similarities in user–user interactions), which reveals two prominent groups: an ideological minority and a majority. We reveal that polarization (measured as the bimodality of the ideology distribution) is low pre-COP25, before a large increase in COP26 (with supplementary data suggesting that the increase in polarization probably started in 2019 around the global climate strikes). Third, we emphasize the political dimension of COP, revealing broad international engagement from elected politicians and highlighting the political parties who oppose urgent climate action. Fourth, we investigate discussion topics during COP26 and highlight the overlap between minority rhetoric and established climate-contrarian views37. Notably, the issue of political hypocrisy is identified as a salient issue of cross-ideological appeal. Finally, we supplement our analysis with Twitter data on climate scepticism and climate change and show that our COP analysis is broadly representative of the wider climate discussion on Twitter.

Results

We start by highlighting the significance of COP21 and COP26 (Fig. 1). Figure 1a shows the number of posts on Twitter from 2014 to 2021. The inset shows general online engagement with COP measured using Google Trends, revealing that Twitter engagement closely reflects wider online attention. Within our study period, COP21 and COP26 are of particular significance, with the Paris Agreement signed at COP21 and the Glasgow Climate Pact agreed at COP26. Consequently, content creation and engagement (that is, retweet count) are larger for COP21 and COP26 than in the intermediate years. Our data show the influence of local engagement (inset), where the overall Google Trends scores are presented alongside country-specific scores for France (the host of COP21) and the United Kingdom (the host of COP26). Supplementary Sections 1A and 1B show a similar analysis for YouTube and Reddit, where activity is significantly lower than on Twitter.

Ideological polarization during COP

To assess the emergence of a broad climate-contrarian community, we now analyse the evolving nature of ideological polarization between COP20 and COP26. Polarization is most often quantified in terms of the modality of a distribution of surveyed opinions38 (the definition we choose to use here; see the discussion in the Methods), although other valid definitions exist. However, on Twitter true opinion data are unavailable, so instead we infer a synthetic opinion distribution from retweet data as a proxy (Methods). In subsequent sections, we validate this proxy method by showing how opposite ends of the ideological spectrum correspond to distinct views on climate change.

We start by assuming that the climate ideology of an individual, i, can be expressed as a single number, xi (ref. 27). Polarization then refers to the properties of the probability distribution, $${{{\mathcal{P}}}}(x)$$, of ideological scores across a population. The ideological spectrum is extracted from the Twitter retweet network using the “latent ideology” method4,5,39. Loosely speaking, the method produces an ordering of users and influencers where accounts with similar retweet interactions are close to each other in the ordering, resulting in similar scores (Methods). We specify that the majority group map to −1 and the minority to +1. Any account with an ideology score less than (more than) zero is part of the majority (minority).

We calculate the latent ideology for COP21 and COP26 (Fig. 2), where influencers are selected as the top 300 most retweeted accounts, excluding a small number (3%) that conflate the results (Methods). Influencer demographics and labelling are discussed in Supplementary Section 1D.

The latent ideology shows unimodal user ideology for COP21, whereas the COP26 user ideology is multimodal, as confirmed by Hartigan’s diptest (Methods): the bimodality statistic, D, increases from COP21 to COP26 (COP21: D = 0.0023; 95% confidence interval, (0.0020, 0.0026); P = 0.003; COP26: D = 0.049; 95% confidence interval, (0.048, 0.050); P < 2.2 × 10−16). Despite the special significance of COP21, similarly low polarization (that is, unimodal ideologies) is found for all COPs prior to COP26 (Extended Data Fig. 1).

For both COP21 and COP26, influencers split into majority and minority actors. The majority are largely pro-climate accounts. Focusing on the minority gives some indication of the ideological divide present in these datasets. The COP21 minority has three influencers: @BjornLomborg, @Tony__Heller and @JunkScience. These individuals are climate focused and self-identify as outside the climate mainstream: @JunkScience quotes a Nature Climate Change article referring to him as “the most influential climate science contrarian”40, @BjornLomborg references his book False Alarm: How Climate Change Panic Costs Us Trillions and @Tony__Heller links to his climate-critical blog realclimatescience.com.

For COP26, we find 56 minority influencers. Of these, 6 have a clear climate focus. The remainder include media organizations and journalists (for example, @newsmax, @nypost, @GBNEWS, @PrisonPlanet and @bennyjohnson), politicians (@SteveBakerHW and @laurenboebert) and accounts campaigning against COVID-19 restrictions (@BernieSpofforth and @JamesMelville). This last group may not have strong views on climate; however, their presence in the minority remains important given how similarities in user–content interactions are used by recommendation systems41.

Qualitatively, the increase in polarization is robust to variable influencer number, different influencer definitions, different data-collection time windows and the removal of tweets related to COVID-19 (Supplementary Figs. 37). Further analysis also suggests that bot activity and deleted content do not conflate the observed increase in polarization (Supplementary Sections 2C and 2D). Our analysis shows that around 30% of climate-sceptic accounts from 2015 are no longer active on Twitter. However, deletion rates would need to exceed 80% to explain the observed increase in polarization.

One important question is whether growing polarization is a consequence of shifting views (that is, individuals moving from a majority to a minority ideological position) or changes in minority activity (that is, users with pre-existing climate-sceptic views expressing those views more prominently on Twitter). We assess this by recomputing the ideological spectrum using an equal number of minority and majority influencers who appear in both the COP21 and COP26 datasets (Fig. 3). This shows that minority influencers from COP21 remain in the minority for COP26, and majority influencers remain in the majority; over half of the minority influencers selected using this method are climate-focused accounts. However, for the standard retweet-based COP26 minority (Fig. 2), only 11% are climate focused. This demonstrates that the promotion of climate-contrarian views is shifting away from climate-focused accounts towards a broader set of non-specialized influencers. The observed increase in polarization is probably due to users with existing minority views expressing those views more prominently on Twitter, although determining this precisely is difficult.

The political dimension of COP26

To better understand how ideological views on climate change are associated with political leanings, we now highlight the role of elected politicians in the COP26 dataset (Methods). We do this by recomputing the latent ideology twice: first, using exclusively politicians as influencers, and second, excluding politicians, generating a two-dimensional spectrum (Fig. 4). Political engagement between COP20 and COP25 is discussed in Supplementary Section 1E; for COP21, we find only one minority politician (Roger Helmer, former UK Independence Party Member of the European Parliament).

Marking the median positions of select Anglophone political parties shows how the majority and minority, which appear homogeneous along the climate axis, split into groups with more geographical and political nuance. In the minority, we find a large block dominated by the US Republicans and former UK Brexit / UK Independence Party politicians, alongside a smaller block corresponding to the Canadian Conservative party. Related tweet extracts include calls for a “Net Zero Referendum” (Nigel Farage), claims that COP “has absolutely zero credibility” (Lauren Boebert) and statements that “everything you’ve being [sic] told by climate alarmists is a lie” (Maxime Bernier).

In the majority, we find most other mainstream political parties. It is perhaps surprising that some parties criticized as weak on climate action appear in the majority, notably the Australian Liberals42. However, this reflects pro-climate rhetoric by Scott Morrison, which attracted majority retweets (for example, “pleased to agree a new low emissions tech partnership”) as well as criticism (“#ScottyFromMarketing”).

One apparent oddity is that left-leaning political groups (for example, UK Labour and the Greens) appear ideological closer to the minority on the climate axis than more conservative parties. The analysis below suggests that this is due to cross-ideological accusations of political hypocrisy.

Topics of discussion

Topics in the COP26 discussion can be extracted using BERT topic modelling43 (Methods) and placed on the ideological spectrum (Fig. 4). See Supplementary Section 1I for the COP21 results.

Majority topics

Majority topics have a clear climate focus, making explicit reference to specific COP themes, including “women’s day”, “transport day” and “climate finance”. Beyond these, there are topics related to climate activism with a specific emphasis on youth protests, indigenous groups, the need for “climate justice”, and the “decolonisation” of climate change.

Potentially the most striking rift between users in the majority relates to whether they are COP supportive or not. Many pro-climate accounts are critical of the COP process, describing it as ineffective and accusing it of “greenwashing”. This theme is a clear shift from COP21, where only select influencers were COP critical (7% of labelled COP21 influencers; Supplementary Section 1D), most notably George Monbiot and Naomi Klein. Since then, criticism of the COP process has grown significantly (35% of labelled COP26 influencers).

Minority topics

The COP26 minority discuss a broad range of climate-related topics. Cross-referencing these with a taxonomy of “climate contrarian” claims37 shows that the COP26 minority promote and engage with all five of the leading contrarian claim types (Table 1).

Other topics not specific to climate include tweets critical of particular politicians, most notably Joe Biden (referred to as “sleepy Joe”), Boris Johnson (for promoting green policies) and Justin Trudeau (for allegedly destroying the Canadian oil/gas industry). Finally, there are topics of wider relevance to the political right, particularly COVID-19 (the “plandemic”), vaccines (“#NoVaccinePassports”) and illegal immigration (“[stop] illegal economic migrants”).

Political hypocrisy and the ideological divide

Understanding content that bridges the ideological divide is important for assessing which topics may act as a gateway into the ideological minority, particularly since Twitter recommends content on the basis of similarities in user–content interactions between accounts41. To assess this, we rank tweets according to the number of cross-ideological retweets—that is, majority author but minority retweeter, or vice versa. This reveals the theme of political hypocrisy, which includes references to the use of private jets and diesel cars, the continued use and development of fossil fuels, and the dumping of raw sewage. Half of all majority tweets referencing hypocrisy have been posted since December 2020 (Supplementary Section 1K and Extended Data Fig. 2).

News media reliability

Given the distinct topics discussed by the majority and minority, we may expect that these ideological groups reference different news media outlets. We show this using heat maps of ideology against independent news media reliability scores (Fig. 5 and Methods). This reveals that the ideological majority preferentially reference news domains with high trust scores, whereas the minority often reference domains with low scores. This result is robust if we use country-specific NewsGuard scores (Supplementary Section 1F). In Supplementary Section 1J, we show the formation of ideological echo chambers during COP, a common feature of polarized communities on social media9,27.

The wider climate discussion on Twitter

We now show that the COP discussion is broadly representative of the wider climate discussion on Twitter. This is important since keyword-based data collection using the search term ‘COP2x’ may fail to capture certain climate-related communities.

First, we cross-reference our COP dataset with two supplementary datasets: 1.3 million tweets using terms associated with climate scepticism, and all original tweets since 2012 using the term ‘climate change’ (Supplementary Section 1K). This reveals that (1) the activity of the COP26 minority is highly correlated to the activity of the broader climate-sceptic community on Twitter (Extended Data Fig. 3), (2) the COP26 minority started to engage with climate issues much more recently than the COP26 majority (Extended Data Fig. 4) and (3) climate-sceptic activity was very low, but present, before 2019. Note that there is no evidence of a change in the interaction rate between pro-climate and climate-contrarian groups (Supplementary Section 1K).

The expression of climate scepticism saw significant growth from 2019 onwards, peaking during the global climate strikes in September 2019 and the Australian bushfires in January 2020. This growth does not appear to have translated into significant engagement from sceptics during COP25, most likely due to its lesser importance (Fig. 1; major new agreements were not negotiated at COP25).

Increased sceptic activity does not necessarily imply an increase in the number of Twitter users with climate-sceptic views but more likely implies an increase in users expressing those views (which is in itself important). Possible drivers of this growth include (1) the issue of political hypocrisy (see above), specifically following the approval of a new Canadian oil pipeline in June 2019; (2) a backlash to the direct impact of the global climate strikes (minority content is particularly critical of Greta Thunberg and Extinction Rebellion); and (3) the belief that the climate movement is unreliable, with minority users blaming the Australian bushfires on arson, not climate change.

Discussion

We have investigated ideological polarization around climate change by analysing the discussion around COP on Twitter. Our results show that ideological polarization, measured in terms of bimodality, was low and largely flat between COP20 and COP25 before a significant increase during COP26, driven by growing right-wing activity.

Cross-referencing the COP dataset with additional data on climate scepticism highlights 2019 as a key year when the expression of climate scepticism grew on Twitter. Our data point towards the role of political hypocrisy and a potential backlash to direct action from climate activists (see refs. 44,45,46 for related discussions) as potential factors in this growth.

The opposition to climate action is a known feature of populist politics47, largely due to the association of climate change with issues of institutional trust and populist attitudes towards science. This trend has probably been catalysed by anti-science sentiments during the COVID-19 pandemic48. However, surveys suggest that right-wing views on climate are more subject to change than left-wing views49. Consequently, there is reason to believe that growing right-wing opposition to climate action may be reversible.

It is perhaps surprising that the events with the greatest increase in climate scepticism on Twitter took place since 2019 and not earlier, particularly given Trump’s election5,50and Brexit51. Climate issues were not a central feature of the 2016 Brexit debate. Yet, many members of the COP26 minority were prominent Brexit campaigners. This shift may be a sign that these politicians see opposition to climate action as a topic with growing popular appeal; note, for example, Nigel Farage’s “Net Zero Referendum” campaign.

Given that rapid and effective climate action depends on broad international consensus and collaboration, the growth in polarization may risk political deadlock if it fuels antagonism to climate action3. Policymakers should consider how actionable factors may be driving this polarization; perceptions of political hypocrisy may be critical in this regard. Our analysis suggests that these perceptions are worsening, not improving. Similar concerns regarding hypocrisy discourse around climate change have been raised previously52,53. For instance, researchers have shown that tweets referencing climate hypocrisy tend to have higher virality53.

Our analysis focuses on Twitter because this is where the COP discussion is the most active and where we find influencers from across the political spectrum. The data were acquired using a keyword search (‘COP2x’), which ensures that the tweets are thematically focused and the data can be feasibly acquired from the Twitter API. In principle, this approach may fail to capture the full climate conversation on Twitter, but supplementary data show that our results are broadly representative of the wider climate discussion. Future work should acquire larger datasets with a broader focus (perhaps using a random tweet sample if sufficient data can be acquired, although this approach is problematic for structural analysis32) and could consider a wider range of platforms54. Our data suggest that the COP discussion is not particularly active on YouTube or Reddit, although this may not be the case for other climate events.

Given significant engagement with climate politics during COP26 from groups and politicians opposed to climate action, future work should monitor how this evolves during COP27 and onwards. Possible questions include (1) whether ideological minorities are growing or declining in influence, (2) whether social media polarization is having a broader impact on public debates and (3) whether ideological echo chambers are becoming more or less isolated as climate communication strategies develop.

Finally, it is a value judgement as to what constitutes a healthy plurality of views on social media or unhealthy polarization. Consensus should not be expected55,56. However, tracking trends in polarization over time is critical for understanding the political context for accelerated climate action and how political actions may impact public opinion.

Methods

Datasets

Twitter data including tweets and user information were collected using the official Twitter API for academic research (https://developer.twitter.com/en/docs/twitter-api), using the search query “cop2x”, x {0, …, 6}. For each COP, data were collected from 1 June in the year of the conference to 31 May in the following year, except COP26, for which data were collected up to and including 14 November 2021. Statistics for each COP are provided in Supplementary Table 5. Each dataset was downloaded between October and November 2021.

Twitter accounts associated with elected politicians were labelled using an existing dataset of political Twitter handles from 26 countries collected between September 2017 and February 2021. The dataset is freely available at TwitterPoliticians.org or on FigShare (https://figshare.com/articles/dataset/The_Twitter_Parliamentarian_Database/10120685). The dataset is discussed in detail in ref. 58. Note that select politicians elected in 2021 who were missing from the dataset were added manually to the database if they appeared as prominent influencers in the COP26 network (for example, @laurenboebert). For practical purposes, an account is labelled as a politician even if that politician no longer holds elected office.

Network construction

The Twitter interaction network was constructed by taking the full corpus of tweets for each COP and focusing exclusively on retweets. Such an approach is typical in the Twitter analysis literature, where retweets are considered evidence of a user endorsing the message of the original poster; this is despite many Twitter users stating in their biography that retweets should not be understood as endorsements. This is in contrast to quote tweets or comments, which are less likely to represent a clear endorsement of a tweet. After selecting all the retweets from the full Twitter dataset, we filtered by language using the Twitter API language metadata, selecting only those retweets written in English.

From this set of English-language retweets, a network was constructed by defining a node for each unique user in the dataset. This includes any user who authored an original English-language tweet, or retweeted an English-language tweet, containing the keyword ‘cop2x’, x {0, …, 6}. A directed edge is formed from node A to node B if user A retweeted a post authored by user B. Edges are weighted according to the number of unique retweets between those two users.

Measuring polarization

Assessments of polarization in social systems have become a key research theme in computational social science, particularly in recent years following the Trump presidency5,50 in the United States and Brexit in the United Kingdom51. Despite this, there is no one agreed definition of polarization, with variable definitions depending on the research question and field.

In the social sciences, the term ‘polarization’ is typically understood as some form of distance measure on a (typically one-dimensional) distribution of opinions. Under this general framework, polarization may be quantified in numerous ways, including but not limited to spread, dispersion, regionalization, community fracturing, distinctness and group size. For an extensive discussion of these “senses” of polarization, see ref. 38. As stated in ref. 38, “the most common measure of polarization in the political literature is probably bimodality, which is the idea that the population can be usefully broken down into two subpopulations”. This is the definition we choose in the current paper, in particular because it reflects how prominently politicians (for whom Twitter is particularly influential) see content that is pro-climate or climate sceptical. We stress, however, that using alternative definitions of polarization may lead to our results being interpreted differently.

One of the limitations of this family of polarization measures is that they typically consider opinion distributions in the absence of structure; we may observe polarized views among individuals, but we do not necessarily know how different individuals interact with each other. It is this structural factor that is focused on in network science, where polarization is often thought of as a distance measure on two (or more) network communities; for a nice example of such work with relevance to climate change, see ref. 15. However, this structural point of view (polarization in terms of interactions) often fails to consider polarization in terms of opinions. This is a limitation, but it reflects the reality of most social media studies of polarization, where structure is known but ground-truth opinion (for example, from surveying individuals) is not known. This is the case for Twitter, where the ‘true’ opinion of an account is unknown.

The latent ideology measure39 used in the current study aims to infer a synthetic opinion distribution from network structure, on the basis of the premise that the structural separation of group interactions on a particular topic should correlate with differences in group opinions on that topic. Without external validation, such synthetic opinion distributions can be dangerous, particularly if a network appears structurally polarized for reasons other than individual views on a topic (for instance, due to geographical factors). However, with validation, such an approach has the benefit of combining the nuanced social science concept of polarization with the structural approach typical in social media studies.

Having extracted the distribution of opinions using the latent ideology method, we quantified polarization in terms of bimodality using Hartigan’s diptest (see below). The choice to measure polarization in terms of bimodality is deliberate since it is a relative measure (as opposed to an absolute measure), which can be applied to a synthetic distribution of opinions. Using absolute measures is difficult with synthetic distributions given that absolute opinion scores are not easily mapped to scores that may be derived from surveys (for example, ‘Out of 10, how strongly do you support climate action?’).

Latent ideology

The latent ideology estimation was developed in refs. 4,39 and adapted for exploiting retweet interactions in ref. 5. Following ref. 5, we infer ideological scores for Twitter users using correspondence analysis59 and retweet interactions.

First, we built a matrix A such that each element aij is the number of times user i retweeted influencer j. To select only users that are interested in the COP26 debate, we pruned out users that retweeted fewer than two influencers.

We then executed the correspondence analysis method according to the following steps. Given the adjacency matrix normalized by the total number of retweets as $$P=A{({\sum }_{ij}{a}_{ij})}^{-1}$$ and the vectors of row and column sums as r = P1 and c = 1TP, respectively, and considering the matrices Dr = diag(r) and Dc = diag(c), we can compute the matrix of standardized residuals of the adjacency matrix as $$S=D_{{\mathrm{r}}}^{-1/2}({P}-{{{\bf{r}}}}{{{\bf{c}}}}){D}_{{\mathrm{c}}}^{-1/2}$$. The usage of the standardized residual matrix allows the method to account for differences in users’ activity and influencers’ popularity. Next, single value decomposition is applied to the matrix S as S = UDαVT with UUT = VVT = I and Dα being the singular values diagonal matrix. The standard row coordinates $$X=D_{{\mathrm{r}}}^{-1/2}U$$ can be considered as the estimates of the user ideologies. In our study, we only consider the first dimension that corresponds to the largest singular value. Users’ ideological positions are computed by rescaling the row estimates into the set [−1, 1], while the influencers’ ideological positions are calculated by the median of the weighted positions of their retweeters.

Hartigan’s diptest

Hartigan’s diptest is a nonparametric test to measure the multimodality of a distribution from a sample60. It calculates the maximum difference over all sample points between the unimodal distribution function that minimizes that maximum difference and the empirical distribution function. The test produces a statistic D, which quantifies the magnitude of multimodality, and a statistical significance P. If P < 0.01, we say that the ideology distribution shows statistically significant multimodality. Conversely, if P ≥ 0.01, we cannot reject the unimodality of the distribution.

The diptest calculates D from the full set of influencer and user ideology scores. To estimate errors for the diptest, we used a bootstrapping procedure. This involves selecting 70% of the users and influencers at random from the pre-computed ideology scores and recalculating the diptest from this sample. Repeating the sampling process 1,000 times gives a distribution of diptest scores from which diptest errors can be computed.

Selecting influencers

Applying the latent ideology to a set of influential accounts on Twitter does not guarantee that those accounts will arrange themselves in the latent space on the basis of political or climate ideology. In a number of cases, the dominant factor that determines the principal ideological axis is geography. Focusing exclusively on English-language Twitter reduces the effect of these geographic factors. However, some additional filtering is required to avoid the latent ideology partitioning accounts on the basis of geography.

Factors that may conflate ideological scores include (1) language (for example, English versus non-English), (2) geography (for example, accounts focused on Indian politics) and (3) prominent topics outside the core discussion (for example, discussions in the blockchain community); see Supplementary Section 2B for the details. These factors are mitigated by selecting English-language tweets and by performing some minor filtering of the influencer set. For each COP, less than 3% of the accounts are removed from the set of influencers as part of the filtering process.

In Supplementary Section 1C, we discuss other influencer definitions and show that the observed increase in polarization during COP26, relative to previous COPs, is robust across a range of measures (Supplementary Figs. 37).

Topic extraction using BERT

BERTopic43 is a topic modelling tool that extracts latent topics from a collection of documents. The base algorithm uses pre-trained transformer-based language models to build document embeddings and produces topic representations by clustering embeddings and applying a class-based term frequency–inverse document frequency procedure61.

BERTopic is well suited to analysing Twitter data, where tweets naturally act as documents such that coherent and consistent themes can be derived from the text due to its ability to generate sentence vector representations, which can preserve semantic structure. In contrast, traditional topic modelling typically uses the bag-of-words approach to define topics on the basis of word frequency.

News media URL classification

To highlight the different news sources used by the ideological minority and majority, we exploited data retrieved from NewsGuard (https://www.NewsGuardtech.com/). NewsGuard is a tool that provides trust ratings for news and information websites. NewsGuard assesses the credibility and transparency of news and information websites on the basis of nine journalistic criteria. These criteria are individually assessed and then combined to produce a single “trust score” from 0 to 100 for a given news media outlet. The scores are assigned by a team of journalists, not algorithmically. Scores are not given to platforms (for example, Twitter and Facebook), individuals or satire content. More detail regarding the rating process is available at https://www.NewsGuardtech.com/ratings/rating-process-criteria/. To complement the news media trust scores, NewsGuard also provides a political leaning for news outlets (far left, slightly left, slightly right or far right), which allows us to gauge the ideological leanings of the news sources referenced in the COP Twitter discussion. Note that NewsGuard classifies a far larger set of news sources as slightly left or right than far left or right.

Using the database of news media trust scores, we cross-referenced the domains found in individual tweets with the corresponding trust score from the NewsGuard database. For COP21, we have 5.7 million tweets (including non-English tweets), of which 3.8 million contain a URL. Of these URLs, we were able to classify 730,000 using the NewsGuard dataset (19% of tweets with a URL). In contrast, for COP26 we have 10.2 million tweets (including non-English tweets), of which 2.8 million contain a URL (note that far fewer tweets contain URLs relative to COP26). Of these 2.8 million URLs, we were able to classify 560,000 (20% of tweets with a URL) using the NewsGuard dataset.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.