Introduction

News information is essential for people to be informed of the events, characters, and communities in the outside world (Leban et al., 2014; McCombs and Reynolds, 2002). Different from the print and broadcast media, the widespread Web connections have endowed online news information with unprecedented geographic reach and spreading speed (Althaus and Tewksbury, 2000; Wu, 2007). Thus, online platforms such as digital news portals and social media websites have become a primary source for many people to consume news information (Thurman, 2008). To alleviate the information overload brought by the vast amount of news information, only a small set of news picked by online platforms is displayed to their users (Das et al., 2007). Instead of manually choosing news by human editors, many online platforms are employing artificial intelligence (AI) techniques (LeCun et al., 2015) to select news in a personalized way to accommodate individual information needs (Okura et al., 2017), which have achieved notable success in improving the information acquisition efficiency of users (Moller, 2022; Vermeulen, 2022).

Unfortunately, machine-aided news delivery is not as credible as we expect. They can be intentionally intervened by humans to manipulate certain aspects of news delivery, such as sentiment and opinions, as Facebook’s “emotional contagion” experiment (Kramer et al., 2014) did. Such a study caused an uproar among the academia and public about the risks of potentially unethical use of AI techniques in human-centered applications (Davies, 2016; Del Vicario et al., 2016; Hallinan et al., 2020; Larson, 2018; Ruxton and Mulder, 2019). More recently, Facebook is accused of using algorithms to amplify hateful or harmful content in the news feed to optimize its profit ("60 Minutes” interview, Facebook whistleblower Frances Haugen; Hemphill and Banerjee, 2021). Beyond financial incentives, intentional manipulation of displayed news sentiment with political motives has shown great power in swaying the outcome of political events like elections (Bovet and Makse, 2019, Gu et al., 2017; Ratkiewicz et al., 2011). Thus, deliberate or malicious manipulation of news sentiment can bring considerable threats to individuals, society, and democracies (Gallotti et al., 2020; Kucharski, 2016; Mihaylov et al., 2018, 2015; Shao et al., 2018).

Although human-involved manipulation of news sentiment has been perceived and can be prohibited by laws in the future (Beridze and Butcher, 2019), personalized news recommender AI itself can manipulate news sentiment without human interference due to the problem of AI’s algorithm bias (Gibney, 2020; Zou and Schiebinger, 2018), as shown in Fig. 1. This is mainly because when learning AI models on massive user data, they can inherit and even amplify the biases encoded in human behaviors (Courtland, 2018). As the proverb goes, "for evil news rides fast, while good news baits later” (John Milton), users prefer to interact with negative news articles rather than positive ones (Hornik et al., 2015; Naveed et al., 2011). AI recommender systems may capture this pattern and form their sentiment prejudices in news selection, which leads to the sentiment manipulation of recommended news. As a human-in-the-loop system, the sentiment bias is further magnified during the iterative interactions between users and news feed providers, which may generate unforeseeable negative psychological and societal impacts (Han et al., 2019; Johnston and Davey, 1997).

Fig. 1: The amplification of sentiment bias in the loop of human-AI interactions.
figure 1

The AI news recommender selects a few news articles from the full set of recent news according to users’ personal interests inferred from the user profile. Users interact with the selected news displayed to them, and their behaviors such as clicks are used to update the user profile in the database. In this loop, since users have biased preferences for negative news sentiment, the recommendation AI learned on user data can inherit and amplify the sentiment bias, which leads to AI’s manipulation of the sentiment of selected news. Users’ further biased behaviors can strengthen the sentiment bias, and such highly biased sentiment orientation evoked by a large number of users can generate future social impacts and influence the overall sentiment of future news. The dilemma of sentiment bias amplification in the loop can make AI heavily control the sentiment of news displayed to users.

In fact, researchers are aware of the significant impact of sentiment information on personalized recommender systems. Many methods explore how to incorporate sentiment information from user-generated content, e.g., reviews in Yang et al. (2013) and social media posts (Khattak et al., 2020; Kumar et al., 2020; Sun et al., 2018) into recommendation algorithms, which can bolster the model’s ability to model item properties (Huang et al., 2020) and user preferences (Gurini et al., 2013). Some recent studies even successfully encourage the model to enhance recommendation diversity in the sentiment dimension (Wu et al., 2020a). However, the sentiment signal in recommender systems is a mixed blessing, since it may introduce unwanted biases to the recommendation results. Unfortunately, the effects of sentiment bias in recommender systems are rarely studied. Only a few works study the influence of review sentiment on recommendation accuracy (He et al., 2022; Lin et al., 2021), which is the tip of the iceberg of sentiment bias’s evil with very limited societal impacts.

In this study, we reveal the sentiment manipulation phenomenon of AI in personalized news delivery. Through extensive experiments on a large-scale real-world news recommendation dataset (Wu et al., 2020b) with one million users, we discover that users’ biased preferences for negative news sentiment can be captured by various state-of-the-art AI models when optimizing recommendation accuracy. These models further reinforce the sentiment bias by promoting the presence chance of negative news in the recommendation results, which may pose potential risks to the public. Since such unwanted news sentiment manipulation is mainly brought by the algorithm’s sentiment biased learned from user data, we propose a sentiment-debiasing method based on a decomposed adversarial learning framework (Wu et al., 2021) to remove AI’s sentiment manipulation. Our approach aims to build up a debiased sentiment-agnostic model from the biased data, to achieve fair news selection concerning different sentiments. Experimental results show that our method can reduce the vast majority of sentiment bias introduced by the AI model to mitigate its sentiment manipulation under minor performance loss. The results also reveal that our approach can further improve the sentiment diversity of news distribution. The insights provided by our study can help the public be aware of the potential risks of AI-empowered news personalization techniques, and inspire researchers to improve the responsibility of AI involved in Internet journalism and other channels of information spread for the well-being of humans.

Methods

Problem formulation

Given a target user u, we denote his/her historical clicked news as [D1, D2, . . . , DN], where N is the history length. Given a candidate news article Dc, the goal of the recommendation model is to predict a click score \(\hat{y}\) that indicates the (non-normalized) probability of the user u clicking Dc. A set of candidate news is ranked according to the corresponding click scores, and the top news with the highest click scores is displayed to the user u. In addition, we denote the sentiment polarity categories of clicked news and candidate news as [s1, s2, . . . , sN] and sc, respectively. The goal of our method is to rank clicked candidate news at high positions and meanwhile keep the overall sentiment orientation in top recommendation results to be consistent with the average sentiment of the news corpus.

Framework

Next, we introduce the details of our proposed sentiment-debiasing framework that can remove the model’s sentiment manipulation (Fig. 2). The core of this framework is a decomposed news model that aims to learn sentiment-aware and sentiment-independent news information, and a decomposed user model that captures sentiment-related user interests and sentiment-independent user interests. Their details are described as follows.

Fig. 2: The framework of our sentiment-debiasing approach.
figure 2

It removes the sentiment manipulation of personalized news recommendations by learning sentiment-agnostic news and user representations via decomposed adversarial learning.

As shown in the left box in Fig. 2, the decomposed news model takes the news texts and news sentiment as the input. Here the news sentiment is inferred from news texts. We use VADER (Hutto and Gilbert, 2014) to compute a real-valued sentiment score for each news, and then quantize this score and convert it into a discrete sentiment category s as the input. The news texts are processed by a text model that learns a hidden embedding to represent the semantic information of news. Following the text modeling approach in NRMS Wu et al. (2019c), we first convert the word in the news texts into a sequence of word embeddings through a word embedding lookup table, then use a multi-head self-attention (Vaswani et al., 2017) network to learn hidden word representations by capturing the interactions among words, and finally use an attention pooling network to summarize the hidden word representations into a unified news text representation, which is denoted as ht. The sentiment category is converted into a latent embedding hs.

Since the text representation ht learned from news texts may still contain sentiment information, we apply an additional orthogonal regularization to the text embedding ht and the sentiment embedding hs to encourage them to be orthogonal. The regularization loss function \({{{{\mathscr{L}}}}}_{{\rm {R}}}\) is formulated as follows:

$${{{{\mathscr{L}}}}}_{{\rm {R}}}=\frac{| {{{{\bf{h}}}}}_{{\rm {t}}}\cdot {{{{\bf{h}}}}}_{{\rm {s}}}| }{| | {{{{\bf{h}}}}}_{{\rm {t}}}| | \cdot | | {{{{\bf{h}}}}}_{{\rm {s}}}| | },$$
(1)

where means the L2 norm. By optimizing this regularization loss, the text embedding usually contains less sentiment information. However, this loss usually cannot be perfectly optimized and the sentiment embedding may also have some shifts with the real sentiment space, making the text embedding still encode some sentiment information. To further reduce the sentiment information it contains, we apply adversarial learning to purify it. Specifically, a sentiment discriminator is used to predict the sentiment category s from the text embedding ht. The soft sentiment category label \(\hat{{{{\bf{s}}}}}\) is predicted as follows:

$${{{\bf{s}}}}={{{\rm{softmax}}}}({{{\bf{W}}}}{{{{\bf{h}}}}}_{{\rm {t}}}+{{{\bf{b}}}}),$$
(2)

where W and b are linear projection parameters. The loss function \({{{{\mathscr{L}}}}}_{{\rm {D}}}\) for learning the sentiment discriminator is as follows:

$${{{{\mathscr{L}}}}}_{{\rm {D}}}=-\mathop{\sum }\limits_{i=1}^{C}{{{{\bf{s}}}}}_{i}\log ({\hat{{{{\bf{s}}}}}}_{i}),$$
(3)

where C is the number of sentiment categories, si and \({\hat{{{{\bf{s}}}}}}_{i}\) are the real and predicted labels for the ith class. The negative gradients inferred by the sentiment discriminator are used to learn the text model in an adversarial way to encourage it to remove sentiment information. When the discriminator and the text model achieve a Nash equilibrium, most sentiment information encoded in the text embedding ht can be effectively removed. Thus, ht can be regarded as a sentiment-agnostic news embedding. We apply the decomposed news model to the user’s clicked news and candidate news to learn their sentiment-agnostic embeddings and sentiment embeddings. We denote the sentiment-agnostic embeddings of clicked news and candidate news as [ht,1, ht,2, ht,N] and ht,c, respectively. The sentiment embeddings of them are denoted as [hs,1, hs,2, hs,N] and hs,c, respectively.

The decomposed user model takes the sentiment-agnostic and sentiment embeddings of clicked news as the input. It contains a sentiment-agnostic user model to learn a debiased user embedding ud from sentiment-agnostic news embeddings and a sentiment-based user model to learn a bias-aware user embedding ub (right box in Fig. 2). The debiased user embedding is mainly used to capture sentiment-independent user interest, and the bias-aware user embedding aims to encode sentiment biases. Following NRMS (Wu et al., 2019c), we use two independent multi-head self-attention networks with attention pooling modules to capture the relatedness between different news and learn unified user embeddings. Although the sentiment-aware and sentiment-independent information is nearly decomposed in the news model, the user model may further encode sentiment information into the user embedding. Thus, we apply an additional orthogonal regularization loss \({{{{\mathscr{L}}}}}_{{\rm {R}}}^{{\prime} }\) to the user embeddings learned by the two user models, which is formulated as follows:

$${{{{\mathscr{L}}}}}_{{\rm {R}}}^{{\prime} }=\frac{| {{{{\bf{u}}}}}_{{\rm {d}}}\cdot {{{{\bf{u}}}}}_{{\rm {b}}}| }{| | {{{{\bf{u}}}}}_{{\rm {d}}}| | \cdot | | {{{{\bf{u}}}}}_{{\rm {b}}}| | }.$$
(4)

By optimizing this loss, the user interest information can also be effectively decomposed into sentiment-aware and sentiment-independent components.

After learning the decomposed news and userembeddings, we compute two ranking scores based on them. One score is a debiased ranking score (denoted as \({\hat{y}}_{{\rm {d}}}\)), which measures the relevance between debiased user embedding and the sentiment-agnostic candidate news embedding via the inner product (i.e., \({\hat{y}}_{{\rm {d}}}={{{{\bf{u}}}}}_{{\rm {d}}}\cdot {{{{\bf{h}}}}}_{{\rm {t,c}}}\)). This score reflects the matching degree of candidate news content and debiased user interest. Another score is a bias-aware ranking score (denoted as \({\hat{y}}_{{\rm {b}}}\)), which is computed by the relevance between bias-aware user embedding and the sentiment embedding of candidate news using their inner product (i.e., \({\hat{y}}_{{\rm {b}}}={{{{\bf{u}}}}}_{{\rm {b}}}\cdot {{{{\bf{h}}}}}_{{\rm {s,c}}}\)). This score reflects the impact of sentiment bias on users’ click behaviors. To capture the sentiment bias patterns in the training data, both scores are added into a unified score \(\hat{y}\) for model training. Following many prior studies (Wu et al., 2019b, c), we use the negative sampling method to construct representative training samples. More specifically, for each clicked news \({D}_{{\rm {c}}}^{+}\) (regarded as a positive sample), we sample T non-clicked news \([{D}_{{\rm {c}},1}^{-},{D}_{{\rm {c}},2}^{-},...,{D}_{{\rm {c}},T}^{-}]\) (regarded as negative samples) and jointly predict their click scores (the choice of T is discussed in Supplementary Fig. 6). The loss function \({{{{\mathscr{L}}}}}_{{\rm {P}}}\) for learning the recommendation model is formulated as follows:

$${{{{\mathscr{L}}}}}_{{\rm {P}}}=-\log \left(\frac{\exp ({\hat{y}}^{+})}{\exp ({\hat{y}}^{+})+\mathop{\sum }\nolimits_{i = 1}^{T}\exp ({\hat{y}}_{i}^{-})}\right),$$
(5)

where \({\hat{y}}^{+}\) and \({\hat{y}}_{i}^{-}\) stand for the click scores of the positive sample and its associated ith negative sample. In the test stage, only the debiased ranking score \({\hat{y}}_{{\rm {d}}}\) is used for ranking. In this way, the influence of sentiment bias is removed from the recommendation results. To learn the entire model, the unified loss function \({{{\mathscr{L}}}}\) on each training sample \(({D}_{{\rm {c}}}^{+},{D}_{{\rm {c}},1}^{-},{D}_{{\rm {c}},2}^{-},...,{D}_{{\rm {c}},T}^{-})\) is formulated as follows:

$${{{\mathscr{L}}}}={{{{\mathscr{L}}}}}_{{\rm {P}}}-\frac{\alpha }{N+T+1}\mathop{\sum}\limits_{d\in {{{\mathscr{D}}}}}{{{{\mathscr{L}}}}}_{{\rm {D}}}^{d}+\beta ({{{{\mathscr{L}}}}}_{{\rm {R}}}^{{\prime} }+\frac{1}{N+T+1}\mathop{\sum}\limits_{d\in {{{\mathscr{D}}}}}{{{{\mathscr{L}}}}}_{{\rm {R}}}^{d}),$$
(6)

where \({{{\mathscr{D}}}}\) means the union of historical clicked news, positive sample and negative samples, \({{{{\mathscr{L}}}}}_{{\rm {D}}}^{d}\) and \({{{{\mathscr{L}}}}}_{{\rm {R}}}^{d}\) represent the adversarial loss and regularization loss on the news d, and α and β are two coefficients that control the intensity of the adversarial loss and the orthogonal regularization loss, respectively (the selection of these coefficients is shown in Supplementary Fig. 5). The loss function for training the discriminator is \(\frac{1}{N+T+1}{\sum }_{d\in {{{\mathscr{D}}}}}{{{{\mathscr{L}}}}}_{D}^{d}\). By training the discriminator and the recommendation model towards convergence, our model can be effectively debiased to get rid of the sentiment manipulation issue. Since the recommendation model and the sentiment discriminator are two adversaries, they cannot be optimized simultaneously. Thus, we adopt a batch-wise training method to learn them in turn on each batch of training samples, as shown in Algorithm 1. In this way, the two adversaries can be jointly trained on the same data.

Algorithm 1

Training algorithm of our approach

1: Initialize the recommendation model parameter set Θm and the sentiment discriminator parameter set Θd

2: repeat

3: Randomly select a batch of samples s from the entire training set \({{{\mathscr{S}}}}\)

4: Freeze the recommendation model parameter set Θm

5: Compute \({{{{\mathscr{L}}}}}_{{{D}}}\) on s

6: Optimize Θd based on \({{{{\mathscr{L}}}}}_{{{D}}}\)

7: Freeze the sentiment discriminator parameter set Θd

8: Compute \({{{\mathscr{L}}}}\) on s

9: Optimize Θm based on \({{{\mathscr{L}}}}\)

10: until model convergence

Result

AI’s manipulation of news delivery sentiment

We perform analysis and experiments on a public large-scale news recommendation dataset named MIND (Wu et al., 2020b), which is constructed by real interaction logs of 1 million users collected on the Microsoft News platform during 6 weeks from October 12 to November 22, 2019. The sentiment of each news article is indicated by a real value from −1 to 1 (see the “Methods” section). We classify news sentiment into five categories according to polarity and intensity. From the sentiment distribution of news in the corpus (Fig. 3 left), we observe that most news has neutral sentiment, and the overall sentiment orientation of the full news set is nearly neutral (the average sentiment score is −0.0174). However, the click probabilities of news with different sentiments have significant differences (Fig. 3 middle), where p < 0.001 among different sentiment categories. It verifies users’ biased behavior patterns of news reading, i.e., more negative news is more likely to attract clicks. In fact, many news categories with strong negative sentiment (Supplementary Table 2) involve common topics, such as health, crime, and disaster, which can be consumed by a broader audience than topics with specific interests (e.g., soccer and basketball).

Fig. 3: Sentiment bias and AI’s sentiment manipulation.
figure 3

Left: the sentiment distribution of news in the dataset. We categorize the sentiment polarities according to the real-valued sentiment scores. Most news has very weak or neutral sentiment while other has positive or negative sentiment orientation with stronger intensities. The overall sentiment of the news corpus is nearly neutral (the average sentiment score is −0.0174). Middle: the click probability of news with different sentiment polarities. News articles with more negative sentiments are more likely to be clicked by users, which is the core source of sentiment bias. The differences in click probabilities among different sentiment polarity categories are significant (p < 0.001 according to two-sided t-tests). Right: average sentiment scores of the full news set, the news displayed to users in the training data, users’ clicked news in the training data, and top 50 recommendation results given by a state-of-the-art news recommendation model NRMS (Wu et al., 2019c). The negative sentiment is amplified in a cascaded way due to users’ biased news reading choices and AI’s algorithm biases learned from user data. This provides evidence of AI’s news sentiment manipulation in the loop of human-machine interactions on news delivery platforms.

To investigate AI’s sentiment manipulation phenomenon, we compare the average sentiment of the full news set, the news displayed to users in this dataset, users’ clicked news, and the top news recommended by a state-of-the-art (SOTA) AI-based news recommendation approach (Wu et al., 2019c) (Fig. 3 right). The results indicate that the displayed news articles amplify the negative sentiment by 124% compared with the full news set, which is mainly due to the sentiment bias of the original recommender system for generating this dataset. The negative sentiment orientation is strengthened by users’ click behaviors (+117%) because of the biased user preferences for negative news sentiment. The SOTA news recommendation AI learned on such click data further magnifies the negative sentiment 1.76 times in its top recommendation results. The cascaded amplification of negative sentiment reveals the worrying increase of sentiment bias in the loop of human–machine interactions, where news sentiment may be heavily manipulated by AI after multiple rounds of biased data accumulation and biased AI model learning.

Results of sentiment-debiasing

To verify the effectiveness of our proposed sentiment-debiasing method in removing AI’s sentiment manipulation, we compare it with several SOTA AI-empowered news recommendation methods (An et al., 2019; Liu et al., 2020; Okura et al., 2017; Wang et al., 2018, Wu et al., 2019a, c) in terms of sentiment bias and recommendation accuracy. The recommendation accuracy is indicated by Area under the ROC Curve (AUC) score and the normalized Discounted Cumulative Gain (nDCG) score of the top 10 recommended news (Wu et al., 2020b), which are formulated as follows:

$${\rm {AUC}}=\frac{{\sum }_{p\in {{{\mathscr{P}}}}}{\sum }_{n\in {{{\mathscr{N}}}}}I[P(p) \,>\, P(n)]}{| {{{\mathscr{P}}}}| | {{{\mathscr{N}}}}| },$$
(7)
$${{{\rm{nDCG@K}}}}=\frac{\mathop{\sum }\nolimits_{i = 1}^{K}({2}^{{r}_{i}}-1)/{\log }_{2}(1+i)}{\mathop{\sum }\nolimits_{i = 1}^{{N}_{p}}1/{\log }_{2}(1+i)},$$
(8)

where P(  ) is the predicted click score of a sample, \({{{\mathscr{P}}}}\) and \({{{\mathscr{N}}}}\), respectively denote the positive and negative sample sets, and I[  ] is an event indicator function. The symbol Np represents the number of positive samples, and ri is a relevance score of news with the ith rank, which is 1 for clicked news and 0 for non-clicked news. Note that nDCG@10 is an instance of nDCG@K that computes the metric based on the top 10 recommendation results. Since the MIND dataset provides the real impression logs, we use the candidate news in each impression to compute the metrics of recommendation accuracy. The sentiment bias can be reflected by the average sentiment of top K-recommended news. Since the original impression data in the dataset already contained some sentiment bias, it cannot be used to evaluate the removing degree of sentiment bias. Instead, we use the entire news set as the candidate news set to be ranked and use the average sentiment of top K-ranked news as the sentiment bias measurement. In our experiments, we repeat each experiment 5 times, and the average performance with 0.95 confidence intervals (if applicable) is illustrated. The ideally minimal bias is benchmarked by the average sentiment of randomly ranked news (i.e., the average sentiment of a full news set), and the absolute difference between this benchmark and the average sentiment of top recommendation results generated by AI algorithms is used as the metric for quantitatively evaluating AI’s sentiment bias, where smaller sentiment biases indicate lighter sentiment manipulations.

The sentiment bias comparison (left Fig. 4) shows that all compared SOTA baseline methods introduce heavy sentiment bias, which provides consistent evidence of AI’s sentiment manipulation by amplifying the ratio of negative content in news delivery. The average sentiment of our approach is very close to random ranking, which represents that most sentiment bias is eliminated. Specifically, the sentiment bias in the top 50 recommended news is reduced by 97.3% (compared with its basic model NRMS; Wu et al., 2019c) and is reduced by 96.7% compared with the least biased method DKN (Wang et al., 2018). From the recommendation accuracy results (right Fig. 4), our approach can achieve comparable performance with other SOTA methods. It has only 2.9% absolute AUC and 2.5% nDCG@10 sacrifice compared with the best-performed NRMS model. These results verify the effectiveness of our methodology in reducing sentiment bias without heavy performance loss.

Fig. 4: The sentiment bias and recommendation performance of different methods.
figure 4

Left: the average sentiment scores of top K news recommended by different methods. The “Random” dashed line (black) represents recommending news randomly, and the expectation of its average sentiment is the average sentiment of the full news set. We use this score as an unbiased benchmark, and the distance to it is regarded as sentiment bias. The average sentiments of all compared SOTA deep learning-based news recommendation methods: EBNR (Okura et al., 2017), DKN (Wang et al. (2018), NAML (Wu et al., 2019a), LSTUR (An et al., 2019), NRMS (Wu et al., 2019c), and KRED (Liu et al., 2020) are much more negative than the unbiased benchmark, which indicates their sentiment manipulation phenomenon. The average sentiment of news recommended by our approach (red line) is very close to the benchmark, especially for the top 50 news articles that are preferentially displayed to users. It shows that our approach effectively mitigates AI models’ sentiment manipulation. Right: the recommendation accuracy is evaluated by the AUC and nDCG@10 scores of news ranking. The results show that our approach achieves comparable results with other SOTA methods (the maximal performance drop is 2.9% AUC and 2.5% nDCG@10). The error bars represent mean scores with 0.95 confidence intervals (n = 5 independent experiments).

To further understand the impact of sentiment debiasing on the recommendation results, we compare our approach with its basic model NRMS (Wu et al., 2019c) in terms of the sentiment distributions of their recommended news as well as the sentiment correlations between recommended news and users’ historical clicked news (Fig. 5). We find in debiased recommendation results, the ratio of negative news is reduced while positive news is promoted (upper left Fig. 5). In addition, the overall sentiment intensity is slightly decreased (from 0.3311 to 0.3286, t-test p < 0.01), which means that our debiased model tends to recommend less emotional content (upper middle Fig. 5). In addition, we observe a huge sentiment standard deviation difference (t-test p < 0.001) between the original and debiased models (upper right Fig. 5). This shows that our debiased approach tends to recommend news with various sentiments, which can promote the sentiment diversity (Wu et al., 2020a) of news distribution to individuals. From lower Fig. 5, we find that the sentiment of recommended news given by the original biased model is correlated to the average sentiment of users’ clicked news significantly (Pearson r = 0.5109, p < 0.001), while there is no such significant correlation in debiased recommendation results (Pearson r = − 0.0030, p = 0.7569). These results reveal that biased AI models may tend to provide users with content with homogeneous sentiment, which may strengthen the polarization of social opinions. Our approach has a greater ability in recommending news with diverse sentiments, which can help mitigate the filter-bubble problem (Bergstrom and Bak-Coleman, 2019) to better satisfy users’ diverse needs on news information (see Supplementary Fig. 4 for an example).

Fig. 5: Impact of sentiment debiasing on the sentiment of recommended news.
figure 5

Upper: the distributions of sentiment orientation, sentiment intensity, and sentiment variance of the biased or debiased recommendation results. The left plot shows that negative news is demoted in debiased recommendation results while positive and neutral news articles are promoted. The middle plot shows the sentiment intensity of debiased recommendations is slightly weaker than biased ones (p < 0.01). The right plot shows that the sentiment standard deviation of debiased recommendations is much larger than biased ones, indicating that our sentiment-debiasing method improves sentiment diversity. Lower: the correlations between the average sentiment of clicked news and recommended news given by biased or debiased models. Darker colors indicate higher probability densities. The left plot shows the sentiments of news recommended by biased models have significant correlations with historically clicked news (r = 0.5109, p < 0.001). The right plot indicates that in debiased recommendation results such correlation is not significant (r = −0.0030, p = 0.7569).

Recommendation topic analysis

We then analyze the high-frequency topic categories in the original news set and the recommendation results (Fig. 6, the topic categories are sorted in descending order by their frequencies). The “newscrime” category has a strong negative sentiment orientation, but its rank is promoted in the recommendation results without debiasing, which is an indication of the amplification of negative sentiment. Although crime news can effectively attract users’ attention, it may be inappropriate to display crime news excessively because of its potential societal impacts (Mastro et al., 2009). By contrast, in the debiased recommendation results generated by our approach, the position of the “newscrime” category is degraded. In addition, topics with relatively strong positive sentiment such as “recipes” and “lifestyleroyals” gain more display chances. These results further support the effectiveness of our sentiment-debiasing approach in reducing the sentiment bias related to the amplification of negative sentiment.

Fig. 6: Sentiment analysis of news topics.
figure 6

The top-frequency fine-grained news topic categories with their average sentiment orientations in the original news set, recommendations without debiasing, and debiased recommendations. The topic categories are sorted by their frequencies in descending order (from left to right). The results show that some topics with strong negative sentiment orientation are promoted by the biased recommender, while our debiased model demotes some negative news topics such as “newscrime” and promotes news topics with positive sentiment such as “recipes’’.

Model component analysis

Next, we verify the effectiveness of the decomposed adversarial learning framework in our approach (see the “Methods” section for more details). We use the leave-one-out scheme to evaluate the contributions of the core techniques in our approach, including the adversarial learning mechanism, orthogonal regularization, and the decomposition framework. From the results of recommendation accuracy and sentiment bias (Fig. 7), we observe that the adversarial learning mechanism plays the most important role in reducing sentiment bias, though it has some sacrifice on recommendation accuracy. The orthogonal regularization can improve accuracy and meanwhile eliminate sentiment bias. This is because it encourages the model to disentangle sentiment-aware and sentiment-independent information, which can aid the elimination of sentiment bias. The decomposition framework shows great importance, especially in keeping recommendation accuracy. Since removing sentiment bias and optimizing user clicks can be contradictory, it can be difficult for the canonical adversarial training method (Zhang et al., 2018) without information decomposition to balance debiasing and performance. These experimental results corroborate the effectiveness of our methodology in alleviating AI’s sentiment manipulation without heavy performance decreases.

Fig. 7: Effectiveness of the core techniques used in our approach.
figure 7

The contribution of each module is evaluated by the changes in sentiment bias and recommendation performance when removing it from our approach. Left: the sentiment bias is indicated by the average sentiment of the top 50 and top 500 news. The dashed line represents the unbiased benchmark. Right: the recommendation performance is indicated by the AUC and nDCG@10 scores. The adversarial learning mechanism contributes most to the removal of sentiment bias. The orthogonal regularization technique improves model performance and decreases sentiment bias. The decomposition framework has a major contribution to the model performance. Differences between different bars are significant (p < 0.01 in the left figure and p < 0.001 in the right figure according to the two-sided t-test). Error bars stand for mean scores with 0.95 confidence intervals (n = 5 independent experiments).

Discussion

With the explosion of online information, people’s daily lives depend heavily on personalized services to alleviate information overload (Littman, 2015). Among them, personalized news delivery is a special one that can generate huge impacts on users’ emotions, decisions, and views on the world outside (Fischer et al., 2020). Although AI techniques have been successfully incorporated into many news recommender systems to improve user experiences, their potential ethical risks and intrinsic causes are not fully identified nor addressed. Our work provides quantitative empirical evidence that news recommendation AI can be manipulating the sentiment orientation of news for display by increasing the recommendation chances of news with stronger negative sentiments. Since users have biased behaviors towards news with different sentiments, AI models learned on big user data will encode these sentiment biases and generate more biased recommendation results. The sentiment bias can be amplified in the loops of human–AI interactions, which leads to heavier sentiment manipulation by news recommender models. Since users are vulnerable to the sentiment manipulation of news feeds (Chen et al., 2021), using biased AI for news selection has great risks of generating unforeseeable negative societal impacts. We should be vigilant about AI’s sentiment manipulation brought by unwanted algorithm biases when developing and using personalized news feed services.

To get rid of AI’s sentiment manipulation of personalized news delivery, in this work we propose a sentiment-debiasing method to eliminate the model’s sentiment bias inherited from user data. We decompose news information into a sentiment-aware component and a sentiment-independent component and regularize them to be orthogonal. By applying adversarial learning to the sentiment-independent part, its encoded sentiment bias can be effectively removed, and thereby the recommendation results are sentiment-agnostic. Our approach can reduce most of AI’s sentiment bias with minor accuracy loss, which indicates that the sentiment manipulation problem is effectively mitigated without severely harming user experiences. Our work can promote the responsibility of AI-empowered news delivery to provide users with both effective and trustworthy information acquisition resources. In addition, our proposed methodology can be generalized to reduce other types of biases in AI systems, such as gender (Park et al., 2018) and racial (Obermeyer et al., 2019) biases, to build more controllable, inclusive, and fair machine intelligence for the good of humanity.

However, we still need to be cautious when handling sentiment biases in news recommendations, since removing them can change the impacts of other types of biases (e.g., gender bias, see Supplementary Fig. 5) on the recommendation results. This chain reaction may amplify (or fortunately alleviate) the bias effects on the news information delivered to users. In our future work, we would like to study how to jointly mitigate the effects of multiple types of biases on the personalized recommendations.