Introduction

Online social networks and information networks have become integral parts of our everyday life. However, the opportunities offered by such networks are often constrained not only by our previous interactions1,2,3,4,5, but also by algorithms. For instance, algorithms could make some people or content more visible than others via classification6, ranking or recommendations7. In this regard, search engines and recommender systems are increasingly used for various applications such as whom to follow, whom to cite, or whom to hire. Typically, these applications use algorithms to order items (e.g., people and academic papers) based on “importance” or “relevance”, and may therefore produce social inequalities by discriminating certain individuals or groups of people in top ranks. In fact, it has been shown that recommender systems such as Who-to-Follow (WTF)8 tend to increase the popularity of users who are already popular7,9,10. A similar effect has been found in PageRank11, where nodes in high ranks stabilize their position and give little opportunity to other nodes to occupy higher positions12. This tendency towards the “popular” arises because these algorithms harness structural information, in particular, the in- and out-degree of nodes. For this reason, modeling the directionality of links—which is often left out for simplicity—is crucial to really understand how these algorithms work on different types of networks.

However, social networks are complex systems, and many other structural properties may also alter the distribution of nodes and groups in the ranking. For example, previous studies have shown that homophily—the tendency to connect to similar others—affects the visibility of minorities in degree rankings13 and people recommender systems14. Consequently, it can reinforce societal issues such as the glass ceiling effect15,16,17 and the invisibility syndrome18. Despite these findings, little is known about the extent to which the combination of multiple structural properties can alter the visibility of minorities in top ranks from ranking and recommendation algorithms. A further complication is that debiasing ranking outcomes and making them fair is very challenging since they can be mitigated in different ways19: by intervening on the score distribution of candidates20, on the ranking algorithm21, or on the ranked outcome22. While most of these studies tackle fairness in ranking, they do not explore the effects of networked data in ranking. This paper is a step towards this goal. Since such algorithms are so deeply involved in social, economic, and political processes, we need to first understand how our connections affect them to then apply appropriate interventions towards fair results.

Figure 1
figure 1

Inequality and inequity. Every column represents a network with certain level of homophily. All networks contain 20 nodes: 20% belong to the minority group (orange), and 80% to the majority group (blue). Edges follow a preferential attachment with homophily mechanism. The top row shows the graph and the level of homophily within groups (MM: majorities and mm: minorities). The second row shows all nodes in descending order (from \(+\) to −) based on their PageRank scores. The third row represents the rank inequality: Gini coefficients of the rank distribution for every top\(-k\%\) (black line). Gini\(_{global}\) refers to the Gini coefficient of the entire rank distribution (i.e., at top-\(100\%\)). We see that the lower the k, the lower the Gini of the rank distribution. The bottom row represents the rank inequity: Percentage of minorities found in each top-\(k\%\) of the rank distribution (orange line). ME is the mean error of these percentages compared to a fair baseline or diversity constraint (i.e., how much the orange line deviates from the dotted line across all top-k’s). Here we see three main patterns: (a, b) When the majority group is heterophilic, minorities are on average over-represented, \(ME>0.0\). (d, e) When majorities are homophilic, minorities are on average under-represented, \(ME<0.0\). (c) When both groups are neutral, the observed fraction of minorities is almost as expected, \(ME\approx 0\).

To this end, we propose DPAH, a network model that generates directed scale-free networks with binary-attributed nodes. It encodes two main mechanisms of edge formation found in social networks: homophily and preferential attachment23,24,25 (see Methods for more details). Moreover, it allows to control for the fraction of minorities, edge density, and the skewness of the out-degree distribution. By using this model, we systematically study how these structural properties of social networks impact the ranking of nodes in PageRank and WTF. In particular, we investigate two ranking issues, inequality and inequity, and show how they get affected by the ranking algorithm together with the type of network. We measure inequality by quantifying the skewness of the rank distribution of nodes that PageRank and WTF produce, and inequity as how well-represented the minorities are in the top of the rank compared to the proportion of minorities in the network. In this work we study both ranking issues and measure their correlation. Furthermore, we quantify them globally using the whole rank distribution, and locally within each top-k% rank. The goal is to identify both the overall inequality and inequity trend that these algorithms produce, and the tipping points where minorities start gaining visibility in the top of the rank.

As an example, consider the directed networks shown in Fig. 1. Every column represents a network with two types of nodes, minority (orange) and majority (blue), and different levels of homophily within groups. Homophily h, is a parameter ranging from 0 to 1 and determines the tendency of two nodes of the same color to be connected. \(h_{MM}\) and \(h_{mm}\) represent homophily within majorities and minorities, respectively. When nodes are ranked using PageRank (second row), the position of the minorities in the rank varies systematically. For instance, when majorities are heterophilic (\(h_{MM}=0.2\), columns a and b), minorities often appear at the top (+). In contrast, when majorities are homophilic (\(h_{MM}=0.8\), columns d and e), minorities tend to appear at the tail of the rank (-). Next, we explain this systematic ranking behavior in top ranks by further varying the structure of the network.

Results

Inequality and inequity in ranking

Inequality refers to the dispersion or distribution of importance among individuals. This importance is the ranking score assigned to every node by the algorithm. We compute the Gini coefficient of the rank distribution to measure how far the ranking scores of individuals deviate from a totally equal distribution (see Methods for more details). As shown in Fig. 2, a very low Gini score (\(\text {Gini}<0.3\)) means that individuals are very similar with respect to their ranking scores. If the Gini score is extremely high (\(\text {Gini}\ge 0.6\)), it means that only a few individuals capture most of the rank. In other words, the rank distribution is very skewed. Values in between (\(0.3\le \text {Gini}<0.6\)) represent moderate skewed distributions. Note that we measure inequality globally by using the whole rank distribution, and locally for each top-k%. From our example in Fig. 1, we see that PageRank on average generates moderate skewed ranking distributions for all the depicted networks (\(Gini_{global}\approx 0.5\)). However, for very small top-k%’s, the Gini is very low. This means that the top individuals possess very similar ranking scores.

Inequity refers to group fairness. In particular, it measures the error distance between the fraction of minorities in the top-k% and a given fair baseline (e.g., a diversity constraint or quota). This baseline may be adjusted depending on the context of the application19,26,27. Here, a ranking is fair when its top-k% preserves the proportional representation of groups in the network (i.e., equivalent to demographic parity19,28). Therefore, the error represents the local inequity at each top-k%, and ME the mean of these errors across all top-k% ranks or global inequity. As shown on the last row of Fig. 1, we measure the local inequity in two steps. First, we compute the fraction of minorities that appear in each top-k% rank (orange line). Second, we compute the error between the observed fraction of minorities in each top-k% rank and a fair baseline (e.g., the actual fraction of minorities in the network, in this example \(20\%\)). Then, we average these error scores across all top-k% ranks to determine the global inequity score (ME values). Ideally, a fair ranking should reach \(ME=0\). However, in order to allow for small fluctuations we introduce the smoothing factor \(\beta\). Thus, a fair ranking is such that \(-\beta \le ME \le \beta\). The value of \(\beta\) is arbitrary, and allows for a smooth definition of “low mean error” or fairness. We set \(\beta =0.05\). As shown in Fig. 2, when \(ME>\beta\), then minorities are over-represented in the top-k% (blue region). When  \(ME<-\beta\), then minorities are under-represented (red region), otherwise the ranking is representing very well the minorities in the top of the rank (green region). Alternatively, we can say that the top rank (i) replicates the proportional representation of groups when ME is zero or very low, (ii) amplifies the representation of minority nodes when \(ME>\beta\), and (iii) reduces the representation of minority nodes—and benefits the majority group—when \(ME<-\beta\). Note that \(ME\approx 0\) may be an artifact of a numerical cancellation as in (c), the neutral case in Fig. 1. In such cases, we could argue that the ranking is still fair since overall it was biased towards both groups across all top-k’s.

Figure 2
figure 2

Regions of disparity. We measure inequality (y-axis) as the skewness of the rank distribution, and inequity (x-axis) as the mean differences between the proportional representation of groups in top-k% ranks and the network. Highly skewed distributions lie in regions I to III (darker colors), and fair rankings, where minorities are well represented in top ranks, lie in regions II, V, VIII (green). We set \(\beta =0.05\) which is arbitrary and allows for a flexible region of group fairness.

Finally, we refer to the relationship between inequality and inequity as disparity. For example, if a ranking distribution achieves \(Gini=0.65\) and \(ME=0.5\), we say that the disparity lies in the region III (dark blue), i.e., high inequality and high inequity, see Fig. 2.

Figure 3
figure 3

DPAH  model and ranking of nodes. (a) Illustration of the directed network model with preferential attachment and homophily (DPAH). First, \(n=8\) nodes are created and randomly labeled according to the fraction of minorities \({f_m= 0.25}\). Then, the following algorithm repeats until a desired edge density is fulfilled. At time t, a source node p is drawn from a power-law (activity) distribution, and a target node j is drawn with a probability proportional to the product of its in-degree \(K_j^{in}=4\) and the pair-wise homophily \(h_{pj}=h_{mM}=0.9\). At time \(t+1\), a new edge is added between nodes \(l\rightarrow o\) based on the same mechanism. (b) The PageRank score of each node is shown under PR. Nodes in each top-k% of the rank are grouped based on the unique PageRank scores. In this example, the top-60% of nodes concentrate most of the PageRank and their scores are somewhat similar (i.e., low Gini). Also, the ranking is fair from top-80% onwards, since they capture the same fraction of minorities as in the population, 25%. Local values are measured per top-k%, and global values are measured using the whole distribution for inequality (Gini), and the average across all top-k% ranks for inequity (mean error).

Growth network model with homophily and directed links

In order to examine the effect of homophily on the ranking of minorities in social networks, first we need to develop realistic network models that capture not only a variety of group mixing, but also the directionality of links. Many online social networks are directed networks in their nature, including the follower-followee structure on Twitter, citation networks29, and the hyperlink structure of the Web. Directed links are the key components of many algorithms such as Google Scholar30 PageRank and Who-to-Follow.

To this end, we prop ose DPAH, a directed pref erential attachment with homophily network growth model. We generate these networks by adjusting the number of nodes \(n=2000\), the edge density \(d=0.0015\), the fraction of minorities \(f_m \in \{0.1, 0.2, 0.3, 0.4, 0.5\}\), the in-class homophily \(h_{MM}, h_{mm} \in \{0.0, 0.1, \ldots , 1.0\}\), and the power-law exponents of the activity distributions \(\gamma _M = \gamma _m = 3.0\). We refer to the minority group as m, and to the majority group as M. Note that the between-class homophily is the complement of the in-class homophily. That is, \(h_{Mm}=1-h_{MM}\) and \(h_{mM}=1-h_{mm}\). Furthermore, an activity score is assigned to every node. This score is drawn from a power-law distribution and determines with what probability the existing node becomes active to create additional links to other nodes. This means that more active nodes possess higher out-degree (see Methods for more details). Each combination of network structure is generated 10 times, nodes are ranked using PageRank and WTF separately, and inequality and inequity scores are computed and averaged across network types (and top-k’s for local disparity) for each algorithm.

Figure 3a illustrates the generation of a network using the DPAH model. First, n labeled nodes are created. In this example \(n=8\). Then, at time t, node p is selected as source node with a probability proportional to its activity. Then, p connects to an existing node j with a probability related to their pair-wise homophily \(h_{pj}\) and preferential attachment that is based on \(k^{in}_{j}\), the in-degree of node j. By this process, we ensure that the out- and in-degree distributions of nodes follow seemingly power-law distributions that have been observed in many large social networks31. The algorithm stops once the network reaches an expected density. Note that source nodes can be either new nodes joining the network for the first time (e.g., node p at time t) or existing nodes (e.g., node l at time \(t+1\)). Since the network size is given, “a node joining the network for the first time” is a 0-degree node that has been selected to create its first edge. Once a source node connects to a target node successfully, the source node becomes available in the next rounds to become a target candidate. This means that in the beginning the model faces a cold start problem since there are no existing (target) nodes to connect to. Thus, the first 1% of new edges are between a source node (drawn from the activity distribution) and any other node with probability as in Equation (3). For the sake of completeness we show the computation of local and global disparities of this network in Fig. 3b.

How do homophily and directional links influence the ranking of minorities globally and locally?

Global disparity. As expected, we found that the Gini coefficient of the rank distributions is large. \(Gini_\text {global}\ge 0.6\) (regions I, II and III; dark colors) for both PageRank (see Fig. 4) and WTF (see Supplementary Figure S1). As we will see later, this is mainly due to the preferential attachment mechanism32,33. Moreover, we find that on average: (i) Balanced networks (\(f_m=0.5\)) can get a fair ranking (green) when both groups possess the same homophily scores (\(h_{MM}=h_{mm}\)). The same applies for neutral networks (\(h_{MM}=h_{mm}=0.5\)) regardless of their fraction of minorities (\(f_m\le 0.5\)). (ii) When the fraction of minorities decreases (\(f_m<0.5\)), groups can be fairly represented in the rank in two regimes: First, when both groups are homophilic, homophily within minorities must be higher than homophily within majorities (\(h_{mm}>h_{MM}>0.5\)). Second, when both groups are heterophilic, homophily within majorities must be higher than homophily within minorities (\(h_{mm}<h_{MM}<0.5\)) to balance the importance of groups.

Figure 4
figure 4

The effects of homophily and fraction of minorities in the global disparity of PageRank. Columns represent the fraction of minorities in the network, x-axis indicates the homophily within minorities, and y-axis the homophily within majoritie s. Colors denote the region where the disparity lies in according to our interpretation in Fig. 2. First, we see that, on average, there is never low global inequality (i.e., regions IV to IX—lighter colors—do not appear). This makes sense because these are scale-free networks. Second, depending on the level o f homophily within groups, minorities on average can be under-represented (region I, red), or over-represented (region III, blue), or well-represented (region II, green). For example, when \(f_{m}=0.1\), minorities are on average under-represented when \(h_{MM}\ge 0.7\) and \(h_{MM}\ge h_{mm}\).

Local disparity

We also compute inequality and inequity within each top-k% rank in order to see to what extent they ch ange when k increases. In the case of PageRank, we see in Fig. 5 that inequality varies (i.e., from light to dark col or s) in different regimes mainly due to the size of k (x-axis), and inequity due to the interplay between homophily within groups, \(h_{MM}\) and \(h_{mm}\). In particular: (i) Only at the top-5% of the rank we see a few cases of low inequality (regions VII, VIII and IX; very light colors), this means that nodes at the very top possess very similar ranking scores, but they are very far from the rest of the population, i.e., the larger the top-k%, the hig her the Gini (darker colors). This holds for WTF up to roughly the top-30% (see Supplementary Figure S2). Overall, PageRank converges to high inequality faster than WTF. (ii) Inequity (regions: red, blue, green) is consistent across all top-k% ranks for both algorithms. In other words, if the ranking algorithm favors or harms one group in the top-5%, it will continue to do so until converging to the fair regime (regions II, V, VIII; green). With a few exceptions, this fair regime is only reached when k is very large. For example, if a minority group is under-represented at the top-5%, it will remain under-represented at the top-80% (see \(h_{mm}=0.1\) and \(h_{MM}\ge 0.7\) in Fig. 5 for PageRank, and Supplementary Figure S2 for WTF). (iii) Minorities are often over-represented when majorities are heterophilic \(h_{MM}<0.5\); (regions III, VI, IX; blue). In contrast, minorities are often under-represented when majorities are homophilic \(h_{MM}>0.5\) (regions I, IV, VII; red). This is consistent up to \(\approx\) top-\(80\%\) for both algorithms.

In summary, our results sugges t that the size of k does not have an influence on inequity. This means that if the algorithm amplifies inequity at the top-5%, it will also amplify inequity at larger top-k%’s. Therefore, increasing the selection pool (larger k) does not improve the representation of minorities. This can be explained by the fact that the preferential attachment mechanism disproportionately affects nodes ranking12.

Figure 5
figure 5

The effects of homophily and fraction of minorities in the local disparity of PageRank. Columns represent the fraction of minorities (10%, 30% and 50%) and rows show homophily within minorities (from top to bottom: heterophilic, neutral and homophilic). The x-axis denotes the top-k% rank and the y-axis shows homophily within majorities. Colors refer to the regions of disparity introduced in Fig. 2. One can see that the minority suffers most (red) when the majority is homophilic and the minority is either heterophilic or neutral. Moreover, inequality is lowest (very light colors) only for a few cases at top-5%. This means that the top best ranked nodes are very similar and their ranks are far from the majority of nodes (i.e., due to preferential attachment). Moreover, inequity remains mostly consistent regardless of top-k%. In other words, if the ranking algorithm favors one group in the top-5% (e.g., red or blue), it will continue to do so until entering the fair regime (green).

Correlation and feature importance

We compute the Spearman correlation between inequality and inequity, and conduct a random forest regression to measure the importance of each network property on both inequality and inequity values (see Supplementary Appendix A.3 for more detai ls). Results are shown in Table 1 for PageRank and Supplementary Table S2 for WTF. We find that inequality and inequity are positively correlated in both global and local regimes. In other words, the more skewed the rank distribution (i.e., high Gini), the more unfair with either group (i.e., mean error far from zero), and vice versa. This correlation is stronger and more significant in PageRank than in WTF. In terms of feature importance, we find that global inequality (Gini) is mainly explained by both homophily values, whereas g lobal inequity (ME) is mainly driven by homophily within majorities. Local inequality (\(Gini_k\)), however, is mainly explained by the top-\(k\%\) rank, and local inequity (\(ME_k\)) by the homophily within the majority group. Notice that we added the variable \(\epsilon\) to verify whether the network-based features are better than random (see Supplementary Appendix A.3 for more details). In the case of PageRank, all network-based features perform better than by chance. However, randomness seems to be more relevant for explaining rank inequality (Gini) in WTF.

Table 1 Ten-fold cross-validation for PageRank.

These results are in agreement with what we see in previous figures; even though majority nodes produce most of the inequality and inequity in the rank, their interplay with minority nodes can change or intensify the direction of bias. In fact, both homophily values can explain \(75\%\) (49%) of Gini, the global inequality in PageRank (WTF), \(92\%\) (88%) of ME, the global inequity, and \(78\%\) (74%) of \(ME_k\), the local inequity. However, the top-k% rank together with the homophily within majority nodes explain \(84\%\) (86%) of \(Gini_k\), the local inequality.

How do different social mechanisms of edge formation contribute to disparity?

So far, we show that PageRank and WTF on our network model produce high inequality and a wide-range of possible inequity outcomes. How much of that inequality or inequity was a product of homophily or preferential attachment? To see the effects of these two mechanisms alone, we generate new networks by turning on and off the homophily and preferential attachment features (see Methods for the details of the models).

Figure 6 shows the inequality and inequity produced by PageRank on a variety of models: DPA (Directed Preferential Attachment), DH (Directed Homophily), Random, and DPAH (see Supplementary Figure S3 for WTF). Results from both algorithms show that networks whose nodes connect through preferential attachment (DPA) produce on average higher inequality compared to DH and Random. However, when preferential attachment is combined with homophily (DPAH), this inequality increases even further. Additionally, we see that WTF produces higher inequality compared to PageRank (see Supplementary Appendix A.4 for more details). Inequity, on the other hand, is mainly driven by homophily. This means that, homophily (DPAH and DH) influences both, inequality and inequity in both algorithms.

Figure 6
figure 6

The effects of homophily and preferential attachment in the global disparity of PageRank. We generated directed networks using four different models of edge formation. DPA: only preferential attachment. DH: only homophily. DPAH: our proposed model that combines DPA and DH. Random: a baseline where nodes are connected randomly. We see the following patterns: (i) Homophily (DH) produces a moderate-to-high level of inequality (\(0.3<Gini<0.8\)), while preferential attachment (DPA) produces a consistent moderate inequality (\(Gini\approx 0.5\)). When both mechanisms are combined (DPAH), the rank inequality increases even further (\(0.7<Gini<0.9\)). (ii) Random and Preferential attachment (DPA) are always fair (\(ME=0\) or \(|ME|\le \beta\)), while in the cases where homophily is involved (DH and DPAH) inequity is often high (\(|ME|>\beta\)). Thus, in general preferential attachment is the main driver of inequality, while homophily influences both inequality and inequity. Vertical and horizontal error bars represent the standard deviation over 10 runs of the Gini and ME, respectively.

Note that in Fig. 6, we fixed the activity of nodes to \(\gamma _M=\gamma _m=3.0\). However, when we set these parameters to \(\gamma _M=\gamma _m<3.0\) (more active nodes or lower values of \(\gamma\) as found in several scale-free networks34), inequality decreases, see Supplementary Figure S5. This behavior holds even if the minority group is the only one increasing its activity (\(\gamma _m=1.5 < \gamma _M=3.0\)) which in turn increases inequity against the majority, see Supplementary Figure S6. Additionally, in Supplementary Figure S4, we see that edge density also plays a role in the inequality produced by PageRank and WTF. This means that, by further adjusting these two parameters (node activity and edge density), we would expect changes only to inequality since inequity is mainly affected by homophily as we saw before.

Disparities on empirical networks

First, we fit the DPA, DH, and DPAH models to each of the empirical networks in order to find the mechanism that best explains the inequality and inequity found in the rank. The parameters passed to these models are inferred from the real networks and described in Table 3. Second, we rank nodes in the empirical and fitted networks using PageRank and WTF, and compute the disparities (inequality vs. inequity) found in their rank distribution. Results are shown in Fig. 7 for PageRank and Supplementary Figure S9 for WTF. Disparity values from the real-world networks are labeled as empirical (black dot), and disparity values from the fitted networks are labeled according to the model (x marks). We see that each network tells a different story. This can be explained by the nature or domain of these networks. For instance, APS and Hate are best explained by the DPA model. This means that scientists tend to cite authors that have already many citations, and users in Twitter tend to retweet content posted by popular users (i.e., popular in terms of the number of retweets they get). Blogs and Wikipedia on the other hand, are best explained by our DPAH model. Notice that both are hyper-link networks. In other words, people tend to add not only popular references to their Web pages, but also related to their topics (i.e., political leaning in Blogs, and gender in Wikipedia). Note that the Hate network shows the lowest (empirical) inequality. This is due to the fact that it possesses low out-degree exponents (\(\gamma _M=2.2\), \(\gamma _m=1.7\)).

Figure 7
figure 7

Global disparity in PageRank on empirical networks. Each column represents an empirical network. Citation/retweet networks (APS and Hate) and Hyper-link networks (Blogs and Wikipedia). Inequality and inequity are shown in the y- and x-axis, respectively. The disparity in ranking that we see in empirical networks are best explained as follows: (i) citation/retweet networks by preferential attachment PA, and (ii) hyper-link networks by preferential attachment and homophily DPAH.

Strategies towards a fair ranking

Results from both algorithms show that while the homophily within majorities is the main driver for inequality and inequity, minorities may overcome unfair rankings by connecting strategically in the network. For instance, when both groups are equally active, minorities should adjust their homophily based on the homophily of the majority. (i) When majorities are homophilic \(h_{MM}>0.5\), minorities should increase their homophily such that \(h_{mm}>h_{MM}\). (ii) When majorities are (somewhat) neutral (\(h_{MM}=0.5\pm 0.1\)), minorities may connect arbitrarily with any group without being too homophilic, otherwise they will become over-represented in the rank. (iii) When majorities are heterophilic \(h_{MM}<0.5\), one solution to achieve a fair rank is to increase the size of the minority group, and make sure that both groups behave similarly in terms of homophily (\(h_{MM}\approx h_{mm}\)). Otherwise, minorities will be over-represented regardless of their in-class homophily. On the other hand, when one group is more active than the other, achieving a fair rank becomes challenging. Nevertheless, if the objective is to increase the visibility of minorities in the rank, then the minorities themselves should be more active in the network by creating more connections to increase their out-degree. Note that these “strategies” without algorithmic intervention may work in scenarios such as a citation or collaboration networks, but they might not work in other scenarios. In such cases, we need additional recommender systems to help under-represented groups discover those “strategic” links that will help them climb to higher ranks.

Discussion and future work

In this work we have proposed a systematic study to measure the inequality and inequity produced by PageRank and Who-To-Follow (WTF). Our approach disentangles the effect of network structure on the rank distributions of these two algorithms by using synthetic networks. By doing so, we control for the properties of the network and measure how these changes affect the rankings. In particular, we studied six prominent structural properties of social networks: homophily, preferential attachment, fraction of minorities, edge density, node activity and the directionality of links. We found that the systemic bias produced by these algorithms in the rank is mainly due to homophily imbalance (\(h_{MM}\gg h_{mm}\) or \(h_{mm}\gg h_{MM}\)) for inequity, and the interplay between our six properties of interest for inequality. Consequently, our systematic study makes PageRank and Who-To-Follow interpretable and explainable since our results show the necessary structural conditions to achieve a fair rank. A potential avenue to reduce inequity is to then create synthetic connections before the ranking as it is done for correcting the class imbalance problem in supervised learning35. Alternatively, these conditions or strategic connections may be added into the network to change its structure as a collective fairness intervention. For instance, recommender systems could suggest relevant articles not only based on popularity and (keyword) similarity but also based on fairness by fulfilling diversity constraints.

Notice that our model simplifies the role of homophily and minorities. First, it assumes that all nodes of the same group have the same in-class and between-class homophily. This means that rich mixing patterns might get ignored since some nodes can exhibit local differences36. Second, while there exist multiple definitions of minorities37 we adopted the one by Italian jurist Francesco Capotorti: “a group numerically inferior to the rest of the population of a State, in a non-dominant position...38,39. However, we constrained this population to all nodes in a given network (neither the State nor the world population). Notice as well that the ranking amplifies the representation of minorities by reducing the representation of majorities in top ranks (e.g., when the majority is heterophilic, see Fig. 1). This aligns with the definition of minorities by Wirth40 that implies that “minorities objectively occupy disadvantageous positions in society”. This means that “a minority may actually, from a numerical standpoint, be the majority” (e.g., people living in poverty in under-developed countries). In other words, under these two definitions, being in disadvantage in the top-k% (inequity) is not a group size issue only, but a combination of group size and homophily as we have previously shown. More complex definitions of minorities are out of the scope of this paper. A further limitation is that we focus on a single binary attribute (e.g., \(\text {color}\in \{\text {Black}, \text {White}\}\)), this means that multiple sources of inequality and inequity (e.g., intersections of disadvantage such as being poor and of color) cannot be captured at once41. Addressing these issues is beyond the scope of this paper and we leave them for future work.

Finally, we disentangled the individual effects of preferential attachment and homophily in the rank by comparing the disparities of our proposed DPAH model with two variants and a baseline: networks with preferential attachment only (DPA), networks with homophily only (DH), and directed Erdös-Rényi (Random)42 graphs. Further research can investigate other topologies and social mechanisms of edge formation such as clustering43, transitivity44, and reciprocity45. Similarly, other structural properties such as monophily46 and second order homophily47 can be studied to measure their influence on ranking.

Conclusions

In this work we have investigated under which conditions PageRank and Who-To-Follow (WTF) reduce, replicate or amplify the representation of minorities in top ranks. In particular, given the rank distribution produced by these algorithms, we computed inequality as the dispersion among individuals in terms of ranking scores, and inequity as whether minorities are over-, under- or well-represented in top ranks compared to their representation in the network. We studied these two metrics separately and in combination to better understand the mechanisms that can explain them.

To that end, we proposed DPAH, a growth network model that allows to generate realistic scale-free directed networks with different levels of homophily, fraction of minorities, node activity, and edge density. In these networks, we found that both inequality and inequity are positively correlated and mainly driven by the homophily within majorities. This means that, when the majority group is highly homophilic, the minority group is under-represented in top ranks. Also, when the majority is highly heterophilic, the minority benefits tremendously since it is over-represented in the top-k%. However, minorities can overcome these disparities by connecting strategically with others. Thus, equity in ranking is a trade-off between homophily and the fraction of minorities.

Our systematic study makes PageRank and Who-to-Follow explainable and interpretable to help data scientists understand and estimate the disparity that these algorithms produce given the structure of networks, which is key for proposing targeted interventions. We hope our results create awareness among majority and minority groups about these disparities since they may replicate and even amplify the biases found in social networks.

Data and methods

Synthetic networks

Network models have been proposed with various social mechanisms. For instance, the classic stochastic-block model48 which allows for homophily between and across groups, and the configuration model49 which generates links among nodes by preserving a given degree distribution. On the other hand, the preferential-attachment model25 produces scale-free networks due to cumulative advantage50. Although these models can reproduce certain properties of real-world networks such as degree or homophily, they fail at guaranteeing similar visibility of minorities as their empirical counterpart. In this direction, Karimi et al.13 and Fabbri et al.14 devise social network models with preferential attachment, adjustable homophily and fraction of minorities. They demonstrate how the degree rank of the minority group in a network is a function of the relative group sizes and the presence or absence of homophily. However, the former models undirected networks, and the latter did not control for edge density and node activity (i.e., power-law out-degree distributions) as we do in this work for minority and majority groups.

Directed network

We define a directed network as: Let \(G=(V,E,C)\) be a node-attributed unweighted graph with \(V=\{v_1,\ldots ,v_n\}\) being a set of n nodes, \(E \subseteq V \times V\) a set of e directed edges, and \(C=\{c_1,\ldots ,c_n\}\) a list of binary class labels where each element \(c_i\) represents the class membership of node \(v_i\). The fraction of minorities \(f_m\) captures the relative size of the minority class—with respect to C—in the network. We refer to the minority group as m, and to the majority group as M. A network is balanced when all class labels have the same number of nodes (\(f_m=0.5\)), otherwise it is unbalanced (\(f_m<0.5\)). Networks fulfill a predefined edge density level d. Since n and d are given, networks stop growing when \(e=d n (n-1)\).

In order to generate directed links, inspired by the activity-driven network model51, we assign an activity score to each node that determines with what probability the existing node becomes active and creates additional links to other nodes. It has been shown that in empirical networks the activity of the nodes follows a power-law distribution51. Therefore, we assign an activity to each node drawn from a power-law distribution \(\rho (\gamma ) = X^{-\gamma }\). Note that each group possess its own activity distribution and they are defined by its power-law exponent \(\gamma _M\) and \(\gamma _m\) for majority and minority nodes, respectively. The level of activity of a group is inversely proportional to \(\gamma\). That is, groups with higher out-degree produce lower \(\gamma\) (more skewed).

Then, the probability of connecting a source (active) node \(v_i\) to a target node \(v_j\) (or in other words the probability of connecting to \(v_j\) given the source node \(v_i\)) is explained by any of the following three mechanisms of edge formation.

Preferential attachment (DPA)

Also known as the rich-get-richer effect or cumulative advantage in social networks25,50. It indicates that nodes tend to connect to popular nodes. We define popularity as the in-degree of the node. Therefore, the probability that a source node \(v_i\) connects to a target node \(v_j\) is proportional to the in-degree of the target node \(v_j\).

$$\begin{aligned} P(i\rightarrow j) = P(j|i) = \frac{k^{in}_{j}}{\sum _{l=1}^{N} k^{in}_{l}} \end{aligned}$$
(1)

Homophily (DH)

It is the tendency of individuals to connect (or interact) with similar others24,49. Thus, the probability that a source node \(v_i\) connects to a target node \(v_j\) is driven by the homophily between their classes \(c_i\) and \(c_j\). We assign a homophily value to each dyad based on pre-defined homophily parameters within majorities and minorities, \(h_{MM}\) and \(h_{mm}\), respectively. Homophily values range from 0.0 to 1.0. If the homophily value is high, that means that nodes of the same class are attracted to each other more often than nodes of different attributes. Following the definitions from previous work13,14,52, nodes of the same class with homophily \(h_{aa}=0.5\) are referred to as neutral (i.e., they connect randomly to either class), otherwise they are heterophilic if \(h_{aa}<0.5\) (i.e., more likely to connect to the other class), or homophilic when \(h_{aa}>0.5\) (i.e., more likely to connect to the same class). Note that in- and between-class homophily values are complementary: \(h_{mm}=1-h_{mM}\) and \(h_{MM}=1-h_{Mm}\).

$$\begin{aligned} P(i\rightarrow j) = P(j|i) = \frac{h_{ij}}{\sum _{l=1}^{N} h_{il}} \end{aligned}$$
(2)

Preferential attachment with homophily (DPAH)

We propose DPAH, a directed growth network model with adjustable homophily and fraction of minorities. DPAH stands for Directed network with Preferential Attachment and Homophily. This mechanism combines DPA and DH, and is an extension of the BA-Homophily model13.

$$\begin{aligned} P(i \rightarrow j) = P(j|i) = \frac{h_{ij}k^{in}_{j}}{\sum _{l=1}^{N} h_{il} k^{in}_{l}} \end{aligned}$$
(3)

Note that DPA and DH are especial cases of DPAH where only the in-degree mechanism varies. This means that, the out-degree distribution remains the same as in DPAH: it is driven by the activity model. Additionally, we include a random model where both source and target nodes are chosen at random (i.e., directed Erdös-Rényi model42). Table 2 shows the parameters adjusted in each model. Number of nodes n and edge density d are arbitrary in the sense that they are not part of the edge formation mechanism. Thus, we fix them to make a fair comparison across all models.

Table 2 Model parameters.

Empirical networks

We inspect four networks from different domains and compute the inequalities and inequities produced by PageRank and WTF. Table 3 shows the most important properties of these networks.

  • APS: The American Physical Society citation network wh ose nodes represent articles, and edges represent citations. The binary class of each node is \(\text {pacs}\) and encodes two different Physics sub-field s w he re \(\text{05.20.-y}\) ( C la ssical statistical mechanics) is the minority.

  • Hate: A retweet network53 where nodes denote users, and edges represent ret weets among them. Users are labeled as either \(\text {hateful}\) or \(\text {normal}\) depending on the sentiment of their tweets. Hateful users represent the minority.

  • Blogs: An hyper-link network from political blog posts about the 2004 U.S. election54. Nodes represent blog pages, and edges hyper-links among them. Each blog is labeled as either \(\text {right-}\) or \(\text {left-}\)leaning. The latter represents the minorities.

  • Wikipedia: A Wikipedia hyper-link network where nodes repres ent U.S. politicians55,56 labeled as either male or female. Female politicians re present the minorities.

Table 3 Empirical Networks.

Ranking and recommendation algorithms

There exist a variety of ranking and recommendation algorithms that follow different strategies depending on the nature of the problem. For instance, in information systems, items such as content, Web pages, and products are ranked to recommend users what to read or buy57. In social networks, however, people are ranked to identify their hierarchy or importance58,59,60, and recommended to other users in order to establish new connections61,62,63,64. These rankings and recommendations are based on algorithms that often rely on whom we are already connected with. In this work, we focus on two such algorithms widely used in practice65: PageRank11 and Who-to-Follow (WTF)8. While PageRank determines the global ranking of nodes in comparison with all other nodes, WTF deals with ranking nodes in a node level and thus remains a local measure. For that reason, we focus on these two algorithms to capture both dimensions.

PageRank

It was invented to rank all web pages in the Web11, and has been used in several applications65. For example, to study citation and co-authorship networks66,67,68. PageRank assigns an importance score to every single node in a network. This score takes into account the number and quality of incoming links of each node. The PageRank of node i is defined as follows:

$$\begin{aligned} PR(i) = (1-\alpha ) + \alpha \sum _{j \in N_i} \frac{PR(j)}{k^{out}_j} \end{aligned}$$
(4)

where \(i\in V\), \(N_i\) represents all neighbors of node \(v_i\) (e.g., all nodes \(v_i\) points to), and \(k^{out}_j\) the out-degree of node \(v_j\). The damping factor \(\alpha\), or probability of following links using a Random Walker, is set to 0.85 as suggested by Brin and Page69. We use the fast-pagerank70 python package to compute the PageRank score of all nodes using sparse adjacency matrices.

Who-to-follow (WTF)

This recommendation algorithm was created and used by Twitter to suggest new people to follow8. It is based on SALSA71 which in turn is based on Personalized PageRank72. In a nutshell, for each user u (or node \(v_i \in V\)), the algorithm looks for its circle of trust, which is the result of an egocentric random walk (similar to personalized PageRank)8. Then, based on this circle-of-trust, the algorithm ranks all users that are not yet friends with u but are connected through the circle of trust. Then, we take the top-k of these (recommended) users, and add up the counter of being selected as a recommendation to each of them. This is done for every node u in the network. At the end, the rank of each node encodes the number of times a user was suggested as a recommendation across all nodes in the network. Thus, the WTF score for each node is defined as follows:

$$\begin{aligned} WTF(i) = \sum _{j\in V} \mathbbm {1}_{SALSA(j)}(i) \end{aligned}$$
(5)

where SALSA(j) refers to the top-k users the SALSA algorithm recommends to node j. In this work we select the top-10 users as recommendations. \(\mathbbm {1}_A(x)\) denotes the indicator function or boolean predicate function to test set inclusion (i.e., whether \(x \in A\)).

Gini coefficient

The Gini coefficient was developed by the Italian Statistician Corrado Gini73 to measure the income inequality of a society. It is defined as the mean of absolute differences between all pairs of individuals for some measure. In our setup this measure is the score given to every node by PageRank and Who-To-Follow. The minimum value is 0 when all individuals’ scores are equal, and its maximum value is 1 when there is a big gap or discrepancy between scores74.

We define the Gini coefficient of the rank distribution X as follows. For more details see75:

$$\begin{aligned} Gini(X)=\frac{\sum _{i=1}^{{\hat{n}}} (2i- {\hat{n}} -1)x_i}{n\sum _{i=1}^{{\hat{n}}} x_i} \end{aligned}$$
(6)

where \(x \in X\) is an observed value in the rank distribution, \({\hat{n}}=|X|\) is the number of values observed, and i is the rank of values in ascending order.