Friendship paradox biases perceptions in directed networks

Social networks shape perceptions by exposing people to the actions and opinions of their peers. However, the perceived popularity of a trait or an opinion may be very different from its actual popularity. We attribute this perception bias to friendship paradox and identify conditions under which it appears. We validate the findings empirically using Twitter data. Within posts made by users in our sample, we identify topics that appear more often within users’ social feeds than they do globally among all posts. We also present a polling algorithm that leverages the friendship paradox to obtain a statistically efficient estimate of a topic’s global prevalence from biased individual perceptions. We characterize the polling estimate and validate it through synthetic polling experiments on Twitter data. Our paper elucidates the non-intuitive ways in which the structure of directed networks can distort perceptions and presents approaches to mitigate this bias.


Introduction
We observe our peers to learn social norms, assess risk, or copy behaviors. However, these observations can be systematically biased [1,2,3,4,5], distorting how we see the world. One of the better known sources of bias is the friendship paradox in social networks [6], which states that people are less popular than their friends are, on average. Consequences of friendship paradox can skew how we compare ourselves to friends: people tend to be less happy than their friends are [7], and researchers tend to have less impact than their co-authors do, on average [8]. In fact, any trait that is correlated with popularity is likely to be misperceived [9,10]. This may explain why adolescents systematically overestimate how much their peers drink or engage in risky behaviors [2,5] and why social media use is often associated with negative social comparisons [11].
In contrast to friendship networks, many online social networks are directed. On Twitter, for example, we subscribe to, or follow, others to see their posts, but the information does not flow in the opposite direction, unless those people also follow us back. For convenience, we refer to people whose posts we see our friends, and those who see our posts our followers. Note that this nomenclature does not imply a bidirectional friendship relationship. An individual's in-degree is the number of his or her friends, and the out-degree is the number of followers. The asymmetric nature of links in directed networks leads to four variants of the friendship paradox [12]: your friends (or followers) have more friends (or followers) than you do, on average. Empirically, this effect can be quite large, with upwards of 90% of social media users observing that they have a lower in-degree and out-degree than both their friends and followers [13]. However, the conditions under which these four variants of the paradox exist have not been comprehensively analyzed. We carry out the analysis to show that while two variants of the friendship paradox occur in any directed network [14], the remaining two exist only if an individual's in-degree and out-degree are correlated.
Friendship paradox can alter individual's observations of the network's state. We consider directed networks where nodes have a trait, such as gender, political affiliation, or whether they used a certain hashtag in their posts. The trait's global prevalence is simply the fraction of all nodes with that trait. On the other hand, its observed prevalence is the fraction of friends of any node that have the trait. In networks where the more influential (higher out-degree) nodes are likely to have the trait, its observed prevalence will be substantially higher than its actual prevalence. Our analysis shows that, similar to the generalized friendship paradox in undirected networks [15,10], correlation between nodes' trait and their out-degree amplifies this perception bias.
In reality, an individual's perception of a trait is shaped by its local prevalence among his or her friends. We identify a new paradox in directed networks, as a result of which a trait will appear significantly more prevalent locally among individual's friends, than it is globally among all people. We show that this effect is stronger in networks where higher out-degree nodes are connected to nodes with a lower in-degree.
Surprisingly, although individual observations are biased, we can still make efficient estimates of the global prevalence of a trait. We present a polling algorithm that obtains a statistically efficient estimate of a trait's global prevalence, with a smaller error than alternative polling methods. Proposed method leverages the friendship paradox to reduce the error of the polling estimate by trading off the bias of the estimate and its variance. We analytically characterize this tradeoff and provide an upper bound for the variance.
We demonstrate that perception bias can be large in a real-world network. To this end, we extracted a subgraph of the directed Twitter social network and collected messages posted by users within this subgraph. Treating the occurrence of particular hashtags within messages as traits or topics enables us to measure the perception bias. We identify hashtags that appear much more frequently within users' social feeds than they do among all messages posted by everyone, leading users to overestimate their prevalence. We also validate the performance of the proposed polling algorithm through synthetic polling experiments on our Twitter subgraph.
Our paper elucidates some of the non-intuitive ways that directed social networks can bias individual perceptions. Since collective phenomena in networks, such as social contagion and adoption of social norms, are driven by individual perceptions, the structure of networks and the paradoxes endemic in them can impact social dynamics in unexpected way. This work shows how we can begin to quantify and mitigate these biases.

Results
Consider a directed network G = (V, E), with {V } nodes and {E} links. A link (i, j) pointing from i to j indicates that i is a friend of j or equivalently, j follows i. Here, the direction of the link indicates the flow of information. The out-degree of a node v, d o (v), measures the number of followers it has, and its in-degree, d i (v), the number of friends.
We define three random variables, X, Y and Z, that correspond to different node sampling methods. A node v with an out-degree d o (v) has that many followers, or equivalently, v is a friend to d o (v) number of nodes. Therefore, a node Y that is obtained from V by sampling proportional to out-degree of nodes is called a random friend. Similarly, a node v that has d i (v) links pointing to it is a follower of d i (v) other nodes. Therefore, a node Z that is obtained from V by sampling proportional to in-degree of nodes is called a random follower. Below, we formalize these terms.
Random node X is a uniformly sampled node from V : Random friend Y is a node sampled from V proportional to its out-degree: Random follower Z is a node sampled from V proportional to its in-degree: For any directed network, the average in-degree E{d i (X)} = v∈V di(v) N and the average out-degree E{d o (X)} = v∈V do(v) N are the same. Therefore, we used to denote both average in-degree and average out-degree of a random node X: d = E{d o (X)} = E{d i (X)}.

Four Variants of the Friendship Paradox in Directed Networks
Four different variants of the friendship paradox exist in directed networks [12]. The first two (Theorem 1) state that (1) random friends have more followers than random nodes do, and (2) random followers have more friends than random nodes do (on average). The magnitudes of these are set by the heterogeneity (measured by the variance) of the in-and out-degree distributions of the underlying network. Theorems 1 and 2 were independently proved recently in [14] utilizing vector norms. All omitted proofs are in the Appendix.
Theorem 1. Let G = (V, E) be a directed network. Then, 1. random friend Y has more followers than a random node X, on average; i.e., 2. random follower Z has more friends than a random node X, on average; i.e., The remaining two variants of the friendship paradox state that (3) random friends have more friends than random nodes do, and (4) random followers have more followers than random nodes do (on average). In contrast to the first two variants of paradox (Theorem 1), these require positive correlation between the in-degree and out-degree of nodes in the network (Theorem 2). Theorem 2. Let G = (V, E) be a directed network where in-degree d i (X) and out-degree d o (X) of a random node X are positively correlated. Then, 1. random friend Y has more friends than a random node X does, on average; i.e., 2. random follower Z has more followers than a random node X does, on average; i.e., Theorem 2 states that in networks where the in-and out-degrees of a random node are positively correlated, (1) the expected number of friends of a random friend is greater than the expected number of friends of a random node, and (2) the expected number of followers of a random follower is greater than that of a random node.   Figure 1 illustrates the four variants of the friendship paradox in the subgraph of the directed social network of Twitter (see Methods). Specifically, it shows the fraction of individuals with a specific in-degree (or out-degree) that experiences the paradox. Note that this fraction is high: at least half of the users with fewer than 100 friends or followers observe that they are less popular and well-connected than their friends and followers are, on average.

Perception Biases in Directed Networks
When nodes have distinguishing traits or attributes, friendship paradox can bias perceptions of those attributes. For simplicity we assume that each node has a binary valued attribute (f : V → {0, 1}). Such binary functions are useful for representing, among others, voting preferences (Democratic or Republican), de-mographic characteristics (female or male), contagions (infected vs susceptible), or the spread of information in networks (using a particular hashtag or not).

Global Perception Bias
The global prevalence of the attribute in a directed network is given by E{f (X)}, the expected attribute value of a random node X. In other words, when only 5% of nodes have the attribute f (v) = 1, for example, they tweeted about a topic, its expected value is E{f (X)} = 0.05. However, nodes' perceptions of the prevalence of the attribute are determined by its value among their friends. In other words, nodes' perception of how prevalent the attribute is, is given by the expected attribute value of a randomly chosen friend Y : E{f (Y )}. On Twitter this translates into how many people see the topic in their social feed, which aggregates posts made by friends. Under some conditions, the perceived prevalence of the attribute E{f (Y )} will be very different from its actual prevalence E{f (X)}. We define this as global perception bias: where ρ do,f is the Pearson correlation coefficient between out-degree and attribute value of a random node, σ do is the standard deviation of the out-degree distribution, and σ f is the standard deviation of the binary attributes (see appendix for the derivation). When the attribute is correlated with out-degree (ρ do,f > 0), a random friend's attribute is larger than the attribute value of a random node, on average. In undirected networks this effect is known as generalized friendship paradox [10], and it has the same intuition: when popular people (i.e., those with many followers) are more likely to possess some trait (ρ do,f > 0), that trait will be overrepresented among the friends of any individual. As a result, people will tend to overestimate the trait's prevalence. This may explain the observation that adolescents overestimate the number of smokers or heavy drinkers among their peers [2]. All that is required for the bias to hold is if peers with risky behaviors tended to be more popular.
Note that the magnitude of the friendship paradox d increases with the standard deviation of the out-degree distribution (σ do ) and decreases with the average degree (d). Global perception bias B global also increases with σ do and decreases withd when the correlation coefficient ρ do,f remains fixed. Hence, friendship paradox amplifies global perception bias, increasing the deviation between the actual and observed prevalence of the attribute in the network.

Local Perception Bias
Since information in a directed network flows to individuals from their friends, their perceptions of the world are given by the values of the attribute among their friends. One problem with using B global to measure perception bias is that E{f (Y )} captures the expected attribute value among the friends of all individuals, rather than friends of a randomly chosen individual X. Therefore, we define an alternate measure of perception bias-local perception bias-that considers a node's perception of an attribute based on its expected value among its friends.
Formally, the perception q f (v) of a node v ∈ V about the prevalence of an attribute f is where F r(v) denotes the set of friends of v. Local perception bias is then the deviation of the expected perception of a random individual from its global prevalence: To help quantify this value, we define attention that a node v ∈ V allocates to each of her friends: .
The analogy is motivated by an observation that users with more friends tend to receive more messages [16], making them less likely to see any specific friend's post [17]. This allows us to succinctly express the expected perception of a random node X as Here,d is the expected number of friends of a random node, and U and V denote the endpoints of a link sampled uniformly from E. Intuitively, E{f (U )A(V )|(U, V ) ∼ Uniform(E)} represents the expected influence of an interaction along an link drawn at random from the network: i.e., the attribute f (U ) of the friend U times the attention that the follower V pays to that friend. Hence, the expected perception E{q f (X)} of a random node X is the product of the average number of interactionsd and the average influence of an interaction. The appendix notes show that local perception bias exists, i.e., B local ≥ 0, which indicates the local overestimation of global prevalence, if the following conditions are met: The first condition (Eq. 13) specifies positive correlation between the out-degree and the attribute of a random node, implying that popular nodes are more likely to have the attribute. The second condition (Eq. 14) specifies positive correlation between the attention of a follower and the attribute of a friend, suggesting that nodes with higher attribute values will appear as friends of nodes that follow few others. These two conditions are sufficient for positive local perception bias (B local > 0), leading individuals to overestimate the attribute's prevalence. Further, the two conditions (Eq. 13 and Eq. 14) also ensure that B local > B global > 0 as shown in Appendix Section 2. Hence, under these two conditions, local perception bias and global perception bias will both indicate overestimation of the global prevalence. However, B global and B local can differ significantly in certain settings. For example, there exist situations where the two measures have different signs, with one measure suggesting overestimation and the other suggesting underestimation of an attribute's prevalence by individuals in the social network. In such cases, we propose using B local to measure the perception bias, as it takes more structural properties of the network into account. Further, global perception bias B global and local perception bias B local are equal if and only if the attribute f (U ) of U and attention A(V ) of a random link (U, V ) are uncorrelated, i.e., as we show in the appendix notes A3.

Empirical Validation
To measure perception bias, we used data from Twitter (see Methods) to compare the actual and perceived popularity of various hashtags mentioned in text posts. We treat each hashtag h as a binary attribute, with f h (v) = 1 if a user v used the hashtag h in his or her posts.  most popular hashtags, each used by more than 1,000 people in our data set. The bulk of these hashtags were used by fewer than 2% of the people, with the most popular hashtags being used by just 8% of the people in the subgraph. Figure 2b shows the histogram of B local value for all hashtags. Although its peak is at zero, the distribution is skewed, with 865 hashtags having positive bias, meaning that they appear more popular than they really are. What hashtags have most bias? Figure 3 shows the top-20 and bottom-10 hashtags ranked by B local . Among the most positively biased hashtags are those associated with social movements (#ferguson, #mikebrown, #michaelbrown), memes and current events (#icebucketchallenge, #alsicebucketchallenge, #ebola, #netneutrality), sports and entertainment (#emmys, #robinwilliams, #sxsw, #applelive, #worldcup). For example, #ferguson, with E{q f (X)} = 12.1%, is perceived as the most popular hashtag. While it is also one of the more widely-used hashtags, with E{f (X)} = 3.1%, perception bias makes it appear about four times more popular to Twitter users, on average, than it actually is. Interestingly, there are also hashtags with negative bias, indicating that they appear less popular than they actually are. Among these hashtags are Twitter conventions aimed at getting more followers (#tfb, #followback, #follow, #teamfollowback ) or more retweets (#shoutout, #pjnet, #retweet, #rt). Many of these hashtags are actually among the top-20 most popular Twitter hashtags (#oscars, #tcot, #quote and #rt), but due to the structure of the network, they appear less popular to users. This occurs either because people who use these hashtags do not have many followers (Cov{f (X), d o (X)} < 0), or the attention of their followers is diluted because they follow many others (Cov{f (U ), A(V )} < 0). For example, for #oscars, both of the covariances are negative. The ranking of hashtags based on global bias is available in Figure 6.
Node Perception Bias   The hashtags can appear to be much more popular than they actually are (e.g. #ferguson) or, they can appear to be less popular (e.g. #oscars) due to local perception bias.
At an individual level, the popularity of a hashtag h among the friends of a user v ∈ V is given by q f h (v). The individual-level perception bias is then Figure 4 shows the empirical distribution of B h (v) for all users and hashtags. Most of the mass of the histogram is for B h (v) > 0, suggesting that most of the people in our data overestimate the popularity of these hashtags. Figure 4b compares individual-level perception bias for two hashtags that have similar global prevalence: #nyc (E{f (X)} = 0.021) and #rt (E{f (X)} = 0.019). Of the two hashtags, #nyc is perceived as more popular (with B local #nyc = 0.022), but #rt appears less popular (with B local #rt = −0.011) than it is globally.

Estimating Global Prevalence via Polling
Polling estimates the global prevalence E{f (X)} of an attribute by sampling random individuals and averaging their answers to some question. The accuracy of a poll depends on two key factors: (i) the method of sampling individuals (sampling distribution) and, (ii) the question presented to them. We propose a practical polling algorithm (Algorithm 1) that differs from the currently used polling algorithms in both aspects. First, our algorithm samples random followers (step 1 of Algorithm 1) instead of random individuals, as is done by most alternative methods. Second, instead of asking about their own attribute, the sampled individuals are asked about their perception (step 2 of Algorithm 1): "What do you think is fraction of individuals with attribute 1?" Consequently, we call the proposed algorithm Follower Perception Polling (FPP) algorithm.

Compute the estimatef
As random followers have more friends than random nodes (on average), according to Theorem 1, the key idea behind the FPP algorithm is to sample individuals who have more friends. As a result, the variance of the perceptions of random followers will be smaller (compared to that of random nodes) and hence, will result in a more accurate (lower mean-squared error) estimate of the global prevalence of the attribute. We analytically show that (see Methods) (i) the bias of the estimatef FPP produced by the FPP algorithm is same as the global perception bias B global and, (ii) variance of the estimatef FPP produced by the FPP algorithm is bounded above by a function of the correlation between out-degree and the attribute as well as spectral properties of the network (i.e. second largest eigenvalue of the bibliographic coupling matrix). The FPP algorithm assumes that every node has a non-zero in-degree and out-degree. To evaluate the polling algorithm, we extract a subgraph of 5409 Twitter users from our dataset with the same properties. We use the polling algorithm to estimate the popularity of the 500 most frequent hashtags mentioned by users in this subgraph. We compare the performance of the proposed FPP algorithm on this induced subgraph to two alternative algorithms: 1. Intent Polling -IP: asks random users whether they used a hashtag (orange in Figure 5).

2.
Node Perception Polling -NPP: asks random users what fraction of their friends used the hashtag (red in Figure 5).

Follower Perception Polling -FPP:
asks random followers what fraction of their friends used the hashtag (green in Figure 5).
Node perception polling (NPP) differs from IP in terms of the questions asked: random nodes are asked about their perception in NPP, whereas they are asked about their attribute in IP. Follower perception polling (FPP) differs from NPP in terms of the sampling method: random followers are sampled (based on friendship paradox) in FPP whereas naive sampling of random nodes is used in NPP. Hence, comparing performance of IP with NPP will illustrate the effect of polling perceptions instead of attributes. Comparing performance of FPP with NPP will illustrate the effect of friendship paradox based perception polling in contrast to the naive sampling based perception polling. Figure 5a shows the (empirical) squared bias of the three polling algorithms for a fixed sampling budget b = 25, which corresponds to querying 0.5% of the nodes. As asserted in Theorem 3 (see Method), for each hashtag, the FPP estimate is biased by an amount equal to B global value for that hashtag. Hence, the IP, which yields an unbiased estimate, outperforms NPP and FPP in terms of bias for most hashtags as illustrated in Figure 5a. However, as illustrated in Figure 5b, FPP produces a smaller variance estimate compared to both IP and NPP. Hence, in terms of the Mean Squared Error (which is defined as MSE{T } = Bias{T } 2 + Var{T } for an estimate T ), FPP estimate is more accurate compared to both IP and NPP estimates for most hashtags as illustrated in Figure 5c. Increasing the sampling budget decreases performance gap between FPP and the other two algorithms (Figure 5d). However, even with b = 250 (5% of the nodes polled), FPP outperforms IP in more than 80% of the cases, and it outperforms NPP in more than 55% of the cases.

Discussion
Social networks can have surprising, even counter-intuitive behaviors. For example, previous work has shown that the "majority illusion" may lead people to observe that the majority of their friends has some attribute, even when it is globally rare [9]. The illusion is created by the friendship paradox, which can also bias the observations individuals make in directed networks in non-obvious ways. Our analysis identifies the conditions under which friendship paradox can distort how popular some attribute or behavior (e.g., drinking, smoking, etc.) is perceived to be, making it appear several times more prevalent than it actually is. Specifically, the following two conditions amplify local perception bias: (1) positive correlation between an individual's attribute and popularity (number of followers in a directed network) and (2) positive correlation between the attributes of individuals and the attention of their followers. The first condition suggests that bias exists when popular people (i.e., those followed by many others and hence more visible) have the attribute, for example, engaging in risky behavior, having a specific political affiliation, or simply using a particular hashtag. Their influence is amplified when they are followed or seen by "good listeners", i.e., people who follow fewer others and thus are able to pay more attention to the influentials.
We validated these findings empirically using data from the Twitter social network. We measured perceptions of popularity of hashtags, i.e., words or phrases preceded by a '#' sign that are frequently used to identify topics on Twitter. Such hashtags serve many important functions, from organizing content, to expressing opinions, to linking topics and people. We measured a hashtag's global prevalence as the fraction of all people using it, and its perceived popularity as the fraction of friends using it. Our analysis identified hashtags that appeared several times more popular than they actually were, due to local perception bias. Such hashtags were associated with social movements, memes and current events. Interestingly, as our data was collected in 2014, some of the most biased hashtags were #icebucketchallenge and #alsicebucketchallenge, the explosively popular Ice Bucket Challenge. Perception bias could have potentially amplified their spread, as well as the spread of other costly behaviors that require social proof [18]. For example, the #MeToo movement has grown into an international campaign to end sexual harassment and assault in the workplace by highlighting just how endemic the problem is. It spread through online social networks as women posted their own stories of harassment using the hashtag #metoo. Perception bias may have amplified the spread of such hashtags by making them appear more common and thus easier to use.
We also presented an algorithm that leverages friendship paradox in directed networks to efficiently (in a mean-squared error sense) estimate the true prevalence of an attribute. In essence, the idea behind the algorithm is that perceptions of random followers should have a smaller variance compared to the perceptions of random individuals. This is because random followers are more informed than random individuals (according to friendship paradox). It was shown that the variance of this algorithm is bounded by a function of the second largest eigenvalue of the degree-discounted bibliographic coupling matrix and the correlation between the out-degree and the attribute. Empirical results illustrate that the proposed algorithm outperforms other widely used polling algorithms.
Our work suggests that one way to mitigate perception bias is to alter the local network topology to allow more information to reach the low-attention users. This opens up new research avenues on how link recommendation can alleviate perception bias. However, our empirical study has limitations, namely, the nature of the subsample of the network we studied. Social networks are huge, necessitating analysis of subgraphs sampled from the entire network. However, by leaving out some nodes, data collection process itself may distort the properties of the sample. Specifically, since we observed only the outgoing links from the seed nodes, we do not have information about the followers of these nodes. Addressing the limitations of analysis imposed by sampling is an important research direction. Despite this limitation, our work shows that friendship paradox can lead to surprising biases, especially in directed networks, and suggests potential strategies for mitigating these biases.

Data
The dataset used in this study was collected from Twitter in 2014. We started with a set of 100 users who were active discussing ballot initiatives during the 2012 California election and expanded this set by retrieving the accounts of the individuals they followed and reached a total of 5,599 users. We refer these individuals as seed users. Next, we identified all friends of the seed users, collecting all directed links that start with one of the seed users. We then collected all posts made by the seed users and their friends-over 600K users in total-over the period June-November 2014. The posts include their activity i.e. tweets and retweets. These tweets mention more than 18M hashtags. With this datacollection approach, seed users are fully observed (their activity and what they see in their social feeds), and their friends are only partially observed (only their activity). Table 1 reports properties of the Twitter dataset, considering only the seed users and using the variables defined in Section 2.1. Note that the average degreed (where,d = E{d o (X)} = E{d i (X)}) is relatively large at 123.55. However, since the distribution of the in-and out-degree is highly heterogeneous, the variance of the in-and out-degrees is relatively large (two orders of magnitude compared tod). The covariance between the in-and out-degrees of nodes is also relatively large with a correlation coefficient Due to the relatively large variance (compared tod) of the in-and out-degree distributions, the expected out-degree of a random friend (E{d o (Y )}) and the expected in-degree of a random follower (E{d i (Z)}) are larger than the average degreed as stated in Theorem 1. Note also that, due to positive covariance between the in-and out-degrees of nodes, the expected in-degree of a random friend (E{d i (Y )}) and the expected out-degree of a random follower (E{d o (Z)}) are also larger thand, as stated in Theorem 2. The accuracy of a poll depends on the method of sampling respondents and the question asked of them. For example, in the case of estimating an election outcome, asking people "Who do you think will win?" (expectation polling) is better than "Who will you vote for?" (intent polling) [19]. This is because in expectation polling, an individual names the candidate more popular among her friends, thus summarizing a number of individuals in the social network, rather that provide her own voting intention. Our polling algorithm is motivated by [20,19,21], which show that polling methods asking individuals to summarize information in their neighborhood outperform polling methods that ask only about the attribute of each individual. [20] studied the polling problem analytically in the context of an undirected network and, proposed a method to obtain an unbiased estimate of the global prevalence with bounds on its variance. The analysis of Algorithm 1 for directed graphs is motivated by these results in [20] for undirected social networks. [21] proposed to ask the simple question "What fraction of your neighbors have the attribute 1?" (neighborhood expectation polling) from randomly sampled neighbors (instead of random nodes) on undirected social networks. In this case, sampled individuals will provide the average opinion among their neighbors. Further, since random friends have more friends than random individuals (by the friendship paradox for undirected graphs), this approach would yield an estimate with a smaller variance than asking it from random nodes. Motivated by these works, Algorithm 1 exploits the friendship paradox on directed networks to obtain a statistically efficient estimate of the global prevalence of an attribute using biased perceptions of random followers.
Analysis of the FPP Algorithm Recall that in order to reduce the variance, the FPP algorithm polls perceptions q f (Z) of random followers Z instead of attributes f (X) of random individuals X. However, it is not guaranteed that the estimatef FPP will be unbiased. The following result shows that the bias of the FPP algorithm is the same as the global perception bias B global .
Theorem 3. The bias of the estimatef FPP computed in Algorithm 1 is equal to the global perception bias B global i.e.
Hence, the same factors (specified in Eq. (9)) that increase (decrease) the global perception bias will increase (decrease) the bias of the estimatef FPP produced by the FPP algorithm. The aim of the FPP algorithm is to compensate for the bias B global of the algorithm with a reduced variance and thereby achieve a smaller mean squared error. Also, we highlight that the Algorithm 1 can be modified to generate an unbiased estimate by replacing (16) The unbiased estimatef Unbiased F P P is based on the concept of social sampling proposed in [20] for undirected social networks where, queried individuals provide a weighted value of their friends' attributes in a manner that results in an unbiased estimate. This estimate is useful in contexts where unbiasedness is preferred over mean-squared error to assess the performance of the estimate. However, this does not result in an intuitive and easily implementable algorithm similar to Algorithm 1 since the modified estimatef Unbiased F P P involves each sampled individual calculating a weighted average of the attributes of her neighbors.
Before analyzing the variance of estimatef produced by the Algorithm 1, we digress briefly to review the bibliographic coupling matrix. Bibliographic coupling originated from the analysis of citation networks [22], and is used to symmetrize a directed graph by transform it into an undirected graph for purposes of clustering, etc. The bibliographic coupling matrix B of a directed graph with adjacency matrix A is defined as B = AA T . Hence, the weight of the link between nodes i, j in the new undirected graph is B(i, j) = v∈V A(i, v)A(j, v) which corresponds to the number of mutual followers of i and j. Hence, the weight of the link between two nodes i and j in B is the number of individuals who follow both of these nodes. 1 This conveys the similarity of i, j in terms of the number of mutual followers. However, when determining the similarity of two nodes i, j using B, a mutual follower with a large number of friends (a likely scenario), is weighted the same as a mutual follower with a small number of friends (a rarer scenario). Hence, the latter type of mutual follower should be given more weight compared to the former type when evaluating the similarity of two nodes. Similarly, the number of followers of i and j should also be taken into consideration when assessing their similarity. Based on these observations, [26] proposed the degree-discounted bibliographic coupling matrix where which discounts the contributions of the nodes i, j by their out-degrees (the number of followers) and each mutual follower k by her in-degree (number of friends). Please see [26,27] for more details on the degree-discounted bibliographic coupling.
Returning to the analysis of the estimatef FPP of the Algorithm 1, the following result gives an upper bound on the variance of this estimate under certain conditions on the structure of the network.
where, M = v∈V d i (v), λ 2 is the second largest eigenvalue of B d , f is the N × 1 dimensional vector of binary attributes.
Theorem 4 shows that the variance of the friendship paradox based polling Algorithm 1 depends on the correlation between the out-degrees and attributes ||D 1/2 o f || 2 and the structure of the graph via second largest eigenvalue λ 2 of the matrix B d . Specifically, a smaller λ 2 implies that the bibliographic coupling network has a good expansion (i.e. absence of bottlenecks) [28]. Hence, if the nodes in the network G = (V, E) cannot be clustered into distinct groups based on their mutual followers (i.e. bibliographic similarity) then, the variance of the algorithm will be smaller (due to smaller λ 2 ).
1. random friend Y has more followers than a random node X, on average; i.e., 2. random follower Z has more friends than a random node X, on average; i.e., Proof. Part 1: Proof of part 2 follows using similar arguments.

A2 Proof of Theorem 2
Theorem. Let G = (V, E) be a directed network where in-degree d i (X) and out-degree d o (X) of a random node X are positively correlated. Then, 1. random friend Y has more friends than a random node X does, on average; i.e., 2. random follower Z has more followers than a random node X does, on average; i.e., Proof. Part 1: Hence, positive correlation (Cov{d i (X), d o (X)} > 0) between in-degree d i (X) and out-degree d o (X) of a random individual X implies that E{d i (Y )} > E{d i (X)}. Proof of part 2 follows using similar arguments.

A3 Derivation of B local
Let Y denote a uniformly sampled friend of a random node X. Further, let A uv denote the element (u, v) of the adjacency matrix of network: A uv = 1 if there is a link pointing from u to v and A uv = 0 otherwise. Then, by definition of the function q f in Section 2 of the main text, Therefore, which proves the first statement. Next