Investigating and modeling the dynamics of long ties

Long ties, the social ties that bridge different communities, are widely believed to play crucial roles in spreading novel information in social networks. However, some existing network theories and prediction models indicate that long ties might dissolve quickly or eventually become redundant, thus putting into question the long-term value of long ties. Our empirical analysis of real-world dynamic networks shows that contrary to such reasoning, long ties are more likely to persist than other social ties, and that many of them constantly function as social bridges without being embedded in local networks. Using a cost-benefit analysis model combined with machine learning, we show that long ties are highly beneficial, which instinctively motivates people to expend extra effort to maintain them. This partly explains why long ties are more persistent than what has been suggested by many existing theories and models. Overall, our study suggests the need for social interventions that can promote the formation of long ties, such as mixing people with diverse backgrounds. Social network analysis has been long involved in studying social ties. The authors study the dynamics of long ties using mobile phone dataset, finding that long ties typically persist for longer than short ties, suggesting implications for social intervention to support long ties.


Introduction
Social network analysis provides a powerful instrument to investigate the structure of society by aggregating interpersonal relationships among individuals [1][2][3][4][5] . In the social network literature, a large body of research centers on how tightly clustered social ties and groups are formed, as well as how they evolve, spread information and behaviors, and promote group solidarity [6][7][8][9][10][11][12] . Meanwhile, a smaller but increasing number of studies focus on weak ties, which may function as "bridges" between different communities because of the unique roles they play in global network structures and information diffusion 1, 13-20 .
One recent development in the literature is the concept of "long ties." These are social ties that have a large tie range, which is measured by the length of the second shortest path between two connected nodes (see Fig. 1). Long ties -social ties with a large tie range -work as important social network bridges between different communities [21][22][23][24][25][26] . Structurally, long ties may be considered to be weak ties, as they are not positioned in a "cohesive embedded network" where individuals can easily contact or spend time with common neighbors 14,22,27 . Yet, despite the seeming weakness (in terms of low frequency or intensity of contact) of long ties, many studies have shown that long ties are crucial for the widespread dispersion of novel information and contagious behaviors 1,14,18,25,[28][29][30] . Relatedly, these bridges may have other special characteristics such as exhibiting a higher level of direct reciprocity 31 .
Still, one crucial perspective lacking in the literature of long ties is the dynamics. Evidence from static social networks may not be generalizable to dynamic networks 32 . In particular, existing social network theories and prediction models may indirectly imply that long ties should dissolve quickly or eventually become redundant, thus putting into question the long-term value of long ties.
The critical role of long ties would be challenged if empirical evidence from dynamic networks suggests that long ties tend to dissolve or become short ties. Firstly, it is possible that long ties may dissolve rapidly. According to various theories 14,27 and prediction models 9,33 , social ties are likely to dissolve quickly when they lack sufficient common neighbors to reinforce their relationships or when they have few interactions (i.e., interactions with weak tie strength). Long ties likely satisfy this condition, and thus their role in bridging different communities might be limited 16 . Secondly, long ties may evolve to become redundant "short ties." By triadic closure 33, 34 , a person may introduce other friends to their long ties, thereby forming common neighbors and switching the long tie to a short tie. Therefore, two people who had a long tie may become in-creasingly similar, for example, regarding the information they digest or the opinions they hold 35 .
Eventually, the previously long tie becomes largely redundant, as there now exist other paths where the same piece of novel information can flow between the two individuals 27, 36 .
Our study combines empirical analysis and computational modeling to provide a dynamic perspective of long ties. First, using two-year social network data, we find that contrary to what is implied by existing theories and models, not only are long ties more likely to persist than shorterrange ties but also that many of them continue to be long ties. To explain this finding, we propose three possible hypotheses: degree heterogeneity, survival bias, and valuable long ties 37, 38 . Investigating these hypotheses, we empirically show that the first two mechanisms might not fully explain our main results.
Next, we propose a cost-benefit analysis model to support our last hypothesis -that individuals spend extra effort to maintain relationships with long ties because they are highly beneficial, since they provide novel information or different expertise. The model combines strategic network formation models from the game theory literature 3,7 and node embedding techniques in machine learning [39][40][41] to simulate the dynamics of social networks. This interdisciplinary approach has been shown effective in trading off the model's power to explain mechanisms versus to predict 42 .
Our model describes the social tie formation process as a result of a meeting procedure and a subsequent rational decision procedure. We verify the model by utilizing real-world data. Ultimately, we find that our model partly explains the persistency of long ties, which is the main conclusion of our empirical analysis.

Long ties last longer
In this work, we employ tie range to characterize the local network structure of a social tie. As the length of our data is two years, we partition the data into eight phases; our results are robust to other ways of partitioning, as well (see Supplementary Note 2). To begin our analysis, we classify all social ties by tie range in the first phase, and then, we observe the evolution of those ties in the subsequent phases.
First, we examine the dynamics of tie strength, which is measured by interaction frequency (the number of calls or texts) and interaction duration (the total duration of the calls). We define y t as the interaction frequency or duration in phase t. We present E[y t |y 1 > 0] in the eight phases, as shown in Fig. 2. This conditional expectation indicates that we focus our analysis on ties that already exist in phase 1 (see Supplementary Note 4). Observing the magnitudes in just the first phase, we find a "U-shape" in the data that is consistent with the results of the prior work 25 . Our result shows that interaction frequency and duration initially decrease with the tie range, but later increase with the tie range. In particular, long ties (tie range ≥ 6) appear to be as intimate as those short ties with tie range = 2 in that the average interaction frequency or duration for these two types of ties are close in the first phase.
By comparing the dynamics of short ties and long ties in Fig. 2, we find that long ties continue to be stronger. For example, in the long run, the average interaction duration and frequency of social ties with a tie range ≥ 6 appear to be even slightly larger than those with a tie range of 2. a b Fig. 2: Dynamics of tie strength given initial tie range. Tie strength is measured by interaction duration (the total call volume in seconds) and interaction frequency (the number of calls or texts).
Each phase represents a season (three months). We take logarithms (log) for both interaction duration and frequency. All ties are classified according to their tie range in the first phase. The curves represent (a) the average (log) interaction frequency and (b) the average (log) interaction duration conditional on a tie existing in phase 1 with the given tie range. Error bars are 95% confidence intervals for the mean log interaction duration and frequency (assuming normal distribution). Note that error bars are sometimes smaller than the data point markers.
Furthermore, social ties with a tie range of 5 also appear to be stronger than ties with a tie range of 3 or 4. In Supplementary Note 2, we discuss the robustness of our findings by adjusting the time window that determines the length of each phase.
To understand what mechanisms drive the patterns above, we decompose the dynamics of interaction frequency or duration into persistence probability and interaction increments. We let the difference in the interaction frequency or duration between phase t and 1 be ∆y t = y t − y 1 . Each phase represents a season (three months). All ties are classified according to their tie range in the first phase. The curves represent (a) the probability of persisting, (b) the average (∆ log) interaction duration, and (c) and the average (∆ log) interaction frequency (c) conditional on a tie existing in phase 1 with the given tie range. Error bars are 95% confidence intervals for the means (assuming normal distribution). Note that error bars are sometimes smaller than the data point markers.
Then, we define the persistence probability and interaction increments as follows: The dynamics of the persistence probability and interaction increments are presented in have the largest persistence probability in all subsequent phases, followed by closely embedded ties with a tie range of 2. Meanwhile, we find that social ties with a mid-sized tie range (i.e., 3 or 4) dissolve the fastest. This pattern is consistent with the overall effect presented in Fig. 2 Regarding the interaction increments, we find that they generally increase with tie range.
This indicates that conditional on a persistent social tie, the interaction frequency and duration appear to be larger when there is a long tie. By contrast, social ties with a tie range of 2 have the smallest interaction increments. From this, we conjecture that persistent short ties typically require less effort to maintain, as they can be indirectly maintained through their common friends; by contrast, we speculate that long ties require a lot of time investment in order to be maintained.

Many long ties are persistently long
Next, we investigate the dynamics of tie range. We first examine the dynamic trends of tie range in the first two phases by analyzing the social ties that exist in both phases. We present the transition probability matrix between tie ranges in the left panel of ties have a large likelihood of evolving into short ties. In particular, for longer ties, i.e. those with a tie range of = 5 or ≥ 6, their probability of evolving into a tie range equal to 2 is the largest: 32% or 36%, respectively. Few short ties become long ties, since such an evolution requires that all their common neighbors dissolve with either of them. In addition, long ties appear to be a stable status. For example, a social tie range ≥ 6 in phase 1 has a probability of 34% or 15% to have a tie range of 5 or ≥ 6 in phase 2, respectively.
We further analyze the tie range dynamics in phase 4 and phase 8, which are presented in the middle and right panels of Fig. 4. We find the patterns in phases 4 and 8 are largely consistent with the pattern in phase 2. In particular, for those with a tie range = 5 or ≥ 6 in phase 1, they have a probability of 26% or 38%, respectively, to persist with a tie range ≥ 5 in phase 4; they also have a probability of 41% or 52%, respectively, to persist with a tie range ≥ 5 in phase 8. These results indicate that although long ties have a high probability of becoming short ties, they can also persist as long ties. This finding suggests that it is not necessary for a social tie to become a short-range tie to be long-lasting.
Next, we proceed to jointly investigate tie range and tie strength (i.e., the frequency and the total duration of interactions). As shown in Fig. 5, in general, those ties that become short-range (e.g., tie range = 2) are those with more interactions; for social ties that have an arbitrary initial tie range but later change to a tie range of 2, the interaction frequency and duration are always the greatest. For the persistence probability, the same trend generally holds. The one exception here is for those with a tie range ≥ 6: if they continue to be social ties with a tie range ≥ 6, their tie strength remains strong. Note that although we are only discussing phase 1 and phase 2, our results are equally robust when we examine any phase t and its first subsequent phase, t + 1 (see Supplementary Fig. S9).

Explaining the results: Three hypotheses
In the previous sections, we show that long ties are not only stronger but also last longer. Moreover, quite a few strong long ties continue to be long ties. To discuss the plausible explanations for the observed patterns, We next propose and discuss three hypotheses pertaining to degree heterogeneity, survival bias, and valuable long ties below.
First, one plausible explanation for the observed patterns is degree heterogeneity. As shown in Supplementary Fig. S10, we find that individuals who have fewer friends are more likely to have long ties. Thus, they tend to retain relationships with a small number of friends, but with greater tie strength.
To reduce the impact of degree heterogeneity, we plot the results conditional on the degree subgroup (see Supplementary Note 6). Specifically, we separate individuals by their degree and obtain multiple degree subgroups. We then plot the main results for each degree subgroup in Supplementary Fig. S11. We find that the patterns observed in our main text are found in all degree subgroups. This finding shows that although degree heterogeneity may provide an explanation for the observed patterns, it does not fully explain our main results.
The second plausible explanation is survival bias -that only very valuable long ties survived -even though newly formed long ties are likely to be weaker than newly-formed short ties. Therefore, surviving long ties tend to continue to persist, or perhaps even become stronger, while others dissolve rapidly. To test this hypothesis, we need to examine (1) whether newly formed long ties are weaker than newly formed short ties in the beginning and (2) whether newly formed long ties have a smaller persistence probability, such that only very strong long ties survive. We find that while (1) is supported, (2) is not supported; thus, survival bias cannot fully explain our results.
To investigate these two ideas, we divide social ties into one of two categories: existing ties, and new ties. An existing tie is one that has had any interactions in the previous phase, while a new tie has had no such interactions. After separating all ties into existing or new ones, we perform the same analysis as that found in the previous sections. We use the tie range in phase 2 as the reference, and we investigate whether there was non-zero interaction frequency or duration in order to determine if it is a new or existing tie.
We first examine whether newly formed long ties are weaker initially than newly formed short ties. In Fig. 6, we show that while existing ties present a "U-shape" in the relationship between interaction frequency (duration) and tie range in phase 2, this "U-shape" pattern does not hold for new ties. Instead, as indicated by Fig. 6, for new ties, the longer the new tie is, the fewer interactions the two people have in phase 2. This result supports our conjecture that newly formed long ties are likely to be weaker than newly formed short ties.
Next, we investigate whether newly formed long ties have a smaller persistence probability. However, we observe that for newly formed ties, there exists a "U-shape" between tie range and persistence probability; newly formed long ties have the highest persistence probability (see Supplementary Note 7). This finding contradicts our conjecture that the persistence probability of newly formed long ties would be the smallest. Thus, for the two notions we examined, we find that (1) is supported while (2) is not supported. Therefore, the survival bias hypothesis does not fully explain our main results.
Valuable long ties.
Our last hypothesis is that long ties tend to be more valuable. This hypothesis is consistent with weak tie theory and the roles of long ties, as conjectured in previous studies 1,14 . However, while most computational models that simulate real-world networks highlight homophily 43 -the phenomenon that individuals with similar attributes tend to be friends -previous models do not typically consider the benefits of social exchange between people with different skills or information sets 42 . Recent work 42 , provides an example of how one can consider homophily and social exchange jointly, but this work is restricted to static social networks. Below, we propose a computational model that combines game theory and machine learning in order to examine long tie dynamics. This model helps support our hypothesis on valuable long ties, while also incorporating the first two hypotheses.

The model explaining long ties' persistency
Here, we propose a game-theoretical computational model that simulates the dynamics of social networks. Specifically, the model combines the embedding techniques in machine learning [39][40][41]44 and the strategic network formation in economics 7,45 . Compared to the common network formation game models in the economics literature, our model stresses the high-dimensional heterogeneity, as well as the values of social exchange. Compared to network embedding techniques, our model helps understand the social network formation mechanisms. Ultimately, our model integrates the strategic network formation approach to explain the mechanisms, while the embedding techniques improve the predictability of the computational model. Our study echoes Hofman's (2021) recent paper that discusses the trade-off between explanation and prediction in computational social science 46 .
Our model considers two procedures during the formation of social ties: the meeting procedure, and the choice procedure. This two-step model takes into account the dynamics of social ties -that people first meet others randomly, and then make their rational decisions about the choice of friends. The meeting procedure models reality, wherein people meet each other at random. There may exist many potential neighbor candidates who are mutually beneficial (e.g., some potentially valuable long ties), but the extremely low meeting probability can prevent the social tie from being formed. Moreover, when first meeting a new neighbor, a person may lack sufficient information to assess the person, and they are unable to make a rational decision about the social tie. After getting to know a new friend over a period of time (one phase in our study), the individual can then start to make a rational decision about that person. The choice procedure assumes that individuals are rational when choosing their network neighbors and that each individual maximizes their utility function.
Formally, let I be the set of individuals and let i (or j, ) be their index. Additionally, let t index the discrete time steps (or phases), and thus, t ∈ N + . Also, let A (t) denote the adjacency matrix in phase t. A i.e., A ji for all i, j ∈ I, and for all, t ∈ N + . To account for the heterogeneity of individual attributes, we use the "endowment vector" w i , which is a K-dimensional vector as in the embedding techniques 39, 40 . As embedding techniques do, each dimension measures a certain latent attribute of an individual, such as a type of skill or useful information. A larger w ik indicates that the individual retains a high endowment of the k th dimension.
In each phase, the neighbor's set of i consists of two components: the new friend set M ij to "meet" each other in phase t.
If A (t−1) ij = 1, that is, the two individuals were connected in phase t − 1, then the p ij is a small probability, dependent on the network topology between i and j. Inspired by our previous comparison between newly formed ties and existing ties, we can imagine that if this is a long tie, the probability would be much smaller. Formally, we parametrize ij as follows: The distance metric d t−1 (i, j) depends on the network topology between individual i and individual j in phase t − 1. We define the distance metric to be proportional to the probability of random walks from i to j. Here, q is set to describe the probability of maintaining the meeting procedure in phase t.
The second component is the existing friend set N (t) i , which is determined by the rational choice procedure. It is a subset of all friends in phase t − 1, i.e., N . This means that individuals make rational decisions after maintaining their friendships for a period of one phase. The rationale behind this notion is that individuals need a significant amount of time to assess the value of an existing friend, so the rational choice procedure happens in the phase immediately following the meeting procedure. For a connected social tie in phase t − 1, the friendship must survive both the meeting procedure (a random draw from Bern(q)) and the rational choice procedure. The choice procedure is modeled using the following utility function: Here, can be understood as a function that maps any j in the neighbor set in phase t−1, i.e., each element , to a real number in [0, 1]. The utility function sums over all i's neighbors in enumerates over all j's neighbors in phase t − 1, which are also i's "friends' friends." The depreciation factor δ, which ranges in (0, 1), measures how the value of a potential friend depreciates as the distance on the network increases. We refer to as the benefit that j brings to i. In addition, we separate the benefit into two: the direct benefit, σ(w jk −w ik ), and the indirect benefit, The design of these benefit terms was intended for our valuable long tie hypothesis -we hope to observe that long ties have, on average, larger values in the direct benefit term. By the Cauchy-Schwarz inequality, Equation (3) can be solved by In particular, In other words, if the optimal solution informs c This model provides major improvements based on the framework proposed in prior work 42 .
First, different from their paper, we establish a model for network dynamics. In particular, we incorporate a meeting procedure; this addresses the phenomenon that, in reality, there are many neighbor candidates who do not form links purely because they have no opportunity to meet. Second, our model also takes into account the "weight" (i.e., the interaction frequency or duration) of the links. This is different from Yuan et al. 42 , where the weights between the links are binary.
Third, Yuan et al. 42 assumes that the marginal utility of additional neighbors is not dependent on other existing neighbors; by contrast, our model does not incorporate this assumption, and it also accounts for the network externality (i.e., the benefits of friends of friends) 7 . We provide additional analyses to verify our modeling fitting capacity in Supplementary Note 8. Figure 7 provides the main implications derived from the learning results of our model. We first present the average benefit, i.e., σ(w jk − w ik ) + Fig. 7. The average is taken over all candidate neighbors in given the tie range in phase t − 1. From this, we find a "U-shape", i.e., the a b c indirect effects for each tie range learned from our model, respectively. Error bars are 95% confidence intervals for the benefits (assuming normal distribution). Note that error bars are sometimes smaller than the data point markers.
average benefit decreases with the tie range at the beginning, but later increases with the tie range.
This is consistent with our previous findings regarding the "U-shape" between tie range and tie strength.
Next, we separate the benefits in Equation (3) into the direct effect and the indirect effect.
We present the average direct effect, which is Fig. 7. We observe an increasing pattern with the tie range, indicating that as the tie range increases, the average benefit that a tie brings also increases. This result supports our hypothesis that long ties tend to be more valuable, which also explains the results in the previous sections. We also compute the average indirect effect, i.e., Overall, the results from our learning model suggest that long ties are generally more valuable (with greater direct effects). This model also takes into account degree heterogeneity and survival bias hypotheses, although they are probably not the primary drivers. We also compare our model with other baseline models in Supplementary Note 9, but they cannot provide the implications as we plot in Fig. 7.

Conclusion
In this study, we combine empirical analysis and an interdisciplinary computational model to investigate the dynamics of long ties. We find that long ties persist longer than shorter-range ties and that many long ties are persistently long. These results are contrary to what is suggested by several prior theories and prediction models. To better understand our results, we propose three hypotheses -degree heterogeneity, survival bias, and valuable long ties -and then go on to discuss the limitations of both the degree heterogeneity hypothesis and the survival bias hypothesis. Finally, we discuss an interdisciplinary model that combines game theory and machine learning to support 21 our valuable long-tie hypothesis. Verified by real-world data, our model partly explains why long ties are more persistent than what has previously been suggested by existing theories and models.
Our results also signal the importance of social interventions that promote the formation of

Data description
In our study, we use a nationwide call detail record dataset. Users' private information has been anonymized and thus we are unable to identify them. This data provider is a company that functions as the main service provider for most of the mobile phone users in a European region. The time period covered by the data starts from Jan. 2015 to Dec. 2016. In the dataset, we retrieve the total number of calls, texts, as well as the duration of calls between any two people in each month. See Supplementary Note 1 for more details.
We establish a temporal social network with the dataset. We consider discrete time steps (or phases): for each phase, we construct a "snapshot" of the network, where the node indicates a user and the edge represents the interaction between two users. A key question is how we determine the length of the time window of each phase. In our main results, we treat every three months as a phase. In Supplementary Note 2, we also use one month or six months to verify the robustness of our results.
To maintain a temporal network where the node set is stable and the global network structure does not change dramatically with the dynamics of a few nodes, we only consider the interactions 23 among users who have at least one call or text in every phase. We construct a temporal directed network with 45,192 nodes and 385,533 edges on average for each phase.
In terms of the weight of the directed network, we consider two variables as mentioned in the main text: interaction frequency and duration. Interaction frequency is the total number of calls or text that node i sends to j; there are a few calls with zero-second duration and we filter those calls out. Interaction duration is the total time length that i calls j in each phase, and does not account for texting.

Tie range and long ties
Tie range 14,25 is defined as the length of the second shortest path between two connected nodes ( Fig. 1). It indirectly reflects the network distance of the connection. Consistent with previous long tie studies 22,25 , there is no clear cutoff of tie range that decides whether a tie is short or long.
A good reference is the Milgram experiment, which suggested that the average network distance between every two people is approximately 6. In our study, we treat social ties with a tie range of 2 as short ties, and ties with 5 or ≥ 6 as long ties. Besides, we do a sensitive check of our results by randomly dropping a proportion (5%) of nodes or edges (see Supplementary Note 3). Our main results are verified not sensitive to a few nodes or edges happening to exist on the network.

Details in learning
Based on Equation (4), we construct the loss function to minimize the MSE Loss between c ij and its right hand side. We use stochastic gradient descent to optimize the loss function. For each 24 epoch, we construct our loss function as below: The loss function is composed of the loss functions of positive (connected pairs), and negative samples (disconnected pairs).
The ij .
where D (t) ij is the interaction duration between i and j in phase t. To reduce the impact of extreme values, we take the logarithm of D To facilitate the learning process, we apply mini-batch stochastic gradient descent with Adam optimizer 52 . Consistent with conventional network embedding algorithms, node sampling probability is proportional to node degree (d

Competing interests The authors declare no competing interests.
Correspondence Correspondence and requests for materials should be addressed to Yuan Yuan (email: yuanyuan@purdue.edu).

Supplementary Note 1: Data processing and summary statistics
In our study, we use a nationwide mobile phone call dataset involving about 45 thousand (45192) people's phone call logs in 2 years from Jan. 2015 to Dec. 2016. This is a European region with more than 50 thousand but fewer than 100 thousand citizens. We aggregate the monthly phone call and texting log for each pair of users. Then we take a series of snapshots by aggregating all activities happening in a time window. We have flexibility in the choice of the time window. We establish a directed graph including all phone call logs in the time window. As mentioned in the main text, we primarily consider two types of edge weights -interaction frequency and interaction duration. Interaction frequency is the phone call counts between two people, and interaction duration is the sum of call volumes of all phone calls in an interval.
We next discuss how to select the time window. Note that the selection of the time window affects the proportion of each possible tie range. A too narrow time window may result in each snapshot being so sparse that many short-range ties might be treated as long-range ties. A too wide time window may result in too few snapshots for us to analyze the network dynamics. Eventually, we choose a season (three months) as the time window for the main text. Each season or three months is regarded as a "phase." As the length of our data is two years, we partition the data into eight phases. As the definition of tie range, we classify all connections with respect to tie range in each phase. Due to the small magnitude of ties over range 6, we merge them as ≥ 6. In addition, some ties with an infinite tie range cannot be ignored. As illustrated in Tab. S1, social ties with a tie range of 5 or ≥ 6 only take a small proportion of all connections.
We also present the statistics for interaction duration, interaction frequency, degree, and tie range for each phase in Tab. S2. As shown in the Table,

Supplementary Note 2: Robustness of the choice of time windows
To test for the robustness of the choice of the time window, we further adjust the time windows.
When the time interval is set as a month, we obtain 24 monthly snapshots. We respectively calculate the tie range of each edge in every snapshot. Consistent with the main text, we use the logarithm value of interaction frequency and duration so a few extreme values would not unreasonably affect the averages. Fig. S1(a&b), (c&d) present our main results after adjusting the time window. We observe a very similar trend with the results when the time window is three months.
The result from weekly aggregation is presented in Fig. S3. We find that again, long range ties (especially those with tie range ≥ 7) have greater interactions than short range ties (tie range = 2) in the long term. Note that when we aggregate to small time windows, the distribution of tie range is shifted to have a fatter tail larger (Fig. S4); we thus need to change the cutoff to 7 to maintain relatively the same proportion of ties as "long range ties." We also examine the results from data aggregation. Since the snapshot of one-day interactions may miss a great number of persistent social ties that happen not to interact on one specific day, we use a sliding window with a length of seven days, but we move the window day by day.
In this way, our resolution is still on the day level. The results are presented in Fig. S5 which are consistent with our main text.

Supplementary Note 3: Sensitivity check
Since the tie range of an edge is easily impacted by another node or edge that is distant on the network, we need to conduct examine how our results are sensitive to the existence of a few nodes or edges. We examine the sensitivity of our results to the impacts of certain nodes or edges. We randomly drop a proportion (5%) of nodes or edges and then replicate our main result. As shown in Fig. S6, dropping either nodes or edges would not affect our main conclusions. This indicates that our results are not sensitive to a few nodes or edges happening to exist on the network.

Supplementary Note 4: Explanation of the decreasing pattern
Note that Fig. 2 in the main text exhibit decreasing trends for all curves. This is because our analyses are the average interaction frequency and interaction given that a tie exists at phase 1 36 (E[y t |y 1 > 0]). Therefore as t (> 1) increases, we expect a proportion of social ties to terminate, which drives the decreasing pattern.
We hope to clarify that the result is not driven by a decaying trend in activity (E[y t ]). In Here we discuss our "degree heterogeneity" hypothesis. First, as shown in Fig. S10, individuals with fewer neighbors, i.e., a lower degree, tend to have more long ties. We then categorize social 37 ties by degree and plot the trends for each subgroup in Fig. S11. We find that our main results persist in all degree subgroups. Therefore, the degree heterogeneity hypothesis cannot fully explain our main results.

Supplementary Note 7: Survival bias hypothesis
To test for this hypothesis, we need to examine whether (1) newly formed long ties are weaker than newly formed short ties in the beginning; and (2) newly formed long ties have a smaller persistence probability such that only very strong long ties survive.
The plot is presented in the main text. We find that for new ties, the tie strength is weakest for those with tie range ≥ 6. By contrast, for existing ties, the trend appears to be a "U-shape." Thus, we support "newly formed long ties are weaker than newly formed short ties in the beginning".
For hypothesis (2), we re-conduct the analysis by decomposing the outcome into persistence probability and interaction increments. However, we find that newly formed long ties still have the largest persistence probability. Thus (2) is not supported. We therefore believe that the survival bias hypothesis cannot fully explain our main results.

Supplementary Note 8: Details in learning
Here we provide more technical details regarding the learning process of our proposed model. In our proposed model, we need to learn both hyper-parameter δ and endowments. However, simultaneously training δ and endowment vectors may cause an uncontrollable gradient issue.
Therefore, we first try to find the optimal δ and then train endowment vectors by minimizing the loss. From the data, we observe there is a positive indirect effect from common friends, and thus δ should be a small positive value. As shown in Fig. S13, we find that the model performs better when we set δ as 0.2 than other options -the fit resultĉ ij is closest to the real-world data c ij .
After determining the value of δ, we next infer the endowment vectors. To speed up the learning rate of the model, we adopt a sampling strategy. We set the maximum number of epochs as 500 and randomly sample 1000 nodes in each epoch. According to the loss function, sampled nodes and their neighbors will receive a gradient descent, and endowment vectors of them will be updated in each epoch. We set a testing set of 1000 nodes to track the learning curve of the model.
As shown in Fig. S14, the loss appears to converge to stable after 100 epochs.
As to the dimension selection of endowment vectors, we investigate how different selections of the dimensions impact our main results. We test it from 2-dimensional to 5-dimensional endowment vectors. Note that a too large dimensionality may raise the issue of computational complexity. We present the results corresponding to Fig. 7(a) in the main text in Fig. S15. As shown in the figure, the conclusions from different dimensions are largely similar. We therefore choose the dimensionality of four as an illustration in the main text.
We implemented our algorithm in PyTorch. The endowment vectors are implemented as embeddings in PyTorch, and we use Adam optimizer with regularization for the optimization.
where δ is the direct benefit from the connection between node i and node j. a ij is the time investment of node i in node j at phase t. We can further consider a higher order of indirect benefits, such as the benefits of three-hop neighbors (neighbors' neighbors' neighbors). 1 Furthermore, we introduce another baseline, which is a simplified version of our model with is the number of paths with a length of three between node i and node j at phase t − 1. Note that considering even higher order indirect effects (e.g., four-hop neighbors) gives rise to the issue of high computational complexity.

40
indirect effects removed, defined as below: Compared to the version in the main text (i.e., Eq. 3), we remove the indirect effects; that is We present the results in do not reveal the specialty of long range ties -the curves display the non-increasing patterns in tie range and the benefits would be a constant (δ) after a certain cutoff (3 and 4 respectively). The curves in these two panels do not exhibit a "U-shape" reflected in the main text anymore. Panel (c) is the result of the simplified version of our model. Although it reflects that longer-range ties have greater benefits, it also tends to consider short range ties (those with a tie range of 2) the least beneficial. Thus, it neither presents a "U-shape" which we anticipated. Taken together, none of these baseline models reflects the "U-shape" observed in the previous empirical results. Tab. S2: Statistics on interaction duration (ID), interaction frequency (IF), degree (d), and tie range (TR) in eight snapshots at the interval of a season (three months). Tie strength is measured by interaction duration (a&b; the total call volume in seconds) and interaction frequency (c&d; the number of calls or texts). Either a semi-year (a&c; six months) or a month (b&d) is set as the time window. We take logarithms (log) for both interaction duration and frequency. All ties are classified according to their tie range in the first phase. The curves represent the average (log) interaction duration or frequency conditional on that a tie exists in phase 1 with the given tie range. Error bars are 95% confidence intervals for the mean log interaction duration and frequency (assuming normal distribution). Note that error bars are sometimes smaller than the data point markers. the number of calls or texts). Each phase represents a week. We take logarithms (log) for both interaction duration and frequency. All ties are classified according to their tie range in the first phase. The curves represent the average (log) interaction duration or frequency conditional on that a tie exists in phase 1 with the given tie range. Error bars are 95% confidence intervals for the mean log interaction duration and frequency (assuming normal distribution).  4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23   Note that when defining the lifespan, we explore two choices: a&b a social tie has interactions in the first and the last phases no matter whether they have interactions in the phases in between; c&d the social tie has to have interactions for every phase within the lifespan. The former choice considers the ties being re-established after termination, while the latter one does not. Lifespan is measured in months. We examine the two years separately. Error bars are 95% confidence intervals for lifespans of social ties that exist in the first month (assuming normal distribution). Note that error bars are sometimes smaller than the data point markers.   Interaction frequency is the number of calls or texts. Persistence probability is defined as the probability of social ties persisting from phase t to phase t + 1. The numbers on the cells indicate the mean (log) interaction duration (top row), the mean (log) interaction frequency (middle row), and persistence probability (bottom row). Tie strength is measured by interaction duration (a-c; the total duration of the calls in seconds) and interaction frequency (d-f; the number of calls or texts). N D 1 indicates node degree in phase 1. The medium node degree of the snapshot in phase 1 is 12. Each phase represents a season (three months). We take logarithms (log) for both interaction duration and frequency. All ties are classified according to their tie range in the first phase. The curves represent the average (log) interaction duration or frequency conditional on that a tie exists in phase 1 with the given tie range. Error bars are 95% confidence intervals for the mean log interaction duration and frequency (assuming normal distribution). Note that error bars are sometimes smaller than the data point markers. 53 Fig. S12: Dynamics of interaction frequency, interaction duration, and persistent probability of survival or newly-formed ties throughout the next seven phases conditional on that a tie exists in phase 2. Each phase represents a season (three months). Interaction duration is measured in seconds. We take logarithms (log) for both interaction duration and frequency. All ties are classified according to their tie range in the first phase. The curves represent the average (log) interaction duration or frequency conditional on that a tie exists in phase 1 with the given tie range. Error bars are 95% confidence intervals for the mean log interaction duration and frequency (assuming normal distribution). Note that error bars are sometimes smaller than the data point markers. . None of the baselines can replicate the "U-shape" found in empirical data. Note that error bars are sometimes smaller than the data point markers.