How Physical Proximity Shapes Complex Social Networks

Social interactions among humans create complex networks and – despite a recent increase of online communication – the interactions mediated through physical proximity remain a fundamental way for people to connect. A common way to quantify the nature of the links between individuals is to consider repeated interactions: frequently occurring interactions indicate strong ties, such as friendships, while ties with low weights can indicate random encounters. Here we focus on a different dimension: rather than the strength of links, we study physical distance between individuals when a link is activated. The findings presented here are based on a dataset of proximity events in a population of approximately 500 individuals. To quantify the impact of the physical proximity on the dynamic network, we use a simulated epidemic spreading processes in two distinct networks of physical proximity. We consider the network of short-range interactions defined as d \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\lesssim }}$$\end{document}≲ 1 meter, and the long-range which includes all interactions d \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{\lesssim }}$$\end{document}≲ 10 meters. Since these two networks arise from the same set of underlying behavioral data, we are able to quantitatively measure how the specific definition of the proximity network – short-range versus long-range – impacts the resulting network structure as well as spreading dynamics in epidemic simulations. We find that the short-range network – consistent with the literature – is characterized by densely-connected neighborhoods bridged by weak ties. More surprisingly, however, we show that spreading in the long-range network is quite different, mainly shaped by spurious interactions.


The dataset
For every participant in the Copenhagen Networks Study 1 in February 2014 (696 active users) we calculate the number of 5-minute timebins in which we have data on Bluetooth obsevations (whether they contain any other participant in the study or not). Out of this population we choose users with data in at least 60% of timebins covered ( Figure 1). As the Bluetooth scans do not yield false positives (devices that are not actually present during the scan), we make the discovery network symmetric by assuming that if user i observed user j in timebin t the opposite is also true. After the matrix is made symmetric, the median time coverage for the users reaches 84%. The degree and node strength have broad distribution in the full network ( Figure 2). The strength of the nodes has a slightly longer tail in the intimate network, indicating existence of a few nodes with very strong links that are reduced in the random sampling.
The temporal dynamics of the social networks show an expected daily and weekly pattern as quantified by the number of active links in the network in 1-hour timebins ( Figure 3). Weekdays display morning and afternoon peaks of activity, with decreased activity during lunchtime. Fridays show a significantly reduced afternoon peak but a much more pronounced evening peak, indicating social gatherings. Most of the weekends show significantly smaller activity, except for occasional parties (for example on Saturday evening in day 22). This shows that although the analyzed network is primarily driven by work-related interactions, it also contains a non-trivial number of social interactions.
The difference in the entropy of interactions between intimate and (sampled) ambient network (shown in the main text) is not explained by a simple change in the degree of the nodes. Plotting the difference in entropy vs. change in degree reveals only week correlation R 2 = 0.17 as shown in Figure 4.

Link weights and spreading
Strong links in the networks also tend to be more infectious. This is shown in Figure 5. The correlation is strongest in the sampled ambient network (R 2 = 0.90). In the intimate (and ambient) network the infectious power of the strongest links is lower than we would expect from a linear model, due to the high clustering of these links in the tightly-connected neighborhoods (as shown in the main text). Quality of the data. We calculate in how many 5-minute bins we have Bluetooth data for the users. Out of all users we chose these with data quality of over 60%. After filling the observations by making the observations matrix symmetric, the data quality still improves slightly (shaded area).

3/6 2 RSSI and interaction proximity
RSSI is a noisy proxy for distance. Numerous lab experiments have shown strong (logarithmic) relation between RSSI and distance [2][3][4] . Recently, the relation between proximity measured by RSSI and social ties has also been shown 5 . The distribution of RSSI observations decays exponentially ( Figure 6). The dataset used here does not contain data allowing for direct validation of RSSI as proximity indicator, as it does not include any signal that can reliably identify proximity with higher or even comparable resolution. Instead, we use the overlap of other sensed Bluetooth devices to show how RSSI relates to physical proximity.
For every observations between participants we calculate the jaccard overlap between all the other devices they sensed in the timebin. For example, for a 10 minute timebin-which on average contains two scans-we take the value of RSSI as the smallest RSSI between the users in that timebin and for every user we create a set of all unique devices sensed in that period (usually in two scans). Distant interactions (with small RSSI) show smaller overlap of the discovered devices (Figure 7). This is true for timebins (over which the overlap is calculated) spanning over four orders of magnitude, including extremely short windows of 5 seconds and long windows of 4 hours. In the timebins shorter than used Bluetooth scanning interval (5 minutes) the overlap between single scans is calculated (and the scans can be separated in time up to the size of the timebin). For the longer timebins, overlap between devices sensed in multiple scans is used. For the high RSSI values there is a negligible fraction of observations that do not show at least 50% overlap (distributions in Figure 7), which confirms that higher RSSI values are indicative of closer physical proximity. . Jaccard overlap between sensed Bluetooth devices depending on the RSSI between participants. Higher RSSI indicates closer physical proximity which also results in the higher overlap. Observation are binned in 5dBm bins. Plots for different time windows, across which the overlap is calculated. For all the time windows the higher RSSI indicates higher overlap of discovered devices. Lines indicate mean values. Distributions are shown for t = 10 min).