How Physical Proximity Shapes Complex Social Networks

Stopczynski, Arkadiusz; Pentland, Alex ‘Sandy’; Lehmann, Sune

doi:10.1038/s41598-018-36116-6

Download PDF

Article
Open access
Published: 07 December 2018

How Physical Proximity Shapes Complex Social Networks

Arkadiusz Stopczynski^1,2,
Alex ‘Sandy’ Pentland² &
Sune Lehmann ORCID: orcid.org/0000-0001-6099-2345^1,3

Scientific Reports volume 8, Article number: 17722 (2018) Cite this article

9083 Accesses
13 Citations
30 Altmetric
Metrics details

Subjects

Abstract

Social interactions among humans create complex networks and – despite a recent increase of online communication – the interactions mediated through physical proximity remain a fundamental way for people to connect. A common way to quantify the nature of the links between individuals is to consider repeated interactions: frequently occurring interactions indicate strong ties, such as friendships, while ties with low weights can indicate random encounters. Here we focus on a different dimension: rather than the strength of links, we study physical distance between individuals when a link is activated. The findings presented here are based on a dataset of proximity events in a population of approximately 500 individuals. To quantify the impact of the physical proximity on the dynamic network, we use a simulated epidemic spreading processes in two distinct networks of physical proximity. We consider the network of short-range interactions defined as d \({\boldsymbol{\lesssim }}\) 1 meter, and the long-range which includes all interactions d \({\boldsymbol{\lesssim }}\) 10 meters. Since these two networks arise from the same set of underlying behavioral data, we are able to quantitatively measure how the specific definition of the proximity network – short-range versus long-range – impacts the resulting network structure as well as spreading dynamics in epidemic simulations. We find that the short-range network – consistent with the literature – is characterized by densely-connected neighborhoods bridged by weak ties. More surprisingly, however, we show that spreading in the long-range network is quite different, mainly shaped by spurious interactions.

Multi pathways temporal distance unravels the hidden geometry of network-driven processes

Article Open access 05 June 2023

Impact of basic network motifs on the collective response to perturbations

Article Open access 08 September 2022

Linking the network centrality measures closeness and degree

Article Open access 02 July 2022

Introduction

Social interactions among humans form complex networks. While these interactions have recently begun to occur via many different channels – email, social networks, texts, and calls¹ – the interactions mediated through physical proximity remain a fundamental way for people to connect². A common way to quantify the nature of a link is to consider repeated interactions: frequently occurring interactions indicate strong ties, such as friendships, while ties with small weights can indicate random encounters. Here we focus on a different dimension: rather than the strength of links, we study physical distance between individuals when a link is activated. Using epidemics as an example application, we show that changing of our definition of what constitutes a social tie based on the distance of pairs of individuals leads to strong structural differences in the resulting networks and quantify those differences.

The findings presented here are based on a dataset of proximity events in a population of approximately 500 students at the Technical University of Denmark³. These students are densely interconnected via networks of interactions, both virtual (Facebook, calls, texts) and based on physical proximity (both within university campus and outside). The full dataset – known as the Copenhagen Networks Study – contains two years of high-resolution records of students’ activity (the aforementioned networks along with GPS location and questionnaires), collected primarily through smartphones distributed to students at the beginning of their university education. Here, we explore the dynamic network where every person is represented by a node, and two nodes are connected if they are within certain physical distance d of each other. While this network is small from the perspective of population-level epidemiological studies, the access to physical proximity sampled at the 5-minute level, provides very a detailed view of possible empirical spreading paths (see Table 1 for details).

Table 1 Overview of networks statistics.

Full size table

To quantify the impact of the physical proximity on the dynamic network, we use a simulated epidemic spreading processes in two distinct networks of physical proximity. We consider the network of short-range interactions defined as \(d\lesssim 1\) meter, and the long-range network which includes all interactions \(d\lesssim 10\) meters⁴. Below we show that the short-range and long-range networks are fundamentally different in terms of structure and dynamics.

The key novelty of this work arises from the fact that we are able to explore dynamics of two distinct types of spreading mechanisms (in many ways similar to e.g. droplet vs. airborne spreading mechanisms) based on the same underlying empirical behavioral data. Because we are able to consider two fundamentally distinct networks arising from a single underlying dataset, we can be certain that the differences in infection patterns are related solely to differences in how the disease is able to spread on each of the networks. This implies that differences in spreading patterns are not due to other differences in behavior that one might encounter when comparing two disparate datasets of actual human behavior, such as mobility, culture, population density, demographics, etc. Similarly, having both short- and long-range networks directly observable allows us to sidestep creation of synthetic networks via randomization schemes.

In the literature on physical proximity, the tacit rules of human interactions in physical space have been an object of interest since the 1950’s^5,6,7. Yet little is known about how the structure of person-to-person proximity networks change as we vary the definition of which distance between two individuals corresponds to a connection between the two. Previous research into proximity networks has been based on self-reported data^6,8,9 or tightly-controlled laboratory observation⁵.

We expect the social network of individuals to be closely related to the structure of the short-range network, but with some differences. This similarity arises because, in social networks, the difference between friend and stranger is typically expressed via different personal spaces for each social category⁶. Interactions with individuals with whom we are not familiar tend to occur at larger distances (we use term ‘interaction’ for all proximity events, including the long-range network). Since people function in bounded spaces, however, we do not have complete freedom to only allow friends to be physically close to us. Rides on buses, random meetings in elevators, or busy dining halls force us to be in close proximity to strangers. Thus, while the majority of our proximity interactions are with friends and families, our interactions network is not fully explained by the underlying social network, as expressed by, for example, link strengths. The long-range networks contains all of the links in the short-range network, but in addition also spurious connections to people passing by and the ‘familiar strangers’^10,11, those individuals we encounter repeatedly but have never gotten to know. Thus, considering the proximity of pairs engaging in interactions and moving beyond simply considering the weight of the links in the network, provides a new source of information regarding potential spreading paths.

From the network science literature we know that social networks exhibit non-trivial structure on every level from degree distribution^12,13, over motifs^14,15, to communities^16,17,18, and at time even an overall hierarchical organization¹⁹. In the light on the research on physical proximity discussed above it is interesting to keep these key findings from the social networks literature in mind as we explore the differences between the short-range and long-range networks.

Results

The proximity networks are based on Bluetooth scans providing a measure of pairwise proximity between N = 464 highly-connected participants – freshmen students at a large university³. We define an interaction between users i, j in a 5-minute timebin t (the smartphone were configured to scan for nearby devices every 5 minutes) as γ_ijt = s, where the signal strength s is reported by the handsets as received signal strength indicator (RSSI). Two users are considered to be interacting within a given timebin if their phones registered each other at least once in that timebin, regardless of the reported signal strength. This densely-connected dynamic network of all Bluetooth interactions is based on a total of 1472 094 interactions, taking place over 28 days. RSSI, measured in dBm, is defined as the observed signal power relative to 1 mW.

The long-range, sampled long-range, and short-range network

The long-range network is created by interactions occurring at any distance covered by Bluetooth range, between 0 and 10–15 meters. In order to capture only close range interactions, we establish the short-range network by selecting the subset of interactions with γ_ijt ≥ −75 dBm corresponding to distances of approximately 1 meter or less⁴ (see Supplementary Information for additional details on the choice of threshod). The short-range network consists of f = 18.3% of all interactions.

Since the short-range network contains only a fraction of all interactions, the simulated spreading processes taking place on this network are trivially slower and smaller than processes occuring on the long-range network. The intuitive reason for this is that with an average of one fifth of the interactions, a node in the short-range network has correspondingly fewer opportunities of spreading a disease than in the long-range network. The difference in number of interactions therefore prevents us from directly comparing the interplay between structure and dynamics of spreading processes for the short-range and long-range networks using simulated disease models with the same parameters.

In order be able to compare directly, we create a sampled long-range network, which contains the same fraction of interactions as the short-range network, but chosen at random among all interactions (see Fig. 1a). As we argue below, the sampled long-range network thus contains both close and distant interactions and shares most topological properties with the full long-range network, while based on precisely the same number of interactions as the short-range network.

Link weights in the three networks

We start our analysis by studying similarities and differences in the the distribution of link weights between the three networks (long-range, sampled long-range, and short-range). For each of the networks, we calculate the weights as described below, using the long-range network as an example. We first create an adjacency matrix A_i×j×t with timebins t containing interactions aggregated over 5 minute intervals corresponding to the Bluetooth scanning rate. This matrix has entries a_ijt = 1 when an interaction is present and a_ijt = 0 otherwise. The weight w_ij of a link connecting two individuals is defined as the total number of interactions occurring on that link \({w}_{ij}={\sum }_{t}\,{a}_{ijt}\). Note that because the sampled long-range network is generated by sampling interactions at random from the full network, it is possible to calculate the weight distribution for this network analytically.

We use a number of closely related (but distinct) terms to describe connections between pairs of individuals. A quick overview of terms are: Interaction: A single measurement of proximity between a pair of individuals. Signal strength: The RSSI measured by a smartphone for a single interaction. The signal strength can be considered a measure of distance. Link: An abstract description of the connection between two individuals, and implies at least one interaction. Links are sometimes denoted ties or connections in the literature. Weight: Number of interactions observed on a given link; sometimes called strength in the literature.

As shown in Fig. 1b, the distribution of link-weights in all three networks is broad with many weak links (containing few interactions) and a small number of links of very high weight. The short-range network and the sampled long-range network contain the same number of interactions, but the number of resulting links in the two networks is strikingly different. The approximately 1.4 million interactions in the full long-range network are distributed across 42 838 links, resulting in an average link-weight of a little over 34 interactions per link. We create the short-range and sampled long-range networks by removing 81.7% of the interactions from the full long-range network, leaving 269 094 interactions in both of these networks. The resulting number of links is much higher in the long-range network. Averaged over 100 realizations, this network has 26 511 ± 68 links, corresponding to around 61.9% of the links in the full long-range network. In contrast, the short-range network has only 13 474 links corresponding to only 31.5% of the links in the full long-range network. These differences are illustrated in the Fig. 1b inset.

Let us investigate these difference with respect to link weight in further detail. First, let us consider the weakest links. In terms of low weight links the sampled long-range network simply retains around f = 18.3% of the long-range network’s links, with small differences. The reason for these differences can be understood by considering links with weight 1. Of course, (100–18.3)% of links with weight 1 are removed, but the sampling process also creates new links of weight 1 by down-sampling the weight of some number of links with weight 2, 3, etc. In the short-range network a much higher fraction of links with weight 1 are removed, this network has about half as many links with weight 1 as we find in the sampled short-range network.

Now, considering high-weight links we find that these links in the short-range network are relatively unaffected by removing interactions according to physical distance: in the short-range network we find that the highest-weight links typically maintain ~80% of their interactions). This is in stark contrast to the sampled long-range network, where link-weight is depleted in proportion to the sampling fraction, and high-weight links maintain only ~18% of the interactions from the full long-range network.

In summary, the weight distribution in the short-range network suggests that friends (with high-weight links) tend to be physically close and that most low-weight links correspond to random encounters (encounters between strangers), consistent with results on interaction distance from both quantitative measurements⁴ as well as sociology⁶.

Differences in local structure

The key comparison is between the short-range network and the two long-range networks. Since our sampling is uniform over interactions, we expect the sampled long-range to be structurally very similar to the full long-range network, with weights decreased proportional to the down-sampling fraction. As we discuss above, however, many low-weight links disappear as part of the sampling process, and the overall network structure is complex, reflecting non-trivial and highly correlated underlying social behaviors. Therefore, it is useful to quantitatively confirm that the structure of the long-range and sampled long-range remain remarkably similar – and distinct from the short-range network.

Starting from the single node perspective, we find important differences between the short-range and the long-range networks. We can quantify this difference using the Shannon entropy. For a node i, we start from a link with neighbor j with weight w_ij and define \(\pi ({w}_{ij})={w}_{ij}/{\sum }_{k}\,{w}_{ik}\) to mean the fraction of the node’s total interactions taking place on that link. Now, we define the node entropy as \(S(i)=-\,{\sum }_{j}\,\pi ({w}_{ij})\,{\mathrm{log}}_{2}\,\pi ({w}_{ij})\). Since infection probability is approximately proportional to link weight (see SI), this quantity can be interpreted as the expected number of yes/no questions needed to establish which of i’s links caused an infection. The distribution of entropy for all three networks is plotted in Fig. 2a. For the short-range network (blue), the distribution peaks at 4 bits, corresponding to an effective group of 2⁴ = 16 potential sources of infection. Comparing the long-range (green) and sampled long-range (orange) networks, we find as expected that the distribution of node entropies are very similar, emphasizing the structural similarity between these two networks. The distribution for the sampled long-range network is created by averaging per-user entropy values over 100 random realizations of the sampled long-range network. Both peak at around 6 bits, corresponding to a larger effective group of 2⁶ = 64 potential sources of infection in this network.

These results provide a striking illustration of how the close proximity zone is preferentially reserved for strong ties (e.g. friends or acquaintances) while the distant zone is a more public space where many more random interactions happen, resulting in a correlation between physical proximity and tie strength as reported in ref. ⁹.

Meso-level structural differences

In the previous section we showed that in the short-range network a large fraction of interactions takes place on high-weight links. We now study the interplay between meso-level network structure and link-weight in the short-range and long-range networks. Specifically we are interested in the structures formed by the highest weight links. To explore these, we start building the networks from empty, adding their respective strongest links one-by-one. As links are added, we keep track of the number of connected components in the network as well as total weight of interactions added through the links, revealing the differences in the networks with respect to the structures created by the heaviest links.

Figure 2b illustrates how the process of adding links gradually grows the long-range and short-range networks, respectively. In the lower panel of Fig. 2b we show the number of the connected components and total number of interactions in the networks as the links are added. First, notice that the full and sampled long-range networks display identical behavior, with number of neighborhoods peaking with approximately 120 strongest links added. This behavior is consistent across 100 random realization of the sampled network. This is in contrast to the short-range network, where the number of components continues to grow up to 240 heaviest links in the network.

In both types of networks, the strongest links in the network first create small isolated neighborhoods of highly interacting nodes. Figure 2b (upper panel) shows snapshots of the sampled long-range (orange) and short-range (blue) networks at points (120, 250, 300 links) indicated on the plot below, illustrating this point. We see that at 250 strongest links the long-range network, a large connected component is beginning to form, making the network significantly more connected. At this point, the short-range network, however, is still divided into many small neighborhoods. We also note that while the x-axis indicates the absolute number of the heaviest links added to the networks, the total number of interactions included in the networks at any number of links is strikingly different. In fact, it is important to underscore just how large a fraction of interaction are concentrated on the high-weight links. The short-range network has a total of 13 474 links and the sampled long-range network has ~26 500 links. Figure 2b (bottom panel), however, shows that in the short-range network the 250 strongest links in the network account for approximately 50% of the interactions. In the long-range network the picture is less skewed. Here, the top 250 links account for approximately 25% of the interactions. Thus, while the percolation transition occurs for a very small number of high-weight links in both networks, these links include a large fraction of the total number of interactions.

Our analysis shows, therefore, that the short-range network not only contains fewer links than the sampled long-range network, but that the configuration of the heaviest links is more fragmented than in the long-range case. This structural property of the short-range network, the highly-connected neighborhoods bridged by weak ties, is consistent with well known structures found in other social networks, such as mobile phone networks and online social networks^17,20,21,22. In the long-range network, however, this structure is less pronounced, obscured by the presence of spurious links, distinct communities bridged by a small number of strong links not present in the short-range network.

Spreading process is captured in neighborhoods

Having investigated differences between short- and long-range networks with respect to structure, we now explore how the differences based on how diseases spread on the networks. Using a simple Susceptible-Infected-Recovered (SIR) model, we run simulations of a disease spreading across the networks. Our model is intentionally simplistic, intended to illustrate the structural differences between short- and full-range transmission, rather than emulate a specific disease. We use the actual temporal sequence of proximity interactions observed in the data, choosing parameter values to create a situation where large outbreaks are likely, but not guaranteed (see Methods for details of the epidemic modeling). While we report results for a specific choice of parameters and a single realization of the sampled long-range network, these results are robust across a wide range values of the transmission parameters and realizations of the sampled network.

Based on the structural analysis, our hypothesis is that, in the short-range network, the simulated pathogen tends to be more contained within small sets of highly interacting individuals. We quantify the contained-in-communities behavior as follows. For each infection event, occurring on link w_ij, where node i infects node j, we measure which fraction I_j of the node’s direct (1-hop) neighborhood has already been infected. Since this is a weighted network, we define \({I}_{j}={W}_{\{-i\}}^{-1}\,{\sum }_{k\in {\mathcal I} (j),k\ne i}\,{w}_{jk}\), where \( {\mathcal I} (j)\) is the set of j’s infected neighbors and \({W}_{\{-i\}}={\sum }_{k\ne i}\,{w}_{jk}\) is the sum of all weights excluding the infecting link. A value of I_j = 0 indicates that no-one in the direct neighborhood besides the infecting node has been yet infected; a value of I_j = 0.5 indicates that neighbors accounting for 50% of link weights connecting to j have already been infected. Figure 3a shows a kernel density estimation of I as a function of the fraction of infected nodes, based on 500 runs of the spreading process in the short-range (left), sampled long-range (middle), and long-range (right) networks.

In the case of the short-range network, we observe behavior which suggest that the spreading agent is indeed slowed by neighborhoods, consistent with behavior of both simulated and real spreading processes found in the literature^{23,24,25,26,27}. As is evident from Fig. 3a, early in the epidemic outbreak, when the fraction of infected nodes is low, the disease agent can saturate small neighborhoods and infect new nodes in neighborhoods, where a large fraction (I > 0.80) of neighbors are already infected. Conversely, it is still possible to find neighborhoods with a low fraction (I < 0.20) of infected nodes very late in the outbreak. These effects are possible because the spreading agent does not jump easily between neighborhoods of densely connected nodes.

The disease spreading is very different in the full and sampled long-range cases. In contrast to the contained-in-communities picture, the infection progresses smoothly through the network. In the long-range networks, the neighborhood infection is more closely proportional to the fraction F of the total network infected. Cuts at particular levels of overall network infection F in Fig. 3b show that the pattern of more spread-out I in the short-range network is consistent through the spreading progression and across random starting conditions (seed node and time) Visually, the distributions of I at given F are narrower for the long-range networks, with peak values of neighborhood infection I closer to values of overall network infection F. To quantify this effect, we consider the distribution of R² of a linear model fitting infection of the neighborhoods I to the progress of the infection (fraction of network infected F), calculated for each of the aforementioned 500 realizations of an epidemic, the distribution of R² peaks at around 0.4 in the short-range network vs 0.75 in the two long-range networks, as shown in Fig. 3c. This indicates that direct proportionality between the global (F) and local (I) infection level is a significantly better model for the long-range networks.

Thus we find, that while – in the short-range network – the infection tends be captured inside closely connected communities, the picture is quite different in the long-range network. While both types of behavior has been described in the literature^{8,23,24,25,26,27,28}, the important finding in this context is that the two networks are representations of the same underlying behavioral data originating from a single population. These findings underscore how long-range spreading dramatically taps into spurious connections outside the social networks, resulting in fundamentally different types of spreading – in some ways mimicking the differences between droplet and airborne spreading mechanisms^29,30,31,32.

Community structure increases infected-infected interactions

Our analysis of link weights showed that the short-range network tends to have fewer links with more interactions on each link. But why is the disease trapped within communities in the first place? One of the reasons that an infection remains ‘stuck’ in a neighborhood is that a disease can only spread via interactions between infected and susceptible nodes. Thus, if a local group is fully infected, we tend to see a large fraction of infected-infected interactions, which cannot help spread the disease. In Fig. 4a we quantify this tendency, by plotting how frequently infected-infected are active in the sampled long-range and short-range network, respectively.

We observe a clear difference between two networks. In the sampled long-range network, where the local connection patterns have high entropy, there is only a low level of activity among infected or recovered individuals. The spreading agent quickly reaches the entire network due to a large number of available susceptible-infected links. This behavior is in contrast to the short-range network, where infected-infected interactions present a larger fraction of interaction events. Thus, as above, given the same number of interactions and the same underlying behavioral data, outbreaks are significantly slower and more contained in the short-range network relative to the sampled long-range case (Fig. 4b).

Statistics of spreading outcomes

Finally, in Fig. 5 we summarize a number of statistics related to disease spreading in the three networks. These results confirm that the structural differences between the short-range and long-range interaction networks discussed above lead to reliably different outcomes in simulated epidemics. Firstly, in Fig. 5a, we show that when the outbreaks do happen in the short-range network, they are smaller in terms of total number of nodes infected. Moreover, the probability that an outbreak is contained – reaching only a small fraction of the network (<20%) – is higher in the short-range network than in the long-range networks (Fig. 5a inset). Finally, the time an infection needs to reach 50% of the short-range network is significantly longer, with the peak of the distribution for sampled long-range network occurring after 7 days, while the short-range network the peak is delayed to 10 days (Fig. 5).

Thus, consistent with the literature short-range short-range interactions are organized in a way that slows down spreading relative to the long-range case. The sampled long-range network features precisely the same number of interactions as the short-range network, but is structurally more similar to the full long-range network according to the measures considered here. Our results show that taking the physical distance of interactions into account results in networks that can significantly alter the outcome of a simulated outbreak. The qualitative behavior described above is reproduced across a wide range of parameter values.

Discussion

We have demonstrated a strong structural difference between the short-range networks that support short-range transmission processes and the long-range networks that support transmission across distances up to 10 meters. Summarizing our findings, we find that the proximity of interactions correlates with link-weight: on average we stay closer to our friends. In the short-range network, we find spreading patterns consistent with our knowledge of spreading on various online social networks and modeling studies^{23,24,25,26,27}. In the long-range network we observe a large proportion of proximity interactions between individuals with weak or absent social ties, resulting in a complex local network structure. This non-social ‘noise’ in the network allows for faster and more powerful outbreaks to take place, even when considering the exactly same number of interactions, consistent with results of synthetic proximity-aware spreading simulations³³.

It is, of course, well known that that the definition of ‘interaction’ impacts the network structure and spreading dynamics. For example, networks of sexual contacts are analyzed separately from other types of pathogen spread^34,35, even though both types of networks are physical interactions networks. A central work in understanding role of physical proximity is by Read et al.⁸, where questionnaire data regarding ‘close’ and ‘distant’ interactions were collected from 49 participants over 14 non-consecutive days. This study, however, did not address how differences in mode of transmission can affect the network of infections. Recently, a multitude of new approaches have been developed for collecting data regarding close interactions with the purpose of modeling spreading using various methods, including Bluetooth, RFID, and questionnaires^{8,28,36,37,38,39}.

Here we argue that from the perspective of a spreading agent, the relatively subtle difference of what ‘interaction’ is in the short-range and long-range networks makes an important difference, even given the same underlying social system. Our results suggest that long-range spreading is less related to the underlying social network and closer to a well-mixed system than simulations on purely social structures might lead one to suggest.

Methods

The dataset

The dataset used in this paper comes from the Copenhagen Networks Study³. We use one month of data (February 2014). Out of 696 freshmen student participants active in that month we chose students with at least 60% of Bluetooth observations present (resulting median 80%) and who belong to a single connected component. Observations are defined as 5-minute bins in which the user has performed scans, whether the scans contained any devices or not. Since Bluetooth scans do not result in false positives, we symmetrized the observation matrix (resulting in an undirected network), assuming that \({\gamma }_{ijt}\iff {\gamma }_{jit}\). This results in improved data coverage, with a median of 85% of 5-minute containing data. More information regarding the dataset is provided in the Supplementary Information.

RSSI and interaction distance

The received Signal Strength Indicator (RSSI) can be used to estimate the distance between wireless devices⁴⁰. Sekara & Lehmann⁴ showed the stability of RSSI in modern mobile phones; the same phones were used in the Copenhagen Networks Study. Based on these results, we use γ_ijt = RSSI ≥ −75 dBm as an indicator that an interaction was closer than 1 meter. This value can be considered a conservative estimation, as the measurements in ref.⁴ have been performed without obstacles. Thus, we expect that γ_ijt ≥ −75 dBm may not include all the close interactions, but it should not include distant interactions. When the interaction matrix is symmetrized, we take the smallest distance (largest RSSI) that happened between users in given timebin γ_ijt = γ_jit = min(γ_ijt,γ_jit).

We note that the approach presented here has some limitations. While all mobile phones used for data collection in the study were the same model and the obtained RSSI values are comparable in this sense, it is important to emphasize that our distance threshold is noisy; RSSI may differ depending on where the phone is placed, environmental conditions, etc. In that sense, our results can be considered a lower bound of the difference between the two types of networks, since a perfectly noisy threshold would produce two randomly sampled networks with no difference between them.

Epidemic simulations

To show the dynamics of the spreading process in the droplet and airborne networks we use a simple Susceptible-Infected-Recovered (SIR) simulation. We run a large number of simulations (N = 10 000) on the full temporal network, where every interaction between Infected and Susceptible participants can lead to infection with probability β = 0.02. Users stay in Infected state for μ_t = 7 days, after which they are moved to Recovered state and cannot be re-infected. The starting time bin and seed node are chosen at random in every simulation and used for simulation on all three networks (long-range, sampled long-range, and short-range). We use one month of data (28 days, 8 064 5-minute timebins) with periodic boundary conditions, having the 28 days repeating indefinitely. The parameter values are chosen so that outbreaks are likely, but not guaranteed and with sizes that do not trivially saturate the entire network. The parameters themselves as well as resulting epidemic curves (with peaks between 7 and 14 days) are consistent with these reported in the literature regarding both simulated and observed flu outbreaks^8,37,41. This model is intentionally simplistic, intended to illustrate the structural differences between full- and short-range transmission, rather than emulate a specific disease. The qualitative behavior of our analysis is unchanged across a wide range of parameter values.

References

Lazer, D. et al. Life in the network: the coming age of computational social science. Sci. (New York, NY) 323, 721 (2009).
Article CAS Google Scholar
Eagle, N., Pentland, A. S. & Lazer, D. Inferring friendship network structure by using mobile phone data. Proc. Natl. Acad. Sci. 106, 15274–15278 (2009).
Article ADS CAS PubMed Google Scholar
Stopczynski, A. et al. Measuring large-scale social networks with high resolution. PLoS One 9, e95978, https://doi.org/10.1371/journal.pone.0095978 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Sekara, V. & Lehmann, S. The strength of friendship ties in proximity sensor data. PloS One 9, e100915 (2014).
Article ADS PubMed PubMed Central Google Scholar
Sommer, R. Studies in personal space. Sociometry 247–260 (1959).
Article Google Scholar
Hall, E. T. & Hall, E. T. The hidden dimension, vol. 1990 (Anchor Books, New York, 1969).
MATH Google Scholar
Eagle, N. & Pentland, A. Reality mining: sensing complex social systems. Pers. ubiquitous computing 10, 255–268 (2006).
Article Google Scholar
Read, J. M., Eames, K. T. & Edmunds, W. J. Dynamic social networks and the implications for the spread of infectious disease. J. The Royal Soc. Interface 5, 1001–1007 (2008).
Article Google Scholar
Mossong, J. et al. Social contacts and mixing patterns relevant to the spread of infectious diseases. PLoS medicine 5, e74 (2008).
Article PubMed PubMed Central Google Scholar
Milgram, S., Sabini, J. E. & Silver, M. E. The individual in a social world: essays and experiments (Mcgraw-Hill Book Company, 1992).
Sun, L., Axhausen, K. W., Lee, D.-H. & Huang, X. Understanding metropolitan patterns of daily encounters. Proc. Natl. Acad. Sci. 110, 13774–13779 (2013).
Article ADS CAS PubMed Google Scholar
Barabási, A.-L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).
Article ADS MathSciNet PubMed Google Scholar
Eubank, S. et al. Modelling disease outbreaks in realistic urban social networks. Nature 429, 180–184 (2004).
Article ADS CAS PubMed Google Scholar
Milo, R. et al. Network motifs: simple building blocks of complex networks. Science 298, 824–827 (2002).
Article ADS CAS PubMed Google Scholar
Szell, M., Lambiotte, R. & Thurner, S. Multirelational organization of large-scale social networks in an online world. Proc. Natl. Acad. Sci. 107, 13636–13641 (2010).
Article ADS CAS PubMed Google Scholar
Girvan, M. & Newman, M. E. Community structure in social and biological networks. Proc. national academy sciences 99, 7821–7826 (2002).
Article ADS MathSciNet CAS Google Scholar
Onnela, J.-P. et al. Structure and tie strengths in mobile communication networks. Proc. Natl. Acad. Sci. 104, 7332–7336 (2007).
Article ADS CAS PubMed Google Scholar
Fortunato, S. Community detection in graphs. Phys. reports 486, 75–174 (2010).
Article ADS MathSciNet Google Scholar
Ahn, Y.-Y., Bagrow, J. P. & Lehmann, S. Link communities reveal multiscale complexity in networks. nature 466, 761–764 (2010).
Article ADS CAS Google Scholar
Bakshy, E., Rosenn, I., Marlow, C. & Adamic, L. The role of social networks in information diffusion. In Proceedings of the 21st international conference on World Wide Web, 519–528 (ACM, 2012).
Bagrow, J. P., Lehmann, S. & Ahn, Y.-Y. Robustness and modular structure in networks. Netw. Sci. 3, 509–525 (2015).
Article Google Scholar
Scott, J. Social network analysis (Sage, 2017).
Fowler, J. H. & Christakis, N. A. Dynamic spread of happiness in a large social network: longitudinal analysis over 20 years in the framingham heart study. Bmj 337, a2338 (2008).
Article PubMed PubMed Central Google Scholar
Mucha, P. J., Richardson, T., Macon, K., Porter, M. A. & Onnela, J.-P. Community structure in time-dependent, multiscale, and multiplex networks. Science 328, 876–878 (2010).
Article ADS MathSciNet CAS PubMed Google Scholar
Salathé, M. & Jones, J. H. Dynamics and control of diseases in networks with community structure. PLoS computational biology 6, e1000736 (2010).
Article ADS MathSciNet PubMed PubMed Central Google Scholar
Karsai, M. et al. Small but slow world: How network topology and burstiness slow down spreading. Phys. Rev. E 83, 025102 (2011).
Article ADS CAS Google Scholar
Cauchemez, S. et al. Role of social networks in shaping disease transmission during a community outbreak of 2009 h1n1 pandemic influenza. Proc. Natl. Acad. Sci. 108, 2825–2830 (2011).
Article ADS CAS PubMed Google Scholar
Xiao, X., van Hoek, A. J., Kenward, M. G., Melegaro, A. & Jit, M. Clustering of contacts relevant to the spread of infectious disease. Epidemics 17, 1–9 (2016).
Article PubMed Google Scholar
Garner, J. S. Guideline for isolation precautions in hospitals. Infect. control 17, 54–80 (1996).
Article Google Scholar
Weinstein, R. A., Bridges, C. B., Kuehnert, M. J. & Hall, C. B. Transmission of influenza: implications for control in health care settings. Clin. infectious diseases 37, 1094–1101 (2003).
Article Google Scholar
Klovdahl, A. S. et al. Networks and tuberculosis: an undetected community outbreak involving public places. Soc. science & medicine 52, 681–694 (2001).
Article CAS Google Scholar
Liverman, C. T. et al. Preparing for an influenza pandemic: Personal protective equipment for healthcare Workers (National Academies Press, 2007).
Sun, P., Cao, X.-B., Du, W.-B. & Chen, C.-L. The effect of geographical distance on epidemic spreading. Phys. Procedia 3, 1811–1818 (2010).
Article ADS Google Scholar
Liljeros, F., Edling, C. R., Amaral, L. A. N., Stanley, H. E. & Åberg, Y. The web of human sexual contacts. Nature 411, 907–908 (2001).
Article ADS CAS PubMed Google Scholar
Rocha, L. E., Liljeros, F. & Holme, P. Simulated epidemics in an empirical spatiotemporal network of 50,185 sexual contacts. PLoS computational biology 7, e1001109 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Salathe, M. et al. Digital epidemiology. PLoS computational biology 8, e1002616 (2012).
Article CAS PubMed PubMed Central Google Scholar
Salathé, M. et al. A high-resolution human contact network for infectious disease transmission. Proc. Natl. Acad. Sci. 107, 22020–22025 (2010).
Article ADS PubMed Google Scholar
Danon, L. et al. Networks and the epidemiology of infectious disease. Interdiscip. perspectives on infectious diseases 2011 (2011).
Article Google Scholar
Christakis, N. A. & Fowler, J. H. Social network sensors for early detection of contagious outbreaks. PloS one 5, e12948 (2010).
Article ADS PubMed PubMed Central Google Scholar
Aamodt, K. Cc2431 location engine. Appl. Note AN042 (Rev. 1.0), SWRA095, Tex. Instruments (2006).
Mills, C. E., Robins, J. M. & Lipsitch, M. Transmissibility of 1918 pandemic influenza. Nature 432, 904–906 (2004).
Article ADS CAS PubMed Google Scholar

Download references

Acknowledgements

Thanks to Enys Mones, Vedran Sekara, and Piotr Sapiezynski for helpful discussions and Radu Gatej for technical assistance. This work was supported by the Villum Foundation [Young Investigator Program ‘High Resolution Networks’ grant (to S.L.)], The Independent Research Fund [Sapere Aude Program ‘Micro dynamics of influence in social systems’], and the University of Copenhagen (UCPH Excellence Programme for Interdisciplinary Research “Social Fabric” grant). Funders had no role in design of the study and collection, analysis, and interpretation of data, nor did funders have a role in writing the manuscript.

Author information

Authors and Affiliations

Technical University of Denmark, Lyngby, Denmark
Arkadiusz Stopczynski & Sune Lehmann
Media Lab, Massachusetts Institute of Technology, Cambridge, MA, USA
Arkadiusz Stopczynski & Alex ‘Sandy’ Pentland
Niels Bohr Institute, University of Copenhagen, Copenhagen, Denmark
Sune Lehmann

Authors

Arkadiusz Stopczynski
View author publications
You can also search for this author in PubMed Google Scholar
Alex ‘Sandy’ Pentland
View author publications
You can also search for this author in PubMed Google Scholar
Sune Lehmann
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors designed the research. A.S. carried out the simulations. A.S. and S.L. drafted the paper. All authors read and approved of the final manuscript.

Corresponding author

Correspondence to Sune Lehmann.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Stopczynski, A., Pentland, A.‘. & Lehmann, S. How Physical Proximity Shapes Complex Social Networks. Sci Rep 8, 17722 (2018). https://doi.org/10.1038/s41598-018-36116-6

Download citation

Received: 24 July 2018
Accepted: 14 November 2018
Published: 07 December 2018
DOI: https://doi.org/10.1038/s41598-018-36116-6

Keywords

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.