Abstract
Social interactions among humans create complex networks and – despite a recent increase of online communication – the interactions mediated through physical proximity remain a fundamental way for people to connect. A common way to quantify the nature of the links between individuals is to consider repeated interactions: frequently occurring interactions indicate strong ties, such as friendships, while ties with low weights can indicate random encounters. Here we focus on a different dimension: rather than the strength of links, we study physical distance between individuals when a link is activated. The findings presented here are based on a dataset of proximity events in a population of approximately 500 individuals. To quantify the impact of the physical proximity on the dynamic network, we use a simulated epidemic spreading processes in two distinct networks of physical proximity. We consider the network of shortrange interactions defined as d \({\boldsymbol{\lesssim }}\) 1 meter, and the longrange which includes all interactions d \({\boldsymbol{\lesssim }}\) 10 meters. Since these two networks arise from the same set of underlying behavioral data, we are able to quantitatively measure how the specific definition of the proximity network – shortrange versus longrange – impacts the resulting network structure as well as spreading dynamics in epidemic simulations. We find that the shortrange network – consistent with the literature – is characterized by denselyconnected neighborhoods bridged by weak ties. More surprisingly, however, we show that spreading in the longrange network is quite different, mainly shaped by spurious interactions.
Introduction
Social interactions among humans form complex networks. While these interactions have recently begun to occur via many different channels – email, social networks, texts, and calls^{1} – the interactions mediated through physical proximity remain a fundamental way for people to connect^{2}. A common way to quantify the nature of a link is to consider repeated interactions: frequently occurring interactions indicate strong ties, such as friendships, while ties with small weights can indicate random encounters. Here we focus on a different dimension: rather than the strength of links, we study physical distance between individuals when a link is activated. Using epidemics as an example application, we show that changing of our definition of what constitutes a social tie based on the distance of pairs of individuals leads to strong structural differences in the resulting networks and quantify those differences.
The findings presented here are based on a dataset of proximity events in a population of approximately 500 students at the Technical University of Denmark^{3}. These students are densely interconnected via networks of interactions, both virtual (Facebook, calls, texts) and based on physical proximity (both within university campus and outside). The full dataset – known as the Copenhagen Networks Study – contains two years of highresolution records of students’ activity (the aforementioned networks along with GPS location and questionnaires), collected primarily through smartphones distributed to students at the beginning of their university education. Here, we explore the dynamic network where every person is represented by a node, and two nodes are connected if they are within certain physical distance d of each other. While this network is small from the perspective of populationlevel epidemiological studies, the access to physical proximity sampled at the 5minute level, provides very a detailed view of possible empirical spreading paths (see Table 1 for details).
To quantify the impact of the physical proximity on the dynamic network, we use a simulated epidemic spreading processes in two distinct networks of physical proximity. We consider the network of shortrange interactions defined as \(d\lesssim 1\) meter, and the longrange network which includes all interactions \(d\lesssim 10\) meters^{4}. Below we show that the shortrange and longrange networks are fundamentally different in terms of structure and dynamics.
The key novelty of this work arises from the fact that we are able to explore dynamics of two distinct types of spreading mechanisms (in many ways similar to e.g. droplet vs. airborne spreading mechanisms) based on the same underlying empirical behavioral data. Because we are able to consider two fundamentally distinct networks arising from a single underlying dataset, we can be certain that the differences in infection patterns are related solely to differences in how the disease is able to spread on each of the networks. This implies that differences in spreading patterns are not due to other differences in behavior that one might encounter when comparing two disparate datasets of actual human behavior, such as mobility, culture, population density, demographics, etc. Similarly, having both short and longrange networks directly observable allows us to sidestep creation of synthetic networks via randomization schemes.
In the literature on physical proximity, the tacit rules of human interactions in physical space have been an object of interest since the 1950’s^{5,6,7}. Yet little is known about how the structure of persontoperson proximity networks change as we vary the definition of which distance between two individuals corresponds to a connection between the two. Previous research into proximity networks has been based on selfreported data^{6,8,9} or tightlycontrolled laboratory observation^{5}.
We expect the social network of individuals to be closely related to the structure of the shortrange network, but with some differences. This similarity arises because, in social networks, the difference between friend and stranger is typically expressed via different personal spaces for each social category^{6}. Interactions with individuals with whom we are not familiar tend to occur at larger distances (we use term ‘interaction’ for all proximity events, including the longrange network). Since people function in bounded spaces, however, we do not have complete freedom to only allow friends to be physically close to us. Rides on buses, random meetings in elevators, or busy dining halls force us to be in close proximity to strangers. Thus, while the majority of our proximity interactions are with friends and families, our interactions network is not fully explained by the underlying social network, as expressed by, for example, link strengths. The longrange networks contains all of the links in the shortrange network, but in addition also spurious connections to people passing by and the ‘familiar strangers’^{10,11}, those individuals we encounter repeatedly but have never gotten to know. Thus, considering the proximity of pairs engaging in interactions and moving beyond simply considering the weight of the links in the network, provides a new source of information regarding potential spreading paths.
From the network science literature we know that social networks exhibit nontrivial structure on every level from degree distribution^{12,13}, over motifs^{14,15}, to communities^{16,17,18}, and at time even an overall hierarchical organization^{19}. In the light on the research on physical proximity discussed above it is interesting to keep these key findings from the social networks literature in mind as we explore the differences between the shortrange and longrange networks.
Results
The proximity networks are based on Bluetooth scans providing a measure of pairwise proximity between N = 464 highlyconnected participants – freshmen students at a large university^{3}. We define an interaction between users i, j in a 5minute timebin t (the smartphone were configured to scan for nearby devices every 5 minutes) as γ_{ijt} = s, where the signal strength s is reported by the handsets as received signal strength indicator (RSSI). Two users are considered to be interacting within a given timebin if their phones registered each other at least once in that timebin, regardless of the reported signal strength. This denselyconnected dynamic network of all Bluetooth interactions is based on a total of 1472 094 interactions, taking place over 28 days. RSSI, measured in dBm, is defined as the observed signal power relative to 1 mW.
The longrange, sampled longrange, and shortrange network
The longrange network is created by interactions occurring at any distance covered by Bluetooth range, between 0 and 10–15 meters. In order to capture only close range interactions, we establish the shortrange network by selecting the subset of interactions with γ_{ijt} ≥ −75 dBm corresponding to distances of approximately 1 meter or less^{4} (see Supplementary Information for additional details on the choice of threshod). The shortrange network consists of f = 18.3% of all interactions.
Since the shortrange network contains only a fraction of all interactions, the simulated spreading processes taking place on this network are trivially slower and smaller than processes occuring on the longrange network. The intuitive reason for this is that with an average of one fifth of the interactions, a node in the shortrange network has correspondingly fewer opportunities of spreading a disease than in the longrange network. The difference in number of interactions therefore prevents us from directly comparing the interplay between structure and dynamics of spreading processes for the shortrange and longrange networks using simulated disease models with the same parameters.
In order be able to compare directly, we create a sampled longrange network, which contains the same fraction of interactions as the shortrange network, but chosen at random among all interactions (see Fig. 1a). As we argue below, the sampled longrange network thus contains both close and distant interactions and shares most topological properties with the full longrange network, while based on precisely the same number of interactions as the shortrange network.
Link weights in the three networks
We start our analysis by studying similarities and differences in the the distribution of link weights between the three networks (longrange, sampled longrange, and shortrange). For each of the networks, we calculate the weights as described below, using the longrange network as an example. We first create an adjacency matrix A_{i×j×t} with timebins t containing interactions aggregated over 5 minute intervals corresponding to the Bluetooth scanning rate. This matrix has entries a_{ijt} = 1 when an interaction is present and a_{ijt} = 0 otherwise. The weight w_{ij} of a link connecting two individuals is defined as the total number of interactions occurring on that link \({w}_{ij}={\sum }_{t}\,{a}_{ijt}\). Note that because the sampled longrange network is generated by sampling interactions at random from the full network, it is possible to calculate the weight distribution for this network analytically.
We use a number of closely related (but distinct) terms to describe connections between pairs of individuals. A quick overview of terms are: Interaction: A single measurement of proximity between a pair of individuals. Signal strength: The RSSI measured by a smartphone for a single interaction. The signal strength can be considered a measure of distance. Link: An abstract description of the connection between two individuals, and implies at least one interaction. Links are sometimes denoted ties or connections in the literature. Weight: Number of interactions observed on a given link; sometimes called strength in the literature.
As shown in Fig. 1b, the distribution of linkweights in all three networks is broad with many weak links (containing few interactions) and a small number of links of very high weight. The shortrange network and the sampled longrange network contain the same number of interactions, but the number of resulting links in the two networks is strikingly different. The approximately 1.4 million interactions in the full longrange network are distributed across 42 838 links, resulting in an average linkweight of a little over 34 interactions per link. We create the shortrange and sampled longrange networks by removing 81.7% of the interactions from the full longrange network, leaving 269 094 interactions in both of these networks. The resulting number of links is much higher in the longrange network. Averaged over 100 realizations, this network has 26 511 ± 68 links, corresponding to around 61.9% of the links in the full longrange network. In contrast, the shortrange network has only 13 474 links corresponding to only 31.5% of the links in the full longrange network. These differences are illustrated in the Fig. 1b inset.
Let us investigate these difference with respect to link weight in further detail. First, let us consider the weakest links. In terms of low weight links the sampled longrange network simply retains around f = 18.3% of the longrange network’s links, with small differences. The reason for these differences can be understood by considering links with weight 1. Of course, (100–18.3)% of links with weight 1 are removed, but the sampling process also creates new links of weight 1 by downsampling the weight of some number of links with weight 2, 3, etc. In the shortrange network a much higher fraction of links with weight 1 are removed, this network has about half as many links with weight 1 as we find in the sampled shortrange network.
Now, considering highweight links we find that these links in the shortrange network are relatively unaffected by removing interactions according to physical distance: in the shortrange network we find that the highestweight links typically maintain ~80% of their interactions). This is in stark contrast to the sampled longrange network, where linkweight is depleted in proportion to the sampling fraction, and highweight links maintain only ~18% of the interactions from the full longrange network.
In summary, the weight distribution in the shortrange network suggests that friends (with highweight links) tend to be physically close and that most lowweight links correspond to random encounters (encounters between strangers), consistent with results on interaction distance from both quantitative measurements^{4} as well as sociology^{6}.
Differences in local structure
The key comparison is between the shortrange network and the two longrange networks. Since our sampling is uniform over interactions, we expect the sampled longrange to be structurally very similar to the full longrange network, with weights decreased proportional to the downsampling fraction. As we discuss above, however, many lowweight links disappear as part of the sampling process, and the overall network structure is complex, reflecting nontrivial and highly correlated underlying social behaviors. Therefore, it is useful to quantitatively confirm that the structure of the longrange and sampled longrange remain remarkably similar – and distinct from the shortrange network.
Starting from the single node perspective, we find important differences between the shortrange and the longrange networks. We can quantify this difference using the Shannon entropy. For a node i, we start from a link with neighbor j with weight w_{ij} and define \(\pi ({w}_{ij})={w}_{ij}/{\sum }_{k}\,{w}_{ik}\) to mean the fraction of the node’s total interactions taking place on that link. Now, we define the node entropy as \(S(i)=\,{\sum }_{j}\,\pi ({w}_{ij})\,{\mathrm{log}}_{2}\,\pi ({w}_{ij})\). Since infection probability is approximately proportional to link weight (see SI), this quantity can be interpreted as the expected number of yes/no questions needed to establish which of i’s links caused an infection. The distribution of entropy for all three networks is plotted in Fig. 2a. For the shortrange network (blue), the distribution peaks at 4 bits, corresponding to an effective group of 2^{4} = 16 potential sources of infection. Comparing the longrange (green) and sampled longrange (orange) networks, we find as expected that the distribution of node entropies are very similar, emphasizing the structural similarity between these two networks. The distribution for the sampled longrange network is created by averaging peruser entropy values over 100 random realizations of the sampled longrange network. Both peak at around 6 bits, corresponding to a larger effective group of 2^{6} = 64 potential sources of infection in this network.
These results provide a striking illustration of how the close proximity zone is preferentially reserved for strong ties (e.g. friends or acquaintances) while the distant zone is a more public space where many more random interactions happen, resulting in a correlation between physical proximity and tie strength as reported in ref. ^{9}.
Mesolevel structural differences
In the previous section we showed that in the shortrange network a large fraction of interactions takes place on highweight links. We now study the interplay between mesolevel network structure and linkweight in the shortrange and longrange networks. Specifically we are interested in the structures formed by the highest weight links. To explore these, we start building the networks from empty, adding their respective strongest links onebyone. As links are added, we keep track of the number of connected components in the network as well as total weight of interactions added through the links, revealing the differences in the networks with respect to the structures created by the heaviest links.
Figure 2b illustrates how the process of adding links gradually grows the longrange and shortrange networks, respectively. In the lower panel of Fig. 2b we show the number of the connected components and total number of interactions in the networks as the links are added. First, notice that the full and sampled longrange networks display identical behavior, with number of neighborhoods peaking with approximately 120 strongest links added. This behavior is consistent across 100 random realization of the sampled network. This is in contrast to the shortrange network, where the number of components continues to grow up to 240 heaviest links in the network.
In both types of networks, the strongest links in the network first create small isolated neighborhoods of highly interacting nodes. Figure 2b (upper panel) shows snapshots of the sampled longrange (orange) and shortrange (blue) networks at points (120, 250, 300 links) indicated on the plot below, illustrating this point. We see that at 250 strongest links the longrange network, a large connected component is beginning to form, making the network significantly more connected. At this point, the shortrange network, however, is still divided into many small neighborhoods. We also note that while the xaxis indicates the absolute number of the heaviest links added to the networks, the total number of interactions included in the networks at any number of links is strikingly different. In fact, it is important to underscore just how large a fraction of interaction are concentrated on the highweight links. The shortrange network has a total of 13 474 links and the sampled longrange network has ~26 500 links. Figure 2b (bottom panel), however, shows that in the shortrange network the 250 strongest links in the network account for approximately 50% of the interactions. In the longrange network the picture is less skewed. Here, the top 250 links account for approximately 25% of the interactions. Thus, while the percolation transition occurs for a very small number of highweight links in both networks, these links include a large fraction of the total number of interactions.
Our analysis shows, therefore, that the shortrange network not only contains fewer links than the sampled longrange network, but that the configuration of the heaviest links is more fragmented than in the longrange case. This structural property of the shortrange network, the highlyconnected neighborhoods bridged by weak ties, is consistent with well known structures found in other social networks, such as mobile phone networks and online social networks^{17,20,21,22}. In the longrange network, however, this structure is less pronounced, obscured by the presence of spurious links, distinct communities bridged by a small number of strong links not present in the shortrange network.
Spreading process is captured in neighborhoods
Having investigated differences between short and longrange networks with respect to structure, we now explore how the differences based on how diseases spread on the networks. Using a simple SusceptibleInfectedRecovered (SIR) model, we run simulations of a disease spreading across the networks. Our model is intentionally simplistic, intended to illustrate the structural differences between short and fullrange transmission, rather than emulate a specific disease. We use the actual temporal sequence of proximity interactions observed in the data, choosing parameter values to create a situation where large outbreaks are likely, but not guaranteed (see Methods for details of the epidemic modeling). While we report results for a specific choice of parameters and a single realization of the sampled longrange network, these results are robust across a wide range values of the transmission parameters and realizations of the sampled network.
Based on the structural analysis, our hypothesis is that, in the shortrange network, the simulated pathogen tends to be more contained within small sets of highly interacting individuals. We quantify the containedincommunities behavior as follows. For each infection event, occurring on link w_{ij}, where node i infects node j, we measure which fraction I_{j} of the node’s direct (1hop) neighborhood has already been infected. Since this is a weighted network, we define \({I}_{j}={W}_{\{i\}}^{1}\,{\sum }_{k\in {\mathcal I} (j),k\ne i}\,{w}_{jk}\), where \( {\mathcal I} (j)\) is the set of j’s infected neighbors and \({W}_{\{i\}}={\sum }_{k\ne i}\,{w}_{jk}\) is the sum of all weights excluding the infecting link. A value of I_{j} = 0 indicates that noone in the direct neighborhood besides the infecting node has been yet infected; a value of I_{j} = 0.5 indicates that neighbors accounting for 50% of link weights connecting to j have already been infected. Figure 3a shows a kernel density estimation of I as a function of the fraction of infected nodes, based on 500 runs of the spreading process in the shortrange (left), sampled longrange (middle), and longrange (right) networks.
In the case of the shortrange network, we observe behavior which suggest that the spreading agent is indeed slowed by neighborhoods, consistent with behavior of both simulated and real spreading processes found in the literature^{23,24,25,26,27}. As is evident from Fig. 3a, early in the epidemic outbreak, when the fraction of infected nodes is low, the disease agent can saturate small neighborhoods and infect new nodes in neighborhoods, where a large fraction (I > 0.80) of neighbors are already infected. Conversely, it is still possible to find neighborhoods with a low fraction (I < 0.20) of infected nodes very late in the outbreak. These effects are possible because the spreading agent does not jump easily between neighborhoods of densely connected nodes.
The disease spreading is very different in the full and sampled longrange cases. In contrast to the containedincommunities picture, the infection progresses smoothly through the network. In the longrange networks, the neighborhood infection is more closely proportional to the fraction F of the total network infected. Cuts at particular levels of overall network infection F in Fig. 3b show that the pattern of more spreadout I in the shortrange network is consistent through the spreading progression and across random starting conditions (seed node and time) Visually, the distributions of I at given F are narrower for the longrange networks, with peak values of neighborhood infection I closer to values of overall network infection F. To quantify this effect, we consider the distribution of R^{2} of a linear model fitting infection of the neighborhoods I to the progress of the infection (fraction of network infected F), calculated for each of the aforementioned 500 realizations of an epidemic, the distribution of R^{2} peaks at around 0.4 in the shortrange network vs 0.75 in the two longrange networks, as shown in Fig. 3c. This indicates that direct proportionality between the global (F) and local (I) infection level is a significantly better model for the longrange networks.
Thus we find, that while – in the shortrange network – the infection tends be captured inside closely connected communities, the picture is quite different in the longrange network. While both types of behavior has been described in the literature^{8,23,24,25,26,27,28}, the important finding in this context is that the two networks are representations of the same underlying behavioral data originating from a single population. These findings underscore how longrange spreading dramatically taps into spurious connections outside the social networks, resulting in fundamentally different types of spreading – in some ways mimicking the differences between droplet and airborne spreading mechanisms^{29,30,31,32}.
Community structure increases infectedinfected interactions
Our analysis of link weights showed that the shortrange network tends to have fewer links with more interactions on each link. But why is the disease trapped within communities in the first place? One of the reasons that an infection remains ‘stuck’ in a neighborhood is that a disease can only spread via interactions between infected and susceptible nodes. Thus, if a local group is fully infected, we tend to see a large fraction of infectedinfected interactions, which cannot help spread the disease. In Fig. 4a we quantify this tendency, by plotting how frequently infectedinfected are active in the sampled longrange and shortrange network, respectively.
We observe a clear difference between two networks. In the sampled longrange network, where the local connection patterns have high entropy, there is only a low level of activity among infected or recovered individuals. The spreading agent quickly reaches the entire network due to a large number of available susceptibleinfected links. This behavior is in contrast to the shortrange network, where infectedinfected interactions present a larger fraction of interaction events. Thus, as above, given the same number of interactions and the same underlying behavioral data, outbreaks are significantly slower and more contained in the shortrange network relative to the sampled longrange case (Fig. 4b).
Statistics of spreading outcomes
Finally, in Fig. 5 we summarize a number of statistics related to disease spreading in the three networks. These results confirm that the structural differences between the shortrange and longrange interaction networks discussed above lead to reliably different outcomes in simulated epidemics. Firstly, in Fig. 5a, we show that when the outbreaks do happen in the shortrange network, they are smaller in terms of total number of nodes infected. Moreover, the probability that an outbreak is contained – reaching only a small fraction of the network (<20%) – is higher in the shortrange network than in the longrange networks (Fig. 5a inset). Finally, the time an infection needs to reach 50% of the shortrange network is significantly longer, with the peak of the distribution for sampled longrange network occurring after 7 days, while the shortrange network the peak is delayed to 10 days (Fig. 5).
Thus, consistent with the literature shortrange shortrange interactions are organized in a way that slows down spreading relative to the longrange case. The sampled longrange network features precisely the same number of interactions as the shortrange network, but is structurally more similar to the full longrange network according to the measures considered here. Our results show that taking the physical distance of interactions into account results in networks that can significantly alter the outcome of a simulated outbreak. The qualitative behavior described above is reproduced across a wide range of parameter values.
Discussion
We have demonstrated a strong structural difference between the shortrange networks that support shortrange transmission processes and the longrange networks that support transmission across distances up to 10 meters. Summarizing our findings, we find that the proximity of interactions correlates with linkweight: on average we stay closer to our friends. In the shortrange network, we find spreading patterns consistent with our knowledge of spreading on various online social networks and modeling studies^{23,24,25,26,27}. In the longrange network we observe a large proportion of proximity interactions between individuals with weak or absent social ties, resulting in a complex local network structure. This nonsocial ‘noise’ in the network allows for faster and more powerful outbreaks to take place, even when considering the exactly same number of interactions, consistent with results of synthetic proximityaware spreading simulations^{33}.
It is, of course, well known that that the definition of ‘interaction’ impacts the network structure and spreading dynamics. For example, networks of sexual contacts are analyzed separately from other types of pathogen spread^{34,35}, even though both types of networks are physical interactions networks. A central work in understanding role of physical proximity is by Read et al.^{8}, where questionnaire data regarding ‘close’ and ‘distant’ interactions were collected from 49 participants over 14 nonconsecutive days. This study, however, did not address how differences in mode of transmission can affect the network of infections. Recently, a multitude of new approaches have been developed for collecting data regarding close interactions with the purpose of modeling spreading using various methods, including Bluetooth, RFID, and questionnaires^{8,28,36,37,38,39}.
Here we argue that from the perspective of a spreading agent, the relatively subtle difference of what ‘interaction’ is in the shortrange and longrange networks makes an important difference, even given the same underlying social system. Our results suggest that longrange spreading is less related to the underlying social network and closer to a wellmixed system than simulations on purely social structures might lead one to suggest.
Methods
The dataset
The dataset used in this paper comes from the Copenhagen Networks Study^{3}. We use one month of data (February 2014). Out of 696 freshmen student participants active in that month we chose students with at least 60% of Bluetooth observations present (resulting median 80%) and who belong to a single connected component. Observations are defined as 5minute bins in which the user has performed scans, whether the scans contained any devices or not. Since Bluetooth scans do not result in false positives, we symmetrized the observation matrix (resulting in an undirected network), assuming that \({\gamma }_{ijt}\iff {\gamma }_{jit}\). This results in improved data coverage, with a median of 85% of 5minute containing data. More information regarding the dataset is provided in the Supplementary Information.
RSSI and interaction distance
The received Signal Strength Indicator (RSSI) can be used to estimate the distance between wireless devices^{40}. Sekara & Lehmann^{4} showed the stability of RSSI in modern mobile phones; the same phones were used in the Copenhagen Networks Study. Based on these results, we use γ_{ijt} = RSSI ≥ −75 dBm as an indicator that an interaction was closer than 1 meter. This value can be considered a conservative estimation, as the measurements in ref.^{4} have been performed without obstacles. Thus, we expect that γ_{ijt} ≥ −75 dBm may not include all the close interactions, but it should not include distant interactions. When the interaction matrix is symmetrized, we take the smallest distance (largest RSSI) that happened between users in given timebin γ_{ijt} = γ_{jit} = min(γ_{ijt},γ_{jit}).
We note that the approach presented here has some limitations. While all mobile phones used for data collection in the study were the same model and the obtained RSSI values are comparable in this sense, it is important to emphasize that our distance threshold is noisy; RSSI may differ depending on where the phone is placed, environmental conditions, etc. In that sense, our results can be considered a lower bound of the difference between the two types of networks, since a perfectly noisy threshold would produce two randomly sampled networks with no difference between them.
Epidemic simulations
To show the dynamics of the spreading process in the droplet and airborne networks we use a simple SusceptibleInfectedRecovered (SIR) simulation. We run a large number of simulations (N = 10 000) on the full temporal network, where every interaction between Infected and Susceptible participants can lead to infection with probability β = 0.02. Users stay in Infected state for μ_{t} = 7 days, after which they are moved to Recovered state and cannot be reinfected. The starting time bin and seed node are chosen at random in every simulation and used for simulation on all three networks (longrange, sampled longrange, and shortrange). We use one month of data (28 days, 8 064 5minute timebins) with periodic boundary conditions, having the 28 days repeating indefinitely. The parameter values are chosen so that outbreaks are likely, but not guaranteed and with sizes that do not trivially saturate the entire network. The parameters themselves as well as resulting epidemic curves (with peaks between 7 and 14 days) are consistent with these reported in the literature regarding both simulated and observed flu outbreaks^{8,37,41}. This model is intentionally simplistic, intended to illustrate the structural differences between full and shortrange transmission, rather than emulate a specific disease. The qualitative behavior of our analysis is unchanged across a wide range of parameter values.
References
 1.
Lazer, D. et al. Life in the network: the coming age of computational social science. Sci. (New York, NY) 323, 721 (2009).
 2.
Eagle, N., Pentland, A. S. & Lazer, D. Inferring friendship network structure by using mobile phone data. Proc. Natl. Acad. Sci. 106, 15274–15278 (2009).
 3.
Stopczynski, A. et al. Measuring largescale social networks with high resolution. PLoS One 9, e95978, https://doi.org/10.1371/journal.pone.0095978 (2014).
 4.
Sekara, V. & Lehmann, S. The strength of friendship ties in proximity sensor data. PloS One 9, e100915 (2014).
 5.
Sommer, R. Studies in personal space. Sociometry 247–260 (1959).
 6.
Hall, E. T. & Hall, E. T. The hidden dimension, vol. 1990 (Anchor Books, New York, 1969).
 7.
Eagle, N. & Pentland, A. Reality mining: sensing complex social systems. Pers. ubiquitous computing 10, 255–268 (2006).
 8.
Read, J. M., Eames, K. T. & Edmunds, W. J. Dynamic social networks and the implications for the spread of infectious disease. J. The Royal Soc. Interface 5, 1001–1007 (2008).
 9.
Mossong, J. et al. Social contacts and mixing patterns relevant to the spread of infectious diseases. PLoS medicine 5, e74 (2008).
 10.
Milgram, S., Sabini, J. E. & Silver, M. E. The individual in a social world: essays and experiments (McgrawHill Book Company, 1992).
 11.
Sun, L., Axhausen, K. W., Lee, D.H. & Huang, X. Understanding metropolitan patterns of daily encounters. Proc. Natl. Acad. Sci. 110, 13774–13779 (2013).
 12.
Barabási, A.L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).
 13.
Eubank, S. et al. Modelling disease outbreaks in realistic urban social networks. Nature 429, 180–184 (2004).
 14.
Milo, R. et al. Network motifs: simple building blocks of complex networks. Science 298, 824–827 (2002).
 15.
Szell, M., Lambiotte, R. & Thurner, S. Multirelational organization of largescale social networks in an online world. Proc. Natl. Acad. Sci. 107, 13636–13641 (2010).
 16.
Girvan, M. & Newman, M. E. Community structure in social and biological networks. Proc. national academy sciences 99, 7821–7826 (2002).
 17.
Onnela, J.P. et al. Structure and tie strengths in mobile communication networks. Proc. Natl. Acad. Sci. 104, 7332–7336 (2007).
 18.
Fortunato, S. Community detection in graphs. Phys. reports 486, 75–174 (2010).
 19.
Ahn, Y.Y., Bagrow, J. P. & Lehmann, S. Link communities reveal multiscale complexity in networks. nature 466, 761–764 (2010).
 20.
Bakshy, E., Rosenn, I., Marlow, C. & Adamic, L. The role of social networks in information diffusion. In Proceedings of the 21st international conference on World Wide Web, 519–528 (ACM, 2012).
 21.
Bagrow, J. P., Lehmann, S. & Ahn, Y.Y. Robustness and modular structure in networks. Netw. Sci. 3, 509–525 (2015).
 22.
Scott, J. Social network analysis (Sage, 2017).
 23.
Fowler, J. H. & Christakis, N. A. Dynamic spread of happiness in a large social network: longitudinal analysis over 20 years in the framingham heart study. Bmj 337, a2338 (2008).
 24.
Mucha, P. J., Richardson, T., Macon, K., Porter, M. A. & Onnela, J.P. Community structure in timedependent, multiscale, and multiplex networks. Science 328, 876–878 (2010).
 25.
Salathé, M. & Jones, J. H. Dynamics and control of diseases in networks with community structure. PLoS computational biology 6, e1000736 (2010).
 26.
Karsai, M. et al. Small but slow world: How network topology and burstiness slow down spreading. Phys. Rev. E 83, 025102 (2011).
 27.
Cauchemez, S. et al. Role of social networks in shaping disease transmission during a community outbreak of 2009 h1n1 pandemic influenza. Proc. Natl. Acad. Sci. 108, 2825–2830 (2011).
 28.
Xiao, X., van Hoek, A. J., Kenward, M. G., Melegaro, A. & Jit, M. Clustering of contacts relevant to the spread of infectious disease. Epidemics 17, 1–9 (2016).
 29.
Garner, J. S. Guideline for isolation precautions in hospitals. Infect. control 17, 54–80 (1996).
 30.
Weinstein, R. A., Bridges, C. B., Kuehnert, M. J. & Hall, C. B. Transmission of influenza: implications for control in health care settings. Clin. infectious diseases 37, 1094–1101 (2003).
 31.
Klovdahl, A. S. et al. Networks and tuberculosis: an undetected community outbreak involving public places. Soc. science & medicine 52, 681–694 (2001).
 32.
Liverman, C. T. et al. Preparing for an influenza pandemic: Personal protective equipment for healthcare Workers (National Academies Press, 2007).
 33.
Sun, P., Cao, X.B., Du, W.B. & Chen, C.L. The effect of geographical distance on epidemic spreading. Phys. Procedia 3, 1811–1818 (2010).
 34.
Liljeros, F., Edling, C. R., Amaral, L. A. N., Stanley, H. E. & Åberg, Y. The web of human sexual contacts. Nature 411, 907–908 (2001).
 35.
Rocha, L. E., Liljeros, F. & Holme, P. Simulated epidemics in an empirical spatiotemporal network of 50,185 sexual contacts. PLoS computational biology 7, e1001109 (2011).
 36.
Salathe, M. et al. Digital epidemiology. PLoS computational biology 8, e1002616 (2012).
 37.
Salathé, M. et al. A highresolution human contact network for infectious disease transmission. Proc. Natl. Acad. Sci. 107, 22020–22025 (2010).
 38.
Danon, L. et al. Networks and the epidemiology of infectious disease. Interdiscip. perspectives on infectious diseases 2011 (2011).
 39.
Christakis, N. A. & Fowler, J. H. Social network sensors for early detection of contagious outbreaks. PloS one 5, e12948 (2010).
 40.
Aamodt, K. Cc2431 location engine. Appl. Note AN042 (Rev. 1.0), SWRA095, Tex. Instruments (2006).
 41.
Mills, C. E., Robins, J. M. & Lipsitch, M. Transmissibility of 1918 pandemic influenza. Nature 432, 904–906 (2004).
Acknowledgements
Thanks to Enys Mones, Vedran Sekara, and Piotr Sapiezynski for helpful discussions and Radu Gatej for technical assistance. This work was supported by the Villum Foundation [Young Investigator Program ‘High Resolution Networks’ grant (to S.L.)], The Independent Research Fund [Sapere Aude Program ‘Micro dynamics of influence in social systems’], and the University of Copenhagen (UCPH Excellence Programme for Interdisciplinary Research “Social Fabric” grant). Funders had no role in design of the study and collection, analysis, and interpretation of data, nor did funders have a role in writing the manuscript.
Author information
Affiliations
Contributions
All authors designed the research. A.S. carried out the simulations. A.S. and S.L. drafted the paper. All authors read and approved of the final manuscript.
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Stopczynski, A., Pentland, A.‘. & Lehmann, S. How Physical Proximity Shapes Complex Social Networks. Sci Rep 8, 17722 (2018). https://doi.org/10.1038/s41598018361166
Received:
Accepted:
Published:
Keywords
 Shortrange Networks
 Longrange Case
 Network Proximity
 Received Signal Strength Indicator (RSSI)
 Higher Link Weights
Further reading

Interaction data from the Copenhagen Networks Study
Scientific Data (2019)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.