Introduction

Infectious individuals can contribute differently to new infections and variation in contacts among hosts is one of the most important factors contributing to unequal pathogen transmission1,2. Contact rates vary because of differences in individual traits, including individual behavior, as well as changes in the overall contact patterns along time and across space. Network theory has been extensively applied to understand and model contact patterns and epidemiology is one of the most active areas in which network theory is applied3. Networks have been used extensively to describe the underlying contact patterns for sexually transmitted diseases and other directly transmitted diseases across large spatial scales. Recently, characterization of contact networks within community settings such as hospitals, schools, or households has been recognized as necessary to accurately predict transmission dynamics and identify interventions for diseases that require close contacts. Confined environments may have spatial “hotspots” for disease transmission and defining the contact network is integral for effective control programs. Transmission in hospitals contributed approximately 50% of all secondary infections of severe acute respiratory syndrome in Hong Kong in 20034 and transmission within schools fueled the fall 2009 wave of pandemic influenza H1N1 in the United States5. Technology advances, such as proximity loggers and radio-frequency identification devices, have facilitated the construction of high-resolution contact networks relevant to infectious diseases that require close contacts among hosts6,7,8. Analysis of generated contact networks have confirmed high contact rates in community settings and provided insights into the network structure in these types of settings and their implication for disease transmission6,7,8,9.

In farm animal agricultural settings, contact networks have been used to describe farm-to-farm disease transmission at large spatial scales across country or between countries10,11,12,13,14,15,16,17. Links between farms are constructed using animal movement databases. However, the contact structure within premises and at lower scales such as pen level are less understood despite their influence on disease transmission, especially for highly transmissible diseases18,19. The characterization of animal-to-animal contact networks at higher temporal and spatial resolution within farm animal groups has been rarely investigated and therefore it is unknown how these networks change over time, including their structure (e.g. degree distribution), the role of individuals (e.g. degree order, individual rank of degree distribution in the network) and the implications in the modeling approaches used to describe transmission. Homogeneous-mixing mean-field compartmental models with constant contact numbers through time and among individuals are commonly assumed and applied at these levels.

In this study, we constructed animal contact networks using real-time animal position data at high temporal and spatial resolutions for three groups of calves. Our objectives were to quantify individual and temporal (within-day and between-day) heterogeneity in animal contact networks at the pen level and assess the implications of these sources of heterogeneity in disease transmission. Toward that purpose, we incorporated sources of identified heterogeneity (e.g. temporal heterogeneity in contact networks and changes in network order) into a contact-based disease transmission model within an agent-based modeling framework. The validity of some commonly used assumptions in disease transmission models (e.g. constant number of contacts among individual and over time) is investigated and discussed.

Results

Dynamic Contact Network

Degree distributions for the aggregated number of contacts at the pen level for the complete period of observation (192-h) were better characterized by a gamma distribution than a normal distribution, indicating that the degree distributions were skewed (Supplementary Fig. 1). A detailed description of the parameters and goodness of fit is provided in Table S1a. The degree distributions of hourly contact networks (during 2–3 am interval, shortened as 2 am thereafter; 8 am, 2 pm and 8 pm hourly interval for pen #1) were better characterized by a normal distribution than a gamma distribution when the animals were more active (at 8 am, 2 pm and 8 pm intervals), while the degree distribution fit better with a gamma distribution during 2 am, when the animals were inactive and the number of contacts was substantially lower than at other times of day and therefore the network was more sparse (Supplementary Table 1, Supplementary Fig. 2).

The time series of hourly number of contacts is shown in Fig. 1 for all three pens. The analysis of variance (ANOVA) was carried to investigate temporal and individual variability. ANOVA results revealed highly significant hourly variability nested in day (P = 0.003), individual variability nested in pen (P = 0.001) and at pen level (P = 0.003) but marginal insignificance between day variability (P = 0.06). Further spectrum analysis confirmed a peak frequency correlated to an approximate 6.25-h period in number of contacts. Thus there was a clear pattern of diurnal cycle in network degree distribution.

Figure 1
figure 1

Observed time series of total degree in the animal contact network.

Total degree is the sum of the degree of all node/individual cattle, which, in this study, is equivalent to the total number of contacts within each hour. Grey area is bracketed by the 1st and 3rd quantiles. A clear diurnal cycle of number of contacts exists for all three pens. Such cycle vanishes if aggregated at the daily level.

The quadratic assignment procedure (QAP) was applied to quantify pairwise contact network structure similarity for different contact rate intervals (see below). QAP could reveal changes in both network topology and roles in individual node. QAP results showed that the structure of the contact network was 90.33% similar between any two low-contact rate intervals (from 2 am to 5 am) on average, was 79.62% similar between any two high-contact rate intervals (at 8 am, 2 pm and 8 pm intervals) and was only 46.67% similar between a low- and a high-contact rate interval. These results further showed that changes in the network structure (actual correlation between different nodes/calves) were not uniform during different hours in a day, besides the network degree distribution variability described above. An example of actual network structures is shown in Fig. 2.

Figure 2
figure 2

Observed animal contact networks during 2 AM, 8 AM, 2 PM and 8 PM.

Showing 21 cattle in Pen #1 on August 11, 2011. Line width is proportional to the number of contacts in that time period, i.e. the thickest line corresponds to the largest number of contacts between two cattle. Thickness of the lines is not directly comparable between different hours.

The degree order was also highly variable throughout the observation period. The distribution of observed mean degree order of animals in each pen was very different from the hypothetical condition that the order of the network was consistent throughout the period (Fig. 3). This was further supported by the two-sided Kolmogorov-Smirnov (K–S) test results showing that none of the observed distributions of mean rank in the three pens was similar to the hypothetical population with constant degree order (P = 0.001 for all three pens, K–S test). Thus in the observed contact network, a calf active during a previous hour may became less active in successive periods, with no clear predictable pattern.

Figure 3
figure 3

Mean degree order of animals in observed and hypothetical pens.

The rank of each animal in each hour was recorded and averaged for the entire 192-h period. For comparison, we simulated a hypothetical group of 21 individuals with constant rank (from 1 to 21) through the entire period. Note pen #3 had 27 individuals instead of 21 as in pen #1 and #2.

Modeling Disease Transmission in a Dynamic Network

The time series of mean disease prevalence for all four conditions are presented in Fig. 4. Characteristics of disease transmission such as maximum prevalence, its associated occurrence date, outbreak size (n), duration of outbreak (Tf) and basic reproduction number (R0) are summarized in Table 1. The maximum prevalence occurrence date, outbreak size and the duration of outbreak did not differ substantially among the four conditions for parameter set 2 (higher R0), whilst in parameter set 1 (lower R0) these characteristics were more distinct. There were substantial differences in the maximum prevalence and the numerical R0. Of the four conditions in both parameter sets, R0 was highest in C3 (with degree distribution change but no degree order change) and lowest in C2 (with degree order change but no distribution change) and it was consistent with other characteristics such as maximum prevalence, outbreak size and duration of outbreak. In general, conditions with no degree order change (i.e. constant network degree order throughout time) had substantially higher maximum prevalence than those with degree order changes, for both parameter sets (set 1: C1 − C2 = 4.89; C3 − C4 = 4.82; set 2: C1 − C2 = 10.47; C3 − C4 = 3.12). This was expected because the individuals with consistently higher degree order would have more contacts through time, which caused a higher probability of infection. The conditions with temporal variability yielded higher maximum prevalence than no temporal variability counterparts for both parameter sets (set 1: C3 − C1 = 3.69; C4 − C2 = 3.76; set 2: C3 − C1 = 1.89; C4 − C2 = 9.24), indicating the influence of temporal variability in disease dynamics as well.

Table 1 Comparison of disease characteristics in four simulated conditions
Figure 4
figure 4

Time series of mean daily prevalence under four simulated conditions.

C1: no temporal variability nor degree order change; C2: no temporal variability with order change; C3: temporal variability with no order change; and C4: temporal variability with order change. The dynamics of these four conditions vary substantially for both parameter sets. (A) . (B) . The time series data are in hourly resolution and the figure is shown/labeled at a daily resolution.

The difference (D) of time series of prevalence between each pair of the four conditions within each parameter set (D12, D13, D14, D23, D24 and D34) was computed and fitted to an ARIMA (Auto-Regressive Integrated Moving Average) model. None of the three fitted coefficients, (p, d, q) of these six conditions under either parameter set resembled white noise, for which the ARIMA parameters should be 0, 0, 0. The results demonstrated the statistically distinct disease dynamics of these four conditions. In general, the disease dynamics of C1 and C4 were more similar (Fig. 4 and Table 1) than other pairs of conditions. We believe including temporal variability tended to increase disease prevalence and incorporating degree order change tended to decrease transmission probability. These two factors acted in opposite directions and offset the effect of each other. Thus the final disease dynamics with both temporal and degree order change (C4) were similar to the condition with neither change (C1).

The Gini coefficients of these four conditions are presented in Table 1. In the first two conditions (C1 and C2, without temporal variability), the Gini coefficients were both very close to zero, indicating all individuals had almost equal contribution to the new infection, despite C1 having no degree order change and higher prevalence. In the latter two conditions (C3 and C4, with temporal variability), C3 had a larger Gini coefficient, indicating the individuals with constantly high degree order contributed to more new infections than the lower order ones.

Discussion

In this study, we have presented a high-resolution direct contact network of calves in a pen. We have found that resolution (or temporal/spatial scaling) substantially alters the observed pattern of contact structure. The degree distribution is less skewed at higher temporal resolution (in our study, 1 h) than at lower ones (1 d or longer period). Increasing to even higher temporal resolution, for example, at quarter-hour or even minute level, may further change network structure. However, as the resolution increases, the effect of system noise and stochasticity also increases, hence reducing the signal/noise ratio. Such scaling issues have been studied in landscape and conservation ecology28,29 but have rarely been addressed in epidemiology, especially for temporal scales. Homogeneous compartmental models assume that contact patterns within a population form a regular random network (3). However, we show that contact network degree distribution varies with both time and individual, suggesting that non-regular dynamic networks characterize the animal-to-animal contact network at the pen level better than a regular static network. Previous studies have either considered the importance of individual heterogeneity30,31,32 or temporal change in the contact network33, but lacked a unified framework to consider both factors simultaneously.

We have shown that the dynamic changes in the contact network are able to change the disease dynamics at the pen level. Furthermore, the network change has a larger effect for diseases with smaller R0 (e.g. R0 < 2, parameter set 1 in the simulation, as opposed to parameter set 2). For larger R0 conditions, although the disease dynamics are still statistically different across the four conditions, in practice they may not show substantial differences because of fast dynamics and large outbreak size. However, for smaller R0 conditions, the temporal variability in degree distribution and network order change (in C2) are further mingled with system stochasticity due to smaller transmission probability, resulting in a much smaller mean of R0 but with substantially larger variance and larger variance in outbreak size than in other conditions as well34. Other studies have demonstrated disease dynamics are further influenced by population size (especially smaller networks)35,36.

In our simulation of disease dynamics, we assume that the disease does not substantially change the individual's behavior. This implicit assumption is appropriate for non-clinical conditions and for demonstrating the importance of dynamic networks on disease dynamics. Nevertheless, in realistic systems modeling clinical diseases, animals may change their behavior during the infected stage, resulting in a different contact pattern37,38,39,40. Furthermore we have assumed frequency-dependent transmission through contact to simplify the model. To model more realistic directly transmitted disease systems and design effective controlling strategies, it is important to make the correct assumptions about transmission mode (frequency-dependent or density-dependent, or a combination of both), observe the actual contact network for all the individuals over time and understand both temporal and individual heterogeneity in the contact network41,42,43.

In summary, our study uses high temporal and spatial resolution observation data to reveal that the animal contact network is highly variable and dynamic, for both contact network structure and degree order. These differences in contact network structure are able to alter simulated disease dynamics and individual contribution to the new infections, especially for diseases with smaller R0. These findings can lead to better experimental design and more effective controlling strategies for diseases transmitted directly through contacts.

Methods

Investigating Animal Contact Network Structure

Animal contact networks were constructed from position data recorded by a wireless remote location system in three pens of calves over 8 d (21, 21 and 27 animals in pen #1, #2 and #3, respectively), as described in our previous study20 (also a brief summary in SI methods) and were undirected in this study (i.e. animal i contacting animal j implied animal j contacting animal i simultaneously). All the experiments were approved by and complied with animal regulation policy of the Kansas State University. A contact was defined as whenever two animals were within a distance of 1 foot (~0.3 m, about the length of a calf head) in a fixed time interval (10 s) and if two animals were in contact for several consecutive intervals, each 10-s interval would be regarded as an individual contact. Therefore, the contact networks not only described whether two animals were in contact, but also explicitly measured how many contacts (and the total duration of contacts, since each contact lasted for a fixed 10 s) were made in each given period. In this study the total number of contacts of each individual were aggregated at an hourly level, as well as at a daily level, for comparison. Among various quantitative measurements of network structure, centrality measurements were used to differentiate relative importance of the individuals in the group21,22,23,24. We computed degree centrality, which specifically measures the number of edges on a node (in this study, equivalent to one individual calf's total number of contacts with other animals in a 1-h period; one calf could have more than one contact with another calf in that period). The degree distributions of the contact networks at pen level (pen #1 through pen #3) and for all pens combined for the entire observation period were fitted with different probability distribution, including gamma and normal distributions with maximum likelihood methods and the goodness of fit was determined by a two-sided Kolmogorov-Smirnov (K–S) test. Contacts were further divided on hourly bases and the degree distribution of hourly networks was computed and compared against that of the entire observation period.

The degree order, which measures the order of individual degrees in the network (lowest number corresponding to 1, highest number corresponding to the number of cattle in the pen, in ascending order), was computed for each hour and averaged over the entire period (192 h) to investigate whether certain calves were consistently more active (consistently higher degree order) in the contact network throughout the time. A hypothetical population of animals with constant hourly degree order was simulated and the distribution of summed contact of each pen and the hypothetical population was compared by the K–S test to investigate whether the observed network order was consistent through time.

An analysis of variance (ANOVA) was performed to further test whether the degree distribution of the networks varied among different individuals in different pens and/or in different hours on different days. The hour factor was nested in the day and the individuals were nested in the pen. Furthermore, spectrum analysis was performed with timeSeries package in R to explore the periodic (within-day) change of the time series of aggregated degrees (number of total contacts of all calves in each pen) in the networks.

The ANOVA and spectrum analysis focused on the number of contacts (network degree distribution) and did not reveal the internal structural change of the networks (e.g. correlation of two individual calves across two different hours). Thus a quadratic assignment procedure (QAP) was applied to further investigate the similarities between the networks in different hours in a day25. The QAP was designed to measure the structure similarity (e.g. nominal, ordinal and interval associations, for both network topology and roles in individual node) between two networks with the same nodes (in our study, the same individual animals). The QAP was comprehensively applied between each pair of the networks in two different hours within the same day (total of 24 × 23/2 = 276 pairs per day). The entire day was divided into three different intervals based on number of contacts: low-contact intervals from 12 am to 5 am (at hourly interval, e.g. 12 am indicated 12 am to 1 am and the same held thereafter); high-contact intervals during 8 am, 2 pm and 8 pm; and the remaining hourly intervals were considered as having medium contact. The average percentage of similarity between low-contact intervals and other low-contact intervals (whenever P > 0.05 of QAPs) for all days across all three pens was calculated. The percentages of similarity between two high-contact intervals and between high- and low-contact intervals were also calculated. These results further measured and revealed the potential network structure change. To illustrate, the actual networks were plotted at four different hourly intervals in a given day (e.g., 2 am, 8 am, 2 pm and 8 pm, corresponding to animal sleeping, feeding and other social behaviors, on August 11, 2011, for each pen) for visualization of both temporal and individual heterogeneities.

Modeling Direct Transmitted Pathogen Dynamics

As shown in the results section, the actual contact network was highly dynamic, featuring substantial individual and temporal heterogeneity. Compartmental models such as directly transmitted SIR type (susceptible-infected-recovered) usually assume a constant number of contacts over time and for any individual. Therefore, with the typical compartmental model, the underlying network corresponds to a regular random network3. However, because the assumption of same degree distribution over time and among individuals was not consistent with our analysis of the observed contact network, it was necessary to investigate how changes in the contact network could further impact disease dynamics quantitatively26,27. To do so, the sources of heterogeneity (temporal change in degree distribution and individual rank) were incorporated in a simple discrete time, agent-based SIR-type model. The probability of infection of the ith susceptible individual in a time period ti,t) was a function of the number of pairwise contacts and was proportional to number of infected animals (j) at t: , where β0 was the transmission coefficient (a constant), Cji,t represented the number of pairwise contacts of animal i in time t, Ij,t and Nt represented jth infected animal and total animals in time t, respectively. Because of the closed population, NtN for any given t. Thus we could re-organize the expression to . To simplify the model, recovery was considered independent of contact; an infected individual had a constant recovery probability (γ) in any time and once it recovered from the infected state, it would stay in the recovered state, assuming no leaking or waning immunity. Two sets of parameters, and , were fed into the model, representing diseases with smaller and larger basic reproduction numbers (R0). A total of 100 individuals were simulated with one infected at the beginning of simulation; the other 99 animals were initially susceptible. The simulation lasted for 100 days, with a 1-h time step. The hourly number of contacts for each individual was simulated from the fitted distribution (see SI figure 2) using the mean and variance of contacts in each hour and the number of contacts was assumed frequency-dependent and independent of population size, according to the observed data (e.g. pen #3 had 27 animals but the number of contacts was not higher than pen #1 and pen #2, which both had 21 animals). We used this assumption to simplify the model, but as discussed later, the model could be altered to be density-dependent or for more complicated conditions for the specific disease system at hand. The degree order was simulated through a random permutation (from 1 to population size N = 100) in each hour. The time series of mean disease prevalence and contribution of new infection from each individual were investigated.

A total of four conditions were simulated for comparison and 100 simulations were run for each condition. The first condition (C1) did not use temporal (mean contact number in each hour) nor degree order change, as a baseline scenario representing general model assumptions regarding the contact networks for infectious disease studies, such as constant number of contacts over time and among individuals. That is, if a certain individual had the highest contact number (degree order) at the first time interval (h) in the network, it would remain the most active during the entire period of the simulation. In this condition, individual variability still existed (however, the mean hourly contact number remained the same through simulation) and individual variability was simulated by the variance of the aggregated total observations across all three pens (69 animals) at that time interval; such variability determined the individual ranks. In contrast, condition C4 was simulated with both temporal network degree distribution and degree order changes to investigate the effects of these two sources of variability.

Two more conditions were also simulated for further comparison. Condition (C2) assumed the mean contact number was the same at any time interval (hourly, h), but the degree order changed during the simulation (no temporal with order change). The simulation of the next condition (C3) incorporated temporal variability (change of number of contacts at an hourly basis using the observed contact network), but assumed the contact network degree order was static over time (temporal with no order change).

The maximum prevalence and its associated occurrence day in each condition for each set of parameters were recorded, along with outbreak size (n), duration of outbreak (Tf) and basic reproduction number (R0). The R0 was numerically computed from the secondary infections in the simulations. To compare the complete dynamics among these four conditions (C1–C4), we computed the time series of the difference (D) of any two conditions a and b. Thus a total of six new time series, D12, D13, D14, D23, D24 and D34 were calculated. Each of the new time series was fit to an autoregressive integrated moving average (ARIMA) model, with three parameters (p, d, q). If the fitted three parameters were 0, 0, 0, it indicated the difference between the two time series a and b resembled white noise; hence, those two time series were assumed to be similar. Otherwise, the two time series were statistically different. Besides the dynamics of prevalence, we also investigated individual contribution to new infections by computing the Gini coefficient, a parameter that quantified the heterogeneity in the group of individuals for the new infection. A homogeneous population would have a zero Gini coefficient, while a more heterogeneous system would give a higher Gini coefficient.