The effect of temperature, farm density and foot-and-mouth disease restrictions on the 2007 UK bluetongue outbreak

In 2006, bluetongue (BT), a disease of ruminants, was introduced into northern Europe for the first time and more than two thousand farms across five countries were affected. In 2007, BT affected more than 35,000 farms in France and Germany alone. By contrast, the UK outbreak beginning in 2007 was relatively small, with only 135 farms in southeast England affected. We use a model to investigate the effects of three factors on the scale of BT outbreaks in the UK: (1) place of introduction; (2) temperature; and (3) animal movement restrictions. Our results suggest that the UK outbreak could have been much larger had the infection been introduced into the west of England either directly or as a result of the movement of infected animals from southeast England before the first case was detected. The fact that air temperatures in the UK in 2007 were marginally lower than average probably contributed to the UK outbreak being relatively small. Finally, our results indicate that BT movement restrictions are effective at controlling the spread of infection. However, foot-and-mouth disease restrictions in place before the detection and control of BT in 2007 almost certainly helped to limit BT spread prior to its detection.


S1.1.2 Direct movements
Direct movements were extracted for each year of the analysis for both cattle and sheep from the underlying database as follows: 1. Cattle: individual cattle movements between farms (i.e. the departure and destination locations were both farms) were aggregated from the underlying database for each pair of farms for each day of the analysis year. 2. Sheep: movements of sheep between farms, whereby the departure and destination locations were both farms, were extracted from the database for each pair of farms for each day of the analysis year. 3. The two datasets were then merged, so the total number of animals moved between each two farms for each day equalled the number of both cattle and sheep moved.

S1.1.3 Indirect movements
Indirect movements were extracted for sheep from the underlying database as follows: 1. Total number of sheep moved from farms to markets was aggregated from the database for each farm and market for day of the analysis year. 2. Total number of sheep moved from markets to farms was aggregated from the database for each market and farm for day of the analysis year.
The mixing mechanism that assigns farm to farm links when sheep are moved via markets has not changed and a value of q = 0.5 was used for this paper.

S1.1.4 Temperatures
Historical daily temperature data for farm locations were extracted from the UKCP09 5km daily observed archive [S2] for 1960 to 2015, by assigning farm temperature as the daily value of temperature at the grid point containing the farm. The UKCP09 dataset is derived from the UK Met Office archive of weather observations at synoptic stations, using regression to account for factors including altitude and coastal influence, and interpolation to transform the network values to a regular grid. Note that a 5-day moving average (involving the current day and preceding four days) was applied to the farm-specific temperature data before use. As a result, values produced by the temperature-dependent functions of the model are less sensitive to daily variations in temperature.

S1.2 Farm-specific values introduced S1.2.1 Degree of susceptibility
Degree of susceptibility (dS) values are an important factor in the calculation of transmission probabilities. In the previous model, all farms were assumed to be fully-susceptible at the start. The value for a farm would be updated after infection and, as the total number of infected animals (final size) on the farm was unknown, the maximum prevalence achieved on the farm was used. With the introduction of the within-farm model, it is now possible to calculate the final size of the outbreak on each farm separately and adjust each farm's dS value accordingly, thereby correctly estimating each farm's individual re-infection risk. Also, in the new model, each farm is given a dS value at the start. These can all be 100% (i.e. fully susceptible), as in the previous model. Alternatively, they can take different values that represent the proportion of animals on each farm that are protected from infection by either acquired immunity or vaccination.

S1.2.2 Prevalence curve
Farm-specific prevalence curves were introduced using a within-farm model. Although the main model is stochastic, the within-farm model was chosen to be deterministic for computational reasons. The model is a two-host, one-vector model with host incubation and recovery (i.e. SEIRtype) and vector incubation (i.e. SEI -type). The host population is assumed to be constant, while the vector population varies seasonally (driven by a temperature-dependent recruitment rate that is related to the new vector to host ratio function described in section S1.3). The model equations are given below.
Hosts (i = C for cattle or S for sheep) The parameters ν, a and μ are temperature-dependent functions. The vector to host ratio m is both time-and temperature-dependent. Temperature estimates (derived from real data) for the start and end of each daily time step are read in at the start. Temperature then varies linearly between Tstart and Tend for that day. The previous recruitment rate ρ is replaced with , leading to the new m function used in main code.
The within-farm model is used to produce a prevalence curve for each farm that reflects the number of animals on that farm, the location (and hence temperature profile) for that farm and the time of year that the infection was introduced to that farm. Using the number of days since infection was introduced, we can obtain from the prevalence curve a measure of the farm's infectivity (dI), which indicates how infectious the farm is to other farms. Figure S1 shows the range of prevalence curves that could occur during a single simulation when infection is introduced on 15 th of each month from May to October. The curve labelled 'mean all' is the mean across years (1990 -2015) when each year consists of a time series created by taking the mean across all farm grid points.  Figure S1: Range of prevalence curves that could occur during a single simulation with ma ≈ 500. Date of introduction is shown in the title of each subfigure. Curves were generated using temperature data from 1990 -2015. Curve 'mean all' is the mean across years when using the mean across all farm grid points. Curve 'mean SL' is the mean across years when using a randomly-chosen farm grid point whose Jun-Jul-Aug mean temperature falls within the low (0-0.2) quantile band. Curves 'mean SM' and 'mean SH' are the equivalent curves for the middle (0.4-0.6) and high (0.8-1) quantile bands, respectively. Curve '2006 all' shows the mean across all farm grid points for year 2006 only.
As the within-farm model is only used to produce a prevalence curve for a farm once that farm has become infectious, and our definition of infectious is having at least one infectious animal, the within-farm model is always initiated with one infectious animal (a cow if the farm is classed as cattle or mixed and a sheep otherwise). Using one exposed animal or one infectious vector instead produces an almost identical curve. The curves in Figure S1 were produced using the default parameter values shown in Table 3 in the main manuscript. These produce summer ma (i.e. the product of m and a) values that are approximately equal to 500. As the model is very sensitive to the size of ma, alternative values were considered when analysing the main (i.e. between-farm) model (see S3.1).
The within-farm model also provides farm-specific time series giving the number of infectious vectors released each day. These are used to calculate the risk to nearby farms resulting from vector dispersal, as described below in section S1.4.
The movement of exposed (i.e. latently-infected) animals is also a route of transmission. As in the previous model, the proportion of exposed animals on an exposed farm is assumed to be 0.01. For infected and detected farms, this was previously increased to 0.02. However, the proportion of exposed animals on these farms is now drawn from the within-farm model (by generating a 'proportion exposed' curve at the same time as the prevalence curve).

S1.2.3 Vector to host ratio
The model has been modified so that different parameter values for the vector to host ratio (m) function can be specified for each farm.

S1.3 Vector to host ratio function replaced
A new seasonal vector-to-host ratio (m) function has been introduced. It is a simplified version of that presented in Sanders et al. [S3]. It produces a seasonal pattern of m values that, when combined with the temperature-dependent biting rate (a), reasonably represents the pattern of trap catch data [S3,S4], in particular the summer plateau so often observed. Note that we have assumed that a trap acts like a host, attracting vectors that are looking to feed. Therefore, the number of vectors caught each night represents the number of vectors per host per night that are seeking to feed. We take this to represent the number of bites per host per night (i.e. number of vectors per host (m) x number of bites per vector per night (a)).
The model is highly sensitive to m, and hence ma, so a range of parameter values were considered, corresponding to ma plateau values equal to 100, 500 and 1000. The results are shown below in section S3.1.

S1.4 Distance kernel replaced
The distance kernel that controls between-farm vector transmission has been replaced with one determined by continuous release, diffusion and mortality of infectious vectors. The details are given below along with the new 'probability of exposure by vector movement' calculation, which had to be modified as a result of the changes.
The basis for our calculation is a standard result for diffusion in two dimensions with additional decay [S5]. The concentration at distance r and time t, C(r,t), from a point source on an infinite plane surface is given by where M is the total number added at the start, μ is the per capita mortality rate of infectious vectors and D is the diffusion coefficient (see equation 1 below). If the source is continuous, rather than instantaneous, specifically if vectors emanate at rate ω(t), then the concentration is found by integrating the instantaneous solution with respect to time t [S5 p. 31 -32].
Unfortunately, in this case an algebraic solution is not possible, not least because μ changes daily with the discrete temperature data. However, we can calculate a discrete approximation by treating each day as a separate instantaneous release and then summing across the different days. The sum can be written as Note that in the discrete case, we stop at t -1 (i.e. the last day that vectors are released before time t).
The accuracy of the approximation can be increased by dividing each day into a number of sub-steps (nsteps). The higher the number of sub-steps, the better the approximation. Unfortunately, increasing the number of sub-steps also increases computation time. As shown in Figure S2, 10 substeps was found to be sufficient for a good approximation, with accuracy only an issue for small distances (e.g. < 0.5km).  Within the main code, the C(r,t) function is evaluated for different Tinf (i.e. time since the source farm became infectious) and different r (i.e. distance from source to target farm). The values for ω(t') (i.e. the number of new infectious vectors released on day t') are drawn from the within-farm model, as mentioned earlier. Each mortality rate μi is the average mortality rate for time step i to i + 1. The advantage of this approach over the previous distance kernel approach that involved the relatively arbitrary parameter α2 is that the remaining parameter D can be estimated as follows from data on the distance travelled by a vector in t days after its release. Let xt be a vector containing the distances from the source farm that each vector travelled in time t (assuming random walk motion). For diffusion in two-dimensions, it is known that The left-hand side of this equation is the mean squared displacement (MSD) in time t. Using data from the study described in Kluiters et al. [S6], we obtained an estimate for D of 0.531. This corresponds to an average distance of approximately 1.5 km per day.
C(r,t) is interpreted as the number of infectious vectors at distance r from the source farm t days after the source farm became infectious. This is used to calculate the average number of new cases arising on the target farm as a result of vector dispersal from the source farm, as in the previous model. However, the formula for the rate at which cases arise at farm i as a result of infection at farm j (ηij) now becomes As before, dSi is the degree of susceptibility of farm i, βvh is the probability of transmission from vector to host given an effective contact and ai(T) is the biting rate at farm i when the temperature is T.

S1.5 Detection mechanism modified
Detection of cases determines when and where movement restrictions are imposed. It is, therefore, important to accurately capture this process. In the previous model, the parameters (λD and λDS) governing the probability of detection were estimated from a single report in the literature in combination with the fixed prevalence curve we were then using within the model. We hoped to update these estimates, but were unable to find good quality data. There are a few reports that quote the number of animals or proportion of a herd showing clinical signs (e.g. [S7]). However, these rarely include the total number of animals that were infectious. As a result, it is impossible to calculate the number of infectious animals showing clinical signs. Also, this approach neglects the fact that (with the exception of dairy farmers) most farmers check their stock relatively infrequently (could be as little as once a week in late summer) and from a distance, only investigating further if an animal is showing severe clinical signs. Add to this the fact the some of the signs of BTV infection are common to other diseases and it is easy to see that slight illness in one animal might be dismissed as nothing requiring immediate attention, unless of course BTV had already been confirmed on another farm. So, the probability of detection is really a combination of the proportion of animals showing severe clinical signs and the frequency with which animals are checked. To try and capture this we have introduced pre-and post-detection parameter values. Prior to the first case of BTV being identified, the pre-detection values are used. Afterwards, the model switches to the post-detection values. The actual values were chosen during the validation step (as described below). Table 3 in the main manuscript. Most were obtained from the literature (Gubbins et al. [S8] and references therein). Others were estimated during validation as described below. Additional parameter values considered during validation and sensitivity analysis are listed in Table S1.   Figure S3 were produced.   Rather than prevalence, the curves show seroprevalence (i.e. the cumulative number of infected animals) as the tests used during the outbreak detect antibodies and not necessarily active infection.

S2. Validation BTV model functions, parameters and default values are given in
For BTV, the model gives a final seroprevalence of either 10.7% or 43.4%, depending on the nature of the introduction. As shown in Table S2, initial estimates during the UK outbreak were 8.9% (0.3 -49.2) for cattle and 3.6% (0.05 -13.0) for sheep [S9]. At the end of the outbreak and after further testing, the estimate for cattle was revised to 11% (1 -57) [S10]. These are consistent with our estimates following the introduction of one latent vector.
In order to further validate the within-farm model, the exercise was repeated using parameter values for Schmallenberg virus (SBV), which is transmitted by the same vectors. See Table S3 for a comparison of BTV and SBV parameter values. The results for SBV show that the seroprevalence quickly tends to 100%, exceeding 90% at either 29 days post infection (dpi) or 25 dpi, depending on the mode of introduction. This is consistent with results published in Wernike et al. [S11], which show seroprevalence on a farm in Germany (close to the original source of the SBV outbreak) rising to 100% within 4 weeks of the first infected animal being detected.

S2.2 Between-farm model
To test the main (i.e. between-farm) model, infection was introduced by making nine farms in Suffolk and Essex latently-infected on 5 August (Table S4). The date and places of introduction were based on information in Gloster et al. [S12] and the BTV case data for the 2007 UK outbreak. The index cases were farms that were found to be infected within one week of the first reported case. For reasons given below, we assumed that they were all exposed on the same night. Reasons:  If we assume infectious midges were blown over from mainland Europe, then we started with one (or more) exposed animals on one or more farms.  The average time it takes an exposed animal to become infectious ('time to first infectious farm') equals the average host incubation period (HIP), which is 7 days for cattle and 5 days for sheep.  At best, detection is likely to occur several days later.  The average time from infectious animals appearing on the primary farm to infectious animals appearing on another (secondary) farm as a result of infection on the primary farm (i.e. 'time to secondary infectious farm') must equal the sum of the average extrinsic incubation period (EIP) and the average HIP.  EIP varies with temperature, but in September in the UK is likely to be 6 -14 days.  So, the average 'time to secondary infectious farm' equals 11 -21 days.  Therefore, it is likely that all farms found to be infected within one week of the first reported case were exposed on the same night as the first case, as insufficient time had elapsed for them to be secondary cases.
As discussed above, the detection parameters could not be estimated from data and so were selected as follows:  The pre-detection values had to be low enough to give a median date of first detection that was close to the observed date of 21 September (following introduction on 5 August).  The post-detection values had to be such that the median number of farms detected by the end of the year was approximately 42 (the number reporting clinical signs during the UK outbreak). Note that the total number of farms infected during the outbreak (i.e. all farms identified on or before 15 March 2008) was 135. Around 25 of these were detected by a combination of surveillance and tracing. It is impossible to say how many might have been detected later as a result of clinical signs. However, the fact that approximately half of all cases were only found following premovement testing in 2008 suggests that many of them would have gone undetected.
Unfortunately, 8 cases of foot-and-mouth disease (FMD) were also reported in the UK in 2007. For the purposes of validation only, FMD movement restrictions were approximated within the model by removing all animal movements for two periods: 1) 3 August to 7 September (day 215 to 250) 2) 12 September to 4 October (day 255 to 277).
Through a process of trial and error, rather than fitting, we found that by using 0.001 before the first case was detected (i.e. λD0 = λDS0 = 0.001) and 0.01 thereafter (i.e. λD1 = λDS1 = 0.01) we obtained results consistent with the observed data. Starting with nine latently-infected index cases on the 5 August (as described above), the median total number of farms infected was 118 (min 95 -max 261) with 37 (min 27 -max 52) detected by the end of the year (see Table S5). The first case was detected on 10 Sep (min 12 Aug -max 28 Sep). The number of farms detected each week is shown in Figure  S4. Increasing or decreasing the post-detection values (λD1, λDS1) affects the total number of farms detected, but has little effect on the total number of farms infected. Naturally, it has no effect on the date of first detection. Table S5 also contains results for the case where the pre-detection values (λD0, λDS0) are both 0.01 (i.e. equal to the default post-detection values). While the numbers of farms infected and detected were unchanged, the date of first detection was far too early (the maximum was almost two weeks earlier than the actual date).  In terms of spatial distribution, Figure S5 shows each farm's probability of detection (i.e. the proportion of simulations in which they were detected). Individual simulations in which the number of farms detected equals the median and maximum values of 500 simulations, respectively, are also shown along with a plot of cases reported in 2007. The model predictions show a more clustered pattern of cases than that observed. This could be due to some movements being omitted (e.g. illicit movements, which are not recorded in the movement databases) or the way in which we have modelled the diffusion of vectors (e.g. without including the effects of wind). It could be because we do not allow movements from high risk to low risk zones. In reality, these are allowed if the source farm tests negative. In terms of pathogen transmission, this means that we could be preventing the movement of latently-infected animals, which would otherwise be allowed. This is only an issue after the first case has been detected. Prior to that, longrange movement of infection can occur as shown in Figures S5a and S5b. There is a small chance that the lack of dispersal could be due to the FMD restrictions in our model. These are approximate. Figure S6 shows that small numbers of movements occurred within the periods during which we prevented all movements, in particular between 24 Aug (when movements outside the FMD Surveillance Zone were allowed subject to additional controls) and 7 Sep (after which all FMD restrictions were lifted). These movements occurred before BTV was detected and BTV movement restrictions were implemented. So, there is a chance that they could have resulted in dissemination of BTV that is not captured by our model. However, the results of tracing and testing carried out by Defra [S10] suggest that this is not very likely. Finally, the relatively wide scattering of observed cases could be due to additional (i.e. on the same night) or subsequent introductions. Gloster et al. [S12] shows that several other nights in August, September and October were considered suitable for vector incursion and it has been stated that it is likely that several incursions took place between 4 th August and 20 th December 2007 [S10].

S2.3 Additional simulations of between-farm model
In order to verify that no other combinations of λD0, λDS0, λD1 and λDS1 were capable of producing results consistent with the outbreak data, particularly when p1 and p2 were also allowed to vary, an additional 100,000 simulations were executed. Parameter values for p1 and p2 (Set 4 in Table S1) were combined with values for λD0, λDS0, λD1 and λDS1 (Set 5 in Table S1) to produce 1000 parameter sets. For each parameter set, 100 simulations were run and three median values were calculated, namely the median number of farms infected, the median number of farms detected and the median date of first detection. For each of these, a feasible range (criterion) based on the observed value was specified. Only parameter sets that satisfied all three criteria were retained. The criteria are listed below and the results presented in Table S6 and Figure S7.   Figure S7: Each point corresponds to a parameter set that produced results which satisfied all three criteria.
It is clear from Table S6 and Figure S7 that the parameter values selected in the previous subsection and used throughout this paper as the default values, namely p1 = 10.59, λD0 = 0.001 and λD1 = 0.01, are consistent with these results. Furthermore, only combinations involving closely related values can produce results consistent with the outbreak data.

S3. Sensitivity analysis S3.1 Effect of different ma values
As explained in section 1.3, the product of the new seasonal vector-to-host ratio (m) and the temperature-dependent biting rate (a) reasonably represents the pattern of trap catch data, in particular the summer plateau. For validation purposes, we used parameter values that produce a plateau approximately equal to 500 biting midges per host per day, which could be considered a conservative estimate. As the model is sensitive to m, and hence ma, here we present results for scenarios in which the plateau is approximately equal to 100 and 1000 too. Figure S8 shows plots of ma versus time. The blue and green curves were generated using daily temperature data (with 5-day smoothing) from 2006 and 2007, respectively. The red curve was produced using data corresponding to the mean across years 1995-2014. As indicated by the horizontal black lines, the plateau height of the mean curve is approximately 100, 500 and 1000, respectively. Essentially, the figures are identical except for the height of the plateaux. To facilitate direct comparison with the validation results, the first set of sensitivity results (Tables S7  and S8) was generated using the default parameter values (except for the two controlling the plateau height and the four detection parameters), nine specified index cases on 5 August in Suffolk and Essex, 2006 movement data and 2007 temperature data. The second set of sensitivity results (Tables S9 and S10) was produced using the same parameter values, one random index case on 1 June in Hampshire, 2013 movement data and 'average' temperature data. The 'average' temperature data corresponds to data from one year (2011) with a mean across the summer months (MJJASO) that closely matches the median of summer values over the last 10 years (1996 -2015). These results are useful as they reveal the behaviour of the model when using a different starting time, a different starting place, different movement and temperature data and a different number of index cases. These settings are also used to produce some of the results in the main paper.    In Table S9, when the pre-and post-detection parameters are (0.001, 0.001) and (0.01, 0.01) respectively, the median number of farms infected is 10 when the plateau of ma is approximately equal to 100, and 837.5 when the plateau is approximately equal to 1000. This illustrates the sensitivity of the model to the number of bites per host per day.

S3.2 Effect of detection parameters
Detection rates were revisited as part of the validation step, where results for a range of values were presented. Detection rates affect the timing of control measures, yet there is little information about them. Additional results with different levels of ma and different initial and running conditions are included in Tables S7 -S10.

S3.3 Partial Rank Correlation Coefficients
To assess the sensitivity of the model to small perturbations in parameter values, we calculated the partial rank correlation coefficient (PRCC) of each parameter with respect to the median outbreak size. This method allows parameter values to be varied simultaneously. As we did not have feasible distributions for each parameter, we chose to use Uniform distributions with limits equal to +/-10% of the default estimate in each case. 100 parameter sets were created using Latin Hypercube Sampling. For each parameter set, the model was run 100 times. Each time, infection was introduced into Hampshire on 1 June and 2013 movement data and average temperature data were used. The results are presented in Table S11 with parameters considered to be significant shown in bold. Top of the list is p1, which is the main parameter governing the height of the ma plateau. This is followed by ν2, the virus' development threshold, and μ2, one of the parameters controlling vector mortality. The remaining parameters in bold include those that determine the vector biting rate, the effect of temperature on m, the host to vector transmission probability and vector diffusion. This clearly demonstrates that median outbreak size is largely determined by the level of vector transmission.