Climate induces seasonality in pneumococcal transmission

Streptococcus pneumoniae is a significant human pathogen and a leading cause of infant mortality in developing countries. Considerable global variation in the pneumococcal carriage prevalence has been observed and the ecological factors contributing to it are not yet fully understood. We use data from a cohort of infants in Asia to study the effects of climatic conditions on both acquisition and clearance rates of the bacterium, finding significantly higher transmissibility during the cooler and drier months. Conversely, the length of a colonization period is unaffected by the season. Independent carriage data from studies conducted on the African and North American continents suggest similar effects of the climate on the prevalence of this bacterium, which further validates the obtained results. Further studies could be important to replicate the findings and explain the mechanistic role of cooler and dry air in the physiological response to nasopharyngeal acquisition of the pneumococcus.

parameters under the three scenarios considered. In each cell, the first number is the posterior mean, and the next two numbers denote the upper and lower limits of the confidence intervals, respectively. We use symbols ↓ and ↑ to denote for the minimum and maximum values of the posterior means in each column.

Seasons based on climate
In Supplementary table 2 we show the posterior summaries for clearance   Above x is any of the , , , . We set the parameter q to equal to the prevalence of strains other than , , , : For the parameter q' we use the following approximation: Additionally, we set = 0.0058 , which is the median frequency of the strains in the serotype distribution.
Motivation for considering reduced state space. The key point here is to treat separately the states with the strains that were actually observed in the two sampling times, and lump together possible states that consider the unobserved strains. The probability of a transition from the pair {*, Ø} to the pair {*, *}, defined as above is the only approximation that introduces a deviation of the lumped Markov model from the full Markov model with all the strains treated separately. This is because a constant is used instead of explicitly modeling the serotype diversity distribution. While this is reasonable approximation in itself, we also can predict a priori that the probability of the chain visiting state {*, *} during the month between the observation times is very low both for the lumped and the full model. This is because the rate of cocolonization is low in general. On the other hand, the sampling is dense enough to detect most of the colonizations, which are known to last approximately at least one month.

Strain-effects in the biologically detailed model
In the biologically realistic colonization model, the clearance rates are scaled according to the estimates given in Table 2 of the paper by 1 . In this paper, point estimates are given to the rates of clearance for 28 most common serotypes. Based on that information, we construct a serotype-specific modifier to the clearance rates as follows: Above cl(k) is the point estimate in the paper for the clearance rate for strain k, and median(cl) is the median clearance rate of all the estimated clearance rates. We thus normalize the point estimates for the clearance rates so that the strains with median clearance rate have ℎ( ) = 1. For those strains that were not considered in the paper by 1 , we set ℎ( )= 1 in our analysis.
We assume that the strain label affects the colonization dynamics for any strain x and y under the neutral model as follows: As the effect of the strain is multiplicative to the rate of clearance, this means that the effect of season and the label of the colonizing strain to the clearance rate are independent of each other. From this assumption it follows that the relative differences between the clearance rates of different strains during different seasons are the same. Observe that the strain effect of an unobserved strain is set to equal 1.

Effects of exposures in the biologically detailed model
In the biologically realistic colonization model, we also assume that the colonization history of a host affects the dynamics of the future infections. We denote this colonization history with ( (1: )), that is the time studies, that there exists both serotype-specific and serotype-independent immunity that is acquired via past colonizations, but reacquisition of same serotype is perfectly possible. However, it seems that the serotypespecific acquired immunity works mostly by reducing the acquisition rate 2 , while the serotype independent immunity works by increasing the clearance rate of future colonizations 3,4 .
Based on these previous findings, we set the effects of previous acquisitions to be the following: parameter explanation = 0.7 Serotype-specific immunity to acquisition = 1 Serotype-independent immunity to acquisition = 1 Serotype-specific increased clearance rate changing one colonization state to include one additional strain, that was not originally observed at that time, but was observed before or after that sampling time. Parameter q denotes the probability of missing a serotype, which we set to be 0.2. The current estimates for the accuracy of the swabbing method 5 , are typically less than or equal to that. The likelihood for these 'modified' time series is then obtained similarly as explained in equation (2)

Monthly birth rates in the cohort
The babies included in the study were selected independently of the month at which they were born. In detail, between October 2007 and November 2008, all the pregnant women in the camp attending the SMRU antenatal clinic at 28-30 weeks gestation, were invited to consent to their infant's participation in a pneumonia cohort study. The mothers were subsequently randomized into a cohort that was sampled each month, and into a cohort that was not sampled systematically each month. The latter cohort was excluded from the seasonality analysis presented here.
Approximately 80% of the women to 8 are estimated to attend the SMRU antenatal clinic, so the data covers the majority of the births in the camp during that time period, and the distribution could reflect the actual distribution of the births. A peak in child births is observed in December 2007, when 114 babies were born, compared to the 51 babies born in July 2008. We also show the distribution of birth months for the subset of infants, which were studied each month, and who were also the cohort that was analysed in the current study.
Supplementary Figure 5: The numbers of births in different months during the onset of the study, 2007-2008. The dates of birth cover the cohort that was sampled each month, and the cohort that was sampled infrequently.
Supplementary Figure 6: The numbers of births in different months in the cohort that was sampled each month. This is the cohort that was analysed in the current analysis.

20
6 The minimum-, mean and maximal temperatures during the study Supplementary Figure 6 shows the minimum, mean and maximum monthly temperatures in the study region during the considered years. The three quantities are observed to correlate, and the curve for minimum temperature appears most stable.
Supplementary Figure 7: The measured maximum, mean and the minimum temperatures in the Mae Sot region during the study.