The role of influenza in the epidemiology of pneumonia

Interactions arising from sequential viral and bacterial infections play important roles in the epidemiological outcome of many respiratory pathogens. Influenza virus has been implicated in the pathogenesis of several respiratory bacterial pathogens commonly associated with pneumonia. Though clinical evidence supporting this interaction is unambiguous, its population-level effects—magnitude, epidemiological impact and variation during pandemic and seasonal outbreaks—remain unclear. To address these unknowns, we used longitudinal influenza and pneumonia incidence data, at different spatial resolutions and across different epidemiological periods, to infer the nature, timing and the intensity of influenza-pneumonia interaction. We used a mechanistic transmission model within a likelihood-based inference framework to carry out formal hypothesis testing. Irrespective of the source of data examined, we found that influenza infection increases the risk of pneumonia by ~100-fold. We found no support for enhanced transmission or severity impact of the interaction. For model-validation, we challenged our fitted model to make out-of-sample pneumonia predictions during pandemic and non-pandemic periods. The consistency in our inference tests carried out on several distinct datasets, and the predictive skill of our model increase confidence in our overall conclusion that influenza infection substantially enhances the risk of pneumonia, though only for a short period.


S-1.2 Dataset 2
The New York city portion of dataset 2 consists of pneumonia and influenza cases in New York city reported weekly. The dataset spans a period of 212 weeks from 1920 to the end of 1923. The time series of pneumonia case reports contained one missing observation and one outlier, and the time series of influenza case reports consisted of 4 missing observations. These missing data observations and the outlier were filled in by values obtained by linear interpolation, before the analyses were carried out. The population sizes of each of these cities are also taken from 1920 census. The population of New York city is taken from the 1920 census to be 5,620,048. In addition, the supplementary portion of dataset 2 consist of pneumonia and influenza weekly case reports, for the same time period in 4 other cities; (i) Chicago, (ii) Philadelphia, (iii) Baltimore, and (iv) Los Angeles. The missing values in these dataset were likewise filled in using linear interpolation, before the analyses were carried out. These data are shown in Fig. S-2. The population sizes of each of these cities are also taken from 1920 census, and are shown in Fig. 3A in the main text.

S-2 Models
We adapt the SIRS model [3,4], to serve as a basis of the underlying epidemiology of bacterial pneumonia. In this framework, S(t), I(t), and R(t) represent the numbers of susceptible, infected and recovered individuals, respectively. We further subdivide the susceptible and infected compartments into two sub-compartments S-2

Philadelphia
Time Weekly influenza and pneumonia cases (log10) q q q q q q q q q q q q q q q q q q qq q q q q qq q q q q q q q q q q q q q q qq qqq q q q q q q qq q

Baltimore
Time Weekly influenza and pneumonia cases (log10) q q q q q q q q q q q q q q q q q q q q q q qq q q q qqq q q q q qq q q q q q q q q q q q q q q q q q q qq qq q

Los Angeles
Time Weekly influenza and pneumonia cases (log10) q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q qq q q q q q influenza pneumonia missing data Hypothesis I: θ > 1 Transmission Impact Hypothesis II: φ > 1 Susceptibility Impact Hypothesis III: ξ > 1 Pathogenesis Impact φλ γ λ γ ρ p ξρ p Figure S-3: Schematic representation of the model and the three hypotheses. We used a previously developed model [2] for transmission dynamics of bacterial pneumonia. In this model, the population was categorized into susceptible (S), infectious (I), and recovered (R) classes. Individuals progress along S → I → R → S at per capita rates λ, γ and , respectively. Progression of individuals recently infected with influenza were tracked separately, via classes SF and IF . Pneumonia case reports were taken to be fraction of the infecteds as they recover. Births and deaths were present in the model, but omitted in this illustration for clarity. We test three hypothesized pathways of influenza-bacterial pneumococcal interaction: Hypothesis 1 (θ > 1): Individuals infected with bacterial pneumonia, contribute more to transmission of pneumonia if they have been recently infected with influenza. Hypothesis 2 (φ > 1): Individuals recently infected with influenza are more susceptible to bacterial pneumonia. Hypothesis 3 (ξ > 1): Individuals infected with bacterial pneumonia are more likely to be reported, if recently infected with influenza. each: susceptibles and infecteds that are currently or very recently infected with influenza, S F (t) and I F (t), respectively, and the ones that are not, S U (t), and I U (t), respectively. Let F(t) be the number of influenza cases reported at time t. Let ρ F be the reporting rate of influenza, ie, F (t) ρ F is the number of influenza infecteds. Assuming that influenza operates independent of pneumonia, it is equally likely to be prevalent among individuals infected, susceptible and recovered with respect to pneumonia, the expected number of susceptibles who are also infected with influenza. Here, N (t) is the population size. And, the remaining susceptibles, are currently not infected with influenza. The force of infection, or the pneumonia hazard, is composed of two parts: a part that is proportional to the fraction of infecteds at current time, and a part that is independent of this dynamics. This is slightly different from the standard formulation, which only includes the former. The addition of the latter is to account for high levels of asymptomatic carriage of the bacteria generally seen in the population [5,6,7]. We will also model the transmission to be seasonal. Seasonality is implemented using 6 cubic spline functions. Hence, if I(t) is the number of pneumonia infecteds, and β(t) is the seasonal transmission rate, then the force We allow for the susceptibility to pneumonia among those that are infected with influenza to be different from those that are not infected with influenza. This is modeled by hazard multiplier φ-individuals infected with influenza experience pneumonia hazard φ times those that are not infected with influenza. This allows for testing hypothesis 2 (φ > 1). Setting φ = 1 assumes that both groups are equally susceptible, and is the null expectation for hypothesis 2.
We also allow for influenza to affect the transmission of bacterial pneumonia. We differentiate the contribution of an infected host to the hazard depending on whether or not she is infected with influenza. We introduce transmission multiplier θ-individuals recently infected with influenza contribute θ times to transmission compared to those that are not. This allows for testing hypothesis 1 (θ > 1). Setting θ = 1, assumes that both groups contribute equally to the hazard, and is the null expectation for hypothesis 1. To accommodate this addition, we consider the force of infection in an expanded form: With per capita recovery rate γ, loss of immunity rate , and birth/mortality rate µ, we have the following set of equations for the deterministic framework of the pneumonia model.
The observed variable in this model is number of pneumonia cases, C P , observed at weekly intervals. They are assumed to be on average a fraction, ρ p , of all infecteds in a given week as they recover. We allow for influenza to affect the reporting of pneumonia cases, via affecting the severity of the disease. We introduce severity multiplier, ξ-individuals recently infected with influenza are reported ξ times compared to those that are not. This allows for testing hypothesis 3 (ξ > 1). Again, setting ξ = 1, assumes that both groups are reported equally, and is the null expectation for hypothesis 3.

S-2.1 Incorporation of demography.
For analyses in dataset 2, we did not consider the change in populations over the 4 year period. The population sizes in the 5 major cities were held constant to the census recordings for the cities in 1920. The sizes are tabulated in Fig. 3 in the main text. For analyses in dataset 1A, and dataset 1B, we incorporated the population data from Illinois in the model. The population levels in Illinois during this period are shown in Fig. S-1. The population data is treated as a covariate in these models, and the change in population is treated as excess birth that directly enter the susceptible class.

S-2.2 Demographic and Extra-demographic Stochasticity
We translate the deterministic transmission model, defined by the the system of ordinary differential equations given above, into a stochastic model by modeling the flux between compartments to be a random process. That is, the per capita rates are constant and that the fluxes out of each compartment are independent, multinomial random variables over a small time interval of duration ∆t,. Extra-demographic stochasticity is modeled via a gamma-distributed multiplicative white noise, dW/dt, in the transmission process [8,9]. The standard deviation of this noise, β sd is also fit along with the parameters. Hence, the force of infection is given by:

S-2.3 Measurement model
Let H(t) to be the total number of new recoveries in week t, of which H F (t) are recoveries coinfected with influenza, and the remaining H U (t) are not. We assume that the weekly case reports, C P , are normally distributed. ie, where, ρ p is the reporting ratio, and σ is the standard deviation. We assume that the variance scales linearly with the mean. i.e. σ 2 = c 2 ρ p C P . The scalar, c, is also fit along with other parameters.
To allow for differences in reporting in pneumonia due to presence of influenza, we assume that the cases with influenza are ξ times more likely to be reported compared to the cases without influenza. This results in the following expansion of the above equation:

S-3 Likelihood inference framework
We utilize the freely available software package pomp [10] that implements a partially observed Markov processes [11,12,13] framework to carry out likelihood based inferences. This framework consists of following four components: (i) Data. The data consists of weekly case reports pneumonia. (ii) Covariates. We use weekly case reports of influenza as covariates for all three datasets. For datasets 2 & S-6 3, we additionally use annual population sizes as a covariate. (iii) The process model. The process model is proposed to describe the underlying epidemiological and demographic processes (described in subsection S-2.2). (iv) The measurement model. The measurement model is proposed to describe the process by which the data are reported (described in subsection S-2.3).
Consider a data set Y , consisting of observations of y(t j ), j = 1, . . . , n at n points in time. For likelihoodbased inference, we calculate the likelihood that a chosen parameter set θ explains the complete data (within the confines of the process and observation models). This likelihood function L(θ) is a product of conditional likelihoods, L tj (θ), calculated at each time t j for all n data points in time. If f (y | t, θ) is the probability of observing the data y(t) at time t, given parameters θ (measurement model), then the likelihood and log-likelihood functions are defined as follows: The Markov property of the model allows one to calculate the conditional likelihood L tj (θ) sequentially starting from t j = t 1 . For each t j , the likelihood function is: Here, x(t j ) represents the state of the complete system at time t j . Furthermore, the calculation of the filtering distribution is simplified identifying a recursive relation by applying Bayes' Theorem.
The likelihood functions are optimized using mif algorithm, which is also implemented in pomp [10] package. For details pertaining to the methods and implementation of the algorithm please refer to [11,13,10]. The range of algorithm parameters used in the inference work are shown in Table S

Influenza attributable etiological fraction of pneumonia cases
Influenza attributable etiological fraction of pneumonia cases in any time interval, is taken to be the ratio of pneumonia cases as a result of influenza to the total pneumonia cases in the given time interval. Let H(t)  be the total number of new recoveries in week t, of which H F (t) are recoveries coinfected with influenza.
With the normal reporting process, with the reporting ratio, ρ p , and the standard deviation σ, the total reported cases are C P ∼ N (ρ p H F , σ), and total reported cases of pneumonia infections with influenza are Recall, ξ is the hypothesized measure of severity impact. The influenza attributable etiological fraction of pneumonia cases during week t, E F (t) is given by: We estimate C P , and C F P , using separate MLE models for datasets 1A, 1B, and 2, which are presented in table S-2. Note that ξ = 1 in these MLEs. We use 1000 simulations of the MLE models to find the mean and the 95% confidence intervals for these estimates.

S-4.1 General features of pneumonia epidemiology
In Table S-2, we list all the parameters used in the pneumonia model, including the ones that are estimated. We estimated 16 parameters altogether: 2 for initial conditions, 6 for the shape of the seasonality, 3 for reporting processes, 3 for epidemiology of bacterial pneumonia, 1 for interaction, 1 for extra-demographic stochasticity.
In Table S-2, we also show the estimates of all other parameters corresponding to the MLE. The simulations arising from the MLE model, in comparison to the null model (with no interaction) for Dataset I are shown in Fig. S-7. Pneumonia is highly seasonal, seen in the difference of β i s-the transmission during the high season is about 7 times that during the low season. The infection is estimated to last a little less than a week (∼ 6 days). The immunity imparted by the infection is estimated to last long (∼ 20 years), although, this estimate is likely to not be robust since we only used 4 years of data. The symptomatic ratio of pneumonia is estimated to be about 3%.
A defining characteristic of the epidemiology of pneumonia is the large carriage rate. It is believed that a substantial fraction of the population can be carrying pneumococcal bacteria in the naso-pharynx without showing any symptom [15,16]. The duration of carriage can vary between weeks to months. Yet, it is unclear how the carriage affects invasive pneumonia [17,15,18]. Here, we estimate the contribution of carriage to the total force of infection, ω, in Dataset I. In Fig. S-4, we show the likelihood profile of this parameter ω. We estimate this to be between 0.047 to 0.234. This suggests that pneumonia epidemiology is primarily driven by asymptomatic carriage.

S-4.2 Timescale of the interaction
In the original model, we hypothesized that individuals currently or very recently infected (up to a week) with influenza face different hazard rate compared to those that do not. Here, we hypothesize that individuals infected up to three weeks in the past with influenza face different hazard rates. Let F 1 (t), and F 2 (t) respectively be the influenza case reports 1 week and 2 weeks in the past, i.e. F 1 (t) = F(t − 1), and F 2 (t) = F(t − 2). Let S F1 , and S F2 respectively be the sizes of the susceptible population that were infected with influenza between 1-2 weeks, and 2-3 weeks in the past. We estimate these quantities in the same way With S F , defined the same way as in the original model, and S U being the rest of the susceptible population, the susceptible compartment is now divided into 4 sub-compartments: S F , S F1 , S F2 , and S U . Let the hazard rates faced by each of the susceptible sub-compartments be φ λ, φ 1 λ, φ 2 λ, and λ, respectively.  The complete lagged-interaction model is described by the following set of equations: Our estimates for φ 1 and φ 2 , for pneumonia are shown in Fig. S-5. We find that the null models, φ 1 = 1 and φ 2 = 1 are within the 95% confidence interval, indicating that there is no evidence of susceptibility enhancement among cohorts that were infected with influenza more than 1 weeks prior.

S-4.3 The three hypotheses
Here, we explore the three different pathways of interaction in dataset 2, New York City in further detail. In particular, we are interested in understanding the tradeoffs between the each of the hypotheses. To do so, we look at likelihood surfaces, when profiled over two of the three parameters holding other to the null. Likelihood surfaces over φ-ξ plane (Fig. S-6B), φ-θ plane (Fig. S-6E) and θ-ξ plane (Fig. S-6H), show how two parameters are associated. When the model is allowed to account susceptibility impact (panels B and E), the maximum likelihood models select for only susceptibility impact. When the model is not allowed to account for susceptibility impact (panel H), the maximum likelihood model selects for pathogenesis impact (over transmission). But, the MLE value in this instance is significantly smaller that MLE values obtained in the instances where the model is allowed to account for susceptibility impact.

S-4.4 Out-of-fit Predictions in other cities
We test the ability of the models we formulated on the basis of New York data (Dataset II) to predict pneumonia cases in other major cities during the same time period (see Fig. S-2). Using (i) parameter values corresponding to the MLE (S-2) and the null model from the New York data, and (ii) population and influenza data from the respective cities, we simulate the model forward and compare the predictions to the pneumonia case data in these cities. Simulations of the out-of-fit predictions based on both of the models are shown in Fig. S-8. The MLE model shows some ability to capture inter-annual variability in the data, that the null model cannot.

S-4.5 Fraction of pneumonia cases attributable to influenza.
We calculated the fraction of pneumonia cases that are attributable to influenza in datasets 2 (New York City), dataset 1A (Illinois, before PCV), dataset 1B (Illinois, after PCV). As shown in Fig. S-9, influenza-attributable etiological fraction of pneumonia cases show variability between different datasets. In New York City, between 1920 and 1924, the fractions of pneumonia cases attributable to influenza were between 2%-7% annually. In Illinois, the fractions were relatively smaller: < 2% annually in both datasets. As expected, the effect S-12  of influenza was also highly seasonal, with most of effect observed the during the high influenza part of the season.

S-5 Sensitivity analyses
S-5.1 Susceptibility enhancement and influenza symptomatic ratio Since we estimate the symptomatic ratio of influenza along with the enhancement, we expect the uncertainty in symptomatic ratio to affect our ability to infer enhancement. To delve further, we constructed a likelihood surface with φ on one axis and symptomatic ratio of influenza, ρ F on the other, for pneumonia data in New York City (Dataset I). As seen in Fig. S-10, the peak of the likelihood surface forms a ridge, starting from (φ ≈ 25, ρ F ≈ 1.5%), to (φ ≈ 140, ρ F ≈ 5%). We estimate the symptomatic ratio to fall between 1.5% to 5%. Furthermore, the estimate of enhancement increases with this ratio. This partially explains the broad confidence interval we get for φ, and further suggests that accuracy in the measurement of influenza can help quantify the enhancement more accurately.

S-5.2 Decoupling seasonality
In the original model, we allowed for the seasonal force of infection to be composed of two parts; one that is proportional to the fraction of infectious in the population, and the other that is independent. A natural question is whether the two components have different seasonality, and importantly whether the inference of the enhancement is sensitive to this. In this dual-seasonality model, we vary the force of infection in the following way: Note that setting β 1 seas = β 2 seas = β seas , is equivalent to the original model. Hence the original model is nested inside this model, with this model adding 6 more parameters.
Shown in Fig. S-11 is the likelihood profile of enhancement, φ using this model. The likelihood profile obtained with this model remains mostly unaltered, when compared with the original model shown in the main text. The shape of the seasonality also does not change significantly. (See Fig. S-12) This suggests that the seasonality of transmission related to carriage is similar to that related to invasive pneumonia.

S-5.3 Predictions of coinfections during the 1918 pandemic.
The MLE + model, used for the prediction of incidences of pneumonia and coinfections across 34 army camps during 1918 pandemic, included changes in 3 parameters in the MLE model, namely (i) birthrate (interpreted in this scenario as turnover rate) µ, (ii) reporting ratio for influenza ρ F , and (iii) reporting ratio for pneumonia, ρ p . Here (Fig. S-13) we examine the sensitivity of the predictions as result of changes in these three parameters. The predictions are fairly robust to moderate variations in all three parameters.

S-5.4 Simulation based predictions of pneumonia excluding pneumococcal pneumonia.
We simulated models of pneumonia (as presented in the main text) and models of pneumococcal pneumonia (as presented in [2]), to generate simulation based predictions of pneumonia excluding pneumococcal pneumonia. By subtracting predictions of pneumococcal pneumonia from predictions of pneumonia, we construct simulation based predictions of pneumonia excluding pneumococcal pneumonia. The null predictions of pneumonia excluding pneumococcal pneumonia were generated by subtracting null-model predictions of pneumococcal pneumonia from null-model predictions of pneumonia. Similarly, the MLE predictions of pneumonia excluding pneumococcal pneumonia were generated by subtracting MLE-model predictions of pneumococcal pneumonia from MLE-model predictions of pneumonia. These null and MLE predictions of pneumonia excluding pneumococcal pneumonia are compared against the data. The data for pneumonia excluding pneumococcal pneumonia is created by subtracting weekly reports of pneumococcal pneumonia from pneumonia. Plotted are log-likelihoods, with 0 set at the 95% confidence interval, shown by the dashed line. The surface is constructed by first maximizing likelihoods at various sampled points in the two-dimensional grid, each consisting a pair of φ and ρF , and then fitting a smooth surface. This figure shows that the ability to infer the enhancement factor is somewhat lost by having to simultaneously infer symptomatic ratio for influenza. The lower end of the estimate for enhancement (φ ≈ 25) corresponds to symptomatic ratio of 1.5%, whereas the higher end of the estimate (φ ≈ 160) corresponds to asymptomatic ratio of 5%. This estimation was based dataset 2 (New York City).   Fig. S-14 are comparisons for Illinois data before the PCV vaccine, and shown in Fig. S-15 are the comparisons for Illinois data after the PCV vaccine. The R 2 goodness of fit for the respective null models were 0.471 and 0.697 in pre-vaccine data and the post-vaccine data, respectively. In comparison, the goodness of fit for the respective MLE models were 0.532 and 0.745 in pre-vaccine data and the post-vaccine data, respectively, and superior to the predictions from the null models. This suggests that interactions between influenza and bacterial pneumonia are not limited to pneumococcal pneumonia, without having fully inferred the interaction between influenza and non-pneumococcal pneumonia.