Seasonal synchronization of foodborne outbreaks in the United States, 1996–2017

Modern food systems represent complex dynamic networks vulnerable to foodborne infectious outbreaks difficult to track and control. Seasonal co-occurrences (alignment of seasonal peaks) and synchronization (similarity of seasonal patterns) of infections are noted, yet rarely explored due to their complexity and methodological limitations. We proposed a systematic approach to evaluate the co-occurrence of seasonal peaks using a combination of L-moments, seasonality characteristics such as the timing (phase) and intensity (amplitude) of peaks, and three metrics of serial, phase-phase, and phase-amplitude synchronization. We used public records on counts of nine foodborne infections abstracted from CDC’s FoodNet Fast online platform for the US and ten representative states from 1996 to 2017 (264 months). Based on annualized and trend-adjusted Negative Binomial Harmonic Regression (NBHR) models augmented with the δ-method, we determined that seasonal peaks of Campylobacter, Salmonella, and Shiga toxin-producing Escherichia Coli (STEC) were tightly clustered in late-July at the national and state levels. Phase-phase synchronization was observed between Cryptosporidium and Shigella, Listeria, and Salmonella (ρ = 0.51, 0.51, 0.46; p < 0.04). Later peak timing of STEC was associated with greater amplitude nationally (ρ = 0.50, p = 0.02) indicating phase-amplitude synchronization. Understanding of disease seasonal synchronization is essential for developing reliable outbreak forecasts and informing stakeholders on mitigation and preventive measures.

are correlated in time due to nonlinear interactions between elements. Many research fields provide illustrations of synchronization, including biology 23 , trade and finance 24 , mathematics 25 , and social sciences 26 . We argue that the synchronization of diseases could be established due to the spatiotemporal alignment of conditions favoring the spread of infections sharing common seasonality features. By extension, infections sharing synchronized seasonal peaks may share similar environmental and manmade drivers of infection associated with food contamination during production and distribution stages.
The synchronization across diseases in a population can be detected and measured. Surveillance systems offer a platform to present time-referenced reported records as an ongoing stream of time series and to assess the properties of these time series, namely the seasonal peak timing and amplitude of seasonal oscillations. The time series of diseases in specific geographic locations can be correlated across infections in a single location or for a single disease across multiple locations. In our earlier works, we illustrated seasonal synchronization between six reported enteric infections in Massachusetts 16 using a systematic approach for estimating the peak timing and amplitude along with their measures of uncertainty 21,27 applied to state surveillance records. We also demonstrated a possible relationship between annual peaks and amplitudes: early peaks in influenza are likely to pair with higher amplitude 22 . These models parallel more recent efforts taken by the CDC and FoodNet surveillance reporting teams 28 .
In this study, we further developed techniques for defining, characterizing, and comparing seasonal disease outbreaks. We applied the proposed methodology to monthly rates of nine infections reported by FoodNet public sites in the United States and ten surveyed states (as shown in Fig. 1) and all available years (Supplementary  Table S1). First, we demonstrated the use of high order characteristics applied to the distribution of monthly rates, specifically L-skewness and L-kurtosis, to identify infections with periodic and non-periodic outbreaks. Next, we developed annualized and trend-adjusted Negative Binomial Harmonic Regression (NBHR) models to estimate the seasonal characteristics for each infection. We applied the δ-methods to derive peak timing and amplitude estimates and their confidence intervals for each infection. We introduced three metrics of outbreak synchronization: serial synchronization based on serial cross-correlations between time series, phase-phase synchronization based on correlations between peak timing estimates, and phase-amplitude synchronization based on correlations across peak timing and amplitude estimates. These techniques allowed us to characterize seasonal infections in a standardized manner, identify possible multi-state outbreaks, and potentially enhance near-time ensemble forecasting. The proposed standardization of data review and analysis is essential for developing reliable www.nature.com/scientificreports/ outbreak forecasts and informing stakeholders on mitigation and prevention measures, and scheduling food contamination inspections.

Results
Monthly rates and sporadic outbreaks. The monthly time series of reported rates for nine foodborne infections (seven bacterial and two protozoal) are shown as a multi-panel plot of stacked time series and rotated frequency histograms for the US from 1996 to 2017 (Fig. 2). The right panel provides a time series for visual inspection of the potential periodic nature of the data. The rotated frequency histogram indicates the rightskewness of the monthly rate distribution, justifying the use of negative binomial regression models. The general summary statistics, including GLM-based average monthly rates and the L-skewness and L-kurtosis coefficients, are provided in Table 1. GLM-based monthly rates properly calculate confidence interval estimates and demonstrate differences in intensity across infections and locations. Coefficients of L-skewness and L-kurtosis identify time series distributions with stable seasonal behaviors or sporadic outbreaks typically reflected by low or high values, respectively. Nationally, the monthly rates of infections exhibited marked variability of 100-fold difference from ~ 20 to ~ 0.2 cases per month per 1,000,000 persons (cpm). Based on the average monthly rates and their variability we clustered infections in three distinct groups. Salmonella and Campylobacter infections had the highest average monthly rates of above 10.0 cpm: 12.68 [11.18, 14.37 As expected, the occurrence of sporadic outbreaks of high intensity was most notable at the state level. The stability of seasonal outbreaks was well detected by L-skewness and L-kurtosis coefficients: low coefficients were found for infections with overall high rates and stable seasonality while high coefficients were found for infections with low monthly rates and sharp sporadic outbreaks. Nationally for Salmonella, these coefficients were 0.18 and 0.05, respectively, demonstrating a stable, seasonal pattern. In NM, NY and MN, L-skewness exceeded the national estimate by ~ 1.3 times and in CO, NM, and OR L-kurtosis exceeded the national estimate by ~ 3.0 times, indicating the presence of spikes with increased intensity. For infections with overall low rates, like Yersinia, Vibrio, Cryptosporidium, and Cyclospora, the large values of L-skewness and L-kurtosis coefficients indicate a frequent occurrence of irregular spikes. Fig. 2 and supported by L-moments in Table 1, most of the infections exhibited regular periodic increases in incidence indicative of seasonality. Therefore, we estimated peak timing and amplitude based on Model 2 and results are shown in Table 2 (see Supplementary Table S3 for details). Nationwide, all infections except Yersinia exhibited summer peaks ranging from early-June (6.43-month for Cyclospora) to mid-August (8.59-month for Shigella). Four infections: Campylobacter, Vibrio, Salmonella, and STEC peaked during mid-to late-July (7.30 [7.08, 7.53 We compared peak timing estimates to identify seasonal co-occurrences and determine the potential for phase-phase synchronization. At the state level, Salmonella peaked during July for all states except GA (8.24 [8.12, 8.35]) while Campylobacter had the earliest peak in GA and latest in NM ( Table 2). The peak of Campylobacter in GA significantly (p < 0.03) preceded peaks in six states: CA, CO, MN, NM, NY, and OR. Similarly, the peak of Campylobacter in CA significantly (p < 0.022) succeeded peaks in seven states: CO, CT, MD, MN, NY, OR, and TN. Listeria had far more state-level variability between mid-June in GA to mid-September in MN. STEC also had large state-level variability (early-July in GA to mid-August in CA). Cryptosporidium had the least variability across states (from MN (7.97 [7.73, 8.22]) to CA (8.64 [6.11, 11.16]), yet its peak timing in CT significantly succeeded peaks in MN (p = 0.025) and OR (p = 0.048). Although nationally Yersinia peaked at the beginning of January, peaks were spread sporadically across states. Like Yersinia, state-specific seasonal peaks for Cyclospora were spread over 2 months.

Seasonality analysis. As shown in
Across states, Salmonella, Campylobacter, Shigella, and STEC had the least variability of amplitude estimates, with about 30-70% difference between the highest and lowest amplitudes ( Table 2). The high values of peak amplitude for Salmonella in GA and for Campylobacter in NM tended to co-occur with late peak timing. Cyclospora and Vibrio had the largest amplitude variability. The large values of peak amplitude co-occur with high values for skewness and kurtosis for Vibrio in NY and for Cyclospora in OR.
To depict the relationship between peak timing and amplitude simultaneously with annual trend and reoccurrences of seasonal changes for all infections in the US, we combined the traditional time series as a multi-panel www.nature.com/scientificreports/     www.nature.com/scientificreports/ calendar plot (Fig. 3). Based on the relationship between peak timing and amplitude, all infections exhibited summer peaks in tightly formed clusters except Vibrio and Yersinia. The heat map of monthly rates for Salmonella, Campylobacter, Shigella, STEC, and Cryptosporidium exhibited distinct seasonal changes in summertime rates.
Trend analysis. The results of trend analyses are shown in Fig. 2 and Supplementary Tables S4-S6. In   Supplementary Table S5. Overall, an adjustment for linear and non-linear trend components resulted in < 3% fluctuations in average peak timing and amplitude estimates with < 10% fluctuations for Cyclospora, Yersinia, and Cryptosporidium estimates as shown in Supplementary Table S6. This high stability in seasonality estimates irrespective of trend specifications justified the use of Model 1 for calculating annualized seasonality characteristics and conducting phase-phase and phase-amplitude synchronization analyses. It is expected that infections with strong trend and/or seasonality components have high autocorrelation, e.g. high dependency on the prior month value, which serves as the base for near-term forecasting. We plotted the correlation coefficients across lags of 1-3 months for each infection in each state and all states combined ( Fig. 4; Supplementary Table S7). Nationwide, six infections: Salmonella, Campylobacter, Shigella, STEC, Cryptosporidium and Vibrio had strong autocorrelations at 1-month lag (ρ ≥ 0.70; p < 0.001) and moderate autocorrelations at lag of 2 months (ρ ≥ 0.42; p < 0.001). At the state level, GA had the strongest autocorrelation patterns for Salmonella and Shigella. CA had the strongest patterns for Campylobacter, MN had the strongest patterns for STEC, and OR had the strongest patterns for Cryptosporidium. In general, autocorrelations across states for Listeria and Vibrio were low across most lags (ρ < 0.45) and thus, support low seasonality and trend contributions as shown in Supplementary Table S4. Synchronization analysis. It is expected that infections with similar trend and seasonality patterns have high cross-correlations, indicating potential synchronization of the shared temporal behavior. Cross-correlation estimates between diseases at − 3 to + 3 months lags are shown in Fig. 5 and Supplementary Table S8. Nationally, monthly rates of Campylobacter were strongly correlated with Salmonella at lags − 1 to + 2 (ρ = 0.59, 0.78, 0.73, 0.50; p < 0.001). Campylobacter was also moderately correlated at lags − 1 and 0 with STEC (ρ = 0.63, 0.66; p < 0.001) and Vibrio (ρ = 0.58, 0.63; p < 0.001), and with Listeria at lags − 2 to 0 (ρ = 0.60, 0.69, 0.63; p < 0.001). Salmonella was also strongly correlated from lags − 1 to + 1 with STEC and Vibrio as well as moderately correlated with Listeria at lags − 1 and 0. STEC was strongly correlated with Vibrio from lags − 1 to + 1 as well as Cryptosporidium from lags − 2 to + 1. Vibrio was similarly moderately correlated with Cryptosporidium from lags − 2 to 0. These results reaffirm strong similarities in the seasonal patterns across infections.
Nationwide, phase-phase synchronization was most pronounced between Cryptosporidium and Shigella (ρ = 0.51, p = 0.019), Listeria (ρ = 0.51, p = 0.019), and Salmonella (ρ = 0.46, p = 0.036) (Supplementary Table S9). Strong positive synchronization between Cryptosporidium and Shigella indicates concordance in their seasonal behavior; when one peaks later, the other does also (Supplementary Fig. S1). In contrast, no significant synchronization was found between Salmonella and Campylobacter (ρ = 0.07) ( Supplementary Fig. S2), indicating that the seasonal processes of Salmonella and Campylobacter peak timing are not associated despite peak co-occurrence in July. At the state level, phase-phase synchronization varies and for one pair the correlation could be significant and positive for one state and negative for another, like Campylobacter-STEC in CT and GA, indicative of discordant patterns. The strongest correlations were found between Salmonella and STEC in NM (ρ = 0.62, p = 0.018) and between Salmonella and Campylobacter in MN (ρ = 0.60, p = 0.003).
Examining phase-amplitude synchronization, we found positive correlations indicating that the magnitude of seasonal peaks is likely to increase when an infection peaks later in the year for STEC at the national level (ρ = 0.50, p = 0.019; Fig. 6), as well as in GA (ρ = 0.58, p = 0.005) (Supplementary Table S10). We also found positive correlations for Shigella in CA (ρ = 0.63, p = 0.002), GA (ρ = 0.50, p = 0.019), and MN (ρ = 0.43, p = 0.049) as well as for Cryptosporidium in CT (ρ = 0.48, p = 0.029) indicating the phase-amplitude synchronization. Table 1. GLM-based monthly average rates per 1,000,000 persons accompanied by 95% confidence intervals, L-skewness, and L-kurtosis estimates for the nine FoodNet reported infections for the United States and ten surveyed states from 1996 to 2017. Note LCI and UCI are lower and upper boundaries for the 95% confidence interval (CI), respectively; L-Skw represents L-skewness while L-Krt represents L-kurtosis estimates.

Discussion
Our results demonstrated how rich and powerful tools of time series analyses could be applied to explore the seasonality and synchronization of foodborne infections between one another and across locations. We urge food safety and public health professionals to make efforts to improve and standardize the analysis of reported infections to allow for a meaningful comparison and actionable inferences derived from this analysis 29 . Measures of L-skewness and L-kurtosis, indicating the degree of departure from a well-defined bell-shaped distribution of cases per selected time unit, could be implemented in routine surveillance system data analysis to quantify the overall degree of outbreak intensity and to distinguish between consistent and thus predictable seasonal behaviors and potentially sporadic outbreaks of foodborne infection. Standardized approaches to quantify seasonal peak timing and intensity (amplitude) along with their uncertainty measures agnostic to infection type or geographic location allow for uniform comparison of seasonal patterns common for all or almost all mandatory or voluntary reported infections. Detection of foodborne infection outbreaks relies on standardized methods for calculating and comparing infection rates and seasonality features that should be implemented with the highest possible precision. The outdated techniques based on arithmetical means produce meaningless negative values for sporadically occurring infections (as shown in Supplementary Table S2) and must be replaced with GLM-calculated estimates based on the highly skewed nature of time series rates. The commonly used aggregation of daily or weekly counts into monthly values is a substandard solution, because it leads to information loss, coarse resolution, and poor understanding of uncertainties needed for proper trend analysis 30 .
By extending the δ-method for systematically estimating the seasonality characteristics, such as peak timing and amplitude, we avoided the traps of using poorly defined seasons, which may vary geographically, climatically, and contextually. When peak timing is calculated as the month with maximal rates using multi-month periods [31][32][33][34][35][36] , this approach, though computationally straightforward, reduces the precision of estimating seasonal characteristics and neglects temporal and geographic variability [37][38][39][40][41][42][43] . Our results show that Campylobacter, Vibrio, Salmonella, and STEC peak from mid-to late-July, suggesting co-occurrence based on month of maximal rate calculations. By applying the δ-method and formal statistical testing we demonstrated that, while all infections peak during July, Campylobacter significantly precedes Vibrio, Salmonella, and STEC. Thus, we are able to identify an important feature, missed by the commonly used methods, yet valuable for disease forecasting.
The standardization of analytical tools could substantially improve our understanding of the co-occurrence of infections with respect to each other and across locations. High values of L-skewness and L-kurtosis spotted an outbreak of Salmonellosis in NY in 1996 with 5-times the amplitude than any other year following 44,45 . High L-moment estimates in GA for Yersinia indicated erratic outbreaks during December and January months and likely reflect reported outbreaks in pork chitterlings for Christmas and New Year celebrations 46,47 . High skewness and kurtosis values for Cryptosporidium and Cyclospora aligned with well documented outbreaks in NY and CA [46][47][48][49][50][51] . Unfortunately, cases reported by the FoodNet Fast platform are designated to a single state, heavily aggregated, and no information was available for multistate outbreaks. FoodNet Fast does not provide granular population catchment information within each county and the counties included in FoodNet represent only a fraction of the total state. Drawing spatial relationships for a single infection risks over-stating the association between states, especially when counties in two states share no geographic border. Improved data collection and reporting will enable modeling and forecasting of foodborne infections using complex network analyses to trace supply chain distribution patterns [52][53][54] .
This study provides evidence for potential outbreak synchronization based on several metrics that utilized the complex systems thinking. Serial synchronization examines whether two infections share similar trend and seasonality. Phase-phase and phase-amplitude synchronization evaluate shared seasonal processes between two infections peak timing or a single infection's peak timing and amplitude, respectively. Additionally, these metrics Table 2. Average peak timing in months and amplitude and their 95% confidence intervals for monthly rates per 1,000,000 persons for nine infections reported by FoodNet in the United States and ten surveyed states from 1996 to 2017. Note LCI and UCI are lower and upper boundaries for the 95% confidence interval (CI), respectively, for peak timing (PT) and amplitude (AMP) estimates. www.nature.com/scientificreports/ can help provide important information for adapting near-term forecasts to more accurately predict, plan for, and prevent seasonal foodborne outbreaks. Surveillance records with more granular temporal resolution and expanded geographic catchment areas can help improve the accuracy and precision of synchronization estimates for creating foodborne infection calendars, inspection schedules, and tracking multistate outbreaks. The public portal automatically compresses data during download, requiring individual year-by-year data extraction. Food recall reports show that a single outbreak is attributable to outbreaks in numerous states for these nine www.nature.com/scientificreports/ www.nature.com/scientificreports/ infections 44 . Failure to consider multistate outbreaks minimizes the utility of assessing cross-state synchronization of infection seasonality. Given annual fiscal losses and food waste reported annually, our proposed synchronization metrics should be considered in order to mitigate seasonal co-infections, track multi-state outbreaks, and coordinate food inspection scheduling. Further investigation is needed to evaluate how synchronization metrics can identify common manmade drivers of infection during the packaging, processing, and transporting of food products. With thousands hospitalized or dying, millions of pounds of foodstuffs recalled, and billions of dollars lost annually, methods of describing and analyzing the seasonality and synchronization of foodborne infections can lead to important health benefits and cost savings for food producers, food retailers, and public health agencies alike.

Data and methods
Data sources. FoodNet reports confirmed cases from 650 randomly sampled clinical laboratories in select counties that reach roughly 15% of the US population (Supplementary Table S1) using both culture-dependent and culture-independent methods 8,55 . FoodNet Fast provides a publicly available subset of reports for confirmed annual infections, hospitalizations, and deaths as well as the monthly prevalence of confirmed infections. These data are available for seven bacterial infections (Campylobacter, Listeria, Salmonella, Shigella, Shiga toxin-producing E.coli (STEC), Vibrio, and Yersinia) and two protozoa (Cryptosporidium and Cyclospora) in ten select states: California (CA), Colorado (CO), Connecticut (CT), Georgia (GA), Maryland (MD), Minnesota (MN), New Mexico (NM), New York (NY), Oregon (OR), and Tennessee (TN) (Supplementary Fig. S1).
For each state, we downloaded all available annual infection profiles from 1996 to 2017. For each year, we created a monthly time series by multiplying the annual total of confirmed infections by the percentage of confirmed infections for each month of that year. National (US) estimates were generated by summing all ten states' data. To draw comparisons between states, we calculated statewide population estimates by summing all mid-year (July 1st) populations of surveyed counties according to the year of their introduction into FoodNet (Supplementary Table S1). Annual county-level population estimates are made publicly available in the 1990, 2000, and 2010 US Census Bureau reports [56][57][58] . We calculated monthly rates per 1,000,000 persons by dividing monthly counts by population estimates and multiplying the product by 1,000,000. Results are presented as cases per 1,000,000 persons and abbreviated as 'cpm. ' Summary statistics for reported rates, trend, and seasonality analyses. To perform the synchronization analysis, we generated 99 individual monthly time series of reported rates (9 infections × 11 locations (each state and all states combined)). For each monthly time series of reported rates, we estimated summary statistics, including average rates and coefficients of L-skewness and L-kurtosis, which are superior on detecting spatiotemporal heterogeneity 59 . Estimates of L-skewness and L-kurtosis reflect the degree of departure of an empirical distribution from a symmetrical bell-shaped curve and the extent of extremes, respectively. Large values of L-skewness and L-kurtosis for a distribution of monthly rates are indicative of sporadic spikes, especially for infections with low overall rates. We used these estimates to identify infections with systematic periodic structures and infections with erratic temporal patterns. Infections with systematic periodic structures undergo trend, seasonality, and synchronization analyses based on annualized estimates of peak timing and amplitude. Infections with erratic temporal patterns were examined for trend and seasonality but only serial synchronization estimates were calculated.
To estimate average monthly rates from the compiled time series and adjust for left-skewed distributions, we applied a generalized linear model (GLM) with a negative binomial distribution and log-link function (Model 1). By exponentiating the model's intercept, we calculated average monthly rates, exp{β 0 }, and their 95% confidence www.nature.com/scientificreports/ interval estimates, exp{β 0 ± 1.96se}. This unadjusted model avoids biologically implausible, negative rates produced by the traditional arithmetic calculations (Supplementary Table S2). Next, we developed four Negative Binomial Harmonic Regression (NBHR) models and applied these models for each infection in each state and all 10 states combined (Models 2-5). We explored the effects of linear, quadratic, and cubic trend terms, which were added in a stepwise manner to Model 2 containing solely harmonic seasonal oscillators.  Table S1). We estimated peak timing, amplitude, and confidence intervals using the δ-methods (Supplementary Table S3) derived by MacNeill and Naumova 27 with further modifications by Alarcon-Falconi, et al. 21 for each infection in each state and all 10 states combined for the full duration of the study. Confidence intervals for peak timing and amplitude estimates are derived under the assumption of seasonal periodicity. As not all infections had consistent seasonal patterns, peak timing and confidence intervals can reach implausible values. Implausible peak timing estimates (values < 1 or > 13) occur when estimate variance exceeds 6 months for any infection or when peak timing estimates align with the beginning or end of the year (e.g. Yersinia). Implausible amplitude estimates (> 20) occur for erratic outbreaks or when estimate variance exceeds the average amplitude estimate (e.g. Cyclospora). Peak timing estimates are expressed in continuous month values from 1.0 (beginning of January) to 12.9(9) (end of December) according to the Gregorian calendar. Amplitude estimates are the midpoint of relative intensity reflecting the ratio between the disease rate at the peak (maximum rate) and the disease rate at the midpoint (median rate). Independent sample t-tests were used to determine statistically significant differences of peak timing estimates between states for the same infection and across infections within the same state.
Model goodness-of-fit was evaluated using the Akaike's Information Criterion (AIC), Bayesian Information Criterion (BIC), and Root Mean Squared Error (RMSE). In addition, we assessed the contribution of each trend component by examining regression coefficients for linear, quadratic, and cubic trend terms. Depending on the sign, the linear term indicates overall increases (β 1 > 0) or decreases (β 1 < 0) while the quadratic and cubic terms indicate acceleration (β 2 > 0) or deceleration (β 3 > 0). We calculated the contribution of each term by multiplying each coefficient by the trend-associated time unit to recover the corresponding predicted rates. The contribution of each trend component (TC) to the overall fit was estimated as follows: where TC k is the contribution of k-component (k = 1, linear; k = 2, quadratic; and k = 3, cubic).
We explored the variability of seasonality estimates with respect to trend specification. We evaluated percentage differences of peak timing and amplitude estimates of Models 3-5 compared to Model 2 as follows: where ΔP is percent difference of peak timing, P m , and amplitude, A m , estimates of m-model (m = 3,4,5) as compared to Model 2 (m = 2). Synchronization analyses. First, we calculated autocorrelations applied to monthly time series for the full duration of the study using Spearman correlations at 0-, 1-, 2-, and 3-month lags for each infection to confirm the strength of trend and seasonality components. For diseases with marked seasonality and overall trend, the strong serial synchronization reflected the similarities in temporal patterns. In some instances, prolonged periods of low incidence and occasional spikes could drive strong serial synchronization and thus be biased. For infections exhibiting well-marked irregularities, such as Cyclospora and Yersinia, serial synchronization metrics were likely to be biased, as evidenced by high values of L-skewness and L-kurtosis coefficients.
We then estimated three metrics of synchronization and compared them across infections within the same state and for each infection across states. Serial synchronization captures whether two infections or two locations share a similar temporal pattern. Phase-phase synchronization derives associations between peak timing estimates and identifies co-occurrences of outbreak timing and seasonal processes between infections or locations. Phase-amplitude synchronization examines seasonal behaviors of disease-state pairs and examines how the intensity of a seasonal peak of a foodborne illness varies in relation to its annual peak timing.
In order to conduct phase-phase and phase-amplitude synchronization analyses, we used annualized NBHR model (Model 1) estimates of seasonality characteristics: peak timing, amplitude, and their confidence intervals, using equations shown in Supplementary Table S3. Phase-phase and phase-amplitude synchronization metrics were calculated using Spearman correlations for 7 infections (excluding Yersinia and Cyclospora) in 11 locations (10 states and US national estimate) across a maximum of 22 years (varying by state). In total, we calculated 77 peak-amplitude pairs spanning a total of 1536 reporting years. Positive correlations for phasephase synchronization indicate concordance of peak timing estimates between infections or states, e.g. if one infection or state peaks earlier in the calendar year, the other does also. Negative phase-phase synchronization correlations indicate that if one infection or state peaks earlier in the year, the other tends to peak later. Positive phase-amplitude synchronization indicates the magnitude of incidence increases when an infection peaks later (3) Model 3: ln[E(Y tds )] = β 0 + β s (sin(2πωt)) + β c (cos(2πωt)) + β 1 t (4) Model 4: ln[E(Y tds )] = β 0 + β s (sin(2πωt)) + β c (cos(2πωt)) + β 1 t + β 2 t 2 (5) Model 5: ln[E(Y tds )] = β 0 + β s (sin(2πωt)) + β c (cos(2πωt)) + β 1 t + β 2 t 2 + β 3 t 3 (6) TC k = β k t k /(|β 1 t| + |β 2 t 2 | + |β 3 t 3 |) * 100%, www.nature.com/scientificreports/ in the year. Negative phase-amplitude synchronization indicates the magnitude of incidence decreases when an infection peaks later in the year. Determination of the association significance is based on the standard test for Spearman correlation at α < 0.05. All statistical analyses were conducted using STATA (SE 15.1) software. All visualizations were designed and created using R Version 3.6.2 and Tableau Desktop 2019.1 software. www.nature.com/scientificreports/