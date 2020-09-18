The progression of the COVID-19 pandemic has been monitored primarily by testing symptomatic individuals for the presence of SARS-CoV-2 RNA and counting the number of positive tests over time1. However, in the United States and other countries, the spread of COVID-19 has commonly exceeded the testing capacity of public health systems. Moreover, test results are a lagging indicator of the pandemic’s progression2,3, because testing is usually prompted by symptoms, which might take 2 weeks to present after infection4, and delays occur between the appearance of symptoms, testing and the reporting of test results. Monitoring sewage in a community’s collection or treatment system has been used previously to provide early surveillance of disease prevalence at a population-wide level, notably for polio5,6, and might be similarly beneficial for the current COVID-19 pandemic. SARS-CoV-2 RNA is present in the stool of patients with COVID-19 (refs. 7,8,9) and in raw wastewater10,11,12, and increased RNA concentrations in raw wastewater have been recently associated with increases in reported COVID-19 cases11. However, the utility of wastewater SARS-CoV-2 concentrations for tracking the progression of COVID-19 infections is poorly understood. In this study, we investigated how viral RNA concentrations in wastewater correlated with compiled testing and hospitalization data in a U.S. metropolitan area over ~10 weeks, corresponding to a first wave of SARS-CoV-2 infection.

When municipal raw wastewater discharges into treatment facilities, solids are settled and collected into a matrix called primary sewage sludge. We chose to analyze primary sludge rather than raw wastewater because it provides a high-solids-content, mixed sample that has been shown to contain a broad diversity of human viruses, including commonly circulating coronavirus strains13. During the COVID-19 outbreak, from March 19, 2020, to June 1, 2020, in the New Haven, Connecticut, metropolitan area, we collected daily primary sludge samples from the wastewater treatment facility, which serves ~200,000 residents. We quantitatively compared SARS-CoV-2 RNA concentrations in sludge with publicly reported data on four other measures of the outbreak: SARS-CoV-2 positive test results by date of specimen collection; the percentage of positive SARS-CoV-2 test results (test positivity) by date of specimen collection; the number of local hospital admissions of patients with COVID-19; and SARS-CoV-2 positive test results by reporting date.

We measured SARS-CoV-2 virus RNA by quantitative reverse transcription polymerase chain reaction (qRT–PCR) using the same N1 and N2 primer sets employed in COVID-19 individual testing. Virus RNA copies ranged from 1.7 × 103 ml−1 to 4.6 × 105 ml−1 of primary sludge. All qRT–PCR concentration threshold (Ct) values were below 40, and 97% of all samples had a Ct value less than 38. The average Ct was 34.6 for N1 primers and 34.5 for N2 primers. Values for each replicate were reported as positive only when the human ribonuclease P (RP) internal control gene was positive. The average (s.d.) Ct value for the RP gene for all positive samples was 36.2 (1.2) for replicate 1 and 36.2 (1.3) for replicate 2. Replicated samples demonstrated similar SARS-CoV-2 RNA concentration values (Fig. 1). Concentration comparisons between replicates produced slopes of 0.99 (R2 = 0.75) for N1 primers and 0.97 (R2 = 0.62) for N2 primers.

Fig. 1: Replicate RNA extraction and analyses for SARS-CoV-2 RNA. a, Comparison of SARS-CoV-2 RNA concentration between two replicates (Rep 1 and Rep 2) using the N1 primer set. b, Comparison of SARS-CoV-2 RNA concentration between two replicates using the N2 primer set. Full size image

All five measures traced the rise and fall of SARS-CoV-2 infections during the more than 10-week period studied (Fig. 2). However, the sludge results showed an increase during the first week (March 19–25, 2020) that was not observed in the reported testing or hospital admissions data. Applying a distributed lag measurement error time series model allowed an estimation of relationships between viral time series results and the reported testing and hospital admissions data. By modeling the epidemiological time series as a function of the sludge SARS-CoV-2 RNA data across multiple daily lags (posterior means ± 90% credible intervals), we found that the sludge results led the number of positive tests by date of specimen collection by 0–2 d, with a potential lag of 1 d (Fig. 3a,b); the percentage of positive tests by date of specimen collection by 0–2 d, with a potential lag of 1 d (Fig. 3c,d); hospital admissions by 1–4 d (Fig. 3e,f); and the number of positive tests by report date by 6–8 d (Fig. 3g,h). Performing the time series analysis with or without adjustment for testing volume did not result in differences in estimated lag times between sludge viral RNA results and number of positive tests (based on the above date of specimen collection results and date reported to the Connecticut State Department of Public Health) (Extended Data Fig. 1).

Fig. 2: Sludge SARS-CoV-2 RNA concentration time course and other COVID-19 outbreak indicators on linear (left) and log (right) scales. All data represent the cities of New Haven, Hamden, East Haven and Woodbridge, Connecticut, which are served by the ESWPAF. The blue vertical dashed lines indicate the first week of analysis, March 19–25, 2020. a,b, Number of positive SARS-CoV-2 test results, reported by date of specimen collection. c,d, Percentage of positive SARS-CoV-2 test results, reported by date of specimen collection. e,f, Number of COVID-19 admissions to Yale New Haven Hospital for residents of the four cities. g,h, Number of positive SARS-CoV-2 test results by public reporting date. i,j, Primary sludge SARS-CoV-2 RNA concentration (virus RNA gene copies per ml of sludge). Source data Full size image

Fig. 3: Estimated daily distributed lag parameters describing the association between viral RNA in sludge and COVID-19 epidemiological parameters. a, Daily lags of 0–2 d and leads of 1 d are associated with the number of positive tests based on specimen collection date. b, Cumulative relationship between viral RNA in sludge and the number of positive tests based on specimen collection date. c, Daily lags of 0–2 d and leads of 1 d are associated with the percentage of positive tests based on specimen collection date. d, Cumulative relationship between viral RNA in sludge and the percentage of positive tests based on specimen collection date. e, Daily lags of 1–4 d are associated with hospitalization. f, Cumulative relationship between viral RNA in sludge and hospital admissions. g, Daily lags of sludge virus RNA data at longer time lags (6–8 d in the past) best correlate with the time series of publicly reported positive tests. h, Cumulative beta relationship between viral RNA in sludge and reported number of positive tests. Posterior means at the center of each data point and 90% credible intervals for error bars are displayed. For each lag, n = 75 daily values for positive tests by date of specimen collection (a,b), 75 daily values for percentage of positive tests by date of specimen collection (c,d), 75 daily values for hospital admission (e,f) and 75 daily values for publicly reported positive tests (g,h). Full size image

Overall, our results demonstrate that measurement of SARS-CoV-2 RNA concentrations in primary sludge provides an approach to estimate changes in COVID-19 prevalence on a population level. Sludge results were not a leading indicator compared to positive test results or percentage of positive tests by date of specimen collection. However, they led hospitalizations by 1–4 d and test results by report date by ~1 week. Thus, in communities where test reporting is delayed, sludge results, if analyzed and reported on the same day as sampling, can provide substantial advance notice of infection dynamics. In locations with rapid reporting of SARS-CoV-2 test results, the lead time afforded by sewage surveillance might be significantly reduced. The lags in test reporting have multiple causes and might vary with societal responses to the pandemic. COVID-19 arrived in the New Haven metropolitan area in early March 2020, when testing capacity was limited, and there were extended waiting times from test date to reporting date. Understanding and mitigating the causes of such lags will require additional research.

Sludge data are also susceptible to variability for multiple reasons. For example, primary sludge handling approaches are specific to particular treatment plants and could affect the levels of detectable virus. Given the uncertainties in sludge data and epidemiological data, we did not attempt to correlate absolute numbers of sludge SARS-CoV-2 RNA concentrations and COVID-19 cases.

Monitoring primary sludge is a broadly applicable strategy. Wastewater treatment plants with primary and secondary treatment are standard in many regions of the world, and treatment facilities are rapidly expanding in urban areas of lower- and middle-income countries14. In the United States, approximately 16,000 treatment plants serve more than 250,000,000 people. In regions without primary wastewater treatment, monitoring of raw wastewater streams would be necessary. Our results indicate that jurisdictions can use primary sludge SARS-CoV-2 concentrations as an additional basis for imposing or easing infection-control restrictions, especially in locations affected by limits in clinical testing capacity or delays in test reporting.