Main

During the first few months of the COVID-19 pandemic, primary and secondary schools in the United States were closed to in-person education as part of the national response to control the spread of SARS-CoV-2 (ref. 1). This decision was guided by data extrapolated from influenza transmission models, which suggested school closures as an effective measure for reducing the basic reproductive number of respiratory viral infections1,2, and early evidence suggesting that non-pharmaceutical public health interventions, including school closures, were associated with improved SARS-CoV-2 outbreak control3,4.

Modeling studies and time series analyses from across the world differ in their assessment of the impact of reopening schools on community SARS-CoV-2 transmission5,6,7. Elementary school children are at lower risk of severe illness than other age groups and their role in driving transmission in the community is cloudy8,9. However, there are multiple close interactions between individuals from separate households in a school setting; thus, interactions that occur in schools, even if each contact is lower risk, may contribute to SARS-CoV-2 spread. If children and school staff become infected at school, these transmissions may lead to subsequent transmissions to family members and other contacts, potentially resulting in increases in community transmission of SARS-CoV-2. Recently published studies about the impact of school mode on community transmission from Indiana, Texas and other states found conflicting results10,11,12, with some analyses suggesting substantial increases in case rates associated with school openings, others suggesting a small impact13,14 and still others suggesting that opening schools to in-person learning has no impact on community case rates after adjusting for community incidence15 and minimal impact on hospitalization rates when COVID-19 hospitalizations within a county are kept under 36–44 per 100,000 (ref. 16).

Thus, the association between type of school reopening mode (for example, virtual, hybrid or in-person) and community spread of SARS-CoV-2 continues to be a critical policy question. Although school closure early in the pandemic was associated with lower SARS-CoV-2 incidence3, the impact of school closures in addition to other public infection prevention measures, such as business restrictions, social distancing, masking, scaling up of testing and contact tracing is unknown. The aim of this national, retrospective cohort study was to evaluate the impact of school mode and opening to in-person education on subsequent changes in community incidence of SARS-CoV-2.

Results

A summary of the study’s main findings, major limitations and policy implications, intended for nonspecialist readers is presented in Table 1. In total, 519 counties representing 1,050 school districts had a school opening mode available. After excluding the Pacific region of the West due to limited variation (59 out of 64 fully remote, 3 hybrid, 2 traditional), 459 counties consisting of 895 school districts were included (Fig. 1). In all counties, one school mode predominated (that is, there were no counties split evenly between remote and in-person learning).

Table 1 Policy summary
Fig. 1: Map of counties included in the analysis.
figure 1

Counties with green markers opened in a fully virtual learning model, counties with blue markers opened in hybrid mode and counties with red markers opened in a fully traditional (in-person) learning mode. The size of county markers is related to county population size, with larger markers indicating larger populations. The table inset depicts the number of counties by region and school opening mode.

Among the included counties, 103 were in the Northeast, 41 in the Mountain division, 124 in the Midwest and 191 in the South (Extended Data Table 1). Traditional, full in-person schooling was the most common mode in the Midwest (48 out of 124); in the Northeast, hybrid learning models predominated (53 out of 103); and in the South and Mountain division, virtual learning was the most common (South, 96 out of 191, Mountain, 22 out of 41).

Initial school opening dates varied but ranged from an earliest start date of 22 July 2020 to a latest start date of 28 September 2020 (Extended Data Fig. 1). Significant demographic differences by region were identified (Extended Data Table 1). Notable differences in community activity and infection control policies identified between regions included higher rates of business closure and activity restrictions in the Northeast and Western regions, increased contact tracing in the Northeast, stricter masking policies and regulations in the Northeast and West and more access to testing in the Northeast and Midwest (Table 2). Manual review of community-level mitigation policies found that masking mandates were generally implemented earlier in the Northeast versus other regions; masking mandates were least common and tended to be implemented latest in the South (5 out of 20 Southern counties had no mask mandate or a masking mandate started after school opening versus 2 out of 20 in the Midwest, 1 out of 20 in the Northeast and 0 out of 20 in the West).

Table 2 Descriptive statistics of covariates included in the regression models

Unadjusted mean SARS-CoV-2 cases per 100,000 residents per week stratified by region are shown in Fig. 2. Increasing SARS-CoV-2 case counts across all regions during the weeks after the start of school were identified, regardless of school mode. The adjusted absolute differences in SARS-CoV-2 cases, which include an accounting for baseline community prevalence before school opening, between counties with hybrid and traditional school opening modes relative to counties with virtual learning models are presented in Fig. 3. In the Northeast and Midwest regions, differences in SARS-CoV-2 case rates were not detectably different across any of the 3 learning modes, although there was a small increase in cases 6–9 weeks after school opening in the Midwest in counties with traditional learning; no increase was found in counties with hybrid learning modes. In the South, there was a statistically significant increase in cases in counties that opened for hybrid or traditional modes compared to virtual. In the West, there was an increase in cases in counties with a hybrid learning mode.

Fig. 2: Unadjusted mean SARS-CoV-2 cases per 100,000 residents stratified by region; week 0 denotes initial school opening.
figure 2

ad, The solid black lines indicate traditional school mode, the solid gray lines indicate hybrid school mode and the dashed gray lines indicate virtual school mode for the Northeast (a), West (b), Midwest (c) and South (d).

Fig. 3: Adjusted absolute difference between SARS-CoV-2 cases in counties with hybrid and traditional school modes relative to virtual for each week, with week 0 being the week when school started for each county, stratified by region.
figure 3

ad, Results were generated from multivariate Poisson regressions with robust s.e. for the Northeast (a), West (b), Midwest (c) and South (d). The black markers indicate traditional school mode and the gray markers indicate hybrid school mode. The whiskers for each marker depict the upper and lower bounds of the 95% CI (n = 1,512 for the Northeast, 648 for the West, 1,887 for the Midwest and 3,110 for the South). Google mobility data were not available for week 12 following school opening in the Northeast, thus results are presented for the 11 weeks following school opening in that region (a).

After adjustment, a traditional school mode was associated with increases in the number of SARS-CoV-2 cases compared to a fully remote mode from week 4 (effect = 13.8 cases per 100,000 residents, 95% CI = 1.1–26.4) to week 6 (effect = 11.2, 95% CI = 0.1–22.3) in the Midwest. In the South, a traditional in-person mode was associated with increases in the number of SARS-CoV-2 cases during the period from week 2 after school opening (effect = 10.7 cases per 100,000 residents, 95% CI = 3.6–17.8) to week 12 after opening, (effect = 10.0, 95% CI = 3.1–16.8).

In the South, a hybrid school mode was associated with increases in the number of SARS-CoV-2 cases from week 3 (effect = 9.7, 95% CI = 1.5–17.9) to week 12 (effect = 6.4, 95% CI = 0–12.7). In the West, a hybrid mode was associated with increases in cases from week 5 (effect = 19.9, 95% CI = 0.2–39.7) to week 12 (effect = 30.7, 95% CI = 3.4–58.1). There was no impact of school opening mode on subsequent COVID-19-related deaths during the entire 12-week period after school opening in any region (Fig. 4 and Extended Data Fig. 2).

Fig. 4: Adjusted absolute difference between SARS-CoV-2 deaths in counties with hybrid and traditional school modes relative to virtual for each week, with week 0 being the week when school started for each county, stratified by region.
figure 4

ad, Results for the Northeast (a), West (b), Midwest (c) and South (d) were generated from multivariate Poisson regressions with robust s.e. The black markers indicate a traditional school mode and the gray markers indicate a hybrid school mode. The whiskers for each marker depict the upper and lower bounds of the 95% CI (n = 1,512 for the Northeast, 648 for the West, 1,887 for the Midwest and 3,110 for the South). Google mobility data were not available for week 12 following school opening in the Northeast, thus results are presented for the 11 weeks following school opening in that region (a).

Sensitivity analyses using P < 0.01 or P < 0.000347 as a threshold for statistical significance yielded similar results in the South but the impact of a hybrid learning mode in the West was nonsignificant after adjustment for multiple testing (Extended Data Figs. 3 and 4).

In the adjusted models, the impact of school opening mode on SARS-CoV-2 cases stratified by age group varied between and within regions (Extended Data Figs. 57). Across all regions, there were no differences in 10–19 year olds. Case increases associated with in-person learning in the South and Midwest were driven by increases in cases diagnosed in ≥20 year olds. In the South, there was a statistically significant and sustained increase in cases among 0–9 year olds during the 2–10 week period after school opening.

Discussion

This national cohort study, which included nearly half of all public school student enrollment across the United States, found regional variation on the impact of school reopening policy on the community incidence of SARS-CoV-2. In the South, which tended to have more limited community-level mitigation measures and which opened during a period with relatively high community prevalence of SARS-CoV-2 cases, reopening schools for in-person learning (using either a hybrid or traditional approach) was associated with a subsequent sustained increase in community case rates of SARS-CoV-2, driven by case increases among adults and children under the age of ten. The effect of school mode in the South was robust to both sensitivity analyses with stringent cutoffs for statistical significance. In the West, opening in a hybrid school mode was associated with sustained increased community case rates; however, these findings were not statistically significant in the sensitivity analyses. In other regions, where adoption of community public health measures were more substantial and where schools opened during times of relatively low prevalence, we found no impact of school opening mode on subsequent community incidence of SARS-CoV-2. These data add to a growing body of literature about the impact of school opening policy on SARS-CoV-2 transmission and public health measures for pandemic control13,14,16,17.

Although evidence demonstrates that children, particularly elementary school children, are at low risk of severe COVID-19 (ref. 18), data are mixed about the role children may play in household and community transmission of SARS-CoV-2 (refs. 16,19,20). Our nationwide study adds to a growing body of data about the role that in-person learning plays in SARS-CoV-2 transmission in the surrounding community and is consistent with previous studies supporting broad infection prevention strategies for SARS-CoV-2 control. Additionally, our study demonstrates that, while school opening can be associated with increases in case rates in some regions, notably those that opened during times of peak SARS-CoV-2 in the community, those case increases may not translate to detectable increases in COVID-19 mortality. Policy decisions about closing schools for in-person learning must be weighed against the harms of ongoing school closures.

In our dataset, the most extreme and sustained increase in SARS-CoV-2 cases associated with school opening was in the South, where school opening was associated with a weekly increase in cases ranging from 9.8 to 21.3 per 100,000 people. Our observational study cannot fully explain the reason for the different effects identified in the South; however, in this region, infection control measures both inside and outside of school were limited. In regions with more substantive infection control efforts both inside of school settings and in the broader communities, such as the Northeast, there was no increase in community case incidence associated with opening schools and a trend toward a decrease in SARS-CoV-2 cases among children after schools opened for in-person instruction. Additional potentially contributing factors include that schools in the South opened when cases of SARS-CoV-2 in the community were relatively high, raising the question of whether the impact of school opening on community transmission rates is dependent on community prevalence, which would be consistent with the finding by Harris et al.16 that the impact of school mode on COVID-19 is dependent on SARS-CoV-2 hospitalization rates in the community. Additionally, it is possible that different climactic conditions may play a role in the trajectory of cases, independent of school mode policy. Thus, it is possible that there is a differential effect of in-person schooling on community transmission that is dependent on a variety of factors, such as mitigation strategies in schools and in surrounding communities, different weather and humidity patterns in the South compared to other regions or higher cases in the community leading to more case introductions in schools and ultimately more community transmission originating inside schools16. Additional studies are needed to delineate the impacts of these different factors more fully.

Our study adds to a growing body of literature about the impact of school closures as a policy measure for reducing the basic reproductive number of SARS-CoV-2. Early reports suggested that school closures were associated with reductions in COVID-19 deaths, although study authors acknowledged that, due to the simultaneous implementation of a variety of public health measures, the impact of school closure specifically could not be delineated fully3,5,7,11,12. Other investigations found conflicting results, with some suggesting that opening schools is associated with an increase in SARS-CoV-2 cases in the community and others suggesting minimal or no impact13,14,15,16. A meta-analysis found that studies with the lowest risk of statistical bias did not find a substantial impact of school mode on community incidence5,17. A recent study evaluating the impact of school mode on community SARS-CoV-2 cases in Texas found substantial increases associated with reopening schools for in-person learning11. However, the Texas study, unlike ours, did not control for temporal trends occurring before the start of school. Across all regions, school opening occurred in a background of increasing case counts; thus, controlling for temporal trends and other county-level factors is critical for isolating the impact of school mode from other simultaneous events that may occur at the same time as changes in school learning modes.

Multiple previous studies conducted in the setting of multifaceted infection control plans demonstrated low rates of transmission in urban, suburban and rural public school settings21,22,23,24,25. Conversely, SARS-CoV-2 school clusters reported in the United States and around the world highlight that substantial in-school transmission can occur under some circumstances. A recent national survey found that, among students attending schools that adopted few or no mitigation measures, living with a student attending in-person school was associated with a higher risk of COVID-19 among family members26. However, the same study also found that the elevated risk was eliminated with the addition of more in-school mitigation measures, findings similar to those of other studies7,27,28.

Our study has several limitations. It was observational in nature and thus we are not able to determine causality, only association. The study was conducted using data collected from the 2020–2021 school year, when the more transmissible Delta variant was not the predominant strain and when vaccines were not available to adults or adolescents. Thus, it is not clear how the results of this study should be contextualized, given current transmission dynamics and vaccine availability for some public school students. Data regarding SARS-CoV-2 cases are available from the Centers for Disease Control and Prevention (CDC) stratified by decade of life but not by year of life. This has two major impacts on our study. First, cases in infants and toddlers were included in case counts for children under ten; thus, it is possible that the impact found in this age group associated with school opening was not driven by children attending elementary schools but by younger children. Second, we were not able to break down cases for middle and high school children separately. Third, given the high rates of mild and asymptomatic cases in children, it is possible that cases in these populations potentially attributed to attendance of in-person learning were not detected.

Detailed district infection control plans were not available; thus, we were not able to measure the effectiveness of specific infection prevention measures within schools on community incidence of cases. However, the aim of this study was to evaluate the impact of school opening policy on community transmission, not to address the related but distinct question of SARS-CoV-2 transmission and prevention in schools. Elementary school children appear to spread less than children in older grades and may be less able to participate in remote learning29. Thus, one proposed policy option was offering traditional instruction to younger children while offering hybrid or remote options to middle and high school students. Due to the very high correlation between learning modes at each of the three grade levels, we were not able to address the potential impact of this strategy on community spread of SARS-CoV-2 since only eight counties in our dataset opted for this policy model. We may not have been able to fully control for community infection prevention measures that may have impacted estimates. However, we attempted to mitigate the effect of this confounding by stratifying our analysis by region, which was highly correlated with school opening date, school opening policy and state-level infection control interventions, and by using a variety of data sources to address community response in a variety of different ways, including the Oxford dataset, which included detailed information about SARS-CoV-2 mitigation strategies, and the Google COVID-19 Community Mobility Reports, which reflected county-level activity, and by manually reviewing a subset of the counties to ensure data quality and accuracy. The use of multiple robust datasets with manual validation to ensure accuracy is a major strength of the study. Data about school mode were available at the district level and other measures were available at a county or state level; it is possible that the conversion from district to county data may have introduced bias into our findings. However, Burbio’s validation data found that school opening mode is highly clustered and estimated a margin of error of 2.7% in their dataset30; this small margin of error would not have changed our study’s principal findings. Finally, we could not account for private schools; however, approximately 90% of elementary and secondary school children attend public schools; thus, the impact of these missing data is likely to be small.

The association between school opening mode and county incidence of SARS-CoV-2 varies by region and may be correlated with community infection prevention measures or community incidence of SARS-CoV-2 cases at the time of school opening, both of which may also be correlated with the implementation of in-school mitigation and community mitigation measures. Although results varied by region, these findings suggest that schools can open for in-person learning during the pandemic with minimal contribution to sustained community incidence of infections, provided other public safety measures are adopted.

Methods

To measure the impact of school mode on the community transmission of SARS-CoV-2, we created a retrospective cohort of school districts including the period immediately preceding and following school reopening in the United States (July–September 2020; Extended Data Fig. 1). Using multiple data sources, a longitudinal dataset at the county-week level was created and SARS-CoV-2 cases were examined. Variation in school opening date and mode were exploited to estimate the effect of initial school mode on SARS-CoV-2 cases and COVID-19 deaths. Data spanning the time from 5 weeks before official school opening in any of three modes (for example, traditional, hybrid, virtual) to 12 weeks after the start of school were included. County fixed effects were included to control for trends in case counts before school opening. In each of our statistical analyses, week 0 corresponds to the week when school began in that county.

Data sources

School model

School reopening mode data were obtained from Burbio, which includes manually validated information from 1,200 school districts across the United States, representing approximately 35,000 schools in 50 states, and 47% of student enrollment in public K-12 schools30. Districts are classified into type of school mode, including traditional, defined as students participating in in-person learning ≥4 d per week; hybrid, defined as students divided into cohorts and attending school in-person 2–3 d per week; and virtual, defined as students attending school in a fully remote mode with no live, in-person instruction. Data available in the Burbio dataset include the date the school district opened and the proportion of schools that opened in each of the three different learning modes, stratified by school type (for example, elementary, defined as kindergarten to 5th grade, middle school, defined as 6th to 8th grade and high school, defined as 9th to 12th grade). To convert these school district-level data to the county level, we first took the average school mode proportion among sampled districts within a county across the three grade levels. We then assigned the school mode for the county based on the maximum value of these averaged grade level school modes; for example, if 75% of the districts within a county were hybrid, then the entire county was considered hybrid.

Community incidence and COVID-19-related deaths

Incident cases of SARS-CoV-2 per day at a county level were obtained from the CDC dataset31. Data available through these sources included daily cases, decade of age and deaths by county, starting from January 2020 (refs. 6,32). Per CDC guidelines, both confirmed and probable cases and COVID-19 deaths were included. Daily incidence was converted into a weekly incidence for cases and deaths. The denominator for the outcome measures was the estimated number of residents in the year 2020 for each county by the US Census Bureau.

Community-level COVID mitigation measures

Data about community-level mitigation measures were obtained through the Oxford University dataset, which contains data about federal, state and substate policies33. To validate these data, a sample of districts (n = 20 in each of the 4 census regions) underwent manual review for the presence and type of community-level masking policy to ensure accuracy of the variables and provide insight into how state and substate mitigation measures may have differed from county-level interventions.

Community mobility data

Community mobility data were accessed from the Google COVID-19 Community Mobility Reports dataset34. These reports contain aggregated and anonymized user data through Google’s location history. User data were organized into trends over time by geography and separated into various locations. The Community Mobility data provide insight into the mobility response to COVID-19 mitigation policies34. These variables are measured as the percentage change in the time individuals spent in different locations relative to a baseline time period (3 January 2020–6 February 2020).

Independent variables

The key independent variables were the county school mode, dummy variables for each week and the interaction between school mode and week variables. Our analyses controlled for important covariates to minimize confounding bias in the relationship between school mode opening and outcomes. These covariates included variables from the Google Community Mobility data (retail and recreation, grocery and pharmacy, workplaces and residential) and from the Oxford policy data (workplace closings, cancelling of public events, restrictions on gatherings, closing of public transportation, COVID testing policies, COVID contact tracing and requirements to wear masks outside the home). In addition, due to the hierarchical and longitudinal nature of our data, we included county, state, week and state–week fixed effects to control for temporal trends among other county-level factors. We chose to use a fixed effects approach as opposed to a random effects or multilevel approach because of the strict assumption that random effects be uncorrelated with other independent variables included in the model. Given regional variation and correlation within regions regarding the timing of school opening, SARS-CoV-2 case counts, county infection control strategies and school mode, the cohort was stratified by US Census region (for example, Northeast, West, Midwest, South). The Pacific division was excluded due to near uniform school mode (virtual; Fig. 1); therefore, the West region includes only the Mountain division.

Outcome variables

The primary outcome variable was change in county-level incidence of SARS-CoV-2 cases per 100,000 residents during the 12-week period after school opening to estimate any sustained impact on community spread of SARS-CoV-2 associated with in-person learning modes. Secondary outcome variables included change in COVID-19 mortality per 100,000 residents and change in incident diagnoses stratified by decade of life (0–9 years, 10–19 years and 20+ years), which was examined to determine if the school model was associated with increases in children and adolescents attending primary and secondary school or if the primary impact was on infections diagnosed in adults.

Data analysis

We used an event study framework35 with data from before and after K-12 schools opened for the 2020–2021 school year, before the Delta variant was predominant and before vaccine availability. Event studies are a commonly used36,37,38,39,40,41 extension to the standard difference in differences approach. They can be exploited to estimate the effect of the occurrence of an event on an outcome over time while taking advantage of variation in the timing of exposure to this event across groups42. In our case, the week of school opening varied from early August to late September 2020 (Extended Data Fig. 1). We estimated the effect of school opening mode on SARS-CoV-2 diagnoses and COVID-19 mortality outcomes using a multivariate Poisson regression with robust standard errors. Given the strong association between region and mitigation measures in the community and timing of school opening, models were estimated separately for each of the four regions. We report the results from these models from the school mode–week interaction terms as marginal effects that are interpreted as the adjusted absolute effect of school mode per week on the outcome. Running analyses for multiple regions raised concerns about multiple hypothesis testing; to address this, we completed a sensitivity analysis using P < 0.01 as the cutoff for statistical significance. Additionally, to account for the analysis calculated on a weekly basis, rather than in aggregate, a second sensitivity analysis, applying a Bonferroni-adjusted P = 0.000373 for significance, was conducted. All analyses were completed using STATA v.16.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.