As we stare into the dawn of 2021, children in the United States face a stark reality. Many have been out of classrooms since March 2020. Online educational alternatives are spotty. Grades and achievement scores are slumping, with the biggest losses for the most vulnerable students. An analysis by consultancy McKinsey & Company, based in New York City, estimates that white students could lose 4–8 months of learning in mathematics, and students of colour 6–12. It’s unclear when many US schools should reopen for in-person learning. Tragically, the country still lacks data that show what’s safe. The fragmented data collection in schools echoes the country’s slipshod approach to tracking COVID-19 in care homes and hospitals, and even basic case- and death-rate information.

Other countries have handled school decisions differently. In the United Kingdom, Public Health England enhanced its national COVID-19 surveillance when schools reopened for a mini-term in June. That month, it found, only around 70 children out of roughly 1.6 million who attended school were infected — and infection rates mostly depended on the infection prevalence in the community. Although the country has stumbled in other aspects of its response, this organized data effort generated confidence in allowing UK schools to reopen nationally in the autumn.

School officials in the United States could probably learn something from foreign data — but US-based data would be more powerful. Alongside many differences, most developed countries have much lower community rates of infection than does the United States (as I write, the UK rate is about half the US rate). However, the US education system is extremely fragmented and mainly under the control of local governments. So decisions are rarely centralized, and data-collection systems are even less so. There have been several efforts to make systematic school COVID-19 dashboards, but all are run by non-governmental entities.

Regional differences in progression of the pandemic across the country, and the times at which schools re-opened, should have provided excellent learning opportunities. A number of states in heavily affected areas of the country opened their schools first, some as early as August. By September, data from these schools could have informed actions elsewhere. However, with no systematic data collection, the learning opportunity was lost.

Take the key question of appropriate distancing in classrooms. Evidence on aerosol transmission pointed to 2 metres as the best distance, but some studies argued that 1 metre is sufficient to reduce infection risk. That would make a big difference in how many students could fit in a classroom. The information to answer this question existed, in principle. It’s true that schools with greater distancing probably take other precautions and probably have more space and better ventilation. Still, with enough reliable data, analyses might have clearer answers on how well various precautions work.

The data collected by states can be hard to learn from. Several states (including Kentucky and Massachusetts) report counts of COVID-19 cases by school district, which can be useful — but they do not report information on numbers of children and staff at school, which are necessary to know rates of infection. Texas and New York both produce information on COVID-19 cases by school, alongside information on numbers of students and staff. Texas, however, does not report for schools with fewer than five cases, and because most schools have few infections, many data are lost.

Some states report only school outbreaks, which could be defined as more than 2 cases — or more than 3, or more than 5 — in a school over 14 days. Sometimes, an outbreak is reported only if the cases were confirmed to be spread at school. Some states will test asymptomatic staff or students — but this information is not typically systematically reported or integrated with other data, such as in-person attendance rates or community testing and case rates.

I’m an economist studying health-related behaviours but, like many scholars during the pandemic, I pivoted my focus to try to be of use. I’ve developed the COVID-19 School Response Dashboard (see, which seeks to collect data from districts all across the country and to aggregate state-level data. But it relies on schools participating voluntarily, and it leaves many out, including many districts with fewer resources. The mismatched nature of data collection means such efforts cannot be fully representative. Individuals cannot compel data provision and participation as a government could.

I would like to see US president-elect Joe Biden’s administration bring together the Centers for Disease Control and Prevention and the Department of Education to develop a stronger system for tracking COVID-19 in schools. A good first step would be to set up a coordinated data-entry system and to strongly encourage schools to participate. The administration should also create centralized guidance for opening plans and what information on COVID-19 infections schools should produce. We need to be able to identify the virus spreading in schools and work out what went wrong. The data we do have suggest that outbreaks in schools are not common, but they do happen. We need a way to find them systematically.

Certainty might be impossible, but decisions driven by reliable, relevant data shouldn’t be. If we can get those data, we can use these insights to guide schools reopening.