Climate and the spread of COVID-19

Visual inspection of world maps shows that coronavirus disease 2019 (COVID-19) is less prevalent in countries closer to the equator, where heat and humidity tend to be higher. Scientists disagree how to interpret this observation because the relationship between COVID-19 and climatic conditions may be confounded by many factors. We regress the logarithm of confirmed COVID-19 cases per million inhabitants in a country against the country’s distance from the equator, controlling for key confounding factors: air travel, vehicle concentration, urbanization, COVID-19 testing intensity, cell phone usage, income, old-age dependency ratio, and health expenditure. A one-degree increase in absolute latitude is associated with a 4.3% increase in cases per million inhabitants as of January 9, 2021 (p value < 0.001). Our results imply that a country, which is located 1000 km closer to the equator, could expect 33% fewer cases per million inhabitants. Since the change in Earth’s angle towards the sun between equinox and solstice is about 23.5°, one could expect a difference in cases per million inhabitants of 64% between two hypothetical countries whose climates differ to a similar extent as two adjacent seasons. According to our results, countries are expected to see a decline in new COVID-19 cases during summer and a resurgence during winter. However, our results do not imply that the disease will vanish during summer or will not affect countries close to the equator. Rather, the higher temperatures and more intense UV radiation in summer are likely to support public health measures to contain SARS-CoV-2.

www.nature.com/scientificreports/ temperature, humidity, and survival of the virus outside of the host that influence and determine transmission rates among humans in the 'real world' … with natural history studies, the conditions are relevant and reflect the real-world, but there is typically little control of environmental conditions and there are many confounding factors" 4 . Between May and November 2020, the European Respiratory Society published several articles discussing the hypothesis that temperature and the spread of COVID-19 are inversely related. Using data from 224 cities in China, one article published in May found no such association 6 . In August 2020, another analysis using data from China implied a non-linear relationship to the extent that temperature and COVID-19 are not associated below 7 ℃ but that a weak negative association exists above that threshold 7 . Yet another study published in November found a significant negative association between temperature and the spread of COVID-19 using global data 8 . While, in general, the evidence is mixed and the debate is still ongoing, laboratory studies found that SARS-CoV-2 is highly susceptible to heat and UV-radiation [9][10][11][12][13][14] .
To add evidence from a different perspective, we use global data to examine the relationship between climatic conditions and the spread of COVID-19 controlling for several important confounding factors. To this end, we regress the prevalence of COVID-19 (logarithmically transformed) at the country level against the latitude of a country. Latitude captures every climate, because different latitudes on Earth receive different amounts of sunlight. The farther from the equator a country is located, the lower is the angle of the sun's rays that reach it, the less UV radiation it receives, and the lower is its temperature. Furthermore, latitude also affects humidity, because water evaporation is temperature dependent 15 .
To control for key confounders at the country-level, our analysis includes (1) data on air travel 16 (to capture a possible way of transmission of SARS-CoV-2 across countries but also the remoteness of a place, which might increase the need for air travel); (2) vehicle concentration 17 and urbanization 16 (to capture differences in the transmission potential of SARS-CoV-2 within a country 18 ); (3) COVID-19 testing intensity 19,20 (to control for the vigor of a country's COVID-19 response and for COVID-19 detection bias in cross-country comparisons 21,22 ); (4) cell phone usage 16 (to control for the speed at which information on behavior change for COVID-19 prevention travels within a country 18,23 ); and (5) health expenditure (to capture differences in countries' commitment to population health); old-age dependency ratio (to capture cross-country differences in age structure and family compositions, which can affect the spread of SARS-CoV-2), and income 16 (to control for differences in economic development and in the availability of general resources to contain the spread of SARS-CoV-2 24-26 ). Figure 1 and Table 1 show our results. In general, the farther a country is located from the equator, the more cases the country has relative to the number of inhabitants. This relationship is visible in the scatterplot in Fig. 1 and in the coefficient estimates of latitude (which represent semi-elasticities, i.e., percentage changes in the number of COVID-19 cases per million for one-degree changes in latitude), in the different regression specifications shown in Table 1. In the ordinary least squares (OLS) regression, in which we control for all potential confounding factors, an increase in the distance from the equator by one degree of latitude is associated with an increase of the prevalence of COVID-19 by about 4.3% (Table 1, Model 4). This result is highly significant and implies that a country that is located 1000 km closer to the equator could expect 33% fewer cases per million inhabitants, other things equal (given that a degree of latitude translates on average into a distance of 111 km). Since the change in Earth's angle towards the sun between equinox and solstice is about 23.5°, one could expect www.nature.com/scientificreports/ a difference in cases per million inhabitants of 64% between two hypothetical countries whose climates differ to a similar extent as two adjacent seasons.

Discussion
Our results are consistent with the hypothesis that heat and sunlight reduce the spread of SARS-CoV-2 and the prevalence of COVID-19, which was also suggested by most of the previous studies examining the same hypothesis with different data and approaches [8][9][10]27,28 . However, our results do not imply that the disease will vanish during summer. Rather, the higher temperatures and more intense UV radiation in summer are likely to support public health measures to contain SARS-CoV-2 29,30 . WHO's warning that the virus spreads in all climates must still be taken seriously. At the time of revising this manuscript in January 2021, many countries in the Northern Hemisphere are experiencing a surge in COVID-19 cases, which could be explained by an easier spread of COVID-19 in winter. Our analysis has several limitations. First, while our results are consistent with the hypothesis that higher temperatures and more intense UV radiation reduce SARS-CoV-2 transmission, the precise mechanisms for such an effect remain unclear and may indeed comprise not only biological but also behavioral factors. For example, Table 1. Results from Ordinary Least Squares regressions of the logarithm of COVID-19 cases per million inhabitants in a country on the country's latitude and control variables. Column 1 contains the bivariate specification of the regression of the natual logarithm of COVID-19 cases per million inhabitants on latitude. The other columns are nested models with control variables. Models (1) through (4) are alternative specifications and the results are based on countries in which more than 100 cases were reported as of January 9, 2021. Latitude is the absolute latitude of a country in degrees; air travel refers to the number of air passengers per capita in a country; vehicle concentration is the number of registered vehicles per capita; urbanization is the percentage of the population living in cities; testing intensity is the number of tests per hundred inhabitants; cell phone usage refers to the number of cell phones per capita; income refers to the purchasing power-adjusted per-capita gross domestic product (GDP) in a country; old-age dependency ratio is the ratio of the population above the age of 65 to the working-age population; health expenditure refers to the share of GDP spent on health. Robust standard errors are used to account for heteroscedasticity. Missing values were estimated with multiple (15) imputations. CI: confidence interval. Thus, future research should aim at uncovering how the transmission of SARS-CoV-2 is affected by changes in (1) climatic factors such as heat and humidity, (2) geographic factors such as altitude and sunlight intensity, (3) factors related to human behavior such as social interactions and pollution due to local economic activity at a more disaggregated level, and (4) the different potential of the human immune system to cope with diseases in summer as opposed to winter. Second, even though we included all countries worldwide for which data for this analysis were available, our final data set included only 117 out of the world's countries, mainly for reasons of data availability and for some countries not yet having surpassed the 100 COVID-19 case threshold. Third, while we strived to control for differential testing intensity using a recently compiled and frequently updated data set 19,20 , the data on testing intensity could suffer from reporting biases and incomplete coverage of testing approaches. To the extent that testing intensity is a function of a country's income, our analysis controlling for income (Table 1, Model 4) should reduce such a bias. The fact that column (4) in Table 1 contains a parameter estimate of latitude that is only slightly lower than the one in column (3) and still highly significant is reassuring in this regard. Furthermore, factors such as health infrastructure, socioeconomic background, and the availability of adequate health supplies may also affect the spread of COVID-19. However, these differences can be at least partially captured by controlling -as we have done -for vehicle concentration, urbanization, cell phone usage, income, the old-age dependency ratio, health expenditure, and testing intensity. Fourth, we cannot, as of yet, assess whether mutated versions of SARS-CoV-2, such as the ones that emerged in South Africa or in the UK in fall 2020, will display similar seasonal patterns of infection. Finally, the distance to the equator has the same climatic effects going north and south only when we are either around equinox or when one full year in the pandemic has passed (such that the seasonal variations average out globally because both hemispheres have passed through all four seasons during the pandemic). Thus, the date of our data set (which we updated during the final revision of this manuscript in January 2021) is comparatively well-suited for our analysis, because at this point in time the COVID-19 pandemic had been spreading for approximately 1 year 31,32 . Moreover, the effect sizes we estimate stayed rather stable over time. In earlier analyses of the data in March and April 2020 33 , which is close to equinox, we also found a significant positive association between latitude and the number of cases. Since then, the semi-elasticity estimates increased slightly, which could be due to better data quality and larger numbers of observations in our updated data sets.
In sum, we show that an increase in absolute latitude by 1° is associated with a 4.3% increase in COVID-19 cases per million inhabitants. Increasing temperatures and longer sunlight exposure during summer may boost the impact of public health policies and actions to control the spread of SARS-CoV-2. Conversely, the threat of epidemic resurgence may increase during winter. However, our results do not indicate that the disease will vanish in summer, nor that countries located close to the equator will contain the disease without effective public health measures.

Methods
We estimated both the bivariate specification of the regression of the logarithm of COVID-19 cases per million inhabitants on latitude as well as nested models with control variables. We excluded countries in which less than 100 COVID-19 cases were reported as of January 9, 2021, to use only data from countries where the pandemic was spreading (a few cases could be merely imported). Our main exposure variable is the absolute latitude of a country in degrees. The control variables included: (1) air travel, measured by the number of air passengers per capita in a country; (2) vehicle concentration, measured by the number of registered vehicles per capita; (3) urbanization, measured by the percentage of the population living in cities; (4) testing intensity, measured by the number of tests per hundred inhabitants; (5) cell phone usage, measured by the number of cell phones per capita; (6) income, measured by purchasing power-adjusted per-capita gross domestic product in a country; (7) old-age dependency ratio, which is the ratio of the population above the age of 65 to the working-age population; (8) health expenditure, which is the share of per-capita GDP spent on health. We used 2018 data for air travel, vehicle concentration, income, urbanization, cell phone usage, old-age dependency ratio, and health expenditure, because more recent data were not available in the World Development Indicators, our data source for these variables 16 . Testing intensity was based on testing data gathered for each country 19,20 . We used robust standard errors to account for heteroscedasticity. We used Stata 16 for our multivariable regression analyses. We estimated missing country covariate data in multiple (15) imputations, using the mibeta Stata command 34 .

Data availability
All data are available in the main text or the supplementary materials.