## Main

Since the beginning of the COVID-19 pandemic, governments have enacted a variety of policies to curb the spread of the new coronavirus1. Non-pharmaceutical interventions (NPIs) have encouraged or legally required people to change their behaviours in an attempt to protect themselves and others from the disease. While the stringency of measures has varied over time and across geographical regions, NPIs have been in place for well over a year in most countries1. Vaccination programmes, now underway, are projected to take many months to roll out2, and even with progress in this area, countries may struggle to achieve herd immunity3,4.

In this context, discussions of alleged ‘pandemic fatigue’ have raised worldwide concerns5. Yet, its exact meaning, existence and scope have triggered fierce debates among academics and policymakers. On the one hand, the apparent seriousness of the problem has led the World Health Organization to not only voice the concern but also propose a framework on how to ‘maintain and reinvigorate’ people’s motivation to comply with recommended protective behaviours6. On the other hand, behavioural scientists have pushed back against, for example, the British7,8,9 and Swedish10 governments’ presumption of behavioural fatigue to NPIs, pointing to an absence of evidence beyond the anecdotal. In reality, irrespective of any role of psychological tiredness and the connotations or appropriateness of the word ‘fatigue’, very little is known about people’s purported flagging observance of protective behaviours against COVID-19. The emerging academic literature has led to conflicting conclusions, with empirical studies generally focused on one or two countries11,12,13,14,15,16.

A systematic, global assessment of behavioural patterns that emerge over time while governments have in place NPIs is therefore much needed by researchers and policymakers17. Currently, when policy observance is operationalized in epidemiological models, it tends to be treated as a static variable for which broad generalizations can be made across peoples, as well as across time18,19,20. This could render some findings temporally specific and potentially misleading. Authors of the few epidemiological models that incorporate lessening adherence as a function of time argue that these more closely match post-peak dynamics in the number of cases21, and predicted strong second waves in Europe in late 2020 (ref. 22). Similarly, researchers evaluating the relative effectiveness of NPIs commonly acknowledge the unaccounted variability in compliance as a limitation of their work23.

Following the WHO’s concerns about pandemic fatigue, which motivated this study, we systematically assess the evidence for a reduction from initial levels of adherence to protective behaviours against COVID-19. If the continuance of NPIs does indeed lead to gradually depleting motivation to comply over time, as proposed by the WHO and other policymakers, then monotonic—perhaps linear or asymptotic—reductions in adherence to protective behaviours may be expected, though such patterns could also result from other causes. Drawing on three multi-country longitudinal datasets that measure: (a) self-report protective behaviours from nationally representative samples, (b) global mobile-phone mobility and (c) gradations in countries’ policy strength, we purposely focus our assessment on the pre-vaccine period, starting when countries first imposed closure and containment policies, to December 2020. During this period, NPIs were generally imposed, progressively eased, and in many places then re-imposed after a spell of gradual loosening.

The richness of these datasets enables us to examine the direction and form of protective-behaviour adherence curves over time, as well as how they have varied across different kinds of behaviours6 (physical distancing versus mask wearing). We also explore how changes in adherence associate with different individual-level (age, gender, professional status and household structure) and country-level characteristics (geographic region, income and two kinds of trust). Whilst not seeking to demonstrate causal relationships, we conclude with a discussion on the possible drivers of the observed behavioural patterns and the policy implications of our findings.

## Results

We used multi-level (mixed-effects) models to examine temporal shifts in indicators of self-report or objectively measured protective behaviours that reflect individuals’ decisions. The dataset of nationally representative survey responses included self-report protective behaviours (physical distancing and mask wearing) from 14 countries, collected on at least a monthly basis between 1 April and 24 November, yielding 238,797 individual responses. The mobile-phone mobility dataset contained daily actual physical distancing behaviours from 124 globally distributed countries, a source that provides an additional month of behavioural data. The policy-strength dataset—Oxford COVID-19 Government Response Tracker (OxCGRT)1—gathers daily data on a wide range of policies for all countries in our study. We evaluate changes in adherence as variation in the performance of protective behaviours over time, since each country’s first enactment of a closure and containment regulation other than international border closures (referred to below as ‘first required measure’), in models that control for the strength of governments’ policies against COVID-19 (that is, change in adherence to protective behaviours stipulated in the policies). More information on the data sources and models is provided in Methods. Alternative model specifications, including fixed-effects models with pseudo-panels, and individual disaggregated data are provided in Supplementary Information.

### Presence and patterns of change in adherence

Our figures display monthly changes in adherence relative to the baseline period (0–30 days after first required measure), with the latter indicated by horizontal dotted lines. Controlling for the strength of policies, Fig. 1 shows the curves of self-report protective behaviours for the 14 surveyed countries, comparing two physical distancing behaviours that entail high and potentially cumulative individual costs over time (avoidance of gatherings and avoidance of going out, Fig. 1a) and one low-cost and arguably habituating protective act (mask wearing, Fig. 1b). With respect to physical distancing behaviours, Fig. 1a displays a U-shaped curve, whereby adherence has reduced gradually (by 0.68 points on a five-point scale between baseline and 150–180 days after first required measure for avoidance of gatherings (β = −0.68, P < 0.001, 95% CI −0.74 to −0.62), and for avoidance of going out, by 0.54 points on the same scale during the same period (β = −0.54, P < 0.001, 95% CI −0.59 to −0.48)), before partially rebounding in approximately October/November, depending on the timing of first measure (which varied by country). Between baseline and 210–240 days after first required measure, the overall reduction was thus smaller: 0.54 points for avoidance of gatherings (β = −0.54, P < 0.001, 95% CI −0.60 to −0.48) and 0.46 points for avoidance of going out (β = −0.46, P < 0.001, 95% CI −0.51 to −0.40). This pattern is also found in mobility data models (Fig. 2). We observed a U shape for change in the proportion of time spent in residences (a reduction of 6.80 percentage points from baseline to 150–180 days after first required measure (β = −6.80, P < 0.001, 95% CI −7.23 to −6.37) and of 5.97 percentage points to the 240–270-day period (β = −5.97, P < 0.001, 95% CI −6.44 to −5.51), Fig. 2a), and found an inverted U shape for change in retail and recreation visits (an increase of 19.3 percentage points from baseline to 150–180 days after first required measure (β = 19.33, P < 0.001, 95% CI 18.09 to 20.57) and of 15.56 percentage points to the 240–270-day period (β = 15.56, P < 0.001, 95% CI 14.23 to 16.89), Fig. 2b). In both cases, the interpretation is the same: adherence dropped gradually, only to rebound in later months.

A completely different trend, however, emerges for mask wearing (Fig. 1b). Controlling also for the strength of facial coverings policy in the mask wearing models, we found that adherence increased progressively with the passage of time. We observe an increase from baseline to 150–180 days after first required measure of 0.73 points on a five-point scale (β = 0.73, P < 0.001, 95% CI 0.64 to 0.83), then to the 210–240-day period, this increases to 0.99 points (β = 0.99, P < 0.001, 95% CI 0.89 to 1.09). Mask use was at least partially required in most countries only 2 months after the first required measure. While the steep early rise may be related to the rapid increase in availability following the initial mask shortages reported in some countries24, the growth trend remains over the whole period, which suggests adoption has risen more rapidly than predicted by policy strength.

Tables 1–4 systematically evaluate the form of adherence curves (Methods). Table 1 relies on the self-report, aggregated physical distancing data (that is, combining avoidance of gatherings and avoidance of going out), Tables 2 and 3 detail the results of mobility data models, and Table 4 evaluates mask use data. To assess the shape of global patterns, we included a quadratic time trend in our models. Because our time variables correlate with variation in policy strength over time, as a robustness check, we adopted a conservative approach to ensure that this shared variance could not be the sole driver of the significant coefficients of our time terms. First, we regressed our dependent variables on the policy indicators (model 1), then we used the residuals from these models as dependent variables in regressions that include the time variable (model 3). These models thus assign all variation in the dependent variable that could be captured by either the covarying policy or time variables to the policy variables only. Models in Tables 1–3 confirm a significant quadratic time term. Although some of our mask use models show a significant quadratic term (Table 4), others do not, and the marginal effect of time is always positive across these models, indicating a positive growth trend (Supplementary Fig. 1j).

We did not aim to tease out causal effects, which our observational analysis cannot achieve. However, as an initial exploration of the potential drivers of the non-linear changes in adherence to physical distancing, we added a small suite of control variables in some of our models. These include an indicator of shifts in information about disease risk (the trend in the national death rate), Google searches for ‘COVID’, and mean monthly temperature interacted with hemisphere, called ‘standard controls’. We also tested measures of the number of cases instead of the trend in deaths (Methods), which we refer to as ‘alternative controls’ and report in Supplementary Information. The inclusion of standard and alternative controls reduced the magnitude of the quadratic time-term coefficient (thus partially accounting for the gradient of the curve). However, this term remained significant (model 4; Tables 1–3) even in the residual models with standard controls (Supplementary Figs. 1c and 2c and Table 1a) and in models with dummy variables for different levels of containment policies that account for potentially non-linear associations between policy strength and behaviour (Supplementary Table 1e–g). Again, while the magnitude of coefficients varied across models, the quadratic term remained significant. In the Supplementary Information we report results for the main models using two alternative time points marking the beginning of the baseline adherence month. Our main results hold when, instead of the date of enactment of the first required measure for each country, we use the date when each country first reached 50 on OxCGRT’s stringency index, and when it first reached 70 (Supplementary Fig. 1e,f and Supplementary Table 1l,m).

We further explored the substantive size of the declining and rebounding changes within the U-shaped curves for physical distancing adherence. In the portion of these curves showing gradual, non-linear declining adherence, we observe reductions of noteworthy magnitudes. On average, adherence in the case of self-report physical distancing (using the aggregated measure) fell by 15% or 0.76 standard deviations (12% and 0.61 standard deviations with standard controls) between the baseline period (the 30-day period starting when required policies were first adopted) and the period with minimum adherence (150–180 days after first required measure). For mobility measures, the same changes represented 44% of baseline adherence (0.65 standard deviations) for time spent in residences and 46% of baseline adherence (0.70 standard deviations) for retail and recreation visits. These values changed to 35% and 36% (0.52 and 0.54 standard deviations), respectively, with the inclusion of standard controls (Supplementary Table 1n).

Considering the portion of the U-shaped curves after the inflection points, the rebounding adherence that we identify is substantively smaller than the prior declines. Without standard controls, rebounds represent 18%, 15% and 19% of the previously described decreases from baseline in self-report (aggregated) physical distancing, time spent in residential locations, and retail and recreation visits, respectively. With standard controls, these values are reduced, respectively, to 5%, 7% and 13% (2%, 7% and 13% with alternative controls). For mask use adherence, which presents a monotonic growth trend, the increase at the last observed period compared with baseline, unexplained by the policy indicators, reaches 48% of baseline value (54% with standard controls).

We also explored the presence of the rebound portion of the adherence curves in marginal effects plots. In these, rebounding adherence is evident as a change in the sign of the marginal effect of time, with or without the inclusion of standard and alternative controls (Supplementary Fig. 1h,i). In summary, adherence curves during the period assessed have been dominated by a substantial, non-linear and decelerating decline, followed by a relatively small rebound that is, in part, though not fully, reduced by standard and alternative controls.

Besides examining the magnitude of decreases in adherence, we also evaluated the overall explanatory power of our time variables. The linear and quadratic time terms explain 10% of incremental variance of the self-report physical distancing compared with a model that includes policies and standard controls but no time variables. These time terms explain approximately 14% of the incremental variance of both mobility dependent variables (Supplementary Table 1i). Our time measures also played an important role for adherence to mask use, with an incremental explanatory power of 21%.

### Regions and income levels

We assessed the consistency of the non-linear decline in adherence by comparing countries with varying wealth and geographies. For these we rely on mobility data as the sample of 14 surveyed countries is too small for meaningful country-level comparisons. Our mobility data models confirmed the robustness of the non-linear decline for physical distancing behaviours across country income classifications, based on the World Bank’s categories, and across world regions (considering those with more than 15 countries with mobility data). These patterns were consistent in residual models and with the inclusion of standard controls (Supplementary Fig. 3ac). The adherence curves representing Europe and high-income and upper-middle-income countries (Fig. 3) had the steepest gradients and highest peaks (in retail and recreation mobility models (Fig. 3a,b)) or nadirs (in the case of time spent in residential locations (Fig. 3c,d)). For example, between baseline and 150–180 days after first required measure, the retail and recreation adherence curve for high-income countries rose by 21.5 percentage points (β = 21.49, P < 0.001, 95% CI 19.60 to 23.38), compared with 14.7 percentage points for low-income countries (β =14.65, P < 0.001, 95% CI 11.08 to 18.23), and, over the same period, the residential mobility curve for high-income countries dropped by 7.4 percentage points (β = −7.37, P < 0.001, 95% CI −8.03 to −6.71), versus 3.6 percentage points (β = −3.59, P < 0.001, 95% CI −4.79 to −2.40) for low-income countries. Note that our baseline measure of initial adherence in the mobility models—specifically, changes in mobility in the 30 days after the first required measure compared with the first 5 weeks of 2020—was similar across high-, upper-middle- and lower-middle-income countries. Low-income countries recorded the lowest initial adherence, with 16.3 percentage points less reduction in visits to sites of retail and recreation than high-income countries relative to early 2020 (β = 16.30, P < 0.001, 95% CI 7.87 to 24.72), and 3.4 percentage points less increase in time spent in residences than high-income countries relative to early 2020 (β = −3.35, P = 0.07, 95% CI −7.04 to 0.34) (Supplementary Table 6 and Supplementary Fig. 6i). Initial adherence was also higher in Latin America than in East Asia, Europe and Sub-Saharan Africa (Supplementary Table 6 and Supplementary Fig. 6j).

While a substantial non-linear decrease in adherence is present in all country groupings, the rebound is more heterogeneous across these groupings and measures of mobility. For both measures of mobility, with and without standard controls, a rebound (that is, change in the sign of the marginal effect of time) was observed for high-, upper-middle- and low-income countries, but not lower-middle-income countries (Supplementary Fig. 3d,e). Comparing different geographical regions for retail and recreation visits, we found rebounding adherence in Europe and, smaller in magnitude, in Sub-Saharan Africa (Supplementary Fig. 3d,e). However, for time spent in residences, a rebound was observed only in Europe.

### Individual-level moderators

As mobility data are only available at the aggregate population level, we rely on the survey data to assess variation associated with individual-level characteristics. Models assessing aggregated physical distancing revealed that non-linear declines in adherence and rebounds have also been widespread across societal groups within countries (Fig. 4). With these models, we first considered differences in initial adherence (during days 0–30 after first required measure) across groups. On the five-point scale, initial adherence to physical distancing (mean 4.24, s.d. 0.82) was 0.19 points higher for women than for men (β = 0.19, P < 0.001, 95% CI 0.16 to 0.22), 0.33 points higher for those aged over 60 years than for people aged 30 years and below (β = 0.33, P < 0.001, 95% CI 0.29 to 0.37), 0.06 points higher for unemployed than for employed people (β = 0.06, P = 0.003, 95% CI 0.02 to 0.10), 0.16 points higher for people with pre-existing conditions that put them at greater risk of severe disease or death from COVID-19 than for those without such conditions (β = 0.16, P < 0.001, 95% CI 0.13 to 0.19) and 0.06 points higher for people living with more than one other person compared with single-person households (β = 0.06, P = 0.002, 95% CI 0.02 to 0.09) (for models with additional controls, see Supplementary Table 7). Despite these initial differences, we found U-shaped adherence curves to emerge across genders (Fig. 4a), age groups (Fig. 4c), household sizes and employment status categories (see Supplementary Fig. 4ag for residual models, models with standard controls and household size results, and Supplementary Fig. 6c–h for descriptives).

However, significant differences in changing adherence over time exist between those with and without pre-existing conditions (the interaction of the chronic health condition dummy variable with the linear (β = 0.017, P = 0.001, 95% CI 0.007 to 0.028) and quadratic (β = −0.006, P = 0.04, 95% CI −0.012 to −0.001) time terms, Fig. 4b), as well as between retired people and the employed (for both linear (β = 0.028, P = 0.001, 95% CI 0.011 to 0.045) and quadratic (β = −0.018, P < 0.001, 95% CI −0.026 to −0.010) time-term interactions, Fig. 4d). For instance, in the 90–120-day period after governments enacted their first required measure, people with chronic illnesses on average avoided physical distancing 0.11 points (β = 0.114, P = 0.03, 95% CI 0.01 to 0.22) more than those without them, compared with baseline, a gap that grew to 0.19 points (β = 0.191, P = 0.001, 95% CI 0.08 to 0.30) at 180–210 days, in models controlling for age (Supplementary Table 2b). Retired people on average avoided physical distancing 0.13 points (β = 0.134, P < 0.001, 95% CI 0.085 to 0.183) more than working people in the initial adherence period, but any subsequent differences in changes in adherence across groups did not hold in all robustness checks (Supplementary Table 2k and Fig. 4). Marginal effects plots show rebounding adherence for all demographic subgroups (Supplementary Fig. 4h,i).

### Trust as a moderator

Among various country-level moderators that have been emphasized in the literature6,25, trust is one of the most widely discussed26. We thus present the adherence curves for countries with high and low levels (above and below the median) of institutional and interpersonal (also called generalized or social) trust, for the 50 countries surveyed by the World Values Survey for which mobility data are available (Methods). These two constructs of trust correlate weakly in our sample (Pearson correlation 0.26, P = 0.067).

Drawing on mobility data, our models reveal that countries with above- (versus below-) median institutional trust scores did not meaningfully differ in initial adherence for retail and recreation visits or for time spent in residential locations (Supplementary Table 8a,c). The interaction of the binary institutional trust variable with the linear time term was not significant for both measures of mobility (retail and recreation visits: β = 0.212, P = 0.68, 95% CI −0.79 to 1.21; time spent in residential locations: β = −0.04, P = 0.85, 95% CI −0.37 to 0.31), as was the case for the quadratic term (retail and recreation visits: β = 0.05, P = 0.83, 95% CI −0.36 to 0.45; time spent in residential locations: β = 0.017, P = 0.82, 95% CI −0.13 to 0.16; Supplementary Table 3i). These null effects for institutional trust hold for each of the individual items that compose the institutional trust index (except for trust in courts with a small, albeit significant, effect indicating less rapid adherence declines among countries with higher trust in courts) as well as for trust in the healthcare system (also with significant interaction terms yet only in models without the additional controls: country-income level and Gini index) (Methods and Supplementary Tables 3e–h,j–m).

However, we found differences between interpersonal trust country groups, with high-interpersonal-trust countries exhibiting a weaker decline in adherence for both measures of mobility (Extended Data Fig. 1). Even after we controlled for country-income level, Gini coefficient, institutional trust and standard controls, the interaction of the binary interpersonal trust variable with the linear time term remained significant in both cases (retail and recreation visits: β = −2.25, P = 0.001, 95% CI −3.37 to −1.13; time spent in residential locations: β = 0.641, P = 0.001, 95% CI 0.26 to 1.02), though the interaction of interpersonal trust and the quadratic time term did not (retail and recreation visits: β = 0.07, P = 0.76, 95% CI −0.39 to 0.53; time spent in residential locations: β = −0.025, P = 0.76, 95% CI −0.189 to 0.138; Supplementary Table 3i). Controlling for these additional variables, the reduction in retail and recreation visits compared with baseline (adherence in the 0–30-day period) was 3.7 percentage points (β = −3.74, P = 0.058, 95% CI −7.61 to 0.12; Supplementary Table 3d) higher in low-interpersonal-trust countries, 90–120 days after the first required policies. This difference grew to 15.5 percentage points (β = −15.45, P < 0.001, 95% CI −19.34 to −11.55) higher in the 180–210-day period compared with baseline. We found similar results when using time spent in residences as the dependent variable (Extended Data Fig. 1d). Although high-interpersonal-trust countries started out with lower levels of initial adherence for retail and recreation visits (β = 13.80, P < 0.001, 95% CI 5.51 to 22.09), as well as for time spent in residential locations (β = −5.93, P = 0.002, 95% CI −9.64 to −2.22), their weaker decline in adherence led, over the period assessed, to this difference being extinguished (Supplementary Table 8b,d). Whereas the minimum level of adherence reached by low-interpersonal-trust countries was 9.2 percentage points lower than baseline for time spent in residences and 24.2 percentage points lower (that is, higher mobility given policy strength) using the retail and recreation measure of mobility, these values were substantively less dramatic for their high-interpersonal-trust counterparts (5.6 and 15.1 percentage points smaller than baseline, respectively).

The result observed for interpersonal trust holds within high- and low-institutional-trust countries alike (Supplementary Table 3n and Extended Data Fig. 1). The interaction between the two types of trust and time since first required policy is not significant for either retail and recreation visits (using the linear (β = 0.34, P = 0.74, 95% CI −1.66 to 2.33) or quadratic (β = −0.13, P = 0.74, 95% CI −0.93 to 0.66) time terms), or for time spent in residential locations (using the linear (β = −0.69, P = 0.07, 95% CI −1.44 to 0.06) or quadratic (β = −0.06, P = 0.69, 95% CI −0.38 to 0.26) time terms; Supplementary Table 3n).

## Discussion

At a point in the COVID-19 pandemic when many countries are experiencing a severe resurgence of the disease, after many months of restrictive policies, our study provides robust evidence of temporal variation in adherence to protective behaviours against COVID-19. We find that observance has been heterogeneous across types of protective behaviours. Rather than a simple pattern of monotonic decline in adherence, as per our (and the WHO’s)6 original expectation, we report a monotonic increase in observance for a low-cost and habituating behaviour (mask wearing) and a U-shaped pattern (a large, decelerating decline in adherence followed by a small rebound) for high-cost and sensitizing behaviours (physical distancing).

Waning observance of physical distancing policies has occurred in all world regions, country-income levels and societal groups. It emerges from our longitudinal analyses of both individual and self-report27, as well as aggregate and objective, behavioural data. Moreover, the adherence curves for virtually all societal groups and some country groups take on a U shape, underscoring the on-average global trend. An uptick in adherence after a period of decline has been hinted at in occasional, single-country findings from the United Kingdom28 and the Netherlands15. However, these accounts either considered self-report physical distancing behaviours to constitute ‘compliance’15 (without controls for policy change over time) or relied on people’s understanding of policies by tracking changes in direct survey questions about observance of government rules27. Our study shows that a gradual reduction from initial levels of adherence to physical distancing, and a subsequent increase, although not found in every world region up to late 2020, may be more common than not.

While we find robust evidence for temporal shifts in adherence to protective behaviours against COVID-19, we can only speculate about their causes. Pandemic fatigue, a loosely defined, popular term, is only one among several plausible explanations, which could, to a lesser or greater extent, contribute to the phenomena we observe.

One possible driver of reductions in adherence to physical distancing is the economic burden that performing these behaviours imposes on the individual and society at large. This would explain why we observe a decline for costly physical distancing but not for ‘cheap’ mask wearing, although the more rapid increase in adoption relative to policy requirements observed for mask use could also have resulted from clearer and more consistent messaging from governments and international bodies24, increased availability following initial shortages and the self-reinforcement of new social norms29. But if financial considerations drive waning adherence to physical distancing, one would arguably expect larger and more rapid drops, and perhaps weaker or non-existent rebounds, in low-income nations than in wealthy ones, as well as among unemployed people compared with those in work30,31. Our findings show that these have not occurred. Even though low-income countries recorded lower initial adherence relative to wealthier counterparts, the same U-shaped pattern of adherence change, though missing for lower-middle income countries, was evident for the poorest country group as well as the richest. And, if anything, residents of better-off nations (high-income countries, of which half are European, and upper-middle-income countries) seem particularly prone to ebbing observance of physical distancing during the decline phase of the curve (Fig. 3).

Another possible explanation for the patterns that we observe is that behavioural changes reflect a response to adjustments in risk assessment. Decades of research emphasize the causal influence of risk perception on health-related behaviours32. Leading theories, such as the Health Belief Model33 and the Protection Motivation Theory34, explicitly state that risk assessments influence people’s propensity to act in a riskless or reckless manner. Thus, irrespective of government policies, learning of a reduced (or increased) number of local cases or deaths could reduce (or raise) self-protective behaviours through changes in risk perception (in terms of the likelihood of catching SARS-CoV-2, or the severity of the disease, or both). Variations in risk assessment over time and across groups (for example, those with chronic illnesses) may well be a contributing cause, as others have found35. Yet, our findings suggest that such variations are unlikely to fully account for deterioration or subsequent improvement in adherence to physical distancing. Because we lack direct measures of fluctuations in risk perception during this pandemic, we relied on the trend in the national death rate and on national case rates as informational inputs about general risk, as well as internet searches to reflect citizens’ varying interest in COVID-19. With these controls, the decline in adherence remains, although the still-present rebound is largely reduced.

Nonetheless, changing risk perceptions over time could possibly help explain the especially strong reductions in observance found in Europe during the period prior to rebounding adherence. One can speculate that this may be so if, for instance, our time-varying risk perception indicators (particularly death trends and cases) do not capture a sense of proximity of early threat, and fear, that Europeans may have felt especially keenly around the time of Italy’s initial COVID-19 outbreak, even while cases and deaths in European countries other than Italy were still low. Such dynamics may have contributed to high levels of initial adherence in the region, and in turn, created space for a large, regional drop-off in adherence thereafter. Overall, however, if changes in risk perception were the chief cause driving our results, somewhat similar patterns of self-protective behaviour should be expected for both physical distancing and mask use. The fact that mask wearing shows steadily growing adherence even when the environment was arguably perceived as relatively safe (for example, summertime in Europe) indicates that other mechanisms may also be at play. It is also possible that, over time, people may have gained confidence that they could behave in such a way as to avoid catching the virus while still going out. This could help to explain the lower raw physical distancing measures often observed in later months relative to the first and second months analysed—both periods when, typically, policies were stringent and deaths were rising.

A third possible mechanism underpinning the patterns that we observe is varying intensity of enforcement measures, holding constant the level of policy stringency. This is difficult to assess empirically in the absence of systematic enforcement data across countries and time. But assuming, for instance, that government enforcement gradually intensified for mask use, and people responded simply so as to increase their policy adherence, then changes in enforcement could at least help account for the gradual increase in mask use over time. For physical distancing policies, it would require that enforcement weakened to then rebound. Anecdotal media reports from a few countries suggest that, after many months of limited government sanctioning of rule-breakers, tougher punishments started to be given out around the period when we tend to see inflections in adherence levels36. If enforcement can, indeed, be considered as a mechanism by which rules more potently affect behaviour, then, in theory, analyses that interact policy strength and time could shed some light on its potential role. We have performed such exploratory analyses (Supplementary Tables 9a,b and Fig. 8a,b), the results of which suggest declining policy efficiency over time. But we are cautious about the assumptions37 involved in this reasoning, and of the difficulties in disentangling the effects of enforcement from other mechanisms that could be driving behavioural change. Future studies might collect data on the extent of enforcement of different policies and estimate enforcement’s explanatory power on adherence changes over time.

What is colloquially referred to as ‘pandemic fatigue’ may also underlie the patterns we find. For many months, the WHO has worried about populations’ gradually emerging demotivation driving waning adherence to recommended protective behaviours, which it describes as “a natural and expected reaction to sustained and unresolved adversity in people’s lives”6. Pandemic fatigue, while acknowledged by sceptics as intuitive8, has been much debated in terms of its meaning and scope, although a consensus understanding among behavioural scientists generally refers to psychological fatigue as involving feelings of tiredness, physical and mental exhaustion of some sort,13,38,39 which often result from lasting exposure to aversive events13,40. The outcome of psychological fatigue in the COVID-19 pandemic context would be a reduced motivation or ability to comply with protective behaviours13, which could go on to impact actual adherence39.

Our findings suggest that the possibility of widespread psychological fatigue contributing to waning adherence should not be dismissed. Life and the literature have taught us that desirable but costly, healthy behaviours are easier to initiate than to sustain41. Whether adhering to a new diet42, programme of physical activity43 or medication regimen44, motivation for psychologically costly new behaviours often loses steam as time goes by. Even so, some behaviours are arguably more psychologically and physically bearable and habituating (seatbelt wearing and flossing) than others (dieting and smoking cessation45). Along the same lines, it is plausible to suspect that physical distancing is not only psychologically demanding but also cost accumulating. Over time, the psychological fatigue associated with staying at home or avoiding gatherings may build up. Even though occasional, single-country reports have questioned the evidence for emerging demotivation to adhere to physical distancing46, there is substantial evidence of the negative psychological effects of quarantine47, including reports of exhaustion48 and depression49, and that psychological effects worsen as quarantine extends47,49. But unlike physical distancing, mask wearing is not only a relatively low-cost behaviour but possibly cost-decreasing through time while people more easily adapt to its use (the same way people have adapted to wearing seatbelts and helmets). Thus, in addition to any psychological fatigue brought on by sustained, public-health crisis50, fatigue in response to restrictive policies could help explain the differences across protective behaviours that we report.

However, the extent to which psychological fatigue contributes to the U-shaped adherence curves we observe for physical distancing is an open question. On the one hand, one could argue that a relatively safer period, with lighter restrictions, may have provided the individual with the cognitive and emotional resources needed to adhere to physical distancing more strictly in the future. We indeed observe relatively low stringency for the July–September 2020 period in many countries (Supplementary Fig. 7a), just before adherence rebounds appear. Furthermore, longitudinal survey data from the UK show that, compared with 2019, self-report psychological distress had increased significantly by April 2020, which continued into May and June, then decreased such that it returned to pre-pandemic levels by September of that year51. On the other hand, there is also evidence that in-between breaks during an unpleasant (cost-accumulating) experience tend to disrupt a naturally occurring habituation process and, as a result, make the overall experience more psychologically aversive, and, arguably, harder to endure, although people tend to believe otherwise52. Whether for positive or negative experiences, people either disregard or underestimate the power of hedonic adaptation53. In summary, therefore, the extent to which en masse feelings of overall tiredness or demotivation to sustain physical distancing behaviours helps explain the adherence patterns that we report is difficult to assess without time-varying, direct measures of psychological fatigue. In any case, it is unlikely that a single factor will represent the dominant cause for such complex social phenomena. We suspect that multiple mechanisms may be needed to account for the direction and form of the adherence curves across the different protective behaviours that we report in this study.

Consistent with the evidence that women54 and the elderly55 are in general more risk averse54, and that those with more social contacts have greater individual concern for health risks56, we found that women, the elderly and people living with others initially adhered more and maintained greater relative adherence over time. These findings are in line with studies of compliance by gender57 and age in other contexts58, in earlier months of this pandemic59 and of single time points60. The absence of differences between genders and age groups in change in adherence is also compatible with arguments in the literature. There are both reasons to expect women’s adherence to drop off more rapidly than men’s (women’s psychological fatigue may be more intense from larger increases in childcare hours61) and, on the other hand, to reduce more slowly (women have been shown to be more likely to anticipate multiple lockdowns62 and so may have been better psychologically prepared for them11). Likewise, theory suggests that older people, who tend to be more physically and socially vulnerable63, may experience weaker resilience to restrictive policies and consequently adhere less over time. Yet as the pandemic has continued, knowledge of age-associated risk has grown64, as has older people’s pessimism regarding the severity of the disease in case of infection65, implying the opposite expectation of steadier adherence.

We found that institutional trust matters little for change in adherence to physical distancing, though it was weakly associated with adherence in some early months after containment measures were first introduced. The idea that populations with firmer bedrocks of confidence in public institutions should be more willing to physically distance in a context of restrictive containment policies is alluring25, and while it has been reported during the current pandemic66,67, these positive assessments have tended to be at the start of the first lockdown period. ‘Rallying around the flag’68 may have raised the salience or boosted69 institutional trust around the time that restrictive policies were first enacted, but this effect appears to be temporary. Over time, the absence of a compliance dividend for high-institutional-trust countries could have resulted from lower risk perceptions of COVID-19 among people who trust their government70, effectively balancing out any greater willingness to follow what the authorities say. In some countries, well-known political actors signalling a lower level of risk to what was implied by the strictness of government policies may have undermined pre-pandemic trust in authorities among discerning citizens71, or confused the trusting as to what the appropriate protective behaviours are, or rendered any such compliance dividend contingent on partisanship72. Indeed, if current levels of trust in government are influenced by policy saliency and performance73, it is noteworthy that some countries with high pre-pandemic institutional trust have so far been widely perceived to have been effective in their handling of COVID-19 (for example, New Zealand), while others (for example, India) have experienced many deaths.

Despite our null findings for institutional trust, we observed a clear and robust association between interpersonal trust and change over time in adherence to physical distancing, with high-interpersonal-trust countries exhibiting weaker decline. Although one might expect low trust in strangers to encourage staying away from them, creating the public good of reducing infection spread requires collective action74. Therefore, paying the individual cost of physical distancing should only be worthwhile if one expects others to do the same. In line with this logic, greater compliance with NPIs has been found in Italian provinces75, and in counties in the United States72 where citizens harbour relatively higher levels of social trust. The initially lower adherence in countries with high social trust that we find may be due to a greater need for an adjustment period in societies where physical distancing is more incongruous76. But as the months pass, to sustain their own compliance, it appears that people need to trust that strangers will also physically distance.

Our study has a number of limitations, such as the lack of available data detailing the extent of governments’ enforcement of NPIs. Self-report survey data may be affected by changing reference points of the respondents over time. However, this is unlikely to drive our results as survey findings are in line with objective mobility data. Individual-level data on income, education, personality traits77 and psychological fatigue were not gathered in surveys, and over the course of the pandemic, questions about changing risk perceptions and levels of trust were not asked repeatedly. Moreover, while pseudo-panel approaches to working with repeated cross-sectional surveys are an established method, here allowing us to estimate temporal changes in adherence to protective behaviours that are stipulated in governments’ NPIs, a true panel would have enabled more fine-grained assessments of individual patterns and moderations. Notwithstanding, our results are consistent across a large number of robustness checks.

Our findings have a number of implications for researchers and policymakers. First, epidemic modellers should revisit and revise assumptions of sustained adherence over time. Also, retrospective assessments of the effectiveness of NPIs may underestimate certain policies if they assume higher levels of compliance over time than was empirically the case. Second, policymakers should bear in mind that changes in adherence are highly contingent on the type of protective behaviour under consideration. A low-cost, habituating behaviour such as mask use represents an almost ideal measure to reduce risks while, if sufficiently safe, in some careful ways allowing life to return to normal. As our findings reveal, people are likely to adopt and get used it. The same rationale is likely to apply to support other low-cost, habituating behaviours, notably the use of cheap and reliable rapid testing before entering closed venues or participating in gatherings.

Third, we hope that our findings do not delay the enactment or re-imposition of containment policies where community transmission is causing cases among immunologically vulnerable people to rise, given the persuasive evidence that more rapid government action has led to fewer deaths78. Instead, we wish to encourage further discussions about (a) how to set protective behaviour policies that mitigate the decline or possibly even increase adherence over time, and (b) how to motivate steady physical distancing adherence6, with additional focus on groups whose behavioural patterns have shown the largest drop-offs in adherence. The benefits of building and reinforcing social trust suggest policies that act as carrots79, rather than merely picking up the sticks of stronger enforcement, especially in societies where interpersonal trust is high. Towards the end of the first lockdown, in Germany—a country with above-average scores for social trust among our group of high-social-trust countries—significantly more people expressed willingness to comply with a government policy limiting contact with others if the policy was communicated as voluntary rather than government enforced37. Indeed, the more social trust Germans reported, the more control adverse they were. Therefore, seeking to build trust in strangers where it is low, and elevating people’s sense of autonomous choice80 and of the societal benefits of adherence where social trust is high, may help to extend people’s performance of physical distancing behaviours.

## Methods

### Data

#### Dependent variables: self-report protective behaviour

We use the publicly available ICL-YouGov Global Survey on people’s behaviours in response to COVID-19 (ref. 81). The survey consists of repeated cross-sections over time, with survey waves approximately every 2 weeks (with variation in the exact dates of data collection by country). The full dataset contains responses for 15 countries from 1 April to 24 November (and another 14 countries for a shorter period). We excluded Australia from the full-period sample because full data were not available to the authors during the analysis, thus restricting our sample of countries to Canada, Denmark, Finland, France, Germany, Italy, Japan, the Netherlands, Norway, Singapore, South Korea, Spain, Sweden and the United Kingdom. The survey rounds were conducted online and approved by the ethical review boards of Imperial College London (ICREC #20IC6020) and Columbia University (IRB-AAAT2959). Although representative of the populations of each country as compared with available national statistics on age, sex and region, those with severe COVID-19 symptoms and other hard-to-reach groups may have been under-represented in these surveys.

Our dependent variables for avoidance of going out, avoidance of gatherings and mask use drew on responses to the question: “Thinking about the last 7 days, how often have you taken the following measures to protect yourself or others from coronavirus (COVID-19)? As a reminder, please exclude any measures that you have already taken for reasons other than coronavirus (COVID-19)”. For mask use, we drew on “Worn a face mask outside your home”; for avoiding going out, we used the average of “Avoided going out in general” and “Avoided going to shops”; for avoiding gatherings, we used the average of four questions including “Avoided having guests to your home” and other questions referring to the size of social gathering avoided (not more than 2 people, between 3 and 10 people and more than 10 people). For our aggregated physical distancing dependent variable, we took the average of our avoiding going out and avoiding social gatherings variables.

As the ICL-YouGov Global Survey is not a true panel but a set of repeated cross-sections, to estimate growth curve models we created a pseudo-panel82 by aggregating data per country in 10-day intervals and demographic characteristics cohorts. Due to the almost continuous data collection process of the survey, 90% of the country/10 day interval cells contained data, with a median number of observations per cell of 829.5 (1Q of 255, 3Q of 2,004). In pseudo-panel models, cohorts are defined in terms of respondents’ attributes that are mutually exclusive and time invariant. In the present study, cohorts were therefore defined by covariates of interest (5-year intervals of age, gender, employment status, having a chronic illness or not and living in a single-person household versus living in a multi-person household). Usually, year of birth is used as a cohort factor rather than age, as the former is time invariant. Due to the short period covered by the survey data (less than 1 year), we use age as a proxy for birth year. All pseudo-panels considered cohorts as cells derived from an interaction of country, age groups and a second grouping that was the variable of interest of the analysis. For instance, for the moderation models of change in adherence across genders, the pseudo-panel contained 24 cohorts per period of 10 days per country (12 age ranges × 2 genders), with up to 24, 10-day intervals (240 days) per cohort, creating a total of up to 8,064 data points over the 14 countries. As the panel is unbalanced (not all cohorts have data collected in every 10-day period), the final database in the gender and age interaction cohorts contained 6,507 observations. The dependent variables of interest (avoidance of going out, avoidance of gathering and mask use) were averaged over each cell, as well as all independent and control variables.

#### Dependent variables: mobile-phone mobility

We also evaluated change in adherence based on publicly available data from Google COVID-19 Community Mobility Reports83, using the measures of change in time spent in residences and change in retail and recreation visits. We chose these measures as they more adequately represent individual physical distancing choices than other available measures (for example, workplace-related mobility, which may be affected by employers’ decisions) and are less likely to be bounded by essential needs (as per changes in visits to pharmacies and groceries). The baseline for these indicators is the median value, for the corresponding day of the week, during the 5-week period from 3 January to 6 February 2020. The mobility data are structured as a country–day panel. We used mobility data prior to 10 December (beginning of vaccination period).

#### Policy control variables

To control for the effects of NPIs on behaviour, we used data from the OxCGRT1. We included indices, calculated as per the OxCGRT Stringency Index, of all individual containment policies (schools closing, workplace closing, cancel public events, restrictions to gatherings, closing public transport, stay at home requirements, restrictions to internal movement and international travel controls) as controls in our models, adding another index for facial coverings as a control in the analyses with mask use as the dependent variable.

#### Other control variables

Other covariates include weather (monthly average temperature per country, 2010–2019 (ref. 84)) interacted with hemisphere (1 for North, 0 for South), COVID-19 searches (frequency of web searches for ‘COVID’ determined using Google trends85) and increase in deaths, a dummy variable indicating whether the 7-day moving average of the number of daily deaths in the country is rising (1) or stable or decreasing (0). For simplicity, this suite of variables is called ‘standard controls’. For robustness checks, we considered ‘alternative controls’, substituting the increase in deaths dummy for the 7-day moving average of number of daily cases. We used data on COVID-19 deaths from Johns Hopkins University Center for Systems Science and Engineering data repository86. Time-invariant variables such as country income level and geographical region were used in moderation models, as the other models focus only on variance across time.

#### Explanatory variables

The main independent variables of interest were measures of time since the beginning of closure and containment policies. The OxCGRT database registers policy recommendations and requirements. We counted the start of the baseline adherence month as the date of the first required measure (that is, excluding recommendations) that could reasonably be expected to have a direct influence on within-country mobility; thus we excluded international travel controls, which in any case were often enacted much earlier than other closure policies. This time measure was used in different functional forms: (i) a set of dummy variables representing 30-day intervals, which allows for a non-parametric estimation and observation of the change in adherence pattern (shown in the figures), and (ii) quadratic and linear terms (reported in the tables). Most countries experienced a rapid ramp-up of physical distancing policy strength, taking a median time of 25 days from the first required measure to reach the country’s maximum level of policy strength on the OxCGRT Stringency Index. Considering, though, that there was some variation across countries in their pace of policy ramp-up, we perform robustness checks using two alternative starting dates for the baseline adherence month: the date that each country first reached 50 or 70 on the OxCGRT stringency index (or maximum stringency in case the country did not cross that threshold in the period of analysis, which happened in a few cases; Supplementary Table 1l,m and Supplementary Fig. 1e,f).

#### Moderators: country level

Institutional trust and interpersonal trust measures were obtained from World Values Survey data (wave 6 (ref. 87)). Interpersonal trust was measured as the proportion of people in a country answering that “most people can be trusted” to the question: “Generally speaking, would you say that most people can be trusted or that you need to be very careful in dealing with people?”. As per ref. 88, institutional trust was taken to be the average of responses to three items presented after the following question: “I am going to name a number of organizations. For each one, could you tell me how much confidence you have in them? [civil service, courts, and the government]”. As robustness checks, we also estimated different models considering each one of these three items as the moderator, as well as the trust in the healthcare system measure from the European Values Survey89,90.

In all models, we treat policies as controls considering their average effects on behaviour over countries and time. In an exploratory model, we also evaluated time-varying policy effects by including an interaction between the stringency index and time since first required policy, alongside the squared time term. We also estimated models with an interaction between the linear and quadratic time trends and dummies representing periods of policy rollback (reduction of at least 10 stringency-index points from pre-rollback maximum stringency that lasts for at least 30 days) and reversal of rollback (increase of at least 10 stringency-index points from minimum stringency in rollback period that lasts for at least 30 days).

#### Moderators: individual level

Moderators of change in adherence at the individual level were obtained from the ICL-YouGov survey and include age, gender, having a chronic disease, living in a single-person household and employment status (employed, unemployed/not working, retired and student/others).

### Analysis

We used multi-level growth curve models91 to describe the evolution in protective behaviours over time. This type of model permits the evaluation of a given outcome over time, controlling for other time-varying covariates and accounting for the dependency of repeated measures over time. For the main analysis with ICL-YouGov survey data, we have a three-level data structure, with time-varying measures of behaviour (level 1) nested within cohorts (level 2) and countries (level 3). In the main model, we evaluate a non-linear (quadratic) growth curve model. Formally:

$$Y_{ijk} = \pi _{0jk} + \pi _{1jk}{\mathrm{time}} + \pi _{2jk}{\mathrm{time}}^2 + \mathop {\sum }\limits_{p = 1}^8 \delta _p{\mathrm{pol}}_{pik} + \varepsilon _{ijk}\quad \left( {{\mathrm{level}}\,1, {\mathrm{time}}} \right)$$
$$\begin{array}{l}\pi _{0jk} = \gamma _{00k} + r_{0jk}\\ \pi _{1jk} = \gamma _{10k} + r_{1jk}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \left( {{\mathrm{level}}\,2, {\mathrm{cohort}}} \right)\\ \pi _{2jk} = \gamma _{20k} + r_{2jk}\end{array}$$
$$\begin{array}{l}\gamma _{00k} = \beta _{000} + u_{00k}\\ \gamma _{10k} = \beta _{100}\quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \left( {{\mathrm{level}}\,3, {\mathrm{country}}} \right)\\ \gamma _{20k} = \beta _{200}\end{array}$$

where i represents time since first required measure, j a specific cohort in a given country and k the different countries, polp represents the strength of each one of the eight containment policy indices from OxCGRT in country k at time i and ‘time’ indicates time (in months) since first required measure (centred around the mean so that time and time2 are orthogonal). The model has random intercepts across cohorts (π0jk) and countries (β00k), random slopes for time (π1jk) and time2 (π2jk) across cohorts and unconstrained covariances between random effects and a first-order auto-regressive covariance structure of the residuals. Time terms are random effects at cohort level as this is the level at which we observe the moderators of interest for survey data. However, results hold with random effects at country level (Supplementary Table 1d). The coefficients of interest are β100 and β200, which capture the grand mean trajectories of behaviours accounting for policy strength. Our focus of interest is level 1, which captures variation across time. Level 2 variables are used as moderators of level 1 trends. When using mobility data as the dependent variable, similar models were estimated but with two rather than three levels: time (level 1) and country (level 2). We also estimated models with 30-day periods dummy variables (having 0–30 days after first required measure as the reference category) instead of the linear and quadratic trends to evaluate the shape of growth curves. Additionally, we estimated models with standard controls (covid-19 searches, weather, hemisphere, weather × hemisphere and increase in deaths) and with alternative controls (7-day moving average of number of daily cases instead of increase in deaths).

For robustness, we also estimated the linear and quadratic trends using ordinary least squares models with dummy variables representing countries (fixed effects) and clustered standard errors at country level for mobility data (Supplementary Table 1c, Supplementary Figs. 2e, 3f), and dummy variables representing countries and cohorts and clustered standard errors at cohort and country level for survey data (Supplementary Table 1b, Supplementary Figs. 1k, 4j). Finally, we also considered alternative time periods (15 and 30 days) and age brackets (10 years) in the construction of the pseudo-panels and estimated models using individual disaggregated survey data and fixed-effects models (Supplementary Tables 1d and 2h–m, Supplementary Figs. 1l, 4k).

To evaluate the moderation of country-level (interpersonal trust and institutional trust) and individual-level (age, gender, employment status, chronic illness and single-person household) variables, each of these variables was multiplied by the linear and quadratic time terms. Additionally, to assess the heterogeneous changes in adherence, moderators were multiplied by each of the dummy variables representing 30-day time intervals after the first required measure. Moderation at country level was evaluated using mobility data only, as survey data include 14 countries (of which only 8 also had trust measures from World Values Survey), which is an insufficient sample size for robust cross-country comparisons. For detailed results of models with standard control variables, see Supplementary Tables 2a–g and 3a–n. All statistical tests were two-sided.

We analysed the mobility data until 270 days and survey data until 240 days after the first required measure because, due to the varying date of first required measure, periods beyond these limits would exclude some countries (that is, the last to implement a required measure).

### Ethical review

Since the data used were publicly available, previously collected and not traceable to individuals, the Central University Research Ethics Committee (CUREC) of the University of Oxford deemed this study exempt from review.

### Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.