Introduction

Large-scale vaccination programs against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) deployed by governments and health authorities in more than 200 countries1 have halted coronavirus disease 2019 (COVID-19)-associated mortality and caused people worldwide to anticipate the oncoming end of the pandemic.

Clinical trials, as well as studies tracking real-world data, demonstrated the pronounced effectiveness of COVID-19 vaccines against SARS-CoV-2 infection and severe COVID-192.

People at increased risk of severe COVID-19 were among the first to be involved in vaccination programs in many countries3,4. This probably helped prevent the collapse of health systems in these countries and maintain economic activity5. By the end of the second year of the pandemic, the increased number of people with vaccine-induced immunity together with the emergence of new, more contagious but less virulent strains of the virus yielded a drop in COVID-19-associated hospitalizations and deaths worldwide6. These developments reduced caution regarding COVID-19 and led to alleviation of coronavirus-related containments. Therefore, the "flattening the curve" strategy with a main emphasis on limiting population mobility and contact tracing was replaced by less stringent measures and focused on the protection of vulnerable populations by encouraging vaccine uptake7,8.

Vaccines prevent severe COVID-19, although their ability to prevent infections even in highly vaccinated populations fades due to the ability of new strains to evade the immune response9,10,11,12. Vaccinated individuals account for at least one-third of all people hospitalized due to COVID-19. Thus, the threat of COVID-19 is not over13,14,15,16. Evidence is needed to target people who are at risk of severe infection with timely boosters, antiviral treatment, and preexposure prevention.

A great deal of uncertainty exists in terms of future planning of vaccination strategies alongside conflicting results in studies aimed at exploring the long-term protection gained by vaccination or booster shot regimens17,18,19,20. We aimed to evaluate the risk of breakthrough infections focusing on severe COVID-19 requiring hospital care and associated sociodemographic and health-related factors.

Methods

Design

We performed a retrospective cohort study based on individual-level data linking national routine datasets on laboratory-confirmed SARS-CoV-2 infection, COVID-19 vaccination status, and health care utilization between 19 January 2021 and 9 February 2022 in a SARS-CoV-2 infection-naive fully vaccinated cohort of 184,132 individuals. This study follows STROBE guidelines for observational research reporting21.

All research was performed in accordance with relevant guidelines and regulations. The Research Ethics Committee of the University of Tartu approved the study by the 18th of October 2021(No. 351/M-8). The need for informed consent was waived by the Ethics committee of the University of Tartu due to the retrospective nature of the study.

The funding agencies had no role in the study design, data collection, data analysis, interpretation, or writing of the manuscript.

Study population

Individuals aged 12 years and older not previously infected with SARS-CoV-2 who had a primary vaccine series against SARS-CoV-2 with at least two doses of BNT162b2 (Pfizer/BioNTech), mRNA-1273 (Moderna), AZD1222 (Oxford/AstraZeneca) or one dose of Ad26.COV2. (Janssen/Johnson & Johnson) vaccine were included. Individuals who had a supplementary dose of vaccine after the primary series were considered boosted. Data are presented in the supplementary materials.

A detailed description of study cohort and participants' numbers in each group is presented in flow diagram (see Appendix A).

Study period

The study period was from 19 January 2021 to 9 February 2022. The data were pertinent to the first two years of the COVID-19 pandemic (driven by the alpha, delta, and, to a lesser extent, the omicron variants of SARS-CoV-2)22. This period was characterized by free access to SARS-CoV-2 PCR testing and subsequent treatment for all Estonian citizens suspected to have or who had confirmed SARS-CoV-2 infection. By the end of the period, 65% of the Estonian population was fully vaccinated23.

Data sources

SARS-CoV-2 vaccination and testing data

Data on COVID-19 vaccination (dates, vaccine, and the manufacturer for each vaccination episode) and SARS-CoV-2 testing dates (results, dates) were retrieved from the Health and Welfare Information Systems Centre (TEHIK)24. The TEHIK maintains the e-health system in Estonia (electronic health records, EHRs) by collecting records about all contacts with health care providers, including medical care and laboratory testing records, referrals, and data on vaccinations25. According to law, all health care providers and laboratories in Estonia are obliged to report to the TEHIK, with an expected 100% coverage. Cohort membership was defined based on the data from the TEHIK.

Health status and health care utilization data.

The Estonian Health Insurance Fund (EHIF) maintains records of all health care services delivered to all residents with valid health insurance (approximately 95% of all residents), including personal information (sex, age), healthcare-related services delivered to individuals, primary and other diagnoses on health care claims (based on the International Classification of Diseases, 10th revision (ICD-10)), treatment type (in- or outpatient), types of tests or services provided), and the date of death26.

Sociodemographic data

The Population Register is a unified database of Estonian citizens and foreign nationals living in Estonia on the basis of right of residence or residence permit and is managed and developed by the Ministry of the Interior. Population Register data were used to identify the study subjects’ location of residence (including emigration status), education, and ethnicity.

Exposures

We considered a primary series of COVID-19 vaccine and boosters as exposure in this study.

COVID-19 primary vaccine series was defined as vaccine administered as the first dose for 1-dose series and the second dose for 2-dose series. Vaccine booster was defined as any additional SARS-CoV-2 vaccine dose received after completing of primary vaccination series against COVID-19.

Outcomes

Breakthrough infection

Breakthrough infection (BTI) was defined as a SARS-CoV-2-positive laboratory test (PCR or antigen test) occurring at least 14 days after the last vaccination and was our main outcome. The time of breakthrough infection was based on the date of the positive test collection.

Severe BTI

A secondary outcome, severe BTI, was hospitalization related to COVID-19 three days before to 30 days after a laboratory-confirmed SARS-CoV-2 infection. We set additional criteria for COVID-19-related hospitalization, aiming to avoid misclassification related to COVID-19 cases identified by routine screening during hospitalization for other reasons. There had to be at least one of the following diagnoses in relation to hospitalization: COVID-19 (U07.1, U07.2), acute respiratory tract infections (J00– J06, J12, J15–J18, J20–J22, J46) or lower respiratory tract infections (J80–84, J85–J86)27,28.

Covariates

We included possible confounders (covariates) associated with the severity of COVID-19 or associated with both SARS-CoV-2 infection and hospitalization.

Sociodemographic characteristics

The age, sex and education level of the study participants were included in the analysis.

The educational level of study participants derived from the population statistics was divided into three groups according to the International Standard Classification of Education (ISCED): low education level (basic education or below), medium education level (general secondary education or vocational education based on secondary education), and high education level (higher or tertiary education)29,30.

Prevaccination comorbidities

Based on health care claims data for the period of 24 months prior to vaccination, the Charlson Comorbidity Index (CCI) score was estimated. The CCI is a weighted index that accounts for the number of patient comorbidities and severity and has good predictive validity for the mortality of patients with COVID-1931. Comorbidity burden was categorized into three groups (CCI score = 0, CCI score 1–2, and CCI score ≥ 3) and calculated using algorithms and weights described by Quan et al.32.

We also used the duration (measured in weeks) of hospitalization 365 days pre-vaccination as a measure of the morbidity of the study participants33.

Data on selected chronic comorbid conditions were included (defined based on ICD-10 classification diagnosis codes), considering the potential of these diseases to act as risk factors for severe COVID-19 (see Appendix B). Comorbidities were defined as any secondary or other diagnoses coded in the claim and/or diagnoses of any type on hospital or outpatient health care claims during the year preceding the index date. We applied a restriction to outpatient claims, such that a comorbid condition could be flagged during the preceding period only if it appeared two or more times at least 30 days apart. We sought to determine the contribution of high glycaemic levels to the risk of severe COVID-19. A high glycaemic level was identified based on the health care service code, which indicated a level of HbA1c greater than or equal to 7% on the last blood sampling34.

Follow-up and timing

All individuals in the cohort were followed longitudinally through a linked database, which allowed for continuous tracking of their vital status and study outcomes over time. Breakthrough infections were captured until 9 February 2022. Data about hospitalisation were captured until 11 March 2022. For an individual, the follow-up period started on the day after the date of the last vaccine dose received. We measured the time-to-event from the first date after the last vaccination until the primary outcome, death or the end of the study, whichever occurred first.

For the secondary outcome (severe BTI), subjects were followed until the COVID-19 hospitalization date, death, or end of the study, whichever occurred first.

The outcome accrual period started 14 days after the completion of the primary vaccination series (the date of the last vaccine dose) and 14 days after the booster shot for the individuals in the primary + booster dose group.

Statistical analysis

Incidence rates (IRs) of breakthrough infections were calculated per 10,000 person-days with 95% Poisson confidence intervals. We assessed the distribution of sociodemographic and prevaccination health status to determine differences between fully vaccinated individuals with and without breakthrough infections.

Sociodemographic and comorbidity status data are reported as the means and standard deviations (SDs) for continuous variables and as frequencies and proportions for binary variables.

Education level was missing for 5.5% of participants. We used KNN (k-nearest neighbour) imputation and age, sex, and education level as predictors.

To account for the potential effect of SARS-CoV-2 testing frequency, we estimated the testing propensity for each individual based on negative binomial regression model prediction, using sex, education level, and the natural spline (with four degrees of freedom) of age as covariates.

Factors associated with the outcome (breakthrough infection and related COVID-19 hospitalization) were explored using time-dependent Cox regression, with all those without BTI as a reference group for BTI as an outcome and people who did not require hospitalization as a reference group for severe COVID-19 as an outcome.

We used calendar time as a timescale and binned follow-up period by month to account for differences in overall background infection rates and SARS-CoV-2 strains. Adjusted models included boosting status and time from the last vaccination as time-varying covariates and all other covariates (age, sex, level of education, hospitalization duration during the last year prior to vaccination, Charlson Comorbidity Index, twelve selected diseases, and SARS-CoV-2 testing intensity) as non time-dependent covariates. The results are presented as hazard ratios (HRs) and adjusted hazard ratios (aHRs) with 95% confidence intervals (CIs).

We tested the effect of time since vaccination on BTI by dividing the follow-up period by month and comparing hazards of BTI or hospitalization with COVID-19 in every given period of time and used binned time from the last vaccination as a covariate.

All analyses were performed in R version 4.0.3. We used tidyverse package of R statistical software for data pre-processing, the Epi package for creating a Lexis object of follow-up, survival for survival modeling, and VIM for kNN imputation.

Results

Study cohort

Among 184,132 individuals in the cohort, 54% (n = 99,453) were females, the mean age was 48.9 years (SD 19.9), and the majority (80.8%) had a high or medium education level. Overall, 84.7% (n = 155,994) of individuals had a CCI score of 0. In general, cardiovascular diseases accounted for one-third (34%) of all comorbid disorders in the present cohort. The most common comorbid condition among the study participants was hypertension (27.8%), followed by mood disorders (7.9%), diabetes (6.6%), heart diseases (6.2%), and chronic lung diseases (4.5%).

Over the period of the study follow-up, 392,647 SARS-CoV-2 tests were performed (mean 2.1 tests per person, SD 2.7).

Slightly less than half of the study participants (46.7%) received a booster dose of the SARS-CoV-2 vaccine on average 199 days (SD 33) after completing the primary vaccination course.

Cohort characteristics are presented in Table 1.

Table 1 Characteristics of the study cohort and individuals with SARS-CoV-2 breakthrough infections and severe breakthrough infections in Estonia for the period of 19 January 2021 to 9 February 2022.

SARS-CoV-2 breakthrough infections and COVID-19 hospitalizations

Over the follow-up of 11 months, 29,688 individuals had BTIs (IR 8.03, 95% CI 7.95⎼8.13 per 10,000 person-days), and 355 needed hospitalization due to COVID-19 (IR 0.093, 95% CI 0.084⎼0.104 per 10,000 person-days). The median follow-up time to BTI was 212 days (interquartile range, IQR 100, range 0⎼386 days) and 243 days to COVID-19 hospitalization (IQR 100, range 3⎼416 days).

Among individuals without a booster dose (n = 98,076), 24,559 (25%) had BTIs, and 327 (0.33%) were hospitalized for COVID-19. Of those who received a booster vaccine dose (n = 86,056), 6.0% (n = 5,129) had BTIs, and 0.03% (n = 28) were hospitalized for COVID-19.

Of the 355 individuals hospitalized for COVID-19, 15.2% (n = 54) died during hospitalization (crude mortality rate 15.2 per 10,000 hospitalized patients). Those individuals who died during hospitalization were older (77,3 vs 66,3, p < 0.001), and a higher proportion of them had two or more chronic diseases (46.3% vs 18.3%, p < 0.001), cancer (31.5% vs 14.6%, p < 0.01), or renal disease (20.4% vs 10%, p < 0.05).

Detailed data on the effect of different vaccine types and vaccine products on the incidence rates of the main outcomes are presented in the supplementary materials (see Appendix C).

Factors associated with SARS-CoV-2 breakthrough infection

BTI risk was dependent on having had a booster dose and the time since vaccination.

The risk for BTI began to increase from 2 months after the last vaccination (aHR 1.21, 95% CI 1.16⎼1.27), increasing gradually up to 4 months post-vaccination (aHR 1.75, 95% CI 1.66⎼1.83). Estimates of associations with BTI and COVID-19 hospitalization are presented in Table 2.

Table 2 Multivariable analysis of patient characteristics associated with SARS-CoV-2 breakthrough infection and severe breakthrough infection in Estonia for the period of 19 January 2021 to 9 February 2022.

Having had a booster dose of vaccine demonstrated a weak protective effect against BTI (aHR 0.95, 95% CI 0.90⎼0.99) (Fig. 1).

Figure 1
figure 1

Time since vaccination and effect of boosting on the risk of BTI and COVID-19-associated hospitalization (aHR with 95% CI).

In addition, BTI risk was associated with sex, age, education level, number of comorbidities, and specific comorbidities. Females were more susceptible to BTI than males (aHR 1.12, 95% CI 1.09⎼1.15). Those aged 50 and older had lower risks for BTI than those younger than 40 years (aHR 0.75, 95% CI 0.69⎼ 0.82). We observed an inverse dose–response effect of age on the risk of BTI (decreased risk with increasing age); i.e., among people older than 70 years, there was an aHR of 0.34 (95% CI 0.29⎼0.39) for the 70–79 age group and an aHR of 0.31 (95% CI 0.27⎼0.36) for those aged 80 or older.

Only those with three or more comorbid conditions (CCI score ≥ 3) had a greater risk for breakthrough infection (aHR 1.18, 95% CI 1.05⎼1.33).

People with poor glycaemic control had a slightly lower risk of BTI (aHR 0.85, 95% CI 0.74⎼0.98). Individuals with renal diseases (aHR 1.16, 95% CI 1.01⎼1.34), heart diseases (aHR 1.10, 95% CI 1.02⎼1.18), and dementia (aHR 1.69, 95% CI 1.40⎼2.04) were associated with an increased risk for BTI.

Factors associated with severe breakthrough infection

The risk for severe BTI began to rise after 6 months post-vaccination, achieving an almost twofold increased risk for severe BTI in the adjusted analysis (aHR 1.74, 95% CI 1.05⎼2.90).

Boosted individuals had a threefold lower risk of hospitalization associated with severe BTI (aHR 0.32, 95% CI 0.19–0–0.54) (see Fig. 1).

Females had a lower risk of severe BTI, which was expressed by substantially lower risks of hospitalization (aHR 0.68, 95% CI 0.55⎼0.84). The risk for severe BTI began to rise from the age of 50 years, increasing gradually with every decade of life and achieving an almost sevenfold increase in risk for those aged 80 and older (aHR 7.23, 95% CI 4.36⎼12.02). A greater CCI score was associated with a greater risk of severe BTI. Thus, vaccinated individuals with CCI scores of 1 or 2 had at least a twofold risk (aHR 2.09, 95% CI 1.54⎼2.83) of developing severe disease during BTI, while a greater CCI score led to an even greater risk for severe BTI (aHR 2.23, 95% CI 1.43⎼3.50 for those people with a CCI score ≥ 3). Among comorbidities, renal diseases demonstrated the most pronounced effect on the risk of hospitalization during BTI (aHR 1.92, 95% CI 1.32⎼2.80); other chronic diseases associated with elevated risk were chronic lung diseases (aHR 1.72, 95% CI 1.28⎼2.31), cerebrovascular diseases (aHR 1.72, 95% CI 1.11⎼2.67), diabetes (aHR 1.71, 95% CI 1.29⎼2.27), heart diseases (aHR 1.34, 95% CI 1.02⎼1.76), and cancers (aHR 1.42, 95% CI 1.04⎼1.94).

Discussion

We followed a large, population-based cohort of SARS-CoV-2-vaccinated, infection-naive individuals. The study results demonstrated a remarkable and consistent protective effect of vaccination against severe COVID-19 for up to six months, with the booster dose offering an additional pronounced benefit (up to 81% lower hospitalization risk compared to the primary vaccination series alone). The risk of severe BTI requiring hospitalisation began to rise at 50 years of age and increased constantly with each subsequent decade of life. Sex-specific differences and unfavorable impact of chronic diseases, especially those affecting cardiovascular, lung, or renal endothelia, were observed. To our knowledge, this is one of the few retrospective cohort studies aimed at assessing the risk factors for severe COVID-19 among people vaccinated against SARS-CoV-2.

Our analysis revealed that age had a bidirectional effect on BTI: we observed a decreased risk of infection and an increased risk of severe disease with increasing age in the case of BTI (see Fig. 2). This was in concordance with other studies that demonstrated a lower risk of BTI with increasing age35. Porru et al. found a similar effect of age on the risk of BTI in a large European multicentre study36. This finding may be related to the reduced number of social contacts and stricter adherence to pandemic-related restrictions among older people, as well as expansion of testing strategy to younger age groups during the second and third waves of pandemic. The effect of age on the risk of BTI in our study was the opposite of that demonstrated in risk assessment studies conducted in the pre-vaccination era. Additionally, there is some variability in terms of the incidence of BTI in studies conducted during the same period of the pandemic, depending on the prevalent SARS-CoV-2 strain and setting37,38.

Figure 2
figure 2

Risk for SARS-CoV-2 breakthrough infection and severe COVID-19 according to age of study participants.

Our analysis revealed that while the protective effect of vaccination against breakthrough infection waned over time, the occurrence of severe disease was delayed for at least six months. This finding of a strong protective effect of vaccination against hospitalization is in concordance with other studies, revealing an immense benefit of booster doses on overall vaccine effectiveness during the first two years of the pandemic39,40,41. Compared to the historical cohort42, the COVID-19 hospitalization rate among vaccinated people with BTIs was at least five times lower compared to the hospitalization rate in the prevaccination era and during the first two waves of the pandemic. The mortality rate among vaccinated individuals requiring COVID-19 hospitalization in our study was lower than that among unvaccinated people in previous studies43. That, in conjunction with decreased hospitalization risk, makes the effect obtained from the booster dose undeniably important.

We identified important individual characteristics associated with the risk of severe COVID-19 among vaccinated individuals. In our study males were significantly more susceptible to severe BTI, similar to previous studies that included unvaccinated people44.

Studies released in the pre-vaccination era reported a close relationship between different range of cardiovascular diseases and severe COVID-19, including the pronounced influence of isolated hypertension on the risk of hospitalisation and death in a study engaged historical cohort of unvaccinated individuals in Estonia42. Whether this also applies to vaccinated individuals warrants careful exploration. In our study, we did not find an association between hypertension and the risk of COVID-19-related hospitalization. However, those with ischaemic or structural heart disease were at risk for COVID-19 hospitalization. This connection is quite predictable, considering the devastating effect of respiratory viruses on the endothelium of small intramyocardial and coronary heart vessels. This process, induced by the response of the innate immune system to viral replication in the host body, plays an important role in the pathogenesis of infectious heart damage, formation of microthrombosis, and coronary vasospasm, which may lead to deterioration of all vital functions in infected organisms45.

Contrary to previous studies that demonstrated the close relationship between dementia and the risk of poor SARS-CoV-2 breakthrough infection outcome, we did not find the same association46. In our study, dementia was a risk factor for breakthrough infection itself but not for severe COVID-19.

Marfella et al. reported that HbA1c ≥ 7 in the postvaccination period is associated with lower immune responses and an increased risk of SARS-CoV-2 breakthrough infections in diabetic patients47. Our study failed to show an association between poor glycaemic control in the prevaccination period and the risk of BTI. Risk factors for severe outcome among vaccinated individuals in our study generally resembled those reported in risk assessment studies in the prevaccination era48,49.

We assume that population size, racial differences, discrepancies in testing policy, and hospitalization have all significant impact on the results of different studies. These factors need to be carefully considered when interpreting results across studies.

Based on the data of our study, we can recommend that an additional shot of SARS-CoV-2 vaccine should be given to those at greater risk of a complicated COVID-19 course no later than 6 months after the initial vaccination course. There are people with identifiable characteristics at higher risk for severe BTIs. This can help health authorities establish and manage vaccination schedules.

Limitations and strengths

Our analysis used data derived from the nationwide and population-based universal tax-funded health care system. We believe our findings are firm given the sample size and the completeness of the data obtained from nationwide electronic health data. The findings of our research have good external validity due to universal SARS-CoV-2 PCR testing strategies and the high coverage of the study population by testing.

It is likely that the BTI incidence we describe here is an underestimate. Not all (asymptomatic, mild disease) COVID-19 cases undergo testing, which may lead to an underestimation of the true incidence among those with lower testing propensity. However, we are more confident in the estimate of severe disease from this study, as this is not open to SARS-CoV-2 testing-related misclassification.

Our inclusion of four of the most widely used vaccines for primary and booster vaccinations together increases the generalizability of our findings.

Our study was missing information on cohort members’ lifestyles (i.e., smoking, alcohol consumption, daily physical activity) and medication use. Therefore, our estimates can be affected by residual confounding. However, we attempted to minimize this by adjusting for the most well-documented risk factors for SARS-CoV-2 infection and severe COVID-19 (health, vaccination, sociodemographic characteristics). To alleviate bias related to differences in testing behavior among those with and without breakthrough infections, we used estimated individual testing intensity as a covariate in the model.

During the study period, the alpha and delta strains of SARS-CoV-2 were prevalent in Estonia22. We could not accurately assess the impact of different SARS-CoV-2 variants on the main outcomes. To gauge bias related to differences in the virulence and pathogenicity of different SARS-CoV-2 strains, we accounted for prevalent strains and background infection rates and then binned the observation period for each individual on the basis of calendar month. By doing this, we ensured that people at risk at any given period were similar with respect to the strain and infection rate.

Whether our results are also robust in the case of the evolving SARS-CoV-2 strains is currently unclear. Additionally, assessing the need and timing of repeated continuous boosting is an urgent research need.

Vaccination remains an effective tool for preventing severe COVID-19 and its complications, even in the face of emerging variants and breakthrough infections. Combating COVID-19 is an iterative and ongoing process, so monitoring vaccination-related protection is essential to managing the epidemic.

Health authorities should continue to encourage COVID-19 vaccination for everyone who is eligible. Considering the high ability of SARS-CoV-2 to mutate and its associated ability to evade the immune response, it is necessary to adjust government-implemented vaccination strategies based on the latest knowledge about the risk of breakthrough infections. Additionally, a more individualized approach should be taken during the planning of preventive strategies. Additional protective measures and elaborated treatment strategies might be necessary for those at risk of severe breakthrough infection.

Research should focus on the exploration of additional risk factors that contribute to severe COVID-19 outcome, such as different kinds of early interventions, and how their implementation alters the risk of severe SARS-CoV-2 breakthrough infection, to plan targeted strategies for those who are most vulnerable.

Conclusion

The protective effect of vaccination again severe COVID-19 remains considerable for up to six months. However, there are still vulnerable groups among vaccinated people who need additional protection due to the ongoing threat of SARS-CoV-2 infection. Veracious data on health inequalities in subgroups are essential for developing local and nationwide COVID-19 preventive measures and treatment guidelines.