Introduction

The COVID-19 pandemic continues to spread rapidly worldwide. By October 3, 2022, there were more than 618 million confirmed SARS-COV-2 cases and more than 6.5 million deaths attributable to COVID-191. Researchers have explored potential risk factors to identify patients at high risk of infection or death, as well as for targeting pharmaceutical and preventive interventions2.

Tobacco use, as a leading risk factor of death and disability due to respiratory diseases, was expected to increase the risk of SARS-CoV-2 infection and COVID-19 disease progression and deaths3,4. Smokers have generally increased risk of other respiratory infections and could be expected to have higher risk of SARS-CoV-2 due to repetitive hand-to-mouth handlings, increased mask handlings5, sharing of cigarettes and vape devices6 and creation of aerosols which might be carriers of viruses. On the other hand, smokers might have fewer social contacts7 and be less exposed to indoor places8. Studies have also reported mixed findings on whether tobacco or nicotine could modify the expression of ACE2 receptors, which provide a cellular entry point for SARS-CoV-29.

Earlier epidemiological studies showed that smokers were underrepresented among patients hospitalized due to COVID-1910. These results could be explained by the selected nature of the sample or information bias arising from data collected retrospectively or from electronic health records11,12. However, the most recent meta-analysis, including more diverse samples and study designs, has confirmed these early findings showing that current smokers had lower risk of COVID-19 than never smokers (Relative risk 0.67, 95% Credible interval 0.60; 0.75)13.

A message of a protective effect of tobacco use could undermine public health efforts to curb its use and reduce the perception of harm in the general population14. Studies with general population samples, with a lower risk of selection bias, are thus urgently needed.

Finland is a pioneer in tobacco control. Since the 1970s, Finland has consistently introduced legislation to increase prices and restrict tobacco availability and marketing15. By 2018, 14% of Finns aged 15 or older were daily smokers and 2.2% were regular e-cigarettes users, both indicators below OECD average16. The prevalence of daily or occasional use of snus was 3% in 201817.

The aim of the study was to examine the association between tobacco use and the risk of having a confirmed COVID-19 case. We explored several forms of tobacco use (smoking, snus, e-cigarettes with and without nicotine and nicotine replacement therapy products) and investigated whether introducing a potential collider bias by adjusting for a mediating risk factor (i.e. body mass index, BMI) could induce a spurious association. We used data from nationally representative health surveys in Finland linked to data on confirmed COVID-19 cases, which is less subject to selection bias than voluntary-based samples18,19.

Methods

We registered the study in ClinicalTrials.gov (NCT04915781). Changes to protocol are described in detail in the Supplementary Appendix. In brief, the main change was that we analysed the data as cross-sectional (and not as a prospective cohort study) because data from FinSote 2020 was collected after the start of the pandemic. We were not able to adjust for physical activity as a potential collider and decided to exclude alcohol use as a potential collider due to its less clear causal relationship with tobacco use. We merged daily and occasional users of other forms of tobacco and nicotine than smoking to increase the statistical power. We report the study in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement20.

Setting and study design

The design is an observational study of pooled cross-sectional population health surveys in Finland linked to data on a confirmed case of COVID-19 (i.e. a positive PCR testing of SARS-CoV-2, either reported by a laboratory or a physician). The study populations were permanent residents in Finland from the FinSote surveys (2018, 2019 and 2020). We linked survey data to confirmed COVID-19 cases from the Communicable Diseases Registry until August 23, 2021, using a unique personal identifier assigned to all Finnish residents.

Data sources

We pooled data from three cross-sectional population health surveys in Finland. The FinSote 2017–2018 Survey was a nationally representative survey of the Finnish population aged 20 years and over. The sampling frame was the Population Register of Statistics Finland21. The survey was based on a stratified random sampling. In 2017, 3300 people were invited to participate from each of 18 research areas (2300 adults aged 20–74 and 1000 adults aged 75 +, total sample size 59,400). Data was collected between October 2, 2017, and March 3, 2018. Participants received a self-administered questionnaire in Finnish, Swedish, English and Russian, which could be returned on paper or electronically. The participation rate was 45%. We excluded participants who did not provide consent for register linkage, resulting in an analytical sample of 14,736 participants21.

FinSote 2019 was a nationally representative survey of the Finnish population aged 15 and over, which was implemented in conjunction with the European Health Information Survey (EHIS) round 3. The sampling frame was the Population Register of Statistics Finland. The survey was based on a random sample of 15,000 individuals. Participants received a self-administered questionnaire available in Finnish, Swedish and English, which could be returned in paper or electronically. All participants consented to record linkage. The participation rate was 44%, resulting in an analytical sample of 6251 participants22.

FinSote 2020 was a nationally representative survey of the Finnish population aged 20 and over. The sampling frame was the Population Information System from the Digital and Population Data Services Agency, created in January 2020 after the merge between the Population Register of Statistics Finland and local register offices23. The survey was based on a stratified random sample from each of 22 regions (2000 adults aged 20–74, 800 adults aged 75 +, total sample size 61,600). Data collection started on September 14, 2020, and finished on February 8, 2021. Participants received a self-administered questionnaire in Finnish, Swedish, English and Russian, which could be returned on paper or electronically. All participants consented to record linkage. The analytical sample was 28,199 participants, with a participation rate of approximately 46%.

All surveys comply with the Declaration of Helsinki regarding confidentiality, anonymity, and data protection. FinSote surveys were approved by the institutional review board of the Finnish Institute for Health and Welfare (decision THL/637/6.02.01/2017). Informed consent was obtained from all participants.

Outcomes

The primary outcome was a confirmed COVID-19 case, defined here as those cases with a positive SARS-CoV-2 RT-PCR, either informed by a laboratory or by a physician as a record of an ICD-10 code U07.1 (which requires a positive SARS-CoV-2 RT-PCR), following national guidelines24. Reporting of COVID-19 positive cases was mandatory for laboratories and physicians throughout the study period25. We obtained COVID-19 infection data until August 23, 2021, from the Finnish National Infectious Disease Register maintained by the Finnish Institute for Health and Welfare. Testing of SARS-CoV-2 is free in Finland. Coverage has been extensive and the total number of tests exceeds 6.4 million by August 23, 202126.

Exposure variables

The exposure of interest was tobacco use. All FinSote surveys had comparable questions on smoking. Comparable questions on snus use were available for people aged 20–74 in FinSote 2018 and 2020 and all participants in FinSote 2019. FinSote 2018 and 2020 included questions for people aged 20–74 about electronic cigarettes with and without nicotine and nicotine replacement therapy products. For smoking, we created a categorical variable with the following categories: never smokers, former smokers, current occasional smokers and current daily smokers. We categorised snus use, e-cigarettes with or without nicotine and nicotine replacement therapy products in a similar way: never users, former users and current users. More details on the specific questions and harmonization strategy can be found in the Supplementary Appendix.

Confounders

We adjusted for covariates that we considered a priori that causally precede the exposure and are associated with the outcome27. Evidence shows clear associations with sociodemographic factors and COVID-19 incidence in Finland and other settings28,29,30. Tobacco use is also associated with demographic and social factors, including social capital and social participation31. As a result, we adjusted for sex, age, marital status, years of education, mother tongue, and participation in social activities drawing on the directed acyclic graph shown in Fig. 1. Data from FinSote 2018 and 2019 was collected prior to the COVID-19 pandemic, while data from FinSote 2020 was collected during the pandemic.

Figure 1
figure 1

Directed acyclic diagram of the study. Note: Confounders X* in this study are sex, age, marital status, years of education, mother tongue and participation in social activities. Unobserved mediator-exposure confounders U* include, for example, genetic factors associated with an increased risk of obesity and SARS-CoV-2 infection but not tobacco use. Mediators and potential colliders M include BMI (observed), and other unobserved factors, such as SARS-CoV-2 testing or chronic health conditions caused by tobacco use.

We used age as a continuous variable. We defined marital status as those married, in a registered relationship or cohabiting versus those separated or divorced, widowed or single. We measured years of education as the number of years a person has attended school or studied full time altogether. We obtained information on the participant's mother tongue from national registries and categorized it into Finnish, Swedish and others. We measured participation with a question about participation in the activities of any club, association, hobby group or religious or spiritual community. We categorised participation into the following: no participation, occasional and active.

In addition, given participants living in different regions in Finland have varying risks of COVID-19 due to geographical variation in viral spread and diverging public health and social distancing measures, we included fixed effects for hospital districts (20) to account for these variations. We chose these administrative units because this is the level used to define COVID-19 epidemic phases.

Collider bias

We examined whether inducing potential collider bias (M-bias) by adjusting for other behavioural risk factors could explain earlier results11,12. We tested the effect of collider bias using BMI as an exemplary case. In the case of BMI, several unmeasured factors could induce mediator-outcome confounding and, thus, create collider bias. This includes, for example, genetic factors associated with BMI and risk of COVID-19, but not related to tobacco use. Other potential colliders, such as chronic conditions caused by tobacco use (e.g., chronic bronchitis) and SARS-CoV-2 testing or hospitalizations were not included due to lack of data.

We calculated body mass index as the self-reported weight (in kg) divided by height (in m) squared and created a continuous variable. We considered as extreme outliers participants who reported values of height outside the range of FINRISK 2012, which is a national health examination survey with standardized measurement techniques (range values 137–218). We excluded those extreme outliers from the analyses.

Risk of bias

The study was observational and we did not have a source of exogenous variation to obtain causal estimates. We used the conditional independence assumption to approximate causal estimates32,33, as well as correctly defining confounders to prevent collider bias. The identifying assumption of conditional independence states that after conditioning on a set of observable covariates (i.e., confounders \(X^{*}\)), exposure to tobacco was independent of potential outcomes. In other words, after controlling for confounders, exposure to tobacco was assumed to be randomly assigned. This is a strong assumption, as there might be unobserved factors \(U\) leading to residual confounding (e.g., certain personality traits, such as lower risk aversion, could increase the risk of tobacco use and increase the risk of COVID-19 due to lower adherence to social distancing restrictions, as well as other unobserved factors as religion or certain hobbies (e.g. singing in a choir), that might confound the association between tobacco use and COVID-19 infection.

Statistical analyses

We used Poisson regression with robust standard errors to estimate the relative risk of a confirmed COVID-19 case34. The use of data from the year 2020 (collected after the start of the COVID-19 pandemic) precluded the use of hazard ratios as preregistered. We report analyses of FinSote 2018 and 2019 using time-to-event data in sensitivity analyses as registered in the protocol (see Supplementary Appendix).

We fitted the following Poisson model:

$$log \mu_{i} = \beta_{0} + \beta_{1} S_{i} + \beta_{x} X_{i}^{*} + \rho_{1} R_{a}$$
(1)

where \(i\) denotes the individual, \(\beta_{0}\) is the intercept; \(\beta_{1}\) is the coefficient of interest for exposure to tobacco \(S\); a vector of covariates \(X^{*}\) (i.e., sex, age, marital status, years of education, mother tongue and participation in social activities); and fixed effects \(\rho_{1} R_{a}\) for region \(a\).

We tested for non-linearity in the association between continuous variables (age, years of education, and BMI) and the outcome by comparing the linear model with penalized smoothing splines35 using a likelihood ratio test with the Wald method36. The likelihood ratio test showed a better fit using penalized smoothing splines for all three continuous variables. We therefore modelled them using penalized smoothing splines. We tested the possibility of collider bias due to behavioural risk factors by assessing the change in the coefficient \(\beta_{1}\) after adjusting for BMI in the model fully adjusted for confounders.

We carried out three sensitivity analyses: (i) we conducted separate analyses including only FinSote 2018 and 2019, since this data was collected prior to the COVID-19 pandemic and is thus less subject to information bias; (ii) we re-conducted the analyses in (i) but using Cox proportional hazard models to adhere to the registration protocol; and (iii) we re-conducted the main analyses but excluding users of other forms of tobacco (e.g. in analyses of smoking we excluded current users of snus, e-cigarettes with nicotine and nicotine replacement therapy products). More details are provided in the Supplementary Appendix. We carried out an additional post-hoc stratified analyses by vaccination period to explore whether the association between tobacco use and COVID-19 incidence changed after the start of the rollout of COVID-19 vaccinations on December 26, 2020. We restricted these analyses to smoking and snus use due to the small number of cases for the other forms of tobacco use.

We used R version 3.6.3 for all analyses. We used the svyglm functions in the survey package to fit the Poisson regression considering the complex sampling design in all analyses. An annotated statistical code can be found in the Supplementary Appendix.

Results

A total of 44,199 participants with recorded smoking status were included in the study (after excluding 4987 participants, 10.1%, with incomplete data). Table S1 shows the proportions of missing data for each variable and survey year. Current daily smokers were more often male, younger, had lower years of education and reported lower active participation in social activities than never smokers (Table 1). Tables S2-S5 shows the baseline characteristics by other forms of tobacco use.

Table 1 Baseline characteristics of 44,199 participants of FinSote 2018, 2019 and 2020 by smoking status.

In our full sample, 395 participants with complete data had a confirmed COVID-19 case. Current daily smokers had a relative risk of 1.12 of a confirmed COVID-19 case (95% CI 0.65; 1.94) in the model fully adjusted for confounders compared with never smokers (Table 2). The estimates had wide confidence intervals and were compatible with a large range of associations. The relative risk for current occasional and former smokers were 0.73 (95% 0.44; 1.22) and 1.06 (95% CI 0.78; 1.45), also compatible with a wide range of associations (Table 2).

Table 2 Relative risk of confirmed COVID-19 cases by tobacco use in participants of FinSote surveys.

Current snus use was associated with a 68% higher risk of a confirmed COVID-19 case (RR 1.68, 95% CI 1.02; 2.75) than never users (Table 2). The relative risk for former users was 1.09 (95% CI 0.70; 1.71). There were very few confirmed cases of COVID-19 among current users of e-cigarettes (with or without nicotine) and nicotine replacement therapy products, resulting in very imprecise estimates.

Adjusting for BMI as a potential collider resulted in small attenuations of the point estimates but did not substantially change the results (Table 2). For example, the point estimate of the relative risk of COVID-19 for current daily smokers changed from 1.12 in the model fully adjusted for confounders to 1.11 in the model additionally adjusted for BMI.

Sensitivity analysis restricting the data to 2018 and 2019 were not informative due to very imprecise estimates (Table S6). Sensitivity analyses using Cox proportional hazard models yielded almost identical estimates as Poisson regression models (Table S7). Excluding users of other forms of tobacco resulted in similar estimates than the main analyses (Tables S8 and S9). Exploratory post-hoc analyses by vaccination period showed that daily smokers had a lower risk of COVID-19 in the period pre-COVID-19 vaccination (fully adjusted RR 0.27, 95% CI 0.10; 0.74) and a higher risk of COVID-19 in the period after the rollout of COVID-19 vaccinations (fully adjusted RR 1.52, 95% CI 0.83; 2.79). Current snus users had a higher risk of COVID-19 during both periods, consistent with the main findings.

Discussion

We examined the association between different forms of tobacco use and the risk of having a confirmed COVID-19 case. We did not find evidence that smoking (current daily, occasional and former smoking) is associated with the risk of a COVID-19 case. Our estimates are weakly informative of an increased risk of confirmed COVID-19 among current daily smokers and former smokers and a lower risk of confirmed COVID-19 among current occasional smokers compared to never smokers. Current snus use was associated with a higher risk of having a confirmed COVID-19 case. Results for e-cigarettes (with or without nicotine) and nicotine replacement therapy products were inconclusive. Post-hoc sensitivity analyses showed that smokers had a lower risk of COVID-19 in the period before the start of COVID-19 vaccination rollout and higher risk after the start of COVID-19 vaccinations.

We did not find evidence of an association between smoking and the risk of a confirmed COVID-19 case in the whole study period. The results of the largest living systematic review and meta-analysis to date showed that current smokers had 33% lower risk of SARS-CoV-2 infection than never smokers (RR 0.67, 95% credible intervals 0.60; 0.75)13. However, only seven out of 39 studies included in the meta-analysis were carried out in random or nationally representative samples and were considered of good quality37,38,39,40,41,42,43. All these studies were seroprevalence studies in national or subnational random samples and showed a negative association between current smoking and seroprevalent SARS-CoV-2. However, they reported unadjusted42 or minimally adjusted estimates38,39,41,44 and none controlled for socioeconomic status, making residual confounding highly likely. Our post-hoc findings of the pre-vaccination period are consistent with these findings. In the post-vaccination period, weak evidence of a higher risk of COVID-19 among smokers could be explained by lower vaccine adherence among smokers.

We found that current snus users had a higher risk of confirmed COVID-19. Comparison with previous studies is limited as, to our knowledge, no study has examined the association between snus use and the risk of COVID-19. Snus is a commonly used tobacco product in Nordic countries. While snus sales are banned in Finland, imports are allowed for personal use. In Finland, snus is primarily used by young men17, with higher use among Swedish-speaking Finns and active sports players (especially ice hockey)45. Other forms of nicotine exposure have been examined in few studies. A study in the United States in adolescents and young adults aged 13–24 did not find conclusive evidence of an association between e-cigarettes and COVID-19 positive diagnosis (OR 1.9, 95% CI 0.8; 4.7)46. Dual users (i.e. cigarettes and e-cigarettes) had 6.8 higher odds of COVID-19 diagnosis (95% CI 2.4; 19.6)46. Another study in the United Kingdom also did not find conclusive results on the association between current e-cigarette use and diagnosed or suspected COVID-1947. Our results suggest that nicotine does not play a protective role in the risk of infection from SARS-CoV-2. While we cannot rule out the existence of biological effects of snus use, our findings suggest that social and environmental mechanisms increasing viral exposure outweigh these hypothetical protective effects.

Adjusting for BMI as a potential collider did not change substantially the results. In a large population study in the United Kingdom, adjusting for BMI, smoking, index of multiple deprivation, and comorbidities resulted in a change in direction of the effect (hazard ratios changed from 1.14 to 0.89)2. Post-hoc analyses showed that this was mainly due to adjusting for chronic respiratory disease, which is a consequence of smoking and might have induced collider bias2. We were only able to adjust for BMI, and not for chronic conditions, which might explain why we did not observe a similar change.

Our results come from a setting of relatively low COVID-19 viral spread. During the first wave in 2020, Finland introduced a nationwide closure of schools, restaurants and mass gatherings, without resorting to statutory lockdown policies. In May 2020, the country transitioned to a so-called “hybrid strategy” which included broad testing-tracing-isolation actions, targeted regional measures and vaccination rollout (which started in December 26, 2020)48. This should be considered when assessing the external validity of our findings, which are mostly generalizable to other Nordic countries and high-income countries with low levels of tobacco use and COVID-19 viral spread. In addition, as suggested by our exploratory analyses by vaccination period, it is possible that the risk of COVID-19 among tobacco users varied over time. Further studies should explore this hypothesis in more depth.

Major strengths in our study included the use of pooled nationally representative data and the relatively large sample size. The sampling frame includes all permanent residents in Finland, including people living in institutions and conscripts, which reduces the risk of selection bias. The outcome is measured using standardized techniques and case definitions and we consider it has a lower risk of misclassification bias. Finally, we were able to control for a larger set of confounders than previous studies, reducing (although not completely) the risk of residual confounding.

However, some limitations are noted. First, there were very few COVID-19 cases in our data, resulting in a low statistical power to observe such a weak association between smoking and the risk of COVID-19. This reflects the fact that Finland has been relatively effective in controlling the viral spread during the COVID-19 pandemic. Up to August 23, 2021, Finland has had less than 125,000 cases1. We consider, however, that the study is worth doing even if the results are only weakly informative, as we are exploring several other forms of tobacco and nicotine use and our estimates can be meta-analysed with future studies to obtain more precise estimates49. In addition, COVID-19 cases are likely underestimated. This underestimation was larger during the first months of the COVID-19 pandemic, where testing was restrictive to health workers and hospitalized patients. As part of a change to a hybrid policy, testing was expanded significantly in May 202050. As a result, positivity rates went from 9.7% in March 2020 to 2.0% in May 2020 and remained below 4% throughout the study period51. Underestimation would only bias our results if exposure groups would differ in their access or use of testing. We consider this possibility as low, as testing was accessible and free of charge, but we cannot rule this out. Second, part of our data was collected during the COVID-19 pandemic and, in some cases, after participants have had COVID-19, which might have influenced their responses to the questions on smoking and other behavioural risk factors. Our sensitivity analyses restricted to 2018 and 2019 were not informative due to very imprecise estimates. Third, given the time lag between exposure and outcome assessment, it is possible that some current smokers in 2018 or 2019 might have quit smoking at the start of the pandemic, introducing misclassification bias in the measurement of exposure. This bias, however, is likely to be small as the time lag is relatively short. Fourth, we were not able to adhere to our pre-registered analysis plan of analysing the data as time-to-event data. However, SARS-CoV-2 is a new pathogen and participants were considered at-risk at the start of the pandemic in Finland (i.e., February 27), creating a unique situation where all participants who did not experience the outcome have identical follow-up time. This leads to a discrete Cox model that in practice provides similar estimates to our current analysis52.

Conclusions

We did not find conclusive evidence of an association between current smoking and the risk of a COVID-19 case. Current snus users had a higher risk of a confirmed COVID-19 case. Our findings suggest that nicotine might not have a protective role in the risk of SARS-CoV-2 infection as previously hypothesized. Future research could use instrumental variable designs, such as Mendelian randomization, to obtain more robust causal effects, and explore the role of COVID-19 vaccinations on the association between tobacco use and risk of COVID-19 and adverse outcomes.