Smoking is significantly associated with increased risk of COVID-19 and other respiratory infections

Observational studies suggest smoking, cannabis use, alcohol consumption, and substance use disorders (SUDs) may impact risk for respiratory infections, including coronavirus 2019 (COVID-2019). However, causal inference is challenging due to comorbid substance use. Using summary-level European ancestry data (>1.7 million participants), we performed single-variable and multivariable Mendelian randomization (MR) to evaluate relationships between substance use behaviors, COVID-19 and other respiratory infections. Genetic liability for smoking demonstrated the strongest associations with COVID-19 infection risk, including the risk for very severe respiratory confirmed COVID-19 (odds ratio (OR) = 2.69, 95% CI, 1.42, 5.10, P-value = 0.002), and COVID-19 infections requiring hospitalization (OR = 3.49, 95% CI, 2.23, 5.44, P-value = 3.74 × 10−8); these associations generally remained robust in models accounting for other substance use and cardiometabolic risk factors. Smoking was also strongly associated with increased risk of other respiratory infections, including asthma-related pneumonia/sepsis (OR = 3.64, 95% CI, 2.16, 6.11, P-value = 1.07 × 10−6), chronic lower respiratory diseases (OR = 2.29, 95% CI, 1.80, 2.91, P-value = 1.69 × 10−11), and bacterial pneumonia (OR = 2.14, 95% CI, 1.42, 3.24, P-value = 2.84 × 10−4). We provide strong genetic evidence showing smoking increases the risk for COVID-19 and other respiratory infections even after accounting for other substance use behaviors and cardiometabolic diseases, which suggests that prevention programs aimed at reducing smoking may be important for the COVID-19 pandemic and have substantial public health benefits.

S ince the first reported cases in Wuhan, China in December 2019 1 , coronavirus disease 2019 (COVID-19) has subsequently affected more than 200 countries and continues to be a global pandemic of substantial worldwide morbidity and mortality 2,3 . More broadly, upper and lower respiratory infections (URIs and LRIs, respectively) and other respiratory diseases (i.e., asthma, chronic obstructive pulmonary disease (COPD), etc.) are leading causes of yearly worldwide morbidity and mortality 4,5 . For example, the Global Burden of Disease Study estimated that LRIs caused more than two million deaths globally in 2016 4 , while approximately 2.3 million people died from COPD in 2015 5 . Respiratory infection and diseases are also a large economic burden: URIs result in more than 40 million missed days of school and work per year 6 .
Substance use (tobacco smoking, cannabis use, and alcohol consumption) are risk factors linked with adverse lung and respiratory outcomes [7][8][9] . For example, observational data has shown chronic heavy alcohol consumption to be associated with increased risk for pneumonia 7 and acute respiratory distress syndrome 10 , while cannabis smoke has been shown to contain many of the same toxins and irritants as smoke derived from tobacco 11 , but may differ from tobacco in its association with bronchitis and other respiratory infections 12 . In addition, it has been suggested that chronic alcohol abuse may compromise the ability of immune cells to destroy bacteria in the lungs, which may result in an increased vulnerability to respiratory infections like pneumonia and tuberculosis 13 .
Paralleling the COVID-19 pandemic have been increases in substance use 14 , which combined with data showing approximately 10.8% of US adults suffering from a substance use disorder (SUD) 15 and recent work using electronic health records (EHRs) to show that individuals with a SUD are at increased risk for COVID-19 16 , suggest identifying potential causal relationships between substance use, SUD and respiratory infectious diseases would have substantial public health benefits.
However, observational studies cannot be used to reliably identify causality due to limitations such as residual confounding and reverse causality 17 . For example, outcomes reached from observational studies may be subject to unmeasured confounders like comorbid disorders or underlying genetic differences that may lead to biased estimates, and consequently, may not reflect true causal relationships 18,19 . While randomized controlled trials (RCTs) are considered the "gold standard", RCTs can be both unethical and impractical 20,21 . Constructing an RCT to examine the effect of substance use on respiratory infection risk may be further complicated by other existing comorbidities.
Mendelian randomization (MR) is a genetic approach that uses genetic variants as instrumental variables to explore causal relations between exposures (e.g., alcohol consumption, tobacco smoking, cannabis use) and health outcomes (e.g., respiratory infections and diseases). This technique takes publicly available genome wide association studies to screen for suitable genetic instrumental variables, which allows researchers to perform MR studies without the need to recruit new patients 22 . Because germline variants are randomly assorted at meiosis, MR may be considered conceptually equivalent to RCTs, though a more naturalized version 19,22 . More specifically, given genetic instruments cannot be influenced by other confounders (i.e., lifestyle, or environmental factors), MR studies, are in theory, less susceptible to confounding or reverse causality than traditional observational studies 23 . Therefore, MR is an important analytical approach to strengthen causal inference when RCTs are challenging due to methodological or ethical constraints 24 . Given the potential for confounding and limited causal inference derived from observational data, we used large, publicly available genome-wide association study (GWAS) data and two-sample MR methods to evaluate the relationships between substance use, substance use disorders (cannabis use disorder (CUD) and alcohol use disorder (AUD)) and respiratory infection and disease outcomes. Finding  the genetic liability for smoking increases the risk for COVID-19  and several other respiratory infections, even after accounting for  other substance use behaviors builds upon recent literature  identifying modifiable risk factors for COVID-19 risk 9,25,26 , and also may inform research and clinical practice given the recent increase in substance use, abuse, and use disorders paralleling the COVID-19 pandemic 14 . Associations of substance use and SUDs with other respiratory infectious disease risk. We further assessed the genetic relationships between substance use and respiratory infections. Tables 2 and 3 compares SVMR and MVMR results for asthmarelated respiratory infections, bronchitis, and the common cold;  Tables 4 and 5 compares SVMR and MVMR results for influenza and pneumonias. Supplementary Data 13-17 contain the full FinnGen results.

Associations
As with COVID-19 infection risk results, we found that the genetic liability of lifetime tobacco smoking was the substance use risk factor with the strongest associations, including results that were robust in MVMR models. Tobacco smoking, for example, was associated with increased risk of asthma-related infections and asthma-related pneumonia/sepsis (SVMR OR = 2.52, 95% CI, 1.59, 3.97, P-value = 7.29 × 10 −7 ; accounting for substance use disorders, MVMR OR = 3.64, 95% CI, 2.16, 6.11, P-value = 1.07 × 10 −6 ), but for neither bronchitis nor the common cold ( As with the smoking-COVID-19 findings, we tested robustness of the smoking-respiratory infection risk results using additional MVMR models that accounted for cardiometabolic disorders (CAD, T2D, and obesity) with evidence for an impact on respiratory infection risk [30][31][32]

Discussion
Using large summary-level GWAS data and complementary twosample MR methods, we show that the genetic liability for tobacco smoking has potential causal relationships with several respiratory infection and disease outcomes, including COVID-19. These tobacco smoking-respiratory findings were supported by multivariable MR analyses accounting for alcohol and cannabis use and abuse, which in addition to the broadly consistent IVW results (within the IVW MR 95% confidence interval but typically less precise) with estimates from the weighted median, weighted mode, and MR Egger sensitivity analyses strengthen causal inference. Further, in single variable MR, we identify potential adverse impacts of CUD on lower respiratory infections, the common cold, and several asthma-related infections, suggesting evidence for a dose-dependent impact of cannabis use where heavy cannabis use may be harmful to the respiratory system. In parallel, we find little evidence for an alcohol-respiratory infection relationship suggesting that previous observational data may be due to confounding.
Our COVID-19 results extend recent MR studies showing adverse effects of smoking on COVID-19 risk by accounting for highly comorbid alcohol consumption, cannabis use, and SUDs, which when combined with reports suggesting smoking intensifies the severity of COVID-19 symptoms 33,34 , the risk for being associations reported as odds ratios with 95% confidence intervals. Boldface indicates statistical significance after correction for multiple comparisons (P < 0.0025). Genetic instruments selected from five GWASs, selection threshold P < 5 × 10 −8 or P < 5 × 10 −6 (CUD and AUD), clumped at linkage disequilibrium (LD) r 2 = 0.001 (10 000 kilobase pair window); N SNPs differs across outcomes depending on number of genetic instruments found in outcome GWASs. CUD cannabis use disorder, AUD alcohol use disorder, COVID-19 coronavirus 2019, MR Mendelian randomization, GWAS genome wide association study, N SNPs number of single-nucleotide polymorphism (genetic instruments), OR odds ratio, CI confidence interval.
admitted to an intensive care unit or requiring ventilation 34 , and recent transcriptomics-based work showing that smoking may increase the expression of angiotensin converting enzyme 2 (ACE2), the putative receptor for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (the virus that causes COVID-19) 35 , suggests smoking may be an important modifiable risk factor for COVID-19 risk.
Our genetics-based findings support and extend the observational literature identifying tobacco smoking as a risk factor for respiratory infection and diseases 9,25,26 , and add to the recent MR literature identifying potential causal links of smoking with reduced lung function 36 , lung cancer 37 , and increased mortality due to respiratory disease 38 . Potential mechanisms by which smoking increases respiratory infection risk include structural changes to the respiratory tract and a dysregulated cellular and humoral immune response, including peribronchiolar inflammation, decreased levels of circulating immunoglobulins, and changes to pathogen adherence. For example, smoking has been shown to stimulate the release of catecholamine and corticosteroids, which may, in turn, increase circulating CD8 + lymphocytes and suppress the host defense against infections. Notably, many immunological effects related to smoking may resolve within six weeks of smoking cessation, which suggests that smoking cessation programs may have an important impact on reducing respiratory infections.
Regarding cannabis use, while we failed to find evidence of any relationships, smoking cannabis, like tobacco smoking, may prompt the onset of coughing, which could consequently increase viral transmission, or may possibly exacerbate respiratory symptoms.
As cannabis is the most used drug worldwide-an estimated 188 million recreational users worldwide-this aspect of cannabis use may have important implications for the spread of COVID-19. In contrast, the single-variable MR CUD results demonstrated adverse effects on several respiratory outcomes, but not COPD, which supports the existing literature [39][40][41] ; however, accounting for lifetime tobacco smoking attenuated the CUD results, thus highlighting the complex nature of these relationships. Further, habitual cannabis smoking may have several effects on respiratory and immune systems that may impact respiratory infection susceptibility. For example, structural abnormalities in alveolar macrophages and coincident dysregulated cytokine production and antimicrobial activity have been reported. While our study provides preliminary genetic evidence suggesting potential causal relationships between heavy cannabis use and respiratory infection, additional triangulating lines of evidence (i.e., immune monitoring studies) are required to further elucidate the CUDrespiratory infection relationship. However, given that the toxin and irritant profiles of cannabis and tobacco smoke are similar 11 , the direct route of administration via inhalation for these associations reported as odds ratios with 95% confidence intervals. Boldface indicates statistical significance after correction for multiple comparisons (P < 0.000714). Genetic instruments selected from 5 GWASs, selection threshold P < 5 × 10 −8 or P < 5 × 10 −6 (CUD and AUD), clumped at linkage disequilibrium (LD) r 2 = 0.001 (10 000 kilobase pair window); N SNPs differs across outcomes depending on number of genetic instruments found in outcome GWASs. CUD cannabis use disorder, AUD alcohol use disorder, MR Mendelian randomization, GWAS genome wide association study, N SNPs number of single-nucleotide polymorphism (genetic instruments), OR odds ratio, CI confidence interval.
substances could result in dysregulated pulmonary physiology which may, in turn, increase infection risk.
In contrast to our tobacco smoking findings, we failed to find genetic evidence of respiratory implications due to alcohol consumption not meeting the threshold of AUD, or binge drinking, suggesting that previous observational literature may be due to confounding from other comorbid behaviors-such as smokingthat may be the true causal risk factors for respiratory infections. For example, observational and genetic evidence have shown a strong association between alcohol consumption and smoking. It associations reported as odds ratios with 95% confidence intervals. Boldface indicates statistical significance after correction for multiple comparisons (P < 0.000714). Genetic instruments selected from 5 GWASs, selection threshold P < 5 × 10 −8 or P < 5 × 10 −6 (CUD and AUD), clumped at linkage disequilibrium (LD) r 2 = .001 (10 000 kilobase pair window); N SNPs differs across outcomes depending on number of genetic instruments found in outcome GWASs. CUD cannabis use disorder, AUD alcohol use disorder, COPD chronic obstructive pulmonary disorder, MR Mendelian randomization, GWAS genome wide association study, N SNPs number of single-nucleotide polymorphism (genetic instruments), OR odds ratio, CI confidence interval. associations reported as odds ratios with 95% confidence intervals. Boldface indicates statistical significance after correction for multiple comparisons (P < 0.000714). Genetic instruments selected from 5 GWASs, selection threshold P < 5 × 10 −8 or P < 5 × 10 −6 (CUD and AUD), clumped at linkage disequilibrium (LD) r 2 = .001 (10 000 kilobase pair window); N SNPs differs across outcomes depending on number of genetic instruments found in outcome GWASs. CUD cannabis use disorder, AUD alcohol use disorder, MR Mendelian randomization, GWAS genome wide association study, N SNPs number of single-nucleotide polymorphism (genetic instruments), OR odds ratio, CI confidence interval.
has been estimated that 85% of smokers consume alcohol [42][43][44] and alcohol drinkers are 75% more likely than abstainers to smoke 45 . Therefore, it is possible that the observational studybased alcohol-respiratory infection links may be due, instead, to tobacco smoking; however, future work will be needed to confirm this hypothesis. In addition, it is important to note that our results should not be interpreted as suggesting that alcohol does not impact overall lung health and structure, which has been previously reported 7 . Further, while we failed to find evidence that weekly alcohol consumption impacted COVID-19 risk, the Centers for Disease Control recently showed that dining at onsite locations, such as restaurants and bars, is associated with increased COVID-19 risk; since alcohol consumption may lower inhibition and increase impulsivity, individuals consuming alcohol may take social distancing less seriously, and thereby unintentionally spread the SARS-CoV-2 virus. This study has several strengths including the use of multiple alcohol consumption and cannabis use variables, which enabled us to evaluate various dimensions of substance use and abuse and identify possible causal relationships of substance use disorders and respiratory outcomes. In addition, our main single variable analyses included multiple MR methods, each relying on orthogonal assumptions, which provide confidence in robustness of the results and strengthen causal inference 46 . Our multivariable twosample MR design, the most appropriate design given the strong correlation between tobacco smoking, alcohol consumption and cannabis use, yielded estimates that account for these correlated behaviors for each exposure on COVID-19 risk and other respiratory outcomes. Another strength is our extension of MVMR to test the robustness of the main tobacco smoking findings by incorporating other potential confounders that may impact infectious disease risk (obesity, cardiovascular disease, and T2D).
This study also has limitations. A main limitation is the possibility of collider bias-especially with regards to the COVID-19 datasets 47 . Collider bias may occur when analyses are controls or selects the sample based upon a collider variable that is caused by both the exposure and outcome variables and distorts the true underlying association 48 50 ; however, as Tattan-Birch et al. discuss, both smoking and COVID-19 may cause coughing, which, during the COVID-19 pandemic, may increase the likelihood for smokers to be tested and their subsequent overrepresentation among clinical study participants testing negative for COVID-19 49 . As a result, among samples with COVID-19 tests, smoking may appear to have a protective effect 49 . While it is often not possible to ensure the absence of collider bias 47 , we aimed to design our study incorporating measures that may mitigate its impact. For example, we used the most recently released version of publicly available COVID-19 data (from January 18, 2021) 51 that may include participants more representative of the general population compared to samples collected earlier in the COVID-19 pandemic. Reassuringly, we also found similar smoking effect estimates in several respiratory-related infection outcomes, which suggests a broader impact of smoking on the respiratory system that extends to COVID-19.
In addition, as with all self-reported substance use literature, these exposures may be either under-or over-reported 52 . Because many of the datasets included UK Biobank participants, who are more educated, lead healthier lifestyles, and have fewer health problems than the UK population 53 , this discrepancy may limit the applicability of our findings to other populations. Regarding our mainly null alcohol-respiratory infection results, it is possible that alcohol may have indirect impacts on infection risk through a modified immune response 54 , or other system dysregulations that may modulate infection risk that we were not able to directly assess using MR. However, like other recent psychiatric MR studies where the exposure instruments included a relaxed statistical threshold, our binge drinking and AUD instruments were comprised of independent SNPs associated with the respective drinking behavior (i.e., P-value < 5 × 10 −6 ) for SNP inclusion due to the lack of conventionally GWS SNPs associations reported as odds ratios with 95% confidence intervals. Boldface indicates statistical significance after correction for multiple comparisons (P < 0.000714). Genetic instruments selected from 5 GWASs, selection threshold P < 5 × 10 −8 or P < 5 × 10 −6 (CUD and AUD), clumped at linkage disequilibrium (LD) r 2 = .001 (10 000 kilobase pair window); N SNPs differs across outcomes depending on number of genetic instruments found in outcome GWASs. CUD cannabis use disorder, AUD alcohol use disorder, MR Mendelian randomization, GWAS genome wide association study, N SNPs number of single-nucleotide polymorphism (genetic instruments), OR odds ratio, CI confidence interval.
(P-value < 5 × 10 −8 ) 55,56 , which may impact the results. Because heavy alcohol consumption and AUD have been previously linked with acute respiratory distress syndrome 10 -one of the most severe complications of COVID-19-future studies should re-evaluate the links between heavy alcohol consumption and AUD when better powered GWAS data become available. Further, the included samples were comprised of primarily white individuals of European ancestry, and research has shown strong racial, ethnic, and socioeconomic disparities in COVID-19 risk and severity [57][58][59] . Therefore, we caution the generalization of these findings and urge future work to investigate these relationships using a genetics-based approach in other populations when the data become available. Another limitation is the overlap of the UKB participants between the alcohol consumption, lifetime smoking, and COVID-19 outcomes, which may bias resulting estimates 60 . However, potential bias would likely be minimal 60 , and it has also been shown that two-sample MR may be used in single samples provided the data is derived from large biobanks, i.e., the UKB, FinnGen, etc 61 . Also, results were largely unchanged when we performed analyses using the COVID-19 endpoints excluding UKB participants suggesting minimal bias.
In conclusion, our data provide genetic evidence of adverse relationships between smoking and many respiratory-related disease outcomes ranging from the common cold to severe COVID-19, which suggests prevention programs aimed at smoking cessation and prevention may have public health and clinical benefits.

Methods
Data sources and genetic instruments. Summary-level data for both modifiable risk factor instrument and infectious disease outcome data were derived from publicly available GWASs in populations of predominantly European ancestry ( Fig. 1; Table 6; Supplementary Data 1). All GWASs have existing ethical permissions from their respective institutional review boards and include participant informed consent with rigorous quality control. For this study, we included all exposure SNPs associated at conventional genome-wide significance (GWAS) P < 5 × 10 −8 for smoking, alcohol and cannabis use, and 5 × 10 −6 for AUD and CUD due to the relatively low number of SNPs at GWS, clumped at linkage disequilibrium (LD) r 2 = 0.001 and a distance of 10,000 kb, using reference samples comprised of participants of European ancestry 62 .
Tobacco smoking. We included lifetime smoking instruments from the recent GWAS of a lifetime smoking index/score (which combined smoking initiation, duration, heaviness and cessation), conducted in a sample of 462 690 current, former and never smokers in the UKB (mean score value 0.359 (standard deviation (SD) = 0.694); sample: 54% female, mean age 56.7 years, 54% never smokers, 36% former smokers, and 11% current smokers 63,64 . (An SD increase in lifetime smoking index score would be equivalent to smoking 20 cigarettes per day for 15 years and stopping 17 years previously or 60 cigarettes per day for 13 years and stopping 22 years previously) 63 (Supplementary Data 2).
Cannabis use. We included two cannabis-related instrument sets: cannabis use and CUD. Summary statistics for lifetime cannabis use (a yes/no variable of whether participants reported using cannabis during their lifetime) were obtained from the PGC meta-analysis GWAS of 3 cohorts (International Cannabis Consortium (35, 65,66 . CUD instruments were obtained from a recent PGC meta-analysis of three cohorts of predominantly European ancestry (PGC, Lundbeck Foundation Initiative for Integrative Psychiatric Research (iPSYCH), and deCODE cohorts, excluding related individuals from PGC family-based cohorts; demographics not available), including 14,808 cases of cannabis abuse or dependence defined as meeting DSM-IIIR, DSM-IV, DSM-5, or ICD10 codes (depending on study cohort) criteria; the 358 534 controls were defined as anyone not meeting the criteria 67,68 (Supplementary Data 3).
Alcohol consumption. We included two instrument sets related to alcohol use: drinks per week 69 , and AUD. Drinks per week instruments were obtained from the GSCAN GWAS meta-analysis of 29 cohorts (941 280 individuals; demographics not available) of predominantly white European ancestry 69,70 . Given the varied cohort methods used to measure alcohol consumption (binned, normalized, etc.), the data was log transformed: thus, the effect estimate is measured in log transformed drinks per week 69 (Supplementary Data 4). For the AUD instrument set, we used the Psychiatric Genomics Consortium (PGC) GWAS meta-analysis of 28 cohorts (51.6% female, 8485 cases, 20,657 controls) of predominantly European ancestry 71,72 . AUD was diagnosed by either clinician rating or semi-structured interview using DSM-IV criteria including the presence of at least three of seven alcohol-related symptoms (withdrawal, drinking larger amounts/drinking for longer time, tolerance, desire or attempts to cut down drinking, giving up important activities to drink, time related to drinking, or continued alcohol consumption despite psychological and/or physical problems) 73 (Supplementary  Data 4).
For the multivariable MR (MVMR) analyses, we concatenated independent instrument sets for alcohol use, cannabis use and lifetime smoking, and also AUD, CUD, and lifetime smoking, clumping the resulting two multivariable (MV) Obesity, coronary heart disease (CAD), and Type 2 Diabetes (T2D) have been identified as risk factors for COVID-19 [27][28][29] , and other respiratory infections [30][31][32] . Therefore, in supplementary sensitivity analyses to further test robustness of the lifetime smoking results, we concatenated independent instrument sets for lifetime smoking and, alternatively, CAD using the CARDIoGRAMplusC4D-UK Biobank CAD (Coronary ARtery DIsease Genome wide Replication and Meta-analysis (CARDIoGRAM) plus The Coronary Artery Disease (C4D) Genetics) GWAS meta-analysis 74,75 ; T2D, using a recent meta-analysis of three T2D studies, i.e. DIAbetes Genetics Replication and Meta-analysis (DIAGRAM), Genetic Epidemiology Research on Aging (GERA) and the full cohort release of UKB 76,77 ; and obesity, using GWASs from GIANT (Genetic Investigation of ANthropometric Traits) 78,79 (see Supplementary Data 1 for more information; Supplementary Data 7).
F statistics for the unconditional instruments were strong (>10, Supplementary  Data 2-4). We were unable to calculate conditional F statistics to assess the strength of the multivariable instrument sets: SVMR statistical methods recently extended to two sample MVMR are appropriate only for non-overlapping exposure summary level data sources. When overlapping, the requisite pairwise covariances between SNP associations are only determinable by using individual level data 80 .
COVID-19 outcomes. We used summary GWAS statistics from the COVID-19 Host Genetics Initiative (COVID-19 hg) meta-analysis round 5a (18 January 2021 release date) (https://www.covid19hg.org/results) 81  Other respiratory infection and disease outcomes. We used data from FinnGen Release 5 (released to public, 11 May 2021) for additional respiratory-related outcomes 82 , including acute upper respiratory infections, asthma related acute respiratory infections, pneumonia, influenza, bronchitis, chronic lower respiratory diseases, and acute nasopharyngitis (common cold) (N ≤ 218,792) ( Fig. 1; Table 6; Supplementary Data 1). FinnGen is a public-private partnership incorporating genetic data for disease endpoints from Finnish biobanks and Finnish health registry EHRs 82 . Detailed documentation is provided on the FinnGen website (https://finngen.gitbook.io/documentation/). Sample independence. Participant overlap in samples used to estimate genetic associations between exposures and outcomes can increase weak instrument bias (WIB) in MR analyses 60,83 , but to a lesser extent with large biobank samples (including UKB and deCODE). Given the large size of the overlapping cohorts (e.g., UKB, deCode) (Supplementary Data 1) and the strength of the instruments in  90 , were used to facilitate identification and removal of outlier instruments to correct potential directional horizontal pleiotropy and resolve detected heterogeneity. For SVMR, we also used, alternatively, Generalized single variable Summary-data based MR (GSMR) to identify and remove instruments with heterogeneous causal estimates suspected to be invalid instruments with apparent pleiotropic effects on both exposure and outcome disease (using the recommended default HEIDI (heterogeneity in dependent instruments) -outlier threshold (0.01) to retain sufficient power to detect heterogeneity) 91 . We used the SVMR Steiger directionality test to test the causal direction between the hypothesized exposure and outcomes 62 . We also performed a leave-one-out analysis to evaluate the potential SNPs within each instrument that may be high influence points 85 .
For MVMR, in addition to the multivariable extension of the MR PRESSO global test, we used the multivariable extension of the MR Lasso method, which applies lasso-type penalization to the direct effects of the instruments on the outcome disease: the so-called post-lasso estimate is obtained by performing MR IVW using only those instruments identified as valid instruments (tuning parameter specified at default heterogeneity stopping rule) 89 . Analyses were carried out using TwoSampleMR, version 0.5.5 85 , MendelianRandomization, version 0.5.0, in the R environment, version 4.0.2; the GSMR method was implemented in the GCTA (Genome-wide Complex Trait Analysis) software (https://cnsgenomics. com/software/gcta/#GSMR).
Reported results and interpretation of findings. MR IVW odds ratios (OR) with 95% CI, per unit increase in the exposures (e.g., per unit increase of log-transformed alcoholic drinks per week or lifetime smoking index), with P-values derived from two-sided tests, corrected for outlier or invalid variants, are presented in Tables 1-5. For our COVID-19 analyses, we used a two-sided α of 0.0025 (based on comparing four COVID-19 outcomes and five substance use exposures) and for the other infectious disease outcomes, a threshold of 0.00071 (based on comparing 14 FinnGen infectious respiratory diseases and five substance use exposures) as a heuristic that allows for follow-up analyses for a plausible number of findings. In assessing consistency and robustness, we looked for estimates substantially agreeing in direction and magnitude (overlapping confidence intervals) across then four complementary MR methods used. We evaluate evidence strength based upon the effect magnitude and direction, the 95% confidence interval of that effect, and the P-value.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
All analyses were based upon publicly available data. Single-variable MR and multivariable MR instrument datasets for each substance use behavior required to replicate the findings of this study are available in the Supplemental Data files. Full COVID-19 GWAS summary-level data is available at https://www.covid19hg.org/results/. FinnGen data are available at https://www.finngen.fi/en; lifetime smoking at https://data.bris.ac.uk/data/ dataset/10i96zb8gm0j81yz0q6ztei23d; alcohol drinks per week data at: https://genome. psych.umn.edu/index.php/GSCAN; cannabis use disorder and alcohol use disorder data are available through the Psychiatric Genomics Consortium data portal: https:// www.med.unc.edu/pgc/download-results/; and the cannabis use data are available through the International Cannabis Consortium at: https://www.ru.nl/bsi/research/group-pages/ substance-use-addiction-food-saf/vm-saf/genetics/international-cannabis-consortium-icc/ . Coronary artery disease and obesity summary statistics are available through the Cardiovascular Disease Knowledge Portal: https://cvd.hugeamp.org/. Type 2 Diabetes summary-level data is available Type 2 Diabetes Knowledge Portal: https://t2d.huge amp.org/.