Health effects associated with vegetable consumption: a Burden of Proof study

Previous research suggests a protective effect of vegetable consumption against chronic disease, but the quality of evidence underlying those findings remains uncertain. We applied a Bayesian meta-regression tool to estimate the mean risk function and quantify the quality of evidence for associations between vegetable consumption and ischemic heart disease (IHD), ischemic stroke, hemorrhagic stroke, type 2 diabetes and esophageal cancer. Increasing from no vegetable consumption to the theoretical minimum risk exposure level (306–372 g daily) was associated with a 23.2% decline (95% uncertainty interval, including between-study heterogeneity: 16.4–29.4) in ischemic stroke risk; a 22.9% (13.6–31.3) decline in IHD risk; a 15.9% (1.7–28.1) decline in hemorrhagic stroke risk; a 28.5% (−0.02–51.4) decline in esophageal cancer risk; and a 26.1% (−3.6–48.3) decline in type 2 diabetes risk. We found statistically significant protective effects of vegetable consumption for ischemic stroke (three stars), IHD (two stars), hemorrhagic stroke (two stars) and esophageal cancer (two stars). Including between-study heterogeneity, we did not detect a significant association with type 2 diabetes, corresponding to a one-star rating. Although current evidence supports increased efforts and policies to promote vegetable consumption, remaining uncertainties suggest the need for continued research.

A search string was developed to identify sources published after the period covered in Bechtold et al 2019 study. PubMed, EMBASE, and Web of Science were searched on May 31, 2022. The search was filtered for publication dates beginning March 01, 2017, the cutoff date for the search in Bechtold et al 2019 study. The search string included terms for each stroke subtype, and sources were screened simultaneously for both stroke subtypes of interest. The searches returned 356 total citations, of which 2 records were ultimately extracted and included in the model. As described in the section on IHD, search results were deduplicated and screened by two reviewers, with each reviewer verifying the other's exclusions for both the title-abstract and full-text screening steps.
Search strings for stroke subtypes: ("Vegetables" [MeSH Terms] OR "Vegetable*"[Title/Abstract] OR "green leafy vegetable*" [All Fields] [Title/Abstract] OR "haemorrhage" [Title/Abstract])))). Diabetes: We developed a PubMed search string to identify the most recent PRISMA complaint meta-analysis on vegetable consumption and type 2 diabetes. The meta-analysis identified that fulfills our criteria of exposure definition (total vegetables consumption), outcome definition (Type 2 diabetes) and PRISMA complaint was Halvorsen et al 2021. A total of 51 citations of this meta-analysis were identified and screened. From this metaanalysis, we included a total of 13 studies in the final analysis. The last date this meta-analysis covered was October 20, 2020.

Search strings for diabetes
Identified meta-analysis "Vegetables" [Mesh]
Another search string was developed to identify sources published after the period covered in Halvorsen et al 2021. PubMed, EMBASE, and Web of Science and searched on May 31, 2022. The searches from the three databases returned 262 total hits, of which 3 studies were ultimately extracted and included in the model. As with the other risk-outcome reviews, after deduplication, two reviewers screened the results independently, with all full-text exclusions and 20% of title-abstract exclusions verified by both reviewers. The exclusion reasons by source count were reported in the PRISMA diagram (Extended Data Figure 4). We also searched the global health data exchange database to identify studies not caught from the searching of citation of the selected meta-analysis and the three databases.

("Vegetables"[Mesh] OR "Vegetables" [Title/Abstract] OR "green leafy vegetables" OR "leafy vegetables" OR "Cruciferous vegetables" OR "Fruits and Vegetables" [Title/Abstract]) AND ("Cohort" [Publication type] OR "Prospective cohort" [Title/Abstract] OR "case-cohort" OR "Follow-up" OR "Longitudinal" [Title/Abstract]) AND ("Diabetes Mellitus, Type 2" [Mesh] OR "diabetes mellitus type 2" [Title/Abstract] OR "diabetes type 2" [Title/Abstract] OR "type 2 diabetes mellitus" [Title/Abstract] OR "type 2 diabetes" [Title/Abstract] OR "noninsulin dependent diabetes" [Title/Abstract] OR "adult-onset diabetes" OR "Diabetes Mellitus" OR "T2D"[Title/Abstract])
Esophageal cancer: We conducted a full systematic review with no date constraint because we did not find a metaanalysis that fulfilled our selection criteria (I.e. PRISMA complaint and matching our definition of exposure and outcome). A search string was developed to identify all sources published with no date constraint. PubMed, EMBASE, and Web of Science were searched on June 16, 2022. The searches returned 214 citations, and after deduplication by DOI and PMID, there were 151 unique records. Each record was screened on title and abstract by one reviewer, finding 20 inclusions. As a sensitivity check, 20% of the 133 exclusions were validated by a second reviewer. 1 discrepancy was found and was settled by moving to full-text review for additional evaluation. During the full text review of the 20 sources, 6 met inclusion criteria and 14 were excluded. Each full-text exclusion was verified by both reviewers. Exclusion reasons by source count described in the PRISMA flow chart (Extended Data Figure 3).

Vegetable and esophageal cancer search strings: ("Vegetables"[Mesh] OR "Vegetables" [Title/Abstract] OR "green leafy vegetables" OR "leafy vegetables" OR "Cruciferous vegetables" OR "Fruits and Vegetables" [Title/Abstract]) AND ("Cohort" [Publication type] OR "Prospective cohort" [Title/Abstract] OR "case-cohort" OR "Follow-up" OR "Longitudinal" [Title/Abstract]) AND ("Esophageal cancer" [Mesh] OR "Esophageal cancer" [Title/Abstract] OR "Esophageal squamous cell carcinoma" [Title/Abstract] OR "esophageal adenocarcinoma" [Title/Abstract]
Section 1.2: Assessing data source eligibility See Extended Data Figures 1-4 for details on identifying, screening, and assessing eligibility for records identified through our search. Inclusion criteria  Reported a relative risk of total vegetable consumption and at least one of the five outcomes  Included a measure of uncertainty for the effect size measure  Quantified the amount of vegetable consumption in the reference and alternate group  Prospective cohort study or Nested case-control or case cohort study Exclusion criteria  Were an aggregate study: meta-analysis or pooled cohort  Wrong study type: not a cohort study or case-cohort study  Duplicate study: cohort reported in paper was also reported elsewhere  Unmeasurable exposure: reported vegetable consumption without grams or servings equivalent, such as in aggregated "diet scores"  No measure of interest: reported RR for change in vegetable consumption or does not report RR  No exposure of interest: did not report any vegetable exposure or only reported a specific vegetable subtype  No outcome of interest: reported on all-cause-mortality or an outcome outside of the five outcomes of interest studied in this paper. This includes outcomes lacking specificity such as total stroke or cardiovascular disease  Not in English  Non-general population: study population defined by comorbidity or other traits that could interact with exposure and affect outcome For reports that met the inclusion criteria, data were extracted for the variables listed in Supplemental Table 4 Methods section "conducting systematic reviews": paragraph 2; SI section 1.1 Selection process 8 Specify the methods used to decide whether a study met the inclusion criteria of the review, including how many reviewers screened each record and each report retrieved, whether they worked independently, and if applicable, details of automation tools used in the process.
Methods section "conducting systematic reviews" paragraph 2-3 Data collection process 9 Specify the methods used to collect data from reports, including how many reviewers collected data from each report, whether they worked independently, any processes for obtaining or confirming data from study investigators, and if applicable, details of automation tools used in the process.
Methods section "conducting systematic reviews" paragraph 2-3 Data items 10a List and define all outcomes for which data were sought. Specify whether all results that were compatible with each outcome domain in each study were sought (e.g. for all measures, time points, analyses), and if not, the methods used to decide which results to collect.
Methods sections "conducting systematic reviews" 10b List and define all other variables for which data were sought (e.g. participant and intervention characteristics, funding sources). Describe any assumptions made about any missing or unclear information.
Methods section "data" paragraph 2; full list and definitions of all variables are in Supplemental Table 5; study characteristics for each included study are also listed in Supplemental Table 3  Study risk of  bias  assessment 11 Specify the methods used to assess risk of bias in the included studies, including details of the tool(s) used, how many reviewers assessed each study and whether they worked independently, and if applicable, details of automation tools used in the process.
Methods section "testing and adjusting for biases across study designs and characteristics" Effect measures 12 Specify for each outcome the effect measure(s) (e.g. risk ratio, mean difference) used in the synthesis or presentation of results.
Method sections "overview" paragraph 2, "estimating the burden of proof risk function" Synthesis methods 13a Describe the processes used to decide which studies were eligible for each synthesis (e.g. tabulating the study intervention characteristics and comparing against the planned groups for each synthesis (item #5)).
Methods section "data" 13b Describe any methods required to prepare the data for presentation or synthesis, such as handling of missing summary statistics, or data conversions.
Methods section "conducting systematic reviews" 13c Describe any methods used to tabulate or visually display results of individual studies and syntheses.
Methods sections "conducting systematic reviews," "estimating the shape of the risk-outcome relationship" 13d Describe any methods used to synthesize results and provide a rationale for the choice(s). If meta-analysis was performed, describe the model(s), method(s) to identify the presence and extent of statistical heterogeneity, and software package(s) used.
Methods sections "Estimating the shape of the risk-outcome relationship," "estimating the TMREL/minimum risk exposure level," and "estimating the burden of proof risk function". Software packages described in "code availability" section of the manuscript 13e Describe any methods used to explore possible causes of heterogeneity among study results (e.g. subgroup analysis, meta-regression).
Methods section "evaluating betweenstudy heterogeneity, uncertainty, and small numbers of studies" 13f Describe any sensitivity analyses conducted to assess robustness of the synthesized results.
Methods section "sensitivity analyses"; SI section 8 Reporting bias assessment 14 Describe any methods used to assess risk of bias due to missing results in a synthesis (arising from reporting biases).
Methods for detecting publication or reporting bias found in methods section "evaluating potential for publication or reporting bias" Certainty assessment 15 Describe any methods used to assess certainty (or confidence) in the body of evidence for an outcome.
Methods section "evaluating betweenstudy heterogeneity, uncertainty, and small numbers of studies" RESULTS Study selection 16a Describe the results of the search and selection process, from the number of records identified in the search to the number of studies included in the review, ideally using a flow diagram.
PRISMA flow diagram for each riskoutcome pair (Extended Data Figures 1-4); first paragraph of each risk-outcome pair results section + the results "overview" 16b Cite studies that might appear to meet the inclusion criteria, but which were excluded, and explain why they were excluded.

N/A
Study characteristics 17 Cite each included study and present its characteristics. SI section 3, Supplemental Table 3 ("study characteristics"); citations also available for download from the online viz tools: (https://vizhub.healthdata.org/burden-ofproof/) Risk of bias in studies 18 Present assessments of risk of bias for each included study.
Supplemental Table 6 Results of individual studies 19 For all outcomes, present, for each study: (a) summary statistics for each group (where appropriate) and (b) an effect estimate and its precision (e.g. confidence/credible interval), ideally using structured tables or plots.
Supplemental Table 7 Results of syntheses 20a For each synthesis, briefly summarise the characteristics and risk of bias among contributing studies.
First paragraph and last sentence of each risk-outcome pair results section 20b Present results of all statistical syntheses conducted. If meta-analysis was done, present for each the summary estimate and its precision (e.g. confidence/credible interval) and measures of statistical heterogeneity. If comparing groups, describe the direction of the effect.
Second paragraph of each risk-outcome pair results section + section titled "minimum risk level of vegetable intake;" Figures 1-5 20c Present results of all investigations of possible causes of heterogeneity among study results.
All uncertainty intervals presented everywhere in the manuscript and appendices reflect between-study heterogeneity (unless specified otherwise); BPRFs, ROSs, and star-ratings for each risk-outcome pair also reflect betweenstudy heterogeneity 20d Present results of all sensitivity analyses conducted to assess the robustness of the synthesized results.

Discussion
23a Provide a general interpretation of the results in the context of other evidence. Discussion paragraphs 2, 6-7 23b Discuss any limitations of the evidence included in the review. Discussion paragraph 5 23c Discuss any limitations of the review processes used.
Discussion paragraph 5 23d Discuss implications of the results for practice, policy, and future research.
Discussion paragraphs 6-8 OTHER INFORMATION Registration and protocol 24a Provide registration information for the review, including register name and registration number, or state that the review was not registered.
This systematic review was not registered, as stated in paragraph 3 of the methods overview 24b Indicate where the review protocol can be accessed, or state that a protocol was not prepared.
This systematic review was not registered, as stated in paragraph 3 of the methods overview 24c Describe and explain any amendments to information provided at registration or in the protocol.
This systematic review was not registered, as stated in paragraph 3 of the methods overview Support 25 Describe sources of financial or non-financial support for the review, and the role of the funders or sponsors in the review.
"Acknowledgments" section of the manuscript Competing interests 26 Declare any competing interests of review authors.
"Competing interests" section of the manuscript Availability of data, code and other materials 27 Report which of the following are publicly available and where they can be found: template data collection forms; data extracted from included studies; data used for all analyses; analytic code; any other materials used in the review.
"Data availability" and "code availability" sections in the manuscript; data collection form template: Supplemental Table 5 Supplementary Per the journal's request, the title does not include "systematic review". It is, however, in the title of the methods section, "conducting systematic reviews" BACKGROUND Objectives 2 Provide an explicit statement of the main objective(s) or question(s) the review addresses.

METHODS
Eligibility criteria 3 Specify the inclusion and exclusion criteria for the review. Not in abstract, just main text (given word count limitations by the journal) Information sources 4 Specify the information sources (e.g. databases, registers) used to identify studies and the date when each was last searched.
Not in abstract, just main text (given word count limitations by the journal) Risk of bias 5 Specify the methods used to assess risk of bias in the included studies. Not in abstract, just main text (given word count limitations by the journal) Synthesis of results 6 Specify the methods used to present and synthesise results. Yes For all data inputs from multiple sources that are synthesized as part of the study:

Included
3 Describe how the data were identified and how the data were accessed. Main text methods section "conducting systematic reviews" 4 Specify the inclusion and exclusion criteria. Identify all ad-hoc exclusions.
Main text methods section "conducting systematic reviews"; reasons for exclusion and number of studies excluded also provided in PRISMA flow diagram (Extended Data Figures 1-4) 5 Provide information on all included data sources and their main characteristics. For each data source used, report reference information or contact name/institution, population represented, data collection method, year(s) of data collection, sex and age range, diagnostic criteria or measurement method, and sample size, as relevant.
SI section 3, Supplemental Table 3 ("study characteristics"); citations also available for download from the online viz tools: https://vizhub.healthdata.org/burdenof-proof/ 6 Identify and describe any categories of input data that have potentially important biases (e.g., based on characteristics listed in item 5).
Main text methods section "testing and adjusting for biases across study designs and characteristics" and "evaluating potential for publication or reporting bias" For data inputs that contribute to the analysis but were not synthesized as part of the study: 7 Describe and give sources for any other data inputs. N/A For all data inputs: 8 Provide all data inputs in a file format from which data can be efficiently extracted (e.g., a spreadsheet rather than a PDF), including all relevant metadata listed in item 5. For any data inputs that cannot be shared because of ethical or legal reasons, such as third-party ownership, provide a contact name or the name of the institution that retains the right to the data.
As stated in the Data Availability Statement, data inputs in excel format available for download from the online viz tools: https://vizhub.healthdata.org/burden-of-proof/ Data analysis 9 Provide a conceptual overview of the data analysis method. A diagram may be helpful.
Main text methods overview, paragraph 1 10 Provide a detailed description of all steps of the analysis, including mathematical formulae. This description should cover, as relevant, data cleaning, data pre-processing, data adjustments and weighting of data sources, and mathematical or statistical model(s).

11
Describe how candidate models were evaluated and how the final model(s) were selected.
Main text methods "estimating the shape of the exposurerelative risk relationship" paragraph 2

Exposure exp_assess_level
Level of exposure assessment: The exposure was assessed… exp_instrument Exposure assessment instrument: Specify the name of the exposure assessment instrument. For self-reported exposures, please specify the name of the questionnaire e.g., International Physical Activity Questionnaire (IPAQ). If more than one instrument specify all exp_assess_period What was the frequency of exposure assessment? exp_assess_num if multiple, specify the number of times that exposure was assessed (excluding baseline) exp_method_1 Please specify the method of exposure assessment. If there are more than 1, please add in the next columns labeled "exp_method_2". exp_method_2 Please specify the method of exposure assessment. If there are more than 2, please add in the next columns labeled "exp_method_3". exp_method_3 Please specify the method of exposure assessment. exp_recall_period This field describes the unit of exposure recall used in data collection ONLY for self-report. Select the correct option from the drop-down menu. If the unit is days, weeks, months, or years, please enter the number in exp_recall_period_value (next column). If the unit is 'lifetime', nothing needs to be entered in exp_recall_period_value. For example, if the study said the recall period was 4 weeks, enter 4 in exp_recall_period_value, and 'weeks' in the field exp_recall_period. If 'other' is selected, please describe in exp_recall_period_other exp_recall_period_value If you entered days, weeks, months, or years in the field 'exp_recall_period', please enter the corresponding integer in this field. For example, if the study said the recall period was 4 weeks, enter 4 in exp_recall_period_value, and 'weeks' in the field exp_recall_period. exp_recall_period_other If 'other' was selected in exp_recall_period, please describe the exposure recall period that the study specified (e.g., recall of exposure from 12 to 18 years).
exp_type Which form of the exposure was included in relative risk estimation analysis?
Outcome outcome_def Outcome definition: Provide a brief description of the outcome as reported in the study. outcome_type Outcome type: please specify if the outcome definition included incidence of or mortality from a disease endpoint outcome_assess_1 Method of outcome assessment: Specify the method of assessment of the study outcome. If more than 1 are appropriate, enter additional methods in the next column labeled "outcome_assess_2" outcome_assess_2 Method of outcome assessment: Specify the method of assessment of the study outcome. If more than 2 are appropriate, enter additional methods in the next column labeled "outcome_assess_3" outcome_assess_3 Method of outcome assessment: Specify the method of assessment of the study outcome. Does the study support a dose-response relationship between the exposure and the outcome? (1= yes, 0=no) dose_response_detail If "1" was specified in the dose_response field, please specify in this field the type of evidence supporting the dose-response relationship. For example, "statistically significant p value for linear trend". Cohorts cohort_person_years_exp Please specify the person years of follow up in the exposed group cohort_person_years_unexp Please specify the person years of follow up in the unexposed group cohort_person_years_total Enter the total person-years of follow-up if person-years of follow up in exposed and unexposed not reported cohort_number_events_exp Please specify the number of events in the exposed group cohort_number_events_unexp Please specify the number of events in the unexposed group cohort_number_events_total Enter the total number of events/cases if number of events in exposed and unexpoxed not reported cohort_sample_size_exp Please specify the number of people in the exposed group if person-years of follow up in exposed not reported cohort_sample_size_unexp Please specify the number of people in the unexposed group if person-years of follow up in unexposed not reported cohort_sample_size_total Please specify the number of people included in the analysis if total personyears of follow up in not reported cohort_dropout_rate Dropout rate: Specify the dropout rate (%) at the end of the study. Enter on a "per 1" basis. For example: 23% is entered as .23. cohort_dropout_assess Specify how dropout rate was defined in the study.
cohort_exposed_def exposed group definition: Provide a brief description of the exposed group (i.e., the comparison group) as used in estimation of the relative risk (e.g., never smokers) cohort_exp_unit_rr Exposure unit (for continuous risks): Specify the unit of exposure (e.g., grams/day). cohort_exp_level_rr Exposure level in the exposed group (for continuous risks): Specify the mean/median level of exposure in the exposed group.
cohort_unexp_def unexposed group definition: Provide a brief description of the unexposed group (i.e., the comparison group) as used in estimation of the relative risk (e.g., never smokers) cohort_unexp_unit_rr Exposure unit (for continuous risks): Specify the unit of exposure (e.g., grams/day) for the unexposed group cohort_unexp_level_rr Exposure level in the unexposed group (for continuous risks): Specify the mean/median level of exposure in the unexposed group.
cohort_exp_level_dr Exposure level in for dose-repose RRs (for continuous risks): If the study reports dose-repose RR, please specify the level of exposure for the reported RR Case-control cc_community Were the controls selected from the community? 1 = yes, 0=no cc_cases Number of cases cc_control Number of controls cc_exposed_def Exposed group definition: Provide a brief description of the exposed group for which the the relative risk is reported (e.g., current smokers) cc_exp_unit_rr Exposure unit (for continuous risks): Specify the unit of exposure (e.g., grams/day). cc_exp_level_rr Exposure level in the exposed group (for continuous risks): Specify the mean/median level of exposure in the exposed group. cc_unexposed_def Unexposed group definition: Provide a brief description of the unexposed group (i.e., the comparison group) as used in estimation of the relative risk (e.g., never smokers) cc_unexp_unit_rr cc_unexp_level_rr Exposure level in the unexposed group (for continuous risks): Specify the mean/median level of exposure in the unexposed group. cc_exp_level_dr Exposure level in for dose-repose RRs (for continuous risks): If the study reports dose-repose RR, please specify the level of exposure for the reported RR Trials int_intervention_description Intervention definition: Provide a brief description of the intervention as reported in the study. int_control_description control definition: Provide a brief description of the control as reported in the study. int_intervention_multi_rf Does this intervention simultaneously target more than one risk? (1=yes, 0=no) int_intervention_multi_rf_specify Specify the risks that are targted by the interevention int_intervention_level Level of intervention: The intervention was implemented … int_adhere_assess Specify how adherence was defined in the study. int_adhere_rate_intervention adherence rate in the intervention group; Enter on a "per 1" basis. For example: 23% is entered as .23. int_adhere_rate_control adherence rate in the control group; Enter on a "per 1" basis. For example: 23% is entered as .23. int_dropout_rate_intervention Dropout rate in the intervention group: Specify the dropout rate (%) at the end of the study. Enter on a "per 1" basis. For example: 23% is entered as .23.
int_dropout_rate_control Dropout rate in the control group: Specify the dropout rate (%) at the end of the study. Enter on a "per 1" basis. For example: 23% is entered as .23. int_dropout_assess Specify how dropout rate was defined in the study. int_blinding For interventional studies. Blinding: The trial was … (select 1) int_exp_unit For trials, specify the unit of exposure (e.g., mmol/l) int_baseline_exp_int For trials, specify the exposure level in the intervention group at baseline int_baseline_exp_comp For trials, specify the exposure level in the comparison group at baseline int_fup_exp_int For trials, specify the exposure level in the intervention group at the end of the follow-up time int_fup_exp_comp For trials, specify the exposure level in the comparison group at the end of follow up time int_fup_exp_int_difference For trials, please specify the difference of exposure level between baseline and follow up time for the intervention group int_fup_exp_comp_difference For trials, please specify the difference of exposure level between baseline and follow up time for the comparison grouo int_person_years_int Please specify the number of person years of follow up for the intrevention group int_person_years_comp Please specify the number of person years of follow up in the comparison group int_number_events_int For trials, specify the number of cases in the intervention group at the end of follow up int_number_events_comp For trials, specify the number of cases in the control group at the end of follow up int_sample_size_int_group_baseline For trials, specify the sample size in the intervention group at baseline int_sample_size_comparison_group_baseline For trials, specify the sample size in the comparison group at baseline int_sample_size_int_group_follow_up For trials, specify the sample size in the intervention group at the end of the follow-up time int_sample_size_comparison_group_follow_up For trials, specify the sample size in the comparison group at the end of follow up time Other note_modeler for modelers only, audience is modeler, not for correspondence note_sr notes related to extraction, including assumptions, data adjustment, problems with source, any other notes that may be relevant, etc. extractor uwnet id of person who extracted the data Custom custom_exp_meas_num If the exposure level was assessed multiple times at a given time point (e.g., systolic blood pressure), specify the number of measurements at each time point. custom_exp_biomarker If the exposure level was assessed via a biomarker, specify the full name of the biomarker. custom_exp_kilometer Specify the geographical unit of measurement in kilometer (if applicable, e.g., satellite data). custom_exp_level_lower if don't have a mean/midpoint exposure level can use this column in conjecture with the custom_exp_level_upper to enter in a range custom_exp_level_upper if don't have a mean/midpoint exposure level can use this column in conjecture with the custom_exp_level_lower to enter in a range custom_unexp_level_lower if don't have a mean/midpoint exposure level can use this column in conjecture with the custom_outcome_level_upper to enter in a range custom_unexp_level_upper if don't have a mean/midpoint exposure level can use this column in conjecture with the custom_outcome_level_lower to enter in a range custom_prospective_lag specify lag time between exposure assessment and outcome custom_age_demographer A binary flag to identify if ages are presented in demographer notation or not in the source. This value is currently not used to adjust any age_start or age_end values, but in the future, that is the intention; 0 = article does not use demographer notation (4 = 4.00 not 4.99); 1 = article uses demographer notation (4=4.99 not 4.00) custom_bmi_menopause_free_text free text field for bmi team custom_cvd_outcome used for mapping cvd outcomes, free text field custom_dm_type used for documenting diabeties type custom_dm_case_defn used for documenting diabeties definitions, free text custom_pmid to document Pubmed id custom_cvd_rep_high_risk cvd specific, binary, if the study only includes people at high risk for CVD (1 for example if it is only among diabetics) custom_drug_class class of drug being used in intervention, free text custom_outcome_primary outcome is the primary outcome of RCT (1=yes, 0=no) custom_outcome_prespecified outcome is the prespecified outcome of RCT (1=yes, 0=no) custom_multipollutant Are any other pollutants controlled for in the model? 0=no, 1=yes custom_pollutants_controlled if custom_multipollutant=1, list the pollutants controlled for custom_PM2.5_model_type Describe the model used for exposure custom_assign_method How do researchers assign participants to exp? (ex: by home address, by city, nearest zipcode centroid, etc.) custom_PM2.5_def What metric are they using to measure PM2.5 (ex: mean of annual PM2.5 averages for 35-1 year prior to study) custom_lag Do the authors take into account lag? If so, how? custom_PM2.5_min All of these have to do with the spread of the PM2.5 exposure covered by the study. Minimum custom_PM2.5_5th 5 th percentile custom_PM2.5_25th 25 th percentile custom_PM2.5_50th Median/50 th percentile custom_PM2.5_75th 75 th percentile custom_PM2.5_95th 95 th percentile custom_PM2.5_max maximum custom_PM2.5_mean Mean custom_PM2.5_stddev Standard Deviation custom_PM2.5_other_measure Any other measures of the distribution of PM2.5 amongst participants? custom_PM2.5_other_measure_description If so, what are they? (ex: 10 th , 90 th , IQR)

Section 6: Study quality and risk of bias assessment
For each study that met the inclusion criteria, two reviewers assessed several indicators of bias during the extraction process. The full list of bias covariates assessed across all studies can be found in the extraction template (Supplementary Table 5).
To assist with a broad sense of data quality, we calculated a quality score for each study used in this analysis based on four study characteristics that were most applicable and likely to introduce bias (see supplementary Table 6). The overall score assessment was measured from 0 to 5, where 0 indicated the least bias and 5 indicated the most bias.