Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

High-dimensional characterization of post-acute sequelae of COVID-19

Abstract

The acute clinical manifestations of COVID-19 have been well characterized1,2, but the post-acute sequelae of this disease have not been comprehensively described. Here we use the national healthcare databases of the US Department of Veterans Affairs to systematically and comprehensively identify 6-month incident sequelae—including diagnoses, medication use and laboratory abnormalities—in patients with COVID-19 who survived for at least 30 days after diagnosis. We show that beyond the first 30 days of illness, people with COVID-19 exhibit a higher risk of death and use of health resources. Our high-dimensional approach identifies incident sequelae in the respiratory system, as well as several other sequelae that include nervous system and neurocognitive disorders, mental health disorders, metabolic disorders, cardiovascular disorders, gastrointestinal disorders, malaise, fatigue, musculoskeletal pain and anaemia. We show increased incident use of several therapeutic agents—including pain medications (opioids and non-opioids) as well as antidepressant, anxiolytic, antihypertensive and oral hypoglycaemic agents—as well as evidence of laboratory abnormalities in several organ systems. Our analysis of an array of prespecified outcomes reveals a risk gradient that increases according to the severity of the acute COVID-19 infection (that is, whether patients were not hospitalized, hospitalized or admitted to intensive care). Our findings show that a substantial burden of health loss that spans pulmonary and several extrapulmonary organ systems is experienced by patients who survive after the acute phase of COVID-19. These results will help to inform health system planning and the development of multidisciplinary care strategies to reduce chronic health loss among individuals with COVID-19.

Main

COVID-19 is a viral illness caused by the coronavirus SARS-CoV-2. The acute clinical manifestations of COVID-19 have been well characterized and involve both pulmonary and extrapulmonary systemic manifestations1,2. Emerging reports suggest that—for some individuals—the symptoms of COVID-19 persist beyond the acute setting. However, the post-acute sequelae of COVID-19 are not yet clear.

Here we leveraged the breadth and depth of the US Department of Veterans Affairs electronic health databases to undertake a high-dimensional approach to comprehensively identify the 6-month outcomes of incident diagnoses (from 379 diagnostic categories), incident medication use (from 380 medication classes) and incident laboratory abnormalities (from 62 laboratory tests) in people who survived for at least the first 30 days after their COVID-19 diagnosis.

Non-hospitalized patients with COVID-19

The cohort included 73,435 users of the Veterans Health Administration (VHA) with COVID-19 who survived for at least the first 30 days after their COVID-19 diagnosis and who were not hospitalized, and 4,990,835 VHA users who did not have COVID-19 and were not hospitalized (Supplementary Fig. 1a, b). The median follow-ups were 126 (81–203; for all reported median values, parenthetical ranges refer to the interquartile range) and 130 (82–205) days for patients with COVID-19 and VHA users, respectively (Extended Data Table 1a). We examined a panel of negative-outcome controls, which yielded results that were consistent with our a priori expectations (for example, hazard ratios of 1.03 (0.94–1.12; for all hazard ratios and burdens, parenthetical ranges refer to 95% confidence intervals) and 1.03 (0.95–1.12) for neoplasms and accidental injuries, respectively); the results of all the negative-outcome controls are provided in Extended Data Table 2a. Our examination of the standardized differences of all high-dimensional variables across all outcome-specific cohorts (including those that were selected and those that were not selected in the models) showed that more than 99.99% of standardized differences were <0.1 after adjustment (Supplementary Fig. 2a, b), which resulted in similar distributions of baseline characteristics in each group after adjustment (Supplementary Table 1).

Beyond the first 30 days of illness, individuals with COVID-19 had an increased risk of death (hazard ratio of 1.59 (1.46–1.73)). We also estimated the adjusted excess burden of death due to COVID-19 per 1,000 persons at 6 months on the basis of the difference between the estimated incidence rate in individuals with COVID-19 and all VHA users. The excess death was estimated at 8.39 (7.09–9.58) per 1,000 patients with COVID-19 at 6 months. Individuals with COVID-19 had a higher risk of requiring outpatient care (hazard ratio of 1.20 (1.19–1.21)), at an excess burden of 33.22 (30.89–35.58; all excess burdens are given per 1,000 patients with COVID-19 at 6 months) and at a greater frequency of 0.47 (0.44–0.49) additional encounters every 30 days (Extended Data Table 2b, c).

We evaluated the risk of incident occurrence of 379 diagnoses (that were categorized according to ICD-10 codes based on Clinical Classifications Software Refined), 380 classes of medication and 62 laboratory tests beyond the first 30 days. For each of the outcomes we examined, we built a cohort who were free of the related outcome at baseline to identify the risk of incident outcome during follow-up. We found that several conditions in almost every organ system exhibited an adjusted hazard ratio that was greater than 1 and a P value lower than 6.57 × 10−5 (significance level adjusted for multiple comparisons). The adjusted hazard ratio and burden for all outcomes are presented in Fig. 1a–c and Supplementary Tables 24. The result for outcomes that were positively associated with COVID-19 are presented in Fig. 2a–c, Extended Data Fig. 1a–c, Supplementary Table 5 and are discussed here.

Fig. 1: High-dimensional identification of the incident post-acute sequelae of COVID-19.
figure1

ac, Incident diagnoses (a), incident medication use (b) and incident laboratory abnormalities (c). All VHA users served as the referent category. Post-acute sequelae were ascertained from 30 days after infection until end of follow-up. Beginning from the outside ring, the first ring represents hazard ratios for the post-acute sequelae of COVID-19. A higher bar indicates a larger hazard ratio. Hazard ratios with a point estimate larger than one and that was statistically significant are shown in yellow. The second ring represents the excess burden per 1,000 patients with COVID-19 at 6 months. The colour of the cell indicates the value of the excess burden (deeper shades of red indicate a higher excess burden and deeper shades of blue indicate a greater reduced burden). The third ring represents the baseline incident rate in the control group (deeper shades of red indicate a higher incident rate). The fourth ring represents negative log of the P value; a higher bar indicates a smaller P value and yellow indicates that the value is statistically significant. ACR, albumin/creatinine ratio; AD, antidotes; AH, antihistamine drugs; Alb, albumin; ALP, alkaline phosphatase; ALT, alanine aminotransferase; AN, antineoplastic agents; AP, antiparasitic agents; AST, aspartate aminotransferase; AU, autonomic; BNP, brain natriuretic peptide; BUN, blood urea nitrogen; CD4, CD4 cell count; CD4/8, CD4/CD8 ratio; Cl, chloride; Cr, creatinine; CRP, C-reactive protein; dBIL, direct bilirubin; Derm, dermatological; DG, diagnostic; DT, dental; GFR, glomerular filtration rate; GT, genitourinary; HbA1c, haemoglobin A1c; HCT, haematocrit; HDL, high-density-lipoprotein cholesterol; Hgb, haemoglobin; hsCRP, high-sensitivity C-reactive protein; ID, irrigation or dialysis; IM, immunological; INR, international normalized ratio; IP, intrapleural; K, potassium; LDL, low-density-lipoprotein cholesterol; MS, musculoskeletal; pBNP, pro-B natriuretic peptide; Plt, platelet; Protein, total protein; PT, prothrombin time; PTT, partial thromboplastin time; RT, rectal; TBIL, total bilirubin;  TC, total cholesterol; TG, triglycerides; TnI, troponin I; TnT, troponin T; WBC, white blood cell.

Fig. 2: Burden of post-acute sequelae of COVID-19.
figure2

ac, Incident diagnoses (a), incident medication use (b) and incident laboratory abnormalities (c). All VHA users served as the referent category. Post-acute sequelae were ascertained from 30 days after infection until end of follow-up. Sequelae were selected on the basis of having a hazard ratio of more than 1 and a P value of less than 6.57 × 10−5. Excess burdens per 1,000 patients with COVID-19 at 6 months are presented with 95% confidence intervals in parentheses. Outcomes are ranked within each domain on the basis of the excess burden, from high to low. Diagnoses are coloured on the basis of the diagnosis group, medications are coloured on the basis of their class and laboratory abnormalities are coloured on the basis of their being higher or lower than the normal range. F, female; M, male; NSAIDs, non-steroidal anti-inflammatory drugs.

Respiratory conditions

The most common excess burden at 6 months after a COVID-19 infection that did not result in a hospitalization in the first 30 days was that of respiratory conditions, which included respiratory signs and symptoms (excess burden of 28.51 (26.40–30.50)), respiratory failure, insufficiency and arrest (3.37 (2.71–3.92)), and lower respiratory disease (4.67 (3.96–5.28)). There was also evidence of a high burden of incident use of bronchodilators (22.23 (20.68–23.67)), antitussive and expectorant agents (12.83 (11.61–13.95)), anti-asthmatic agents (8.87 (7.65–9.97)) and glucocorticoids (7.65 (5.67–9.50)).

Diseases of the nervous system

An excess burden of nervous system conditions was also evident, and included nervous system signs and symptoms (14.32 (12.16–16.36)), neurocognitive disorders (3.17 (2.24–3.98)), nervous system disorders (4.85 (3.65–5.93)) and headache (4.10 (2.49–5.58)).

Mental health burden

Our results also showed an excess burden of sleep–wake disorders (14.53 (11.53–17.36), anxiety and fear-related disorders (5.42 (3.42–7.29)), and trauma- and stress-related disorders (8.93 (6.62–11.09)). These findings were coupled with evidence of excess burden of incident use of non-opioid (19.97 (17.41–22.40)) and opioid (9.39 (7.21–11.43)) analgesic drugs, antidepressant agents (7.83 (5.19–10.30)), and benzodiazepine, sedative and anxiolytic agents (22.23 (20.68–23.67)).

Metabolic disorders

An excess burden of several metabolic disorders was evident, including disorders of lipid metabolism (12.32 (8.18–16.24)), diabetes mellitus (8.23 (6.36, 9.95)) and obesity (9.53 (7.55–11.37)). These was also evidence of an excess burden of incident use of antilipaemic agents (11.56 (8.73–14.19)), oral hypoglycaemic drugs (5.39 (3.99–6.64)) and insulin (4.95 (3.87–5.90)), as well as an excess burden of elevated low-density lipoprotein cholesterol (9.48 (7.02–11.81)), total cholesterol (9.94 (6.61, 13.11)), triglycerides (9.40 (6.63–12.03)) and haemoglobin A1c (10.66 (6.77–14.35)).

Poor general wellbeing

Individuals with COVID-19 exhibited an excess burden of poor general wellbeing, including malaise and fatigue (12.64 (11.24–13.93)), muscle disorders (5.73 (4.60–6.74)), musculoskeletal pain (13.89 (9.89–17.71)) and anaemia (4.79 (3.53–5.93)). These diagnoses were coupled with laboratory evidence of an excess burden of anaemia, comprising decreased haemoglobin (31.03 (28.16–33.76)), decreased haematocrit levels (30.73 (27.64, 33.67)) and low serum albumin (6.44 (4.84, 7.92)).

Cardiovascular conditions

There was an excess burden of cardiovascular conditions, including hypertension (15.18 (11.53–18.62)), cardiac dysrhythmias (8.41 (7.18–9.53)), circulatory signs and symptoms (6.65 (5.18–8.01)), chest pain (10.08 (8.63–11.42)), coronary atherosclerosis (4.38 (2.96–5.67)) and heart failure (3.94 (2.97–4.80)). There was also evidence of excess burden of incident use of beta blockers (9.74 (8.06–11.27)), calcium channel blockers (7.18 (5.61–8.61)), loop diuretic agents (4.72 (3.59–5.72)), thiazide diuretic agents (2.52 (1.37–3.54)), and anti-arrhythmic drugs (1.28 (0.79–1.67)).

Gastrointestinal system

There was evidence of an excess burden of the following conditions: oesophageal disorders (6.90 (4.58–9.07)), gastrointestinal disorders (3.58 (2.15–4.88)), dysphagia (2.83 (1.79–3.76)) and abdominal pain (5.73 (3.7–7.62)). These conditions were coupled with evidence for an increased use of laxatives (9.22 (6.99–11.31)), anti-emetic agents (9.22 (6.99–11.31)), histamine antagonists (4.83 (3.63–5.91)), other antacids (1.07 (0.62–1.42)) and antidiarrhoeal agents (2.87 (1.70–3.91)). Laboratory abnormalities included an increased risk of incident high levels of alanine aminotransferase (7.62 (5.20–9.90)).

Other sequelae

There was also evidence of an excess burden in incident acute pulmonary embolism (2.63 (2.25–2.92)) and use of anticoagulant drugs (16.43 (14.85–17.89)). Other conditions included excess burden of skin disorders (7.52 (5.17–9.73)), arthralgia and arthritis (5.16 (3.18–7.01)) and infections, including urinary tract infections (2.99 (1.94–3.93)) (Fig. 2a–c, Supplementary Tables 25).

COVID-19 requiring hospitalization versus influenza

To gain a better understanding of the spectrum of clinical manifestations in patients with COVID-19 who were hospitalized, we undertook a comparative evaluation of a cohort of hospitalized individuals with COVID-19 versus individuals who were hospitalized with seasonal influenza (a well-known and well-characterized respiratory viral illness).

This cohort included 13,654 people with COVID-19 and 13,997 people with influenza who survived for at least 30 days after hospital admission (Supplementary Fig. 3a, b). The median follow-ups were 150 (84–217) and 157 (87–220) days for patients with COVID-19 and influenza, respectively (Extended Data Table 1a). We tested a panel of negative-outcome controls, which yielded results that were consistent with our a priori expectations (for example, hazard ratio of 0.98 (0.83–1.16) and 1.02 (0.90–1.15) for neoplasms and accidental injuries, respectively); the results of all the negative-outcome controls are provided in Extended Data Table 2a. Our examination of standardized differences of all high-dimensional variables (including those that were selected and those that were not selected in the models) in all outcome-specific cohorts showed that more than 99.75% of standardized differences were <0.1 after adjustment (Supplementary Fig. 4a, b), which resulted in similar distributions of baseline characteristics in each group after adjustment (Supplementary Table 6).

Beyond the first 30 days of illness, individuals with COVID-19 who had been hospitalized for this disease had an increased risk of death (hazard ratio of 1.51 (1.30–1.76)); we estimated excess death at 28.79 (19.52–36.85) per 1,000 persons at 6 months. Individuals with COVID-19 exhibited a higher risk of requiring outpatient care (hazard ratio of 1.12 (1.08–1.17)), at an excess burden of 6.37 (4.01–9.03) and with greater frequency of 1.45 (1.28–1.63) additional encounters every 30 days (Extended Data Table 2b, c).

Compared to individuals who were hospitalized with seasonal influenza (and beyond the first 30 days of illness), patients who had been hospitalized for COVID-19 had a higher burden of a broad array of pulmonary and extrapulmonary systemic manifestations, including neurological disorders (burdens of 19.78 (12.58–26.19) and 16.16 (10.40–21.19) for nervous system disorders and neurocognitive disorders, respectively), mental health disorders (for example, a burden of 7.75 (4.72–10.10) for mental and substance-use conditions), metabolic disorders (for example, a burden of 43.53 (28.71–57.08) for disorders of lipid metabolism), cardiovascular disorders (for example, a burden of 17.92 (10.73–24.35) for circulatory signs and symptoms), gastrointestinal disorders (for example, a burden of 19.28 (12.75–25.13) for dysphagia), coagulation disorders (14.31 (10.08–17.89)), pulmonary embolism (18.31 (15.83–20.25)) and other disorders including malaise and fatigue (36.49 (28.13–44.15)) and anaemia (19.08 (10.58–26.81)) (Extended Data Figs. 2a–f, 3a–c, Supplementary Tables 710). Analyses of risk and the burden of clinical manifestations that additionally adjusted for the severity of the acute infection yielded consistent results in both the direction and magnitude of estimates (Extended Data Figs. 4a–f, 5a–c, Supplementary Tables 1114). Our high-dimensional comparative evaluation of six-month outcomes in a cohort of hospitalized individuals with COVID-19 (n = 13,654) versus individuals who were hospitalized for other causes (n = 901,516) yielded consistent results (Extended Data Figs. 6a–f, 7a–c, Supplementary Tables 1518).

Analysing risk of prespecified COVID-19 outcomes

To complement our high-dimensional approach and to gain a deeper understanding of the clinical manifestations of post-acute COVID-19 across the severity of the initial acute disease, we evaluated the risks of a panel of prespecified outcomes across the care setting of the acute phase of the disease (using whether individuals were non-hospitalized, hospitalized or admitted to intensive care, as a proxy indicator of disease severity) and benchmarked risk in these populations to a common reference group (the broader population of the Veterans Affairs Health Care System (n = 4,990,835)) (Extended Data Table 1b). Our assessment of standardized differences across the four groups showed that none of these differences was less than 0.1 after adjustment (Supplementary Fig. 5). Our results reveal (1) an increased risk of a broad array of specific clinical manifestations that include acute coronary disease, arrythmias, acute kidney injury, chronic kidney disease, memory problems and thromboembolic disease (Fig. 3, Supplementary Tables 19, 20); (2) that this risk was evident even in individuals who were not hospitalized with COVID-19; and (3) a risk gradient that increased across the care setting of the acute COVID-19 infection from non-hospitalized individuals to those who were hospitalized, and risk was highest in patients who were admitted to intensive care (Fig. 3, Supplementary Tables 19, 20).

Fig. 3: Risks and burdens of incident prespecified high-resolution post-acute COVID-19 outcomes.
figure3

Risks and burdens were assessed at 6 months in mutually exclusive cohorts comprising non-hospitalized individuals with COVID-19, people who were hospitalized for COVID-19 and people who were admitted to intensive care for COVID-19 during the acute phase (first 30 days) of the infection. All VHA users served as the referent category. Outcomes were ascertained from day 30 after COVID-19 diagnosis until the end of follow-up. Adjusted hazard ratios and excess burdens are presented; error bars represent the 95% confidence interval. GERD, gastrointestinal reflux disease; ICU, intensive care unit.

To gain a better understanding of whether these post-acute, prespecified outcomes are unique to COVID-19 or whether they represent a general post-viral syndrome, we further conducted comparative analyses (which were adjusted as specified in Methods, including adjusting for the severity of the acute infection) of the prespecified outcomes among people who were hospitalized with COVID-19 or seasonal influenza (Extended Data Table 1a, Supplementary Table 6). Our results show an increased risk and excess burden of a broad array of symptoms as well as multiple organ involvement among people with COVID-19 (Extended Data Fig. 8, Supplementary Table 21).

Negative-exposure controls

In addition to testing negative-outcome controls (Extended data Table 2a) and to further test the robustness of our approach, we developed and tested a pair of negative-exposure controls. We posited that exposure to influenza vaccination in odd- and even-numbered months between 1 October 2017 and 30 September 2019 should be associated with similar risks of clinical outcomes. We therefore tested associations between exposure to influenza vaccine in even- (n = 762,039) versus odd- (n = 599,981) numbered months and the full complement of 821 high-dimensional clinical outcomes considered in this study (including all diagnoses, medications and laboratory test results). We used the same data sources, cohort-building algorithm, variable definitions, analytical approach (including weighting method) and outcome specification, as well as a similar length of follow-up and interpretation method. Our results showed that none of the associations met the threshold of significance (P < 6.57 × 10−5) considered in this study (Supplementary Fig. 6, Supplementary Tables 2224).

Discussion

Here we use a high-dimensional approach to identify the spectrum of clinical abnormalities (incident diagnoses, incident medication use and incident laboratory abnormalities) experienced by individuals with COVID-19 who survive beyond the first 30 days of illness. The results suggest that, beyond the first 30 days of illness, people with COVID-19 are at higher risk of death and are more likely to use healthcare resources, and exhibit a broad array of incident pulmonary and extrapulmonary clinical manifestations (including nervous system and neurocognitive disorders, mental health disorders, metabolic disorders, cardiovascular disorders and gastrointestinal disorders) as well as signs and symptoms related to poor general wellbeing (including malaise, fatigue, musculoskeletal pain and anaemia). We observed an increased risk of the incident use of several classes of medication, including pain medications (opioid and non-opioid), antidepressant, anxiolytic, antihypertensive, antihyperlipidaemic and oral hypoglycaemic drugs and insulin. Our analyses of prespecified outcomes complement the high-dimensional approach to identify specific post-acute sequelae with greater diagnostic resolution and reveal two key findings: (1) that the risk and associated burden of post-acute sequelae is evident even among individuals whose acute disease was not severe enough to require hospitalization (representing the majority of people with COVID-19) and (2) that the risk and associated burden increases across the severity spectrum of the acute COVID-19 infection (from non-hospitalized to hospitalized individuals, to those admitted to intensive care). Our comparative approach to examining post-acute sequelae in individuals who are hospitalized with COVID-19 versus individuals with seasonal influenza (using a high-dimensional approach and through examination of prespecified outcomes) suggests that there is a substantially higher burden of a broad array of post-acute sequelae in the individuals who are hospitalized with COVID-19, which provides features that differentiate post-acute COVID-19 (both in the magnitude of risk and the breadth of organ involvement) from a post-influenza viral syndrome. Our results show that individuals who survive for 30 days or more after their COVID-19 diagnosis exhibit an increased risk of death and are more likely to use health resources, as well as a substantial burden of health loss that spans the pulmonary and several extrapulmonary organ systems; this highlights the need for holistic and integrated multidisciplinary long-term care of patients with COVID-19.

The mechanism or mechanisms that underlie the post-acute manifestations of COVID-19 are not entirely clear. Some of the manifestations may be driven by a direct effect of the viral infection, and may be explained by virus persisting in immune-privileged sites, an aberrant immune response, hyperactivation of the immune system or autoimmunity3. Indirect effects—including changes in social (for example, reduced social contact and loneliness), economic (for example, loss of employment) and behavioural conditions (for example, changes in diet and exercise)—that may be differentially experienced by people with COVID-19 may also shape health outcomes, and may be drivers of some of the post-acute clinical manifestations4,5,6,7,8. A better delineation of the direct and indirect effects, and a deeper understanding of the underlying biological mechanisms and epidemiological drivers, of the multifaceted long-term consequences of COVID-19 is needed9.

To our knowledge, this is the largest study of the post-acute sequelae of COVID-19; it involves 73,435 non-hospitalized patients with COVID-19, and 4,990,835 control individuals (corresponding to 2,070,615.52 person years of follow-up), as well as 13,654 hospitalized patients with COVID-19 and 13,997 patients hospitalized with seasonal influenza (corresponding to 12,179.05 person years of follow-up). We leveraged the breadth and depth of the national healthcare databases of the US Department of Veterans Affairs (the largest nationally integrated healthcare delivery system in the US) to undertake a comprehensive high-dimensional comparative approach (relative to control groups) to identify the 6-month health outcomes and clinical manifestations in patients who survived the first 30 days of COVID-19. We further examined risk in a prespecified set of outcomes with higher diagnostic resolution across care settings to enable a deeper understanding of the clinical symptomatology and diagnoses of post-acute COVID-19 across the spectrum of severity of the acute phase of the infection.

This study has several limitations. Although our approach identifies the incident post-acute sequelae in patients with COVID-19, it does not delineate which sequelae may be direct or indirect consequences of COVID-19 infection. Because of the predominantly male composition of the Veterans Affairs population, our findings may not identify clinical features of post-acute COVID-19 that may be much more pronounced in women, or non-expressed or very rare in men. Our approach demonstrated balance for more than 1,150 variables across several data domains (diagnoses, medications and laboratory data) and yielded successful testing of negative-exposure and -outcome controls, but we cannot completely rule out residual confounding effects. Finally, as the global pandemic of COVID-19 continues to evolve, as treatment strategies improve, as new variants of the virus emerge and as vaccine availability increases, it is likely that the epidemiology and short- and long-term outcomes of COVID-19 will also change over time.

Our findings show that, beyond the first 30 days of illness, a substantial burden of health loss that spans pulmonary and several extrapulmonary organ systems is experienced by individuals who survived the acute phase of COVID-19. Our results will inform global discussions on the post-acute manifestations of COVID-19, as well as health system planning and the development of care strategies that are aimed at reducing chronic and permanent health loss and optimizing wellness among patients with COVID-19.

Methods

All eligible participants were enrolled in the study, no statistical methods were used to predetermine sample size. The experiments were not randomized, and investigators were not blinded to allocation during experiments and outcome assessment.

Setting

Cohort participants were selected from US Department of Veterans Affairs (VA) electronic healthcare databases. The VHA provides healthcare to discharged veterans of the US armed forces and operates the largest nationally integrated healthcare system in the USA, with 1,255 healthcare facilities (including 170 VA Medical Centers and 1,074 outpatient sites) located across the USA. Veterans who are enrolled with the VHA have access to the comprehensive medical benefits package of the VA (which includes inpatient hospital care, outpatient services, preventive, primary and speciality care, prescriptions, mental healthcare, home healthcare, geriatric and extended care, medical equipment, and prosthetics). The VA electronic healthcare databases are updated daily.

Cohort

The cohort was constructed from 5,808,018 participants who had encountered the VHA between 1 January 2019 and 31 December 2019. Of those who were alive on 1 March 2020 (n = 5,606,309), a COVID-19 group was selected as individuals who had a positive test for COVID-19 between 1 March 2020 and 30 November 2020 (n = 98,661). Participants without hospitalization within the first 30 days of their first positive test were further selected (n = 76,877). To examine post-acute outcomes, we then selected from the COVID-19 group those alive on the 30th day after their positive test (participants with COVID-19, n = 73,435). To generate a comparison group that had a similar distribution of length of follow-up, we then matched each participant with COVID-19 with 70 VHA users who did not have a positive test for COVID-19 without replacement. In matching, the dates of cohort enrolment for the corresponding 70 VHA users were matched with time of cohort enrolment of the participant with COVID-19—that is, the date of testing positive (control group n = 5,140,450). In the VHA user group, we similarly selected individuals who were without hospitalization and alive during the first 30 days after the date of enrolment (control group n = 4,990,835) (Supplementary Fig. 1a, b). Participants were followed until 31 January 2021.

To compare post-acute outcomes of hospitalized participants with COVID-19 and hospitalized participants with seasonal influenza, we selected 15,846 participants with COVID-19 who were admitted to a hospital within 30 days after or 5 days before their first positive test (from the 98,661 patients with a positive COVID-19 test between 1 March 2020 and 30 November 2020). Similarly, we selected 62,909 patients who had their first positive seasonal influenza test between 1 October 2016 and 29 February 2020 and who had encountered the VHA at least once in the calendar year before the test was collected. Of these patients, 14,948 were admitted to a hospital within 30 days after or 5 days before their first positive influenza test. The hospitalized cohort was further restricted to those alive at the 30th day after hospital admission (COVID-19 n = 13,654; seasonal influenza n = 14,212), where for 215 patients who were in both the hospitalized COVID-19 and seasonal influenza group, only their COVID-19 hospitalizations were used in the analyses (Supplementary Fig. 3a, b). In this cohort, participants were considered to be enrolled at the time of hospitalization. To balance the duration of follow-up in the hospitalized COVID-19 and seasonal influenza groups, each participant in the seasonal influenza group was independently randomly assigned a duration of follow-up on the basis of the distribution of length of follow-up of the participants in the hospitalized COVID-19 group who were followed from date of hospitalization to 31 January 2021.

To examine high-resolution, prespecified post-acute COVID-19 outcomes across the severity spectrum of the initial acute disease, we built four mutually exclusive cohorts: VHA users without COVID-19 (n = 4,990,835), VHA users with COVID-19 (n = 73,435), VHA users who were hospitalized with COVID-19 within the first 30 days of follow-up (n = 10,068) and VHA users with COVID-19 who were admitted to the intensive care within the first 30 days of follow-up (n = 3,586). Participants in these cohorts were followed up until 31 January 2021.

Data sources

Electronic health records from VA Corporate Data Warehouse (CDW) were used in this study10,11,12,13. The CDW ‘outpatient encounters’ domains provided information related to outpatient encounters and ‘inpatient encounters’ domains provided information between hospital admission and discharge14. The CDW ‘outpatient pharmacy’ domain and CDW ‘bar code medication administration’ domain were used to collect medication data, and CDW ‘patient’ domain was used to collect demographic information. The CDW ‘laboratory results’ domain was used to collect laboratory test information, and the ‘COVID-19 shared data resource’ was used to collect COVID-19 test and demographic information for patients with COVID-19. In addition, the area deprivation index—which is a composite measure of income, education, employment and housing—was obtained from the University of Wisconsin15.

Post-acute use of health resources and death

Outcomes that occurred after 30 days of cohort enrolment—including death, incident outpatient encounter and frequency of outpatient encounter—were examined in both cohorts. The frequency of outpatient encounters was computed on the basis of the number of days with outpatient encounter divided by days of follow-up after 30 days, and is reported as the number of outpatient encounters per 30 days.

High dimensional post-acute clinical characteristics

Negative outcome and exposure controls

The application of negative controls in clinical epidemiology may help to detect both suspected and unsuspected sources of spurious bias, and may lessen concerns about unmeasured confounding and other latent biases16. Here we followed a previously published approach16 to examine a panel of eight negative-outcome controls (including neoplasms, accidental injuries, scars, fitting or adjustment of orthodontic or dental prosthetic device, fitting or adjustment of hearing device, fitting or adjustment or orthotics, fitting or adjustment of casts, and bandages), for which (based on current knowledge) there should be no causal relation between the exposures and risks of the negative-outcome controls. We also developed and tested a pair of negative-exposure controls (defined as exposure to influenza vaccine in odd- or even-numbered months during the period between 1 October 2017 and 30 September 2019). We posited that there should be no differences in risk of clinical outcomes associated with receipt in influenza vaccine in odd- versus even-numbered months. The negative-exposure controls were tested in all 821 high-dimensional outcomes considered in our analyses, including diagnoses, medications and laboratory test results; we used the same data sources, cohort-building algorithm, variable definitions, analytical approaches and outcome specification, as well as a similar length of follow-up and interpretation method. In the assessment of negative-outcome and negative-exposure controls, the relation of the exposure–outcome pairs may share the same potential biases with COVID-19 and the outcomes examined in this study (including biases in the underlying data, algorithms for the construction of cohorts, unmeasured confounders, misspecification of modelling algorithms, outcome ascertainment, analytical considerations, result interpretation and other latent biases)16,17. The successful testing of negative controls reduces concerns about both suspected and unsuspected sources of spurious associations, including associations owing to unmeasured confounding, flaws in the analytical approach, differences in outcome ascertainment and other sources of bias16. In particular, the successful testing of the outcome controls may reduce concerns about biases in outcome ascertainment and unmeasured confounding between the comparison groups (for example, if there was bias in ascertainment of clinical outcomes in one arm versus another, this bias may also extend to ascertainment of neoplasms, accidental injuries or other negative-outcome controls tested in this study); the successful testing of the exposure control may reduce concerns about biases in the analytical approach and underlying data (for example, if there was bias related to the analytic approach, it may also bias the negative-exposure control).

Diagnoses

All ICD-10 diagnosis codes from cohort participants from day 30 after COVID-19 diagnosis until the end of follow-up were used to define the post-acute diagnosis outcomes. More than 70,000 ICD-10 diagnosis codes were classified into 540 diagnostic categories based on the Clinical Classifications Software Refined (CCSR) version 2021.1, which is developed as part of the Healthcare Cost and Utilization Project sponsored by the Agency for Healthcare Research and Quality18,19,20. We examined only diagnostic categories that may plausibly be considered post-acute sequelae of COVID-19 in the adult population. Some diagnostic categories—including external causes of morbidity, injury, poisoning and some other consequences of external causes, congenital malformations, deformations and chromosomal abnormalities, some conditions originating in the perinatal period or outcome from pregnancy, childbirth and the puerperium—were not examined, yielding 379 diagnostic categories.

Medication use

The prescription records of cohort participants from day 30 after COVID-19 diagnosis until the end of follow-up were used to define the post-acute medication use. We classified 3,425 medications on the basis of the VA drug classification system, into 543 medication classes21,22. After removing items in the medication group of investigational agents or prosthetics, supplies and devices, we examined 380 different medication outcomes in total.

Laboratory abnormalities

In total, 62 laboratory test abnormalities from 38 laboratory measurements from day 30 after COVID-19 diagnosis until the end of follow-up were examined including absolute T cell count, alanine aminotransferase, aspartate aminotransferase, blood urea nitrogen, brain natriuretic peptide, C-reactive protein, carbon dioxide, CD4/CD8 ratio, direct bilirubin, estimated glomerular filtration rate, ferritin, haematocrit, haemoglobin, haemoglobin A1c, high-density-lipoprotein cholesterol, high-sensitivity C-reactive protein, international normalized ratio, low-density-lipoprotein cholesterol, microalbumin/creatinine ratio, partial thromboplastin time, platelet count, pro B natriuretic peptide, prothrombin time, serum albumin, serum alkaline phosphatase, serum calcium, serum chloride, serum creatinine, serum phosphate, serum potassium, serum sodium, serum total protein, total bilirubin, total cholesterol, total white blood cell count, triglycerides, troponin I and troponin T were identified on the basis of ‘Logical Observation Identifiers Names and Codes’. Each laboratory test result was classified into abnormally high or abnormally low on the basis of whether results were above the upper normal range or below the lower normal range (in instances in which a high or low result might be clinically possible for a given laboratory test). The definition of the abnormality for each laboratory test is presented in Supplementary Tables 4, 9.

High-resolution, prespecified post-acute COVID-19 outcomes

To identify clinical manifestations of post-acute COVID-19 with greater diagnostic resolution, we specified a list of outcomes on the basis of data from the Center of Disease Control and the National Institute of Health workshop on post-acute COVID-19. Outcomes were defined on the basis of previous definitions that have been validated for use with electronic health records, and integrated information from diagnoses, medications and laboratory measurements when appropriate23,24,25,26,27,28,29. To gain a deeper understanding of the risks of these outcomes across the severity scale of the acute infection, we examined the risk across the care setting of the acute disease—a proxy indicator of clinical severity—in four mutually exclusive cohorts (VHA users (who served as the referent category); people with COVID-19; people hospitalized for COVID-19; and people admitted to intensive care for COVID-19). In addition, we estimated the risks of these prespecified outcomes in individuals hospitalized with COVID-19 and seasonal influenza. The prespecified, high-resolution outcomes included acute coronary disease, acute kidney injury, anxiety, arrythmias, bradycardia, chest pain, chronic kidney disease, constipation, cough, depression, diarrhoea, type 2 diabetes mellitus, fatigue, gastric oesophageal reflux disease, hair loss, headache, heart failure, hyperlipidaemia, hypoxaemia, joint pain, memory problems, muscle weakness, obesity, shortness of breath, skin rash, sleep disorder, smell disorder, stroke, tachycardia and thromboembolism. We restricted capture of incident acute coronary disease, stroke and thromboembolism to inpatient diagnoses that were not present on admission. All other prespecified outcomes that may plausibly be encountered in either the outpatient or inpatient setting were accordingly ascertained in the setting in which they first occurred. Among individuals with COVID-19, and for each prespecified outcome, the percentages of outcomes that were ascertained from outpatient and inpatient data are presented in Supplementary Tables 19, 20.

Covariates

The predefined covariates for analyses included demographics (such as age, race (white, black and other), sex and receipt of long-term care) and proxies of healthcare use (such as number of outpatient encounters, number of hospital admissions, number of outpatient prescriptions and number of outpatient eGFR measurements in the year before enrolment). In addition, we included the area deprivation index at the residency address of patients as a summary measurement of socio-economic deprivation. We used the Sequential Organ Failure Assessment (SOFA) score to adjust for the severity of the acute infection in additional high-dimensional analyses of the hospitalized COVID-19 versus hospitalized seasonal influenza cohorts30,31. To address potential nonlinear associations, all continuous variables were adjusted as restricted cubic spline functions.

To further adjust the models in the most optimal manner, we leveraged the multidimensionality of the electronic healthcare databases of the VA to algorithmically identify covariates (potential confounders) that span multiple domains (diagnoses, pharmacy records and laboratory tests) and that showed evidence of difference in prevalence between the comparison groups24. In the COVID-19 versus VHA users cohort (and separately in the hospitalized COVID-19 versus influenza cohort), high-dimensional covariates were ascertained within one year before the date of enrolment. Within all diagnoses, medication classes and laboratory tests, we first selected variables that occurs in at least 10 patients in both groups. We then estimated the unadjusted relative risk of each variable with being in the COVID-19 or comparator group. The top 100 high-dimensional variables with the strongest association with group membership were used, along with predefined covariates, in the analyses.

To most optimally estimate the risk of the set of prespecified outcomes across the intensity of care needed during the acute infection, we ascertained four sets of high-dimensional covariates (corresponding to the four mutually exclusive groups (all VHA users, people with COVID-19, people who were hospitalized with COVID-19 and people who were admitted to intensive care with COVID-19)) in total, on the basis of the unadjusted relative risk of being in each group compared to being in the remaining three groups. High-dimensional covariates were used along with predefined covariates in the analyses32.

Statistical analyses

The characteristics of the VHA users who were not hospitalized for COVID-19, VHA users who were without COVID-19, hospitalized participants with COVID-19 and hospitalized participants with seasonal influenza are described in Extended Data Table 1a. The flow charts of the overall analytical approach are presented in Supplementary Figs. 7, 8.

We estimated the risk of health resource use and death, and the risk of each diagnosis, medication use and laboratory abnormality between individuals with COVID-19 and all VHA users, and—separately—between individuals who had been hospitalized for COVID-19 or seasonal influenza. To estimate the risk of each incident outcome, we built a cohort of participants without a history of the outcome being examined (for example, risk of insulin use was estimated within a cohort of participants without history of insulin use in the year before cohort enrolment). For each outcome-specific cohort, propensity scores based on predefined variables and high-dimensional algorithmically selected variables were estimated. The propensity scores were then used to compute the overlap weight, which is the probability of membership in the non-observed exposure group (one minus the propensity of in the observed group)33,34. We then—for all outcome models—assessed covariate balance, calculating the standardized difference after application of the overlap weight for all predefined variables, 100 algorithmically selected high-dimensional variables, and all high-dimensional variables that were not selected for inclusion in the propensity score models. We present the distribution of these standardized differences for 20 randomly selected outcome-specific cohorts, and across all outcomes, and the covariate distributions in overall cohort after adjustment.

The risks of health resource use—including outpatient encounter and death between individuals with COVID-19 and all VHA users, and between COVID-19 hospitalization and influenza hospitalization—were estimated from a Cox survival model weighted by overlap weights, in which death was considered as a competing risk in the evaluation of health resource use. The frequency of outpatient encounter was modelled on the basis of a weighted linear regression. Hazard ratios for each of the outcomes—including incident diagnoses, incident medication use and incident laboratory abnormalities—were estimated from cause-specific hazard models weighted by overlap weights, in which occurrence of death was considered as a competing risk. Event rates per 1,000 participants at 6 months (180 days) of follow-up in each group, and the adjusted excess burden based on the differences between two groups, were estimated. Models were built only for outcomes that occurred in at least 10 participants from each group. A Bonferroni correction was applied in consideration of multiple hypotheses testing for high-dimensional outcomes. A P values of less than 6.57 × 10−5 was considered statistically significant. Results are additionally presented with a focus on identified post-acute sequelae of COVID-19, in which we selected those sequelae with a hazard ratio greater than 1 and P values of less than 6.57 × 10−5. High-dimensional analyses of individuals who were hospitalized for COVID-19 versus seasonal influenza, which were adjusted for the severity of the acute infection (through inclusion of SOFA scores), were additionally undertaken. In addition, high-dimensional analyses were also conducted to evaluate the risk of six-month clinical outcomes in people who were hospitalized for COVID-19 versus those who were hospitalized for other causes. Participants who were hospitalized for other causes who survived the first 30 days after hospital admission were enrolled between 1 October 2016 and 29 February 2020 (n = 901,516).

We examined the risk of high-resolution, prespecified outcomes across care settings of the acute phase of the disease, analysing differences in risk of clinical manifestations of post-acute COVID-19 between mutually exclusive groups of people who were positive for COVID-19 (non-hospitalized, hospitalized and admitted to intensive care), and VHA users who were not positive for COVID-19. Propensity scores for group membership were estimated in outcome-specific cohorts free of the related disease at baseline32. Standardized differences in the predefined and algorithmically selected high-dimensional covariates are presented after application of overlap weighting35. The percentage of outcomes ascertained in the COVID-19 group in an inpatient and outpatient setting are presented. We then constructed Cox survival models to analyse the risk of outcomes using overlap weighting for multiple treatments. We report hazard ratios and event rate differences between each group. We also estimated the risks of prespecified outcomes among individuals who were hospitalized with COVID-19 or seasonal influenza, which were additionally adjusted using SOFA scores.

All analyses were done using SAS Enterprise Guide version 7.1. Data visualizations were performed in R 4.0.3. The study was approved by the Institutional Review Board of the Department of Veterans Affairs St. Louis Health Care System.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Data availability

The data that support the findings of this study are available from the VA. VA data are made freely available to researchers behind the VA firewall with an approved VA study protocol. More information is available at https://www.virec.research.va.gov or by contacting the VA Information Resource Center (VIReC) at VIReC@va.gov.

Code availability

SAS and R programming codes are available at https://github.com/yxie618/HDlongCOVID.

References

  1. 1.

    Wang, D. et al. Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China. J. Am. Med. Assoc. 323, 1061–1069 (2020).

    CAS  Article  Google Scholar 

  2. 2.

    Xie, Y., Bowe, B., Maddukuri, G. & Al-Aly, Z. Comparative evaluation of clinical manifestations and risk of death in patients admitted to hospital with Covid-19 and seasonal influenza: cohort study. Br. Med. J. 371, m4677 (2020).

    Article  Google Scholar 

  3. 3.

    Advisory Group of the British Society for Immunology. Long-term Immunological Health Consequences of COVID-19 (British Society for Immunology, 2020).

  4. 4.

    Figueroa, J. D. et al. Distinguishing between direct and indirect consequences of Covid-19. Br. Med. J. 369, m2377 (2020).

    Article  Google Scholar 

  5. 5.

    Townsend, E. COVID-19 policies in the UK and consequences for mental health. Lancet Psychiatry 7, 1014–1015 (2020).

    Article  Google Scholar 

  6. 6.

    Knipe, D., Evans, H., Marchant, A., Gunnell, D. & John, A. Mapping population mental health concerns related to COVID-19 and the consequences of physical distancing: a Google trends analysis. Wellcome Open Res. 5, 82 (2020).

    Article  Google Scholar 

  7. 7.

    Raker, E. J., Zacher, M. & Lowe, S. R. Lessons from Hurricane Katrina for predicting the indirect health consequences of the COVID-19 pandemic. Proc. Natl Acad. Sci. USA 117, 12595–12597 (2020).

    CAS  Article  Google Scholar 

  8. 8.

    Mahase, E. Covid-19: mental health consequences of pandemic need urgent research, paper advises. Br. Med. J. 369, m1515 (2020).

    Article  Google Scholar 

  9. 9.

    Del Rio, C., Collins, L. F. & Malani, P. Long-term health consequences of COVID-19. J. Am. Med. Assoc. (2020).

  10. 10.

    Xie, Y. et al. Proton pump inhibitors and risk of incident CKD and progression to ESRD. J. Am. Soc. Nephrol. 27, 3153–3163 (2016).

    CAS  Article  Google Scholar 

  11. 11.

    Xie, Y. et al. Risk of death among users of proton pump inhibitors: a longitudinal observational cohort study of United States veterans. BMJ Open 7, e015735 (2017).

    Article  Google Scholar 

  12. 12.

    Xie, Y. et al. Long-term kidney outcomes among users of proton pump inhibitors without intervening acute kidney injury. Kidney Int. 91, 1482–1494 (2017).

    CAS  Article  Google Scholar 

  13. 13.

    Xie, Y. et al. Higher blood urea nitrogen is associated with increased risk of incident diabetes mellitus. Kidney Int. 93, 741–752 (2018).

    CAS  Article  Google Scholar 

  14. 14.

    Vincent, B. M., Wiitala, W. L., Burns, J. A., Iwashyna, T. J. & Prescott, H. C. Using Veterans Affairs corporate data warehouse to identify 30-day hospital readmissions. Health Serv. Outcomes Res. Methodol. 18, 143–154 (2018).

    Article  Google Scholar 

  15. 15.

    Kind, A. J. H. & Buckingham, W. R. Making neighborhood-disadvantage metrics accessible – the neighborhood atlas. N. Engl. J. Med. 378, 2456–2458 (2018).

    Article  Google Scholar 

  16. 16.

    Lipsitch, M., Tchetgen Tchetgen, E. & Cohen, T. Negative controls: a tool for detecting confounding and bias in observational studies. Epidemiology 21, 383–388 (2010).

    Article  Google Scholar 

  17. 17.

    Shi, X., Miao, W. & Tchetgen, E. T. A selective review of negative control methods in epidemiology. Curr. Epidemiol. Rep. 7, 190–202 (2020).

    Article  Google Scholar 

  18. 18.

    Wei, Y. et al. Short term exposure to fine particulate matter and hospital admission risks and costs in the Medicare population: time stratified, case crossover study. Br. Med. J. 367, l6258 (2019).

    Article  Google Scholar 

  19. 19.

    Aubert, C. E. et al. Best definitions of multimorbidity to identify patients with high health care resource utilization. Mayo Clin. Proc. Innov. Qual. Outcomes 4, 40–49 (2020).

    Article  Google Scholar 

  20. 20.

    HCUP CCSR. Healthcare Cost and Utilization Project (HCUP) (Agency for Healthcare Research and Quality, 2020).

  21. 21.

    Olvey, E. L., Clauschee, S. & Malone, D. C. Comparison of critical drug–drug interaction listings: the Department of Veterans Affairs medical system and standard reference compendia. Clin. Pharmacol. Ther. 87, 48–51 (2010).

    CAS  Article  Google Scholar 

  22. 22.

    Greene, M., Steinman, M. A., McNicholl, I. R. & Valcour, V. Polypharmacy, drug–drug interactions, and potentially inappropriate medications in older adults with human immunodeficiency virus infection. J. Am. Geriatr. Soc. 62, 447–453 (2014).

    Article  Google Scholar 

  23. 23.

    Xie, Y. et al. Estimates of all cause mortality and cause specific mortality associated with proton pump inhibitors among US veterans: cohort study. Br. Med. J. 365, l1580 (2019).

    Article  Google Scholar 

  24. 24.

    Xie, Y. et al. Comparative effectiveness of SGLT2 inhibitors, GLP-1 receptor agonists, DPP-4 inhibitors, and sulfonylureas on risk of kidney outcomes: emulation of a target trial using health care databases. Diabetes Care 43, 2859–2869 (2020).

    Article  Google Scholar 

  25. 25.

    Xie, Y. et al. Comparative effectiveness of the sodium-glucose cotransporter 2 inhibitor empagliflozin versus other antihyperglycemics on risk of major adverse kidney events. Diabetes Care 43, 2785–2795 (2020).

    CAS  Article  Google Scholar 

  26. 26.

    Bowe, B. et al. Acute kidney injury in a national cohort of hospitalized US veterans with COVID-19. Clin. J. Am. Soc. Nephrol. 16, 14–25 (2021).

    CAS  Article  Google Scholar 

  27. 27.

    Bowe, B. et al. The 2016 global and national burden of diabetes mellitus attributable to PM2·5 air pollution. Lancet Planet. Health 2, e301–e312 (2018).

    Article  Google Scholar 

  28. 28.

    Bowe, B., Xie, Y., Xian, H., Balasubramanian, S. & Al-Aly, Z. Low levels of high-density lipoprotein cholesterol increase the risk of incident kidney disease and its progression. Kidney Int. 89, 886–896 (2016).

    CAS  Article  Google Scholar 

  29. 29.

    Bowe, B. et al. High density lipoprotein cholesterol and the risk of all-cause mortality among U.S. veterans. Clin. J. Am. Soc. Nephrol. 11, 1784–1793 (2016).

    CAS  Article  Google Scholar 

  30. 30.

    Vincent, J. L. et al. The SOFA (sepsis-related organ failure assessment) score to describe organ dysfunction/failure. on behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Med. 22, 707–710 (1996).

    CAS  Article  Google Scholar 

  31. 31.

    Vincent, J. L. et al. Use of the SOFA score to assess the incidence of organ dysfunction/failure in intensive care units: results of a multicenter, prospective study. Crit. Care Med. 26, 1793–1800 (1998).

    CAS  Article  Google Scholar 

  32. 32.

    McCaffrey, D. F. et al. A tutorial on propensity score estimation for multiple treatments using generalized boosted models. Stat. Med. 32, 3388–3414 (2013).

    MathSciNet  Article  Google Scholar 

  33. 33.

    Thomas, L. E., Li, F. & Pencina, M. J. Overlap weighting: a propensity score method that mimics attributes of a randomized clinical trial. J. Am. Med. Assoc. 323, 2417–2418 (2020).

    Article  Google Scholar 

  34. 34.

    Li, F., Morgan, K. L. & Zaslavsky, A. M. Balancing covariates via propensity score weighting. J. Am. Stat. Assoc. 113, 390–400 (2018).

    MathSciNet  CAS  Article  Google Scholar 

  35. 35.

    Li, F. & Li, F. Propensity score weighting for causal inference with multiple treatments. Ann. Appl. Stat. 13, 2389–2415 (2019).

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This study used data from the VA COVID-19 Shared Data Resource. M. Cai developed the data visualization and A. K. Gibson provided technical and editorial assistance. This research was funded by the United States Department of Veterans Affairs and the Institute for Public Health at Washington University in Saint Louis (for Z.A.-A.), and two American Society of Nephrology and KidneyCure fellowship awards (for Y.X. and B.B.). The funders of this study had no role in study design; collection, analysis and interpretation of data; writing the report; and the decision to submit the report for publication. This research project was reviewed and approved by the Institutional Review Board of the Department of Veterans Affairs Saint Louis Health Care System.

Author information

Affiliations

Authors

Contributions

Z.A.-A., Y.X. and B.B. contributed to the development of the study concept and design. Y.X. and B.B. contributed to data acquisition. Z.A.-A., Y.X. and B.B. contributed to data analysis and interpretation. Y.X. and B.B. contributed to statistical analysis. Z.A.-A. and Y.X. drafted the manuscript. Z.A.-A., Y.X. and B.B. contributed to critical revision of the manuscript. Z.A.-A. provided administrative, technical and material support. Z.A.-A. provided supervision and mentorship. Each author contributed important intellectual content during manuscript drafting or revision, and accepts accountability for the overall work by ensuring that questions pertaining to the accuracy or integrity of any portion of the work are appropriately investigated and resolved. All authors approved the final version of the report. The corresponding author attests that all the listed authors meet the authorship criteria and that no others meeting the criteria have been omitted.

Corresponding author

Correspondence to Ziyad Al-Aly.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Paul Garner, Sachin Yende and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Risk of incident post-acute sequelae in COVID-19.

ac, Incident diagnoses (a), incident medication use (b) and incident laboratory abnormalities (c). All VHA users served as the referent category. Outcomes were ascertained from day 30 after COVID-19 diagnosis until the end of follow-up. Adjusted hazard ratios for incident sequelae that are larger than 1 and P value less than 6.57 × 10−5 are presented. Hazard ratios (dots) and 95% confidence intervals (bars) are presented on log10 scale.

Extended Data Fig. 2 High-dimensional identification of the incident post-acute sequelae in people who were hospitalized for COVID-19.

af, incident diagnoses (a, d), incident medication use (b, e) and incident laboratory abnormalities (c, f). Individuals who were hospitalized with seasonal influenza served as the referent category. Post-acute sequelae were ascertained from 30 days after infection until the end of follow-up. ac, Beginning from the outside ring, the first ring represents hazard ratios for the post-acute sequelae of COVID-19. A higher bar indicates a larger hazard ratio. Hazard ratios with a point estimate larger than one and that are statistically significant are coloured in yellow. The second ring represents excess burden per 1,000 patients with COVID-19 at 6 months. Colour of the cell indicates value of the excess burden; deeper shades of red indicate higher excess burden and deeper shades of blue indicate greater reduced burden. The third ring represents the baseline incident rate in the control group; deeper shades of red indicate higher incident rate. The fourth ring represents negative log of the P value; a higher bar indicates a smaller P value and yellow indicates statistically significant. df, Sequelae were selected based on having a hazard ratio larger than one and P value less than 6.57 × 10−5. Excess burdens per 1,000 patients with COVID-19 at 6 months are presented. Within each domain, outcomes are ranked based on excess burden from high to low. Diagnoses are coloured based on diagnosis group, medications are coloured based on medication class, and laboratory abnormalities are coloured based on higher or lower than normal range.

Extended Data Fig. 3 Risk of incident post-acute sequelae in people who were hospitalized for COVID-19.

ac, Incident diagnoses (a), incident medication use (b) and incident laboratory abnormalities (c). People who were hospitalized with seasonal influenza served as the referent category. Outcomes were ascertained from day 30 after hospital admission until the end of follow-up. Adjusted hazard ratios for incident sequelae that are larger than 1 and P value less than 6.57 × 10−5 are presented. Hazard ratios (dots) and 95% confidence intervals (bars) are presented on a log10 scale.

Extended Data Fig. 4 High-dimensional identification of the incident post-acute sequelae in people who were hospitalized for COVID-19 after additionally adjusting for severity of the acute infection.

af, Incident diagnoses (a, d), incident medication use (b, e) and incident laboratory abnormalities (c, f). Individuals who were hospitalized with seasonal influenza served as the referent category. Post-acute sequelae were ascertained from 30 days after infection until the end of follow-up. ac, Beginning from the outside ring, the first ring represents hazard ratios for the post-acute sequelae of COVID-19. A higher bar indicates a larger hazard ratio. Hazard ratios with a point estimate larger than 1 and that are statistically significant are coloured in yellow. The second ring represents excess burden per 1,000 patients with COVID-19 at 6 months. The colour of the cell indicates the value of the excess burden; deeper shades of red indicate higher excess burden and deeper shades of blue indicate greater reduced burden. The third ring represents the baseline incident rate in the control group; deeper shades of red indicate higher incident rate. The fourth ring represents negative log of the P value; a higher bar indicates smaller P value and yellow bar indicates statistically significant. df, Sequelae were selected based on hazard ratio larger than 1 and P value less than 6.57 × 10−5. Excess burdens per 1,000 patients with COVID-19 at 6 months are presented. Within each domain, outcomes are ranked based on excess burden from high to low. Diagnoses are coloured based on diagnosis group, medications are coloured based on medication class, and laboratory abnormalities are coloured based on higher or lower than normal range.

Extended Data Fig. 5 Risk of incident post-acute sequelae in people who were hospitalized for COVID-19 after additionally adjusting for severity of the acute infection.

ac, Incident diagnoses (a), incident medication use (b) and incident laboratory abnormalities (c). People who had been hospitalized with seasonal influenza served as the referent category. Outcomes were ascertained from day 30 after hospital admission until the end of follow-up. Adjusted hazard ratios for incident sequelae that are larger than 1 and P value less than 6.57 × 10−5 are presented. Hazard ratios (dots) and 95% confidence intervals (bars) are presented on log10 scale.

Extended Data Fig. 6 High-dimensional identification of the incident post-acute sequelae of people who were hospitalized for COVID-19.

af, Incident diagnoses (a, d), incident medication use (b, e) and incident laboratory abnormalities (c, f). Individuals who were hospitalized for other causes served as the referent category. Post-acute sequelae were ascertained from 30 days after infection until the end of follow-up. ac, Beginning from the outside ring, the first ring represents hazard ratios for the post-acute sequelae of COVID-19. A higher bar indicates a larger hazard ratio. Hazard ratios with a point estimate larger than one and that are statistically significant are coloured in yellow. The second ring represents excess burden per 1,000 patients with COVID-19 at 6 months. Colour of the cell indicates value of the excess burden; deeper shades of red indicate higher excess burden and deeper shades of blue indicate greater reduced burden. The third ring represents the baseline incident rate in the control group; deeper shades of red indicate higher incident rate. The fourth ring represents negative log of the P value; a higher bar indicates smaller P value and yellow bar indicates statistically significant. df, Sequelae were selected on the basis of a hazard ratio larger than one and P value less than 6.57 × 10−5. Excess burdens per 1,000 patients with COVID-19 at 6 months are presented. Within each domain, outcomes are ranked based on excess burden from high to low. Diagnoses are coloured based on diagnosis group, medications are coloured based on medication class, and laboratory abnormalities are coloured based on higher or lower than normal range.

Extended Data Fig. 7 Risk of incident post-acute sequelae in people with COVID-19 who were hospitalized for COVID-19.

ac, Incident diagnoses (a), incident medication use (b) and incident laboratory abnormalities (c). People who had been hospitalized for other causes served as the referent category. Outcomes were ascertained from day 30 after hospital admission until the end of follow-up. Adjusted hazard ratios for incident sequelae that are larger than 1 and P value less than 6.57 × 10−5 are presented. Hazard ratios (dots) and 95% confidence intervals (bars) are presented on log10 scale.

Extended Data Fig. 8 Risks and burdens of incident prespecified high-resolution post-acute COVID-19 outcomes at 6 months in hospitalized people with COVID-19 versus seasonal influenza.

Hospitalized people with seasonal influenza served as the referent category. Outcomes were ascertained from day 30 after hospital admission until the end of follow-up. Hazard ratios and 95% confidence intervals and excess burdens per 1,000 patients and 95% confidence intervals at 6 months are presented.

Extended Data Table 1 Characteristics of study cohorts
Extended Data Table 2 Results of negative controls, and evidence of high risk of death and health resource use

Supplementary information

Supplementary Figures

This file contains Supplementary Figures 1-8.

Reporting Summary

Supplementary Table 1

Balance of baseline variables after overlap weighting in COVID-19 vs VHA users.

Supplementary Table 2

Identification of the post-acute incident diagnoses in COVID-19 vs. all VHA users.

Supplementary Table 3

Identification of the post-acute incident medication use in COVID-19 vs. all VHA users.

Supplementary Table 4

Identification of the post-acute incident laboratory abnormalities in COVID-19 vs. all VHA users.

Supplementary Table 5

Burden of post-acute sequelae of COVID-19. Excess burdens were estimated vs. a comparator group of all users of the Veteran Health Administration.

Supplementary Table 6

Balance of baseline variables after overlap weighting in hospitalized COVID-19 vs seasonal influenza.

Supplementary Table 7

Identification of the post-acute incident diagnoses in people who had been hospitalized with COVID-19 vs. those who had been hospitalized with seasonal influenza.

Supplementary Table 8

Identification of the post-acute incident medication use in people who had been hospitalized with COVID-19 vs. those who had been hospitalized with seasonal influenza.

Supplementary Table 9

Identification of the post-acute incident laboratory abnormalities in people who had been hospitalized with COVID-19 vs. those who had been hospitalized with seasonal influenza.

Supplementary Table 10

Burden of post-acute sequelae of COVID-19 which required hospitalization during the acute infection. Excess burdens were estimated vs. a comparator group of people with seasonal influenza which required hospitalization during the acute infection.

Supplementary Table 11

Identification of the post-acute incident diagnoses in people who had been hospitalized with COVID-19 vs. those who had been hospitalized with seasonal influenza after additionally adjusting for severity of the acute infection.

Supplementary Table 12

Identification of the post-acute incident medication use in people who had been hospitalized with COVID-19 vs. those who had been hospitalized with seasonal influenza after additionally adjusting for severity of the acute infection.

Supplementary Table 13

Identification of the post-acute incident laboratory abnormalities in people who had been hospitalized with COVID-19 vs. those who had been hospitalized with seasonal influenza after additionally adjusting for severity of the acute infection.

Supplementary Table 14

Burden of post-acute sequelae of COVID-19 which required hospitalization during the acute infection after additionally adjusting for severity of the acute infection. Excess burdens were estimated vs. a comparator group of people with seasonal influenza which required hospitalization during the acute infection.

Supplementary Table 15

Identification of the post-acute incident diagnoses in people who had been hospitalized with COVID-19 vs. those who had been hospitalized for other causes.

Supplementary Table 16

Identification of the post-acute incident medication use in people who had been hospitalized with COVID-19 vs. those who had been hospitalized for other causes.

Supplementary Table 17

Identification of the post-acute incident laboratory abnormalities in people who had been hospitalized with COVID-19 vs. those who had been hospitalized for other causes.

Supplementary Table 18

Burden of post-acute sequelae of COVID-19 which required hospitalization during the acute infection. Excess burdens were estimated vs. a comparator group of people who required hospitalization for other causes. Estimates are provided per 1000 persons at 6-month.

Supplementary Table 19

Risks of incident pre-specified high resolution post-acute COVID-19 outcomes at 6 months in all users of the Veteran Health Administration healthcare system (referent category), people with COVID-19, people hospitalized for COVID-19, and people admitted to intensive care for COVID-19.

Supplementary Table 20

Pairwise comparison of risks of incident pre-specified high resolution post-acute COVID-19 outcomes at 6 months in all users of the Veteran Health Administration healthcare system, people with COVID-19, people hospitalized for COVID-19, and people admitted to intensive care for COVID-19.

Supplementary Table 21

Risks of incident pre-specified high resolution post-acute COVID-19 outcomes at 6 months in hospitalized people with COVID-19 vs. seasonal influenza (the referent category). Outcomes were ascertained from day 30 after COVID-19 diagnosis until end of follow-up.

Supplementary table 22

Measures of the association between negative exposure control (exposure to influenza vaccine in even vs. odd months) and the risks of incident diagnoses.

Supplementary Table 23

Measures of the association between negative exposure control (exposure to influenza vaccine in even vs. odd months) and the risks of incident medication use.

Supplementary Table 24

Measures of the association between negative exposure control (exposure to influenza vaccine in even vs. odd months) and the risks of incident laboratory abnormalities.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Al-Aly, Z., Xie, Y. & Bowe, B. High-dimensional characterization of post-acute sequelae of COVID-19. Nature 594, 259–264 (2021). https://doi.org/10.1038/s41586-021-03553-9

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing