Viral and host factors related to the clinical outcome of COVID-19

Zhang, Xiaonan; Tan, Yun; Ling, Yun; Lu, Gang; Liu, Feng; Yi, Zhigang; Jia, Xiaofang; Wu, Min; Shi, Bisheng; Xu, Shuibao; Chen, Jun; Wang, Wei; Chen, Bing; Jiang, Lu; Yu, Shuting; Lu, Jing; Wang, Jinzeng; Xu, Mingzhu; Yuan, Zhenghong; Zhang, Qin; Zhang, Xinxin; Zhao, Guoping; Wang, Shengyue; Chen, Saijuan; Lu, Hongzhou

doi:10.1038/s41586-020-2355-0

Download PDF

Article
Published: 20 May 2020

Viral and host factors related to the clinical outcome of COVID-19

Xiaonan Zhang ORCID: orcid.org/0000-0002-1286-7280¹^na1,
Yun Tan ORCID: orcid.org/0000-0001-8450-0392²^na1,
Yun Ling¹^na1,
Gang Lu²^na1,
Feng Liu²^na1,
Zhigang Yi^1,3^na1,
Xiaofang Jia¹,
Min Wu¹,
Bisheng Shi¹,
Shuibao Xu¹,
Jun Chen¹,
Wei Wang¹,
Bing Chen²,
Lu Jiang²,
Shuting Yu²,
Jing Lu²,
Jinzeng Wang²,
Mingzhu Xu¹,
Zhenghong Yuan³,
Qin Zhang⁴,
Xinxin Zhang⁵,
Guoping Zhao⁶,
Shengyue Wang ORCID: orcid.org/0000-0001-9589-7719²,
Saijuan Chen ORCID: orcid.org/0000-0003-3789-1284² &
…
Hongzhou Lu ORCID: orcid.org/0000-0002-8308-5534¹

Nature volume 583, pages 437–440 (2020)Cite this article

119k Accesses
595 Citations
922 Altmetric
Metrics details

Subjects

Abstract

In December 2019, coronavirus disease 2019 (COVID-19), which is caused by the new coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was identified in Wuhan (Hubei province, China)¹; it soon spread across the world. In this ongoing pandemic, public health concerns and the urgent need for effective therapeutic measures require a deep understanding of the epidemiology, transmissibility and pathogenesis of COVID-19. Here we analysed clinical, molecular and immunological data from 326 patients with confirmed SARS-CoV-2 infection in Shanghai. The genomic sequences of SARS-CoV-2, assembled from 112 high-quality samples together with sequences in the Global Initiative on Sharing All Influenza Data (GISAID) dataset, showed a stable evolution and suggested that there were two major lineages with differential exposure history during the early phase of the outbreak in Wuhan. Nevertheless, they exhibited similar virulence and clinical outcomes. Lymphocytopenia, especially reduced CD4⁺ and CD8⁺ T cell counts upon hospital admission, was predictive of disease progression. High levels of interleukin (IL)-6 and IL-8 during treatment were observed in patients with severe or critical disease and correlated with decreased lymphocyte count. The determinants of disease severity seemed to stem mostly from host factors such as age and lymphocytopenia (and its associated cytokine storm), whereas viral genetic variation did not significantly affect outcomes.

The next phase of SARS-CoV-2 surveillance: real-time molecular epidemiology

Article 09 September 2021

SARS-CoV-2 genomic variations associated with mortality rate of COVID-19

Article Open access 22 July 2020

Epidemiology and analysis of SARS-CoV-2 Omicron subvariants BA.1 and 2 in Taiwan

Article Open access 03 October 2023

Main

The COVID-19 outbreak was first identified in Wuhan and appeared to be linked to Huanan Seafood Wholesale Market (HSWM). The causal agent, SARS-CoV-2^1,2, is closely related to a bat coronavirus (RaTG13)², although its receptor binding domain is more similar to that of pangolin coronaviruses³. Currently, several questions remain regarding the origin, evolution and host interactions of SARS-CoV-2. First, although HSWM has been widely proposed to be the original outbreak site of SARS-CoV-2, a significant number of the initial cases did not have contact with this market⁴. This casts doubt on the idea of a singular event of zoonotic spillover to humans in the initial outbreak. Second, additional data are required to discern whether the virulence of SARS-CoV-2 has altered as a result of genomic sequence evolution during the spread of the disease. Third, although SARS-CoV-2 infection can cause life-threatening respiratory disease, most cases manifest only mild pneumonia⁵. The factors associated with disease outcome have yet to be fully characterized. We have systematically analysed key immunological parameters spanning the course of infection in patients, obtained viral genomes directly from clinical samples, and delineated factors associated with prognosis and epidemiological features.

Overview of enrolment

The basic clinical and epidemiological features of the cohort (326 patients in Shanghai between 20 January and 25 February 2020) are summarized in Extended Data Table 1. Four categories of infected case were defined. Five individuals were asymptomatic; that is, they had no obvious fever, respiratory symptoms or radiological manifestations. Most patients (293) had mild disease with fever and radiological manifestations of pneumonia. Twelve patients who had symptoms of dyspnoea and signs of expanding ground-glass opacity in the lung within 24–48 h of admission were defined as severe cases. The remaining 16 patients deteriorated into acute respiratory distress syndrome and required mechanical ventilation or extracorporeal membrane oxygenation; these patients were categorized as critical (Extended Data Table 1). As of 1 April 2020, 315 (96.63%) of the patients had been discharged, and 6 (1.84%) had died.

Nucleotide variation in viral genomes

Sequencing data from 112 samples (sputum or oropharyngeal swab) passed quality control and were used for nucleotide variation calling (Extended Data Fig. 1). Compared to the first-released genome (Wuhan-Hu-1), we identified 66 synonymous and 103 nonsynonymous variants in 9 protein-coding regions (Extended Data Fig. 2a, b). Substitution rates in most genes (ORF1ab, S, ORF3a, E, M and ORF7a) were similar (around 3.5 × 10⁻⁴ per site per year), whereas variation rates in ORF8 (9.51 × 10⁻⁴ per site per year) and N (1.05 × 10⁻³ per site per year) were higher (Extended Data Fig. 2a, b). The recurrence of variations in the viral genome is similar between samples from Shanghai and the GISAID dataset (Extended Data Fig. 2c).

Genomic phylogeny analysis

We next used the viral genomes from 94 patients (which were more than 90% complete) together with 221 sequences of SARS-CoV-2 from the GISAID database for phylogeny analysis. Two major clades were identified (Fig. 1, Extended Data Fig. 3a, b), both of which included cases diagnosed in early December 2019^1,2. Clade I included several subclades, such as those characterized by ORF3a: p.251G>V (subclade V), or S: p.614D>G (subclade G). Clade II is distinguished from clade I by two linked variations—ORF8: p.84L>S (28144T>C) and ORF1ab: p.2839S (8782C>T) (Fig. 1, Extended Data Fig. 3a). The both major clades and their subclades were found in the Shanghai cohort, suggesting that there were multiple origins of transmission into Shanghai. We did not observe significant expansion of clades or subclades in Shanghai.

**Fig. 1: Phylogenetic analysis of the assembled SARS-CoV-2 genomes.**

Additionally, the viral genomes from six patients with a clear history of contact with HSWM^1,2, the suspected initial outbreak site, were all clustered into clade I, whereas those from three patients diagnosed at the same time without a history of contact with HSWM^6,7 were clustered into clade II (Fig. 1). We analysed the sequences around nucleotides 8,782 and 28,144 of SARS-CoV-2 in samples from patients with or without a history of contact with HSWM and in the bat coronavirus Bat-SARS-CoV-RaTG13. Virus genomes found in patients without contact with HSWM were identical to Bat-SARS-CoV-RaTG13 at these two sites (Extended Data Fig. 3c).

We compared the clinical manifestations of patients infected with viruses of either clade I or clade II. We found no statistical difference in disease severity (P = 0.88), lymphocyte count (P = 0.79), CD3 T cell count (P = 0.21), C-reactive protein level (P = 0.83) or D-dimer level (P = 0.19), or in the duration of virus shedding after onset (P = 0.79) (Extended Data Table 2). Thus, these two clades of virus exhibited similar pathogenic effects despite their genome sequence variations. Likewise, we found no significant association between disease severity and the 13 most frequent genetic variations (synonymous and non-synonymous) (Extended Data Fig. 4).

Host factors associated with disease severity

A notable feature of our cohort was that some infected individuals (five cases; 1.53%) did not develop obvious symptoms even though substantial virus shedding could be detected. As shown in Extended Data Fig. 5a, an asymptomatic patient showed no obvious lesions in the lungs upon admission or five days later. By contrast, unilateral and bilateral opacity lesions were observed in patients with mild (Extended Data Fig. 5b) or critical COVID-19, and the latter deteriorated quickly over just two days (Extended Data Fig. 5c).

We further analysed the immunological and biochemical parameters of the patients (Extended Data Table 3). A prominent feature of COVID-19 was progressive lymphocytopenia, particularly in patients categorized as severe or critical (initial test result after admission, P = 6 × 10⁻⁶). Detailed analysis of lymphocyte subtypes revealed that CD3⁺ T cells were most significantly affected (P < 10⁻⁶), with CD4⁺ and CD8⁺ T cells sharing similar trends (CD4⁺ T cell, P < 10⁻⁶; CD8⁺ T cell, P = 1 × 10⁻⁵). Notably, the changes in T lymphocytes were statistically significant not only in critical cases but also in the other three categories (asymptomatic, mild and severe; CD3⁺ T cells, P = 0.013; CD8⁺ T cells, P = 0.004). By contrast, for CD19⁺ B cells, although a significant decline was found in critical cases (P = 1 × 10⁻⁵), patients in the other categories showed no obvious changes (P = 0.47). We further examined the longitudinal cell counting data for each group. It was clear that the CD3⁺ T lymphocytes exhibited a gradual decline (P < 0.05 on dayd 7, 8, 11, 14–18, 22–25, 28 and 29 after onset, Kruskal–Wallis test) as the disease deteriorated (Fig. 2a), a trend that was also seen in CD4⁺ and CD8⁺ T cells (Fig. 2b, c). However, it was not found for natural killer (NK) (CD16⁺CD56⁺) or B (CD19⁺) cells (Fig. 2d, e).

**Fig. 2: Lymphocyte numbers in patients during hospitalization.**

We next compared the clinical parameters grouped by comorbidity and found a significantly higher risk for disease progression when the disease was complicated by co-existing conditions (P = 0.01) (Extended Data Table 4), although the median age of the comorbidity group was also higher (P = 0.02). Indeed, univariate logistic regression analysis indicated that age (P < 0.0001), lymphocyte counts upon admission (P < 0.0001), comorbidities (P = 0.01) and gender (P = 0.014) (higher risk for male) were the main factors associated with disease severity (Extended Data Table 5). Multivariate analysis showed that age (P = 0.002) and lymphocytopenia (P = 0.002) were two major independent factors, whereas comorbidities did not reach statistical significance.

The levels of eleven cytokines (IFN-α, IFN-γ, IL-1β, IL-2, IL-4, IL-5, IL-6, IL-8, IL-10, IL-12 and IL-17) in serum were measured upon admission and during treatment. Among them, IL-6 (P < 10⁻⁶) and IL-8 (P = 1 × 10⁻⁵) (Extended Data Table 3) showed the most significant changes. Notably, the levels of these two cytokines were inversely correlated with lymphocyte count (Fig. 3a, b, Extended Data Table 5). Furthermore, we combined the longitudinal cytokine data of each group and plotted their fluctuation patterns against the time point from onset. We aggregated the highest IL-6 data from each patient from day 6 to day 10 after onset and compared patients classed as critical with those classed as non-critical. Patients categorized as critical showed significantly higher levels of IL-6 (P = 0.001, two-sided Mann–Whitney U test) (Fig. 3c). There was a similarly significant difference in IL-8 level when data were aggregated from day 16 to day 20 after onset (P = 0.006) (Fig. 3d). These data suggest that there is a strong link between inflammatory cytokines and the pathogenesis of SARS-CoV-2 infection.

**Fig. 3: Correlation between inflammatory cytokines and lymphocyte counts.**

Discussion

Our analysis of some recently treated patients provides further evidence that the viral genome of SARS-CoV-2 is largely stable. Consistent with recently published results⁸, we found that the observed small sequence variations divided the viral genomes into two major clades. We noted that six sequences recovered from patients with a history of contact with HSWM all fell into clade I, whereas three genomes from patients diagnosed in the same period but without exposure to HSWM were clustered into clade II. Thus, these two major haplotypes are likely to represent two lineages derived from a common ancestor that evolved independently in early December 2019 in Wuhan, only one of which (clade I) was spawned within the HSWM, where a high density of stalls, vendors and customers might have facilitated human-to-human transmission. Consistent with this idea, epidemiological investigations of the earliest cases found in Wuhan before 18 December 2019 identified two patients that were linked to HSWM and five that were not⁴. Our time-resolved phylogeny analysis suggests that the earliest zoonotic spillover event might have occurred in late November 2019, which is in agreement with a previous analysis⁸.

Nevertheless, we found no significant differences in clinical features, mutation rate or transmissibility between patients infected with clade I or II virus. Our data are in agreement with a lack of selection against either clade, as suggested⁹, but is at odds with a previous conclusion, the L/S-type classification of which was based on the same two linked polymorphisms¹⁰. The presumed difference in transmissibility might be due to sampling bias, as the early uploaded sequences in the GISAID database were recovered from a limited number of critically ill patients and duplicate assemblies from the same patients were not uncommon^1,2,11.

A recent analysis of 1,099 cases of COVID-19 in China found lymphocytopenia to be one of the most common features in laboratory tests⁵. Here, we have confirmed this observation and further shown that CD3⁺ T cells were the major cell type that was suppressed in infected patients, whereas CD19⁺ B cells and CD16⁺CD56⁺ NK cells exhibited less suppression. Indeed, lymphopenia and, in particular, reduced CD4⁺/CD8⁺ cell counts, are also a major manifestation of SARS-CoV infection¹². Furthermore, our longitudinal monitoring of major cytokines indicated that IL-6 and IL-8 were negatively correlated with lymphocyte count and that IL-6 kinetics was highly related to disease severity. At present, the relationships between virological activity, cytokine release and lymphocytopenia remain unclear. We hypothesize that the immunopathological response against SARS-CoV-2, involving a cytokine storm and loss of CD3⁺ T lymphocytes, could constitute—at least in part—an underlying mechanism for disease progression and fatality. The macrophages in the lung could serve as the first driver of the cytokine storm in the early phase of COVID-19 pneumonia¹³, and subsequent lymphocyte infiltration mobilized by the cytokines, as observed in infected patients^14,15 and Rhesus macaques¹⁶, may explain the lymphocytopenia, although probable cytokine-induced T cell depletion cannot be ruled out.

In conclusion, by closely monitoring the molecular and immunological data in 326 patients with COVID-19, we find evidence that adverse outcome is associated with depletion of CD3⁺ T lymphocytes, which is tightly linked to bursts of cytokines such as IL-6 and IL-8. Targeted sequencing of 94 individuals who were infected during late January to February indicated limited variation in the viral genome, which suggests stable evolution. Two major lineages of the virus derived from one common ancestor may have originated independently from Wuhan in December 2019 and contributed to the current pandemic, although we find no major difference in clinical manifestation or transmissibility between them. Our data provide further evidence for the respective roles played by viral and host factors in disease mechanism and underscore the importance of early intervention in therapy.

Methods

Ethics statement

This study was approved by the Shanghai Public Health Clinical Center Ethics Committee (no. YJ-2020-S015-01). Informed consent was obtained from all enrolled patients.

Patients

This study involved 326 patients, who had tested positive for SARS-CoV-2 RNA and were admitted to the Shanghai Public Health Clinical Center (the designated hospital receiving all COVID-19 cases in Shanghai) between 20 January and 25 February 2020. In addition to routine clinical tests, measurement of serum cytokines was performed on 228 patients. Their basic demographic, epidemiological and clinical characteristics are shown in Extended Data Table 1. The median age of the patients was 51 years (range 15–88) with a male:female sex ratio of 1.10:1. Among these 326 patients, 125 (38.34%) had at least one comorbidity; the most common were hypertension (76 patients), diabetes (24), coronary heart disease (13), chronic hepatitis B (10), chronic obstructive pulmonary disease (2), chronic renal disease (2) and cancer (3). Disease severity was categorized into four stages—asymptomatic, mild, severe and critical—according to the guidelines on the Diagnosis and Treatment of COVID-19 issued by the National Health Commission, China¹⁷. In brief, asymptomatic disease was defined as normal body temperature, lack of respiratory symptoms and no pulmonary radiological manifestation; mild disease as having fever, respiratory symptoms and radiological evidence of pneumonia; severe disease as meeting one of the following manifestations: respiratory rate >30/min, oxygen saturation levels (S_pO₂) <93%, arterial partial pressure of oxygen (P_aO₂)/fraction of inspired oxygen (F_iO₂)(P_aO₂/F_iO₂ ratio) ≤ 300 mm Hg or pulmonary imaging with multi-lobular lesions or lesion progression exceeding 50% within 48 h; and critical disease as one of the following: acute respiratory distress syndrome requiring mechanical ventilation, shock, or complications with other organ failure.

Nucleic acid extraction, molecular screening and genome sequencing

Swabs and sputum samples were collected for nucleic acid extraction using an automatic magnetic extraction device and accompanying kit (Shanghai Bio-Germ) and screened using a semiquantitative RT–PCR kit (Shanghai Bio-Germ) with amplification targeting the ORF1a/b and N genes. Deep sequencing was done using the nucleic acid extracted from patients confirmed as having COVID-19 by RT–PCR in Shanghai Public Health Clinical Center. We used a multiplexed amplicon strategy as described¹⁸ and the primers were synthesized as described (https://github.com/artic-network/artic-ncov2019/blob/master/primer_schemes/nCoV-2019/V1/nCoV-2019.tsv). The primers were split into 10 subpools each containing 9–10 pairs for specific amplification of 400-bp viral sequence using the remaining cDNA from the diagnostic test. The PCR amplicons were purified using AMPure DNA cleanup steps. The amplicon libraries were generated using a NanoPrep for Illumina kit (IDT) according to the manufacturer’s instructions. In brief, the procedures included end-repair, 3′ end adenylation, adaptor ligation and PCR amplification, followed by assessing DNA library quality. Amplicon sequencing was performed with established Illumina protocols on MiSeq platform (Illumina) according to a 2 × 300-bp protocol in the National Research Center for Translational Medicine (Shanghai).

Viral genomic sequence variation calling

All clean reads were mapped to the SARS-CoV-2 genome (Wuhan-Hu-1, GenBank accession number MN908947) using BWA (version 0.7.17)¹⁹. Variations were called with mpileup tools in samtools²⁰. Low-quality variations with depth lower than 10 and Qual score lower than 50 were filtered using bcftools (version 1.9).

Phylogenetic analysis

Sequencing reads were trimmed using Trimmomatic (version 0.39)²¹ to remove low-quality regions, adaptor sequences and sequencing primers. Clean reads were used to build virus genome assemblies with VirGenA (version 1.4)²². A post-assembly procedure was manually performed to remove low-quality content and potential sequencing artefacts. Ninety-four assemblies with coverage above 90% qualified for phylogeny analysis. MAFFT (version 7.453)²³ made the multi-sequence alignment after trimming off Ns on both ends of the genome sequences. The computation and visualization platform used for the phylogeny analysis was Nextstrain (version 1.15.0)²⁴. The module we selected for phylogenetic tree building was IQ-TREE (version 1.6.12)²⁵. Automatic substitution model selection was performed and the TIM+F+I model was selected to build the maximum likelihood phylogeny tree based on Bayesian information criteria (BIC) score. TreeTime (version 0.7.3)²⁶ was used for time-resolved phylogeny analysis. The resulting phylogeny tree was visualized using auspice from the Nextstrain package. All bioinformatics analyses were performed using the ASTRA supercomputing platform (Sugon) with Optane memory technology in the National Research Center for Translational Medicine (Shanghai).

Cytokine quantification and lymphocyte subset counting

A Becton Dickinson (BD) cytometric bead array (human Th1/Th2/Th17 cytokine kit and Human Inflammatory Cytokine Kit) was used quantify serum cytokines (IFNα, IFNγ, IL-1β, IL-2, IL-4, IL-5, IL-6, IL-8, IL-10, IL-12 and IL-17). CD3⁺ T, CD4⁺ T, CD8⁺ T, CD19⁺ B, and CD16⁺CD56⁺ NK cells were stained using BD Multitest 6-colour TBNK reagent in Trucount tubes and analysed using the BD FACSCanto II flow cytometer. The longitudinal plots of cytokines and cell count data were visualized using the geom_smooth tool in the ggplot2 R package.

Statistical analysis

Two sided Mann–Whitney U tests and Kruskal–Wallis tests were used to compare two and more than two groups of variables, respectively. χ² and Fisher’s exact test were used for analysing contingency tables. Spearman’s rank correlation test was used to evaluate correlations. No statistical methods were used to predetermine sample size. Investigators were not blinded to patient group during experiments and outcome assessment.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Data availability

The 94 genome sequences with over 90% coverage were deposited in GISAID (https://www.gisaid.org/) (accessions EPI_ISL_416316–EPI_ISL_416409) and the phylogeny result is accessible at http://ncov.linc.org.cn. The amplicon sequencing reads for variant calling have been deposited with NCBI Bioproject (PRJNA627662) and NODE (http://www.biosino.org/node/project/detail/OEP000877).

References

Zhu, N. et al. A novel coronavirus from patients with pneumonia in China, 2019. N. Engl. J. Med. 382, 727–733 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zhou, P. et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579, 270–273 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Lam, T. T. et al. Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins. Nature (2020).
Li, Q. et al. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N. Engl. J. Med. 382, 1199–1207 (2020).
Article CAS PubMed PubMed Central Google Scholar
Guan, W. J. et al. Clinical characteristics of coronavirus disease 2019 in China. N. Engl. J. Med. 382, 1708–1720 (2020).
Article CAS PubMed Google Scholar
Chan, J. F. et al. A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster. Lancet 395, 514–523 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lu, R. et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet 395, 565–574 (2020).
Article CAS PubMed PubMed Central Google Scholar
Andersen, K. G., Rambaut, A., Lipkin, W. I., Holmes, E. C. & Garry, R. F. The proximal origin of SARS-CoV-2. Nat. Med. 26, 450–452 (2020).
Article CAS PubMed Google Scholar
MacLean, O. A., Orton, R., Singer, J. B. & Robertson, D. L. Response to “On the origin and continuing evolution of SARS-CoV-2”, http://virological.org/t/response-to-on-the-origin-and-continuing-evolution-of-sars-cov-2/418 (2020).
Tang, X. et al. On the origin and continuing evolution of SARS-CoV-2. Natl Sci. Rev. https://doi.org/10.1093/nsr/nwaa036 (2020).
Ren, L. L. et al. Identification of a novel coronavirus causing severe pneumonia in human: a descriptive study. Chin. Med. J. (Engl.) 133, 1015–1024 (2020).
Article Google Scholar
Wong, R. S. et al. Haematological manifestations in patients with severe acute respiratory syndrome: retrospective analysis. Br. Med. J. 326, 1358–1362 (2003).
Article Google Scholar
Tian, S. et al. Pulmonary pathology of early-phase 2019 novel coronavirus (COVID-19) pneumonia in two patients with lung cancer. J. Thorac. Oncol. 15, 700–704 (2020).
Article CAS PubMed PubMed Central Google Scholar
Xu, Z. et al. Pathological findings of COVID-19 associated with acute respiratory distress syndrome. Lancet Respir. Med. 8, 420–422 (2020).
Article CAS PubMed PubMed Central Google Scholar
Wang, C. et al. Alveolar macrophage activation and cytokine storm in the pathogenesis of severe COVID-19. Preprint at https://doi.org/10.21203/rs.3.rs-19346/v1 (2020).
Shan, C. et al. Infection with novel coronavirus (SARS-CoV-2) causes pneumonia in the Rhesus macaques. Preprint at https://doi.org/10.21203/rs.2.25200/v1 (2020).
National Health Commission of the People’s Republic of China Diagnosis and Treatment Protocol for COVID-19 (Trial Version 7) http://en.nhc.gov.cn/2020-03/29/c_78469.htm (2020).
Quick, J. et al. Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples. Nat. Protocols 12, 1261–1276 (2017).
Article CAS PubMed Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
PubMed PubMed Central Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Article CAS PubMed PubMed Central Google Scholar
Fedonin, G. G., Fantin, Y. S., Favorov, A. V., Shipulin, G. A. & Neverov, A. D. VirGenA: a reference-based assembler for variable viral genomes. Brief. Bioinform. 20, 15–25 (2019).
Article CAS PubMed Google Scholar
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Article CAS PubMed PubMed Central Google Scholar
Hadfield, J. et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics 34, 4121–4123 (2018).
Article CAS PubMed PubMed Central Google Scholar
Nguyen, L. T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).
Article CAS PubMed Google Scholar
Sagulenko, P., Puller, V. & Neher, R. A. TreeTime: maximum-likelihood phylodynamic analysis. Virus Evol. 4, vex042 (2018).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was supported by National Megaprojects of China for Infectious Diseases (2017ZX10103009-001, 2018ZX10305409-001-005); Double First-Class Project of Fudan University (no. IDF162005); Scientific Research for Special Subjects on 2019 Novel Coronavirus of Shanghai Public Health Clinical Center (no. 2020YJKY01); Double First-Class Project (WF510162602) from the Ministry of Education, State Key Laboratory of Medical Genomics; Overseas Expertise Introduction Project for Discipline Innovation (111 Project, B17029); National Key R&D Program of China (2019YFA0905902); and the Shanghai Guangci Translational Medical Research Development Foundation. We acknowledge all healthcare personnel involved in the diagnosis and treatment of patients in Shanghai Public Health Clinical Center and we thank Z. Chen for guidance regarding study design, interpretation of results and critical revision of the manuscript. We also thank all researchers who shared SARS-CoV-2 genome sequences in GISAID. We thank the late Y. Hu of Shanghai Public Health Clinical Center for her lifelong commitment to infectious disease research.

Author information

These authors contributed equally: Xiaonan Zhang, Yun Tan, Yun Ling, Gang Lu, Feng Liu, Zhigang Yi

Authors and Affiliations

Shanghai Public Health Clinical Center, Fudan University, Shanghai, China
Xiaonan Zhang, Yun Ling, Zhigang Yi, Xiaofang Jia, Min Wu, Bisheng Shi, Shuibao Xu, Jun Chen, Wei Wang, Mingzhu Xu & Hongzhou Lu
National Research Center for Translational Medicine, Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, Ruijin Hospital Affiliated to Shanghai Jiao Tong University (SJTU) School of Medicine, Shanghai, China
Yun Tan, Gang Lu, Feng Liu, Bing Chen, Lu Jiang, Shuting Yu, Jing Lu, Jinzeng Wang, Shengyue Wang & Saijuan Chen
Key Laboratory of Medical Molecular Virology, Shanghai Medical College, Fudan University, Shanghai, China
Zhigang Yi & Zhenghong Yuan
Tong Ren Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
Qin Zhang
Research Laboratory of Clinical Virology, Ruijin Hospital Affiliated to Shanghai Jiao Tong University (SJTU) School of Medicine, Shanghai, China
Xinxin Zhang
Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
Guoping Zhao

Authors

Xiaonan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yun Tan
View author publications
You can also search for this author in PubMed Google Scholar
Yun Ling
View author publications
You can also search for this author in PubMed Google Scholar
Gang Lu
View author publications
You can also search for this author in PubMed Google Scholar
Feng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhigang Yi
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofang Jia
View author publications
You can also search for this author in PubMed Google Scholar
Min Wu
View author publications
You can also search for this author in PubMed Google Scholar
Bisheng Shi
View author publications
You can also search for this author in PubMed Google Scholar
Shuibao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bing Chen
View author publications
You can also search for this author in PubMed Google Scholar
Lu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Shuting Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Lu
View author publications
You can also search for this author in PubMed Google Scholar
Jinzeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Mingzhu Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhenghong Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Qin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xinxin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guoping Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Shengyue Wang
View author publications
You can also search for this author in PubMed Google Scholar
Saijuan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hongzhou Lu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.C., H.L., Xiaonan Zhang, S.W. and Z. Yuan conceived the study. Xiaonan Zhang, Y.L., Z. Yi, X.J., M.W., B.S., S.X., J.C., Q.Z. and W.W. collected patient samples and epidemiological and clinical data. Xiaonan Zhang, X.J. and M.W. performed viral RNA isolation and PCR. S.Y., J.L., L.J., G.L. and J.W. performed sequencing and sequence assembly. Xiaonan Zhang, Y.T., F.L., G.L., B.S., S.X., J.C., B.C., M.X., S.W. and S.C. carried out data acquisition, analysis and interpretation. Xiaonan Zhang, Y.T., F.L. and G.L. drafted the manuscript. S.C., S.W., Xinxin Zhang and G.Z. revised the final manuscript.

Corresponding authors

Correspondence to Shengyue Wang, Saijuan Chen or Hongzhou Lu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Luke O’Neill and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 High-throughput targeted sequencing process.

a, Design of experiment and analysis pipeline. A total of 121 samples, including sputum and oropharyngeal swab samples, from patients with COVID-19 were used for viral RNA extraction and sequencing. b, Comparison of the number of patients with COVID-19 used for sequencing with total cases in the Shanghai cohort between 20 January and 25 February 2020. c, Coverage of the SARS-CoV-2 genome per sample in four bins (>99%, 98–99%, 90–98%, and <90%). Ninety-four samples have coverage of over 90%. d, Numbers of individuals with severe or critical and mild or asymptomatic COVID-19. Blue bar, cases included in this study; red bar, cases used for variant calling or phylogeny analysis.

Extended Data Fig. 2 Characteristics of single nucleotide variations in 112 Shanghai SARS-CoV-2 samples.

a, Single nucleotide variations between the SARS-CoV-2 reference genome (Wuhan-Hu-1) and genome sequences in the Shanghai cohort, shown by vertical colour bars. Orange, A; blue, G; green, C; purple, T. The top panel shows the open reading frame of each gene. b, Summary of 169 variations in nine open reading frames. Variation counts and the ratio of synonymous to nonsynonymous mutations are plotted. Red represents synonymous; orange represents nonsynonymous. c, Frequencies of variations in Shanghai cohort and published GISAID dataset.

Extended Data Fig. 3 Phylogenetic analysis of the assembled SARS-CoV-2 genomes.

a, A total of 94 SARS-CoV-2 genome sequences and 221 published sequences (as in Fig. 1a) were used for construction of a time-resolved rectangular phylogeny tree. Clade I and clade II are marked, and variations that distinguish branches of the phylogeny tree are indicated. Each tip circle represents a single sample. The position of each sample along the x-axis corresponds to the sample collection date. Case locations are marked by colours (see key). GISAID accession identifiers are displayed alongside each tip circle. Cases with or without a history of contact with HSWM are highlighted. b, Box plot (centre, median; box, interquartile range (IQR); whiskers, 1.5× IQR) of pairwise similarity scores within clade I (n = 22,791), clade II (n = 5,050) and between clades I and II (interclade I–II, n = 21,614). Two-sided unpaired t-test. c, Alignment of sequences around nucleotides 8,782 and 28,144 using Bat-SARS-CoV-RaTG13 sequence and SARS-CoV-2 sequences recovered from patients with or without known history of exposure to HSWM.

Extended Data Fig. 4 SARS-CoV-2 variation in severe or critical and mild cases of COVID-19.

Variations of SARS-CoV-2 from patients with differing severity of COVID-19 are plotted. Blue, nonsynonymous mutations; grey, synonymous mutations. Red bar, severe or critical COVID-19; purple bar, mild COVID-19. Two-sided Fisher’s exact test; 95% confidence intervals and odds ratios are shown. P values are shown on the right.

Extended Data Fig. 5 Computed tomographic scans of three typical patients.

a, Asymptomatic; b, mild; c, critical. Days after admission to hospital are shown above scans. Computed tomography scans were performed once for each patient on a specific date during treatment. Images are representative of 5 asymptomatic, 7 mild and 13 critical cases.

Extended Data Table 1 Clinical features of enrolled patients and subgroups

Full size table

Extended Data Table 2 Clinical features of patients infected with different clades of SARS-CoV-2

Full size table

Extended Data Table 3 Immunological and biochemical parameters associated with disease severity

Full size table

Extended Data Table 4 Clinical features of patients with and without comorbidities

Full size table

Extended Data Table 5 Univariate and multivariate logistic regression analyses of factors associated with disease severity

Full size table

Supplementary information

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, X., Tan, Y., Ling, Y. et al. Viral and host factors related to the clinical outcome of COVID-19. Nature 583, 437–440 (2020). https://doi.org/10.1038/s41586-020-2355-0

Download citation

Received: 14 March 2020
Accepted: 13 May 2020
Published: 20 May 2020
Issue Date: 16 July 2020
DOI: https://doi.org/10.1038/s41586-020-2355-0

This article is cited by

Aldosterone levels do not predict 28-day mortality in patients treated for COVID-19 in the intensive care unit
- Jarosław Janc
- Jędrzej Jerzy Janc
- Patrycja Leśnik
Scientific Reports (2024)
ACE2-dependent and -independent SARS-CoV-2 entries dictate viral replication and inflammatory response during infection
- Tianhao Duan
- Changsheng Xing
- Rong-Fu Wang
Nature Cell Biology (2024)
Emergence of SARS and COVID-19 and preparedness for the next emerging disease X
- Ben Hu
- Hua Guo
- Zhengli Shi
Frontiers of Medicine (2024)
Diving into the proteomic atlas of SARS-CoV-2 infected cells
- Victor C. Carregari
- Guilherme Reis-de-Oliveira
- Daniel Martins-de-Souza
Scientific Reports (2024)
Predictive markers related to local and systemic inflammation in severe COVID-19-associated ARDS: a prospective single-center analysis
- Jan Nikolaus Lieberum
- Sandra Kaiser
- Nils Schallner
BMC Infectious Diseases (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Main

Overview of enrolment

Nucleotide variation in viral genomes

Genomic phylogeny analysis

Host factors associated with disease severity

Discussion

Methods

Ethics statement

Patients

Nucleic acid extraction, molecular screening and genome sequencing

Viral genomic sequence variation calling

Phylogenetic analysis

Cytokine quantification and lymphocyte subset counting

Statistical analysis

Reporting summary

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Extended data figures and tables

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links