This is an unedited manuscript that has been accepted for publication. Nature Research are providing this early version of the manuscript as a service to our customers. The manuscript will undergo copyediting, typesetting and a proof review before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers apply.

A new coronavirus associated with human respiratory disease in China

Abstract

Emerging infectious diseases, such as severe acute respiratory syndrome (SARS) and Zika virus disease, present a major threat to public health1,2,3. Despite intense research efforts, how, when and where new diseases appear are still the source of considerable uncertainty. A severe respiratory disease was recently reported in Wuhan, Hubei province, China. As of 25 January 2020, at least 1,975 cases had been reported since the first patient was hospitalized on 12 December 2019. Epidemiological investigations have suggested that the outbreak was associated with a seafood market in Wuhan. Here we study a single patient who was a worker at the market and who was admitted to Wuhan Central Hospital on 26 December 2019 while experiencing a severe respiratory syndrome that included fever, dizziness and a cough. Metagenomic RNA sequencing4 of a sample of bronchoalveolar lavage fluid from the patient identified a new RNA virus strain from the family Coronaviridae, which is designated here ‘WH-Human 1’ coronavirus (and has also been referred to as ‘2019-nCoV’). Phylogenetic analysis of the complete viral genome (29,903 nucleotides) revealed that the virus was most closely related (89.1% nucleotide similarity) to a group of SARS-like coronaviruses (genus Betacoronavirus, subgenus Sarbecovirus) that had previously been found in bats in China5. This outbreak highlights the ongoing ability of viral spill-over from animals to cause severe disease in humans.

Data availability

Sequence reads generated in this study are available from the NCBI Sequence Read Archive (SRA) database under BioProject accession number PRJNA603194. The complete genome sequence of WHCV has been deposited in GenBank under accession number MN908947.

References

  1. 1.

    Drosten, C. et al. Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N. Engl. J. Med. 348, 1967–1976 (2003).

  2. 2.

    Wolfe, N. D., Dunavan, C. P. & Diamond, J. Origins of major human infectious diseases. Nature 447, 279–283 (2007).

  3. 3.

    Ventura, C. V., Maia, M., Bravo-Filho, V., Góis, A. L. & Belfort, R. Jr. Zika virus in Brazil and macular atrophy in a child with microcephaly. Lancet 387, 228 (2016).

  4. 4.

    Shi, M. et al. Redefining the invertebrate RNA virosphere. Nature 540, 539–543 (2016).

  5. 5.

    Hu, D. et al. Genomic characterization and infectivity of a novel SARS-like coronavirus in Chinese bat. Emerg. Microbes Infect. 7, 1–10 (2018).

  6. 6.

    Shi, M. et al. The evolutionary history of vertebrate RNA viruses. Nature 556, 197–202 (2018).

  7. 7.

    Yadav, P. D. et al. Nipah virus sequences from humans and bats during Nipah outbreak, Kerala, India, 2018. Emerg. Infect. Dis. 25, 1003–1006 (2019).

  8. 8.

    McMullan, L. K. et al. Characterisation of infectious Ebola virus from the ongoing outbreak to guide response activities in the Democratic Republic of the Congo: a phylogenetic and in vitro analysis. Lancet Infect. Dis. 19, 1023–1032 (2019).

  9. 9.

    Li, D., Liu, C. M., Luo, R., Sadakane, K. & Lam, T. W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).

  10. 10.

    Wang, W. et al. Discovery, diversity and evolution of novel coronaviruses sampled from rodents in China. Virology 474, 19–27 (2015).

  11. 11.

    Hu, B. et al. Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus. PLoS Pathog. 13, e1006698 (2017).

  12. 12.

    Lin, X.-D. et al. Extensive diversity of coronaviruses in bats from China. Virology 507, 1–10 (2017).

  13. 13.

    Xu, L. et al. Detection and characterization of diverse alpha- and betacoronaviruses from bats in China. Virol. Sin. 31, 69–77 (2016).

  14. 14.

    Ren, W. et al. Difference in receptor usage between severe acute respiratory syndrome (SARS) coronavirus and SARS-like coronavirus of bat origin. J. Virol. 82, 1899–1907 (2008).

  15. 15.

    Li, F., Li, W., Farzan, M. & Harrison, S. C. Structure of SARS coronavirus spike receptor-binding domain complexed with receptor. Science 309, 1864–1868 (2005).

  16. 16.

    Hulswit, R. J. G. et al. Human coronaviruses OC43 and HKU1 bind to 9-O-acetylated sialic acids via a conserved receptor-binding site in spike protein domain A. Proc. Natl Acad. Sci. USA 116, 2681–2690 (2019).

  17. 17.

    Ge, X. Y. et al. Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor. Nature 503, 535–538 (2013).

  18. 18.

    Yang, X. L. et al. Isolation and characterization of a novel bat coronavirus closely related to the direct progenitor of severe acute respiratory syndrome coronavirus. J. Virol. 90, 3253–3256 (2016).

  19. 19.

    Martin, D. P. et al. RDP3: a flexible and fast computer program for analyzing recombination. Bioinformatics 26, 2462–2463 (2010).

  20. 20.

    Menachery, V. D. et al. A SARS-like cluster of circulating bat coronaviruses shows potential for human emergence. Nat. Med. 21, 1508–1513 (2015).

  21. 21.

    Bermingham, A. et al. Severe respiratory illness caused by a novel coronavirus, in a patient transferred to the United Kingdom from the Middle East, September 2012. Euro Surveill. 17, 20290 (2012).

  22. 22.

    Hamre, D. & Procknow, J. J. A new virus isolated from the human respiratory tract. Proc. Soc. Exp. Biol. Med. 121, 190–193 (1966).

  23. 23.

    McIntosh, K., Becker, W. B. & Chanock, R. M. Growth in suckling-mouse brain of “IBV-like” viruses from patients with upper respiratory tract disease. Proc. Natl Acad. Sci. USA 58, 2268–2273 (1967).

  24. 24.

    van der Hoek, L. et al. Identification of a new human coronavirus. Nat. Med. 10, 368–373 (2004).

  25. 25.

    Woo, P. C. et al. Characterization and complete genome sequence of a novel coronavirus, coronavirus HKU1, from patients with pneumonia. J. Virol. 79, 884–895 (2005).

  26. 26.

    Li, W. et al. Bats are natural reservoirs of SARS-like coronaviruses. Science 310, 676–679 (2005).

  27. 27.

    Lau, S. K. et al. Severe acute respiratory syndrome coronavirus-like virus in Chinese horseshoe bats. Proc. Natl Acad. Sci. USA 102, 14040–14045 (2005).

  28. 28.

    Wang, W. et al. Discovery of a highly divergent coronavirus in the Asian house shrew from China illuminates the origin of the Alphacoronaviruses. J. Virol. 91, e00764-17 (2017).

  29. 29.

    Zhou, P. et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature https://doi.org/10.1038/s41586-020-2012-7 (2020).

  30. 30.

    Gorbalenya, A. E. Severe acute respiratory syndrome-related coronavirus — the species and its viruses, a statement of the Coronavirus Study Group. Preprint at bioRxiv https://doi.org/10.1101/2020.02.07.93786 (2020).

  31. 31.

    WHO. WHO Director-General’s remarks at the media briefing on 2019-nCoV on 11 February 2020. https://www.who.int/dg/speeches/detail/who-director-general-s-remarks-at-the-media-briefing-on-2019-ncov-on-11-february-2020 (WHO, 11 February 2020).

  32. 32.

    Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

  33. 33.

    Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

  34. 34.

    Li, B., Ruotti, V., Stewart, R. M., Thomson, J. A. & Dewey, C. N. RNA-seq gene expression estimation with read mapping uncertainty. Bioinformatics 26, 493–500 (2010).

  35. 35.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

  36. 36.

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

  37. 37.

    Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

  38. 38.

    Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 (2010).

  39. 39.

    Tamura, K. et al. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739 (2011).

  40. 40.

    Lole, K. S. et al. Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination. J. Virol. 73, 152–160 (1999).

  41. 41.

    Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).

  42. 42.

    Waterhouse, A. et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 46, W296–W303 (2018).

  43. 43.

    Hwang, W. C. et al. Structural basis of neutralization by a human anti-severe acute respiratory syndrome spike protein antibody, 80R. J. Biol. Chem. 281, 34610–34616 (2006).

Download references

Acknowledgements

This study was supported by the Special National Project on investigation of basic resources of China (grant SQ2019FY010009) and the National Natural Science Foundation of China (grants 81861138003 and 31930001). E.C.H. is supported by an ARC Australian Laureate Fellowship (FL170100022).

Author information

Y.-Z.Z. conceived and designed the study. S.Z., Y.H., Z.-W.T. and M.-L.Y. performed the clinical work and sample collection. B.Y. and J.-H.T. performed the epidemiological investigation and sample collection. F.W., Z.-G.S., L.X., Y.-Y.P., Y.-L.Z., F.-H.D., Y.L., J.-J.Z. and Q.-M.W. performed the experiments. Y.-M.C., W.W., F.W., E.C.H. and Y.-Z.Z. analysed the data. Y.-Z.Z., E.C.H. and F.W. wrote the paper with input from all authors. Y.-Z.Z. led the study.

Correspondence to Yong-Zhen Zhang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Nicholas Loman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Chest radiographs of the patient.

ad, Computed-tomography scans of the chest were obtained on the day of admission (day 6 after the onset of disease). Bilateral focal consolidation, lobar consolidation and patchy consolidation were clearly observed, especially in the lower lung. e, A chest radiograph was obtained on day 5 after admission (day 11 after the onset of disease). Bilateral diffuse patchy and fuzzy shadows were observed.

Extended Data Fig. 2 Other respiratory pathogens were not detected in the BALF sample by real-time RT–PCR.

ae, The BALF sample was tested for the presence of influenza A virus (a), the Victoria lineage of influenza B viruses (b), the Yamagata lineage of influenza B viruses (c), human adenovirus (d) and Chlamydia pneumoniae (e). Sample 1 was the BALF sample of the patient, water was used as a negative (NEG) control and positive (POS) control samples included plasmids covering the Taqman primers and probe regions of influenza A, the Victoria and Yamagata lineages of influenza B viruses, human adenovirus and Chlamydia pneumoniae.

Extended Data Fig. 3 Mapped read count plot of the WHCV genome.

The histograms show the coverage depth per base of the WHCV genome. The mean sequencing depth of the WHCV genome was 604.21 nt.

Extended Data Fig. 4 Quantification of WHCV in clinical samples by real-time RT–PCR.

a, Specificity evaluation of the WHCV primers. Test samples comprised clinical samples that were positive for at least one of the following viruses: influenza A virus (09H1N1 and H3N2), influenza B virus, human adenovirus, respiratory syncytial virus, rhinovirus, parainfluenza virus type 1–4, human bocavirus, human metapneumovirus, coronavirus OC43, coronavirus NL63, coronavirus 229E and coronavirus HKU1. Only the standard plasmid of WHCV (WHCV 15,704–16,846 bp in a pLB vector) led to positive amplification (brown curve). b, Amplification curve of the DNA standard for WHCV. From left to right, the DNA concentrations were 1.8 × 108, 1.8 × 107, 1.8 × 106, 1.8 × 105, 1.8 × 104 and 1.8 × 103. c, Linear fitted curve of Ct values to concentrations of the WHCV DNA standard. d, Quantification of WHCV in the BALF sample by real-time RT–PCR. The WHCV DNA standard was used as positive control (POS), water (NEG) and blank were used as negative controls. The amplification curve of the BALF sample is shown in green.

Extended Data Fig. 5 Maximum likelihood phylogenetic trees of the nucleotide sequences of the whole genome, and S and N genes of WHCV and related coronaviruses.

Numbers (>70) above or below the branches indicate percentage bootstrap values. The trees were mid-point rooted for clarity only. The scale bar represents the number of substitutions per site.

Extended Data Fig. 6 Maximum likelihood phylogenetic trees of the nucleotide sequences of the 3CL, RdRp, Hel, ExoN, NendoU and O-MT genes of WHCV and related coronaviruses.

Numbers (>70) above or below the branches indicate percentage bootstrap values. The trees were mid-point rooted for clarity only. The scale bar represents the number of substitutions per site.

Extended Data Fig. 7 Analysis of RBD of the spike protein of WHCV coronavirus.

a, Amino acid sequence alignments of RBD sequences of SARS-like CoVs. Three bat SARS-like CoVs—which could efficiently use the human ACE2 as receptor—had an RBD sequence of similar size to SARS-CoV. WHCV contains a single Val470 insertion. The key amino acid residues involved in the interaction with human ACE2 are marked by orange squares. By contrast, five bat SARS-like CoVs, including Rp3, which has previously been found not to bind to ACE214—had amino acid deletions in two motifs (amino acids 433–437 and 460–472, highlighted by red boxes) compared with those of SARS-CoV.11 b, The two motifs (amino acids 433–437 and 460–472) are shown in red for the crystal structure of the RBD of the spike protein of SARS-CoV in complex with the human ACE2 receptor (PDB 2AJF). Human ACE2 is shown in blue and the RBD of the spike protein of SARS-CoV is shown in green. Important residues in human ACE2 that interact with the RBD of the spike protein of SARS-CoV are marked. c, Predicted protein structure of the RBD of the spike protein of WHCV based on target–template alignment using ProMod3 on the SWISS-MODEL server. d, Predicted structure of the RBD of the spike protein of SARS-like CoV Rs4874. e, Predicted structure of the RBD of the spike protein of SARS-like CoV Rp3. f, Crystal structure of the RBD of the spike protein of SARS-CoV (green) (PDB 2GHV). Motifs that resemble amino acids 473–477 and 460–472 of the spike protein of SARS-CoV are shown in red.

Extended Data Fig. 8 Amino acid sequence comparison of the N-terminal domain of the spike protein.

Amino acid sequence comparison of the N-terminal domain of the spike protein of WHCV, bovine coronavirus (BCoV), mouse hepatitis virus (MHV) and human coronaviruses (HCoV OC43 and HKU1) that can bind to sialic acid and the SARS-CoVs that cannot (SZ3, WH20, BJ0 and Tor2). The key residues16 for sialic acid binding on BCoV, MHV, and HCoV OC43 and HKU1 are highlighted by orange squares.

Extended Data Fig. 9

Recombination events in WHCV. The sequence similarity plot of WHCV, SARS-like CoVs and bat SARS-like CoVs reveals putative recombination events.

Supplementary information

Supplementary Tables

This file contains Supplementary Tables 1-8.

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wu, F., Zhao, S., Yu, B. et al. A new coronavirus associated with human respiratory disease in China. Nature (2020). https://doi.org/10.1038/s41586-020-2008-3

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.