Discovery of a novel coronavirus associated with the recent pneumonia outbreak in humans and its potential bat origin

Since the SARS outbreak 18 years ago, a large number of severe acute respiratory syndrome related coronaviruses (SARSr-CoV) have been discovered in their natural reservoir host, bats1-4. Previous studies indicated that some of those bat SARSr-CoVs have the potential to infect humans5-7. Here we report the identification and characterization of a novel coronavirus (nCoV-2019) which caused an epidemic of acute respiratory syndrome in humans, in Wuhan, China. The epidemic, started from December 12th, 2019, has caused 198 laboratory confirmed infections with three fatal cases by January 20th, 2020. Full-length genome sequences were obtained from five patients at the early stage of the outbreak. They are almost identical to each other and share 79.5% sequence identify to SARS-CoV. Furthermore, it was found that nCoV-2019 is 96% identical at the whole genome level to a bat coronavirus. The pairwise protein sequence analysis of seven conserved non-structural proteins show that this virus belongs to the species of SARSr-CoV. The nCoV-2019 virus was then isolated from the bronchoalveolar lavage fluid of a critically ill patient, which can be neutralized by sera from several patients. Importantly, we have confirmed that this novel CoV uses the same cell entry receptor, ACE2, as SARS-CoV.

Coronavirus has caused two large-scale pandemic in the last two decades, SARS and 41 MERS (Middle East respiratory syndrome) 8,9 . It was generally believed that  CoV, mainly found in bats, might cause future disease outbreak 10,11 . Here we report 43 on a series of unidentified pneumonia disease outbreaks in Wuhan, Hubei province, 44 central China (Extended Data Figure 1). Started from a local fresh seafood market, the 45 epidemic has resulted in 198 laboratory confirmed cases with three death according to 46 authorities so far 12 . Typical clinical symptoms of these patients are fever, dry cough, dyspnea, headache, and pneumonia. Disease onset may result in progressive 48 respiratory failure due to alveolar damage and even death. The disease was 49 determined as viral induced pneumonia by clinicians according to clinical symptoms 50 and other criteria including body temperature rising, lymphocytes and white blood 51 cells decreasing (sometimes normal for the later), new pulmonary infiltrates on chest 52 radiography, and no obvious improvement upon three days antibiotics treatment. It 53 appears most of the early cases had contact history with the original seafood market, 54 and no large scale of human-to-human transmission was observed so far. 55 56 Samples from seven patients with severe pneumonia (six are seafood market peddlers 57 or delivers), who were enrolled in intensive unit cares at the beginning of the outbreak, 58 were sent to WIV laboratory for pathogen diagnosis (Extended Data Table 1). As a 59 CoV lab, we first used pan-CoV PCR primers to test these samples 13 , considering the 60 outbreak happened in winter and in a market, same environment as SARS. We found 61 five PCR positive. A sample (WIV04) collected from bronchoalveolar lavage fluid 62 (BALF) was analysed by metagenomics analysis using next-generation sequencing 63 (NGS) to identify potential etiological agents. Of the 1582 total reads obtained after 64 human genome filtering, 1378 (87.1%) matched sequences of SARSr-CoV (Fig. 1a). 65 By de novo assembly and targeted PCR, we obtained a 29,891-bp CoV genome that 66 shared 79.5% sequence identity to SARS-CoV BJ01 (GenBank accession number 67 AY278488.2). This sequence has been submitted to GISAID (accession no. 68 EPI_ISL_402124). Following the name by WHO, we tentatively call it novel 69 coronavirus 2019 (nCoV-2019). Four more full-length genome sequences of nCoV-70 2019 (WIV02, WIV05, WIV06, and WIV07) (GISAID accession nos. 71 EPI_ISL_402127-402130) that were above 99.9% identical to each other were 72 subsequently obtained from other four patients (Extended Data Table 2). 73

74
The virus genome consists of six major open reading frames (ORFs) common to 75 coronaviruses and a number of other accessory genes (Fig. 1b). Further analysis 76 indicates that some of the nCoV-2019 genes shared less than 80% nt sequence 77 identity to SARS-CoV. However, the seven conserved replicase domains in ORF1ab 78 that were used for CoV species classification, are 94.6% aa sequence identical 79 between nCoV-2019 and SARS-CoV, implying the two belong to same species 80 (Extended Data Table 3). 81

82
We then found a short RdRp region from a bat coronavirus termed BatCoV RaTG13 83 which we previously detected in Rhinolophus affinis from Yunnan Province showed 84 high sequence identity to nCoV-2019. We did full-length sequencing to this RNA 85 sample. Simplot analysis showed that nCoV-2019 was highly similar throughout the 86 genome to RaTG13 (Fig. 1c), with 96.2% overall genome sequence identity. The 87 phylogenetic analysis also showed that RaTG13 is the closest relative of the nCoV-88 2019 and form a distinct lineage from other SARSr-CoVs (Fig. 1d). The receptor 89 binding protein spike (S) gene was highly divergent to other CoVs (Extended Data 90 Figure 2), with less than 75% nt sequence identity to all previously described SARSr-91 CoVs except a 93.1% nt identity to RaTG13 (Extended Data Table 3 bat SARSr-CoV WIV1, which is 95% identity to SARS-CoV (Extended Data Figure  102 4a and 4b). From the seven patients, we found nCoV-2019 positive in six BALF and 103 five oral swab samples during the first sampling by qPCR and conventional PCR 104 (Extended Data Figure 4c). However, we can no longer find viral positive in oral 105 swabs, anal swabs, and blood from these patients during the second sampling ( Fig.  106 2a). Based on these findings, we conclude that the disease should be transmitted 107 through airway, yet we can't rule out other possibilities if the investigation extended 108 to include more patients. 109

110
For serological detection of nCoV-2019, we used previously developed bat SARSr-111 CoV Rp3 nucleocapsid protein (NP) as antigen in IgG and IgM ELISA test, which 112 showed no cross-reactivity against other human coronaviruses except SARSr-CoV 7 . 113 As a research lab, we were only able to get five serum samples from the seven viral 114 infected patients. We monitored viral antibody levels in one patient (ICU-06) at seven, 115 eight, nine, and eighteen days after disease onset (Extended Data Table 2). A clear 116 trend of IgG and IgM titre (decreased at the last day) increase was observed (Fig. 2b). 117 For a second investigation, we tested viral antibody for five of the seven viral positive 118 patients around twenty days after disease onset (Extended Data Table 1 and 2). All patient samples, but not samples from healthy people, showed strong viral IgG 120 positive (Fig. 2b). We also found three IgM positive, indicating acute infection. in Vero E6 cells using the five IgG positive patient sera. We demonstrate that all 133 samples were able to neutralize 120 TCID50 nCoV-2019 at a dilution of 1:40-1:80. 134 We also show that this virus could be cross-neutralized by horse anti-SARS-CoV 135 serum at dilutions 1:80, further confirming the relationship of the two viruses 136 (Extended Data Table 4). 137 138 Angiotensin converting enzyme II (ACE2) was known as cell receptor for SARS-139 CoV 14 . To determine whether nCoV-2019 also use ACE2 as a cellular entry receptor, 140 we conducted virus infectivity studies using HeLa cells expressing or not expressing 141 ACE2 proteins from humans, Chinese horseshoe bats, civet, pig, and mouse. We 142 show that nCoV-2019 is able to use all but mouse ACE2 as an entry receptor in the 143 ACE2-expressing cells, but not cells without ACE2, indicating which is likely the cell receptor of nCoV-2019 (Fig. 4). We also proved that nCoV-2019 does not use other 145 coronavirus receptors, aminopeptidase N and dipeptidyl peptidase 4 (Extended Data 146 Figure 6). 147

148
The study provides the first detailed report on nCoV-2019, the likely etiology agent 149 responsible for ongoing acute respiratory syndrome epidemic in Wuhan, central China. Science 310, 676-679, (2005 1967-1976, 201 (2003).   BALF sample from ICU-06 patient was spin at 8,000 g for 15 min, filtered and 295 diluted 1:2 with DMEM supplied with 16 μg/ml trypsin before adding to cells. After 296 incubation at 37 0 C for 1 h, the inoculum was removed and replaced with fresh culture medium containing antibiotics (below) and 16 μg/ml trypsin. The cells were incubated 298 at 37 0 C and observed daily for cytopathic effect (CPE). The culture supernatant was 299 examined for presence of virus by qRT-PCR developed in this study, and cells were 300 examined by immunofluorescent using SARSr-CoV Rp3 NP antibody made in house 301 (1:100). Penicillin (100 units/ml) and streptomycin (15 μg/ml) were included in all 302 tissue culture media. The virus neutralization test was carried out in a 48-well plate. The patient serum 313 samples were heat-inactivated by incubation at 56 0 C for 30 min before use. The 314 serum samples (5 µL) were diluted to 1:10, 1:20, 1:40 or 1:80, and then an equal 315 volume of virus stock was added and incubated at 37 0 C for 60 min in a 5% CO2 316 incubator. Diluted horse anti SARS-CoV serum or serum samples from healthy 317 people were used as control. After incubation, 100 µL mixtures were inoculated onto 318 monolayer Vero E6 cells in a 48-well plate for 1 hour. Each serum were repeated 319 triplicate. After removing the supernatant, the plate was washed twice with DMEM 320 medium. Cells were incubated with DMEM supplemented with 2% FBS for 24 hours. 321 Then the cells were fixed with 4% formaldehyde. And the virus were detected using