International travel played a significant role in the early global spread of SARS-CoV-2. Understanding transmission patterns from different regions of the world will further inform global dynamics of the pandemic. Using data from Dubai in the United Arab Emirates (UAE), a major international travel hub in the Middle East, we establish SARS-CoV-2 full genome sequences from the index and early COVID-19 patients in the UAE. The genome sequences are analysed in the context of virus introductions, chain of transmissions, and possible links to earlier strains from other regions of the world. Phylogenetic analysis showed multiple spatiotemporal introductions of SARS-CoV-2 into the UAE from Asia, Europe, and the Middle East during the early phase of the pandemic. We also provide evidence for early community-based transmission and catalogue new mutations in SARS-CoV-2 strains in the UAE. Our findings contribute to the understanding of the global transmission network of SARS-CoV-2.
In December 2019, several cases of a new respiratory illness (now called COVID-19) were reported in the city of Wuhan (Hubei province, China) and in January 2020 it was confirmed these infections were caused by a novel coronavirus subsequently named SARS-CoV-21,2. On 12 March 2020, the ongoing SARS-CoV-2 outbreak was declared a pandemic by the World Health Organization (WHO)3. As of 09 October 2020, there have been more than 36.5 million laboratory-confirmed cases of COVID-19 and more than 1,050,000 deaths in 188 countries4.
Dubai in the United Arab Emirates (UAE) is a cosmopolitan metropolis that has become a popular tourist destination and home to one of the busiest airport hubs in the world connecting the east with the west2,5. Currently, the UAE has reported, 104,004 confirmed cases and 442 COVID-19-associated deaths (0.4% case fatality; 09 October 2020)4. In view of Dubai’s important tourism and travel connections, we attempted to characterize the full-genome sequence of SARS-CoV-2 strains from the index and early patients with COVID-19 in Dubai to gain a deeper understanding of the molecular epidemiology of the outbreak in Asia, Europe, and the Middle East.
Patient cohort and SARS-CoV-2 whole genome sequencing
The 49 patients included in this study were the earliest confirmed cases in the UAE. The time period of 29 January to 18 March 2020 was specifically selected to focus on early SARS-CoV-2 viral introductions into the UAE. The first index patient in the UAE was reported on the 29 January 2020. Subsequently, Emirates airlines suspended flights to and from 30 global destinations from 18 March 2020 and Dubai airport was closed to passenger flights on 25 March 2020; hence, patients after 18 March 2020 were expected to be more likely a result of community transmission as opposed to imported infections. The index patient in the UAE was a female Chinese tourist (aged 63 years) travelling from Wuhan with other family members to visit her son in Dubai. The Chinese family arrived in Dubai on 16 January 2020 and tested positive on the 29 January 2020 (Table 1). Over the next seven weeks, there were multiple new cases among tourists and residents with travel history (44.9% had travel history from Europe) (Table 1). Nearly two-thirds (63.3%) of patients were male and 61.2% were aged between 20 and 44 years reflecting the young age structure of the UAE population5. Majority of patients (88%) were asymptomatic or had mild symptoms and only four required intensive care with invasive ventilation (one death; Table 1).
SARS-CoV-2 whole genome sequencing was performed on all 49 COVID-19 patient samples. Only genomes with almost complete coverage (n = 25, “Methods” section) were used for phylogenetic analysis. The 25 genomes were obtained from cases with disease onset in late January (n = 1), early February (n = 1), late February (n = 6), early March (n = 8), and late March (n = 9). Of those, approximately two-thirds were male and aged between 10 and 40 years (Table 1).
To understand early viral transmission in Dubai in the global context, we performed phylogenetic analysis on the 25 novel viral genomes we sequenced from early patients in the UAE (Table 1) in this study (“Methods” section) along with 157 largely complete SARS-CoV-2 genomes deposited in GISAID from different countries between December 2019 and early March 20206,7 (Supplementary Table S1).
Consistent with multiple independent introductions, the UAE SARS-CoV-2 isolates were distributed across the phylogenetic tree (Fig. 1). The majority (76%) clustered with clades A2a (48%) and A3 (28%) which are largely composed of isolates from COVID-19 patients in Europe and Iran, respectively. This clearly suggests that the major introductions into the UAE during the early phase of the pandemic originated from Europe and the Middle East/Iran.
Supporting its European origin, all individuals with the A2a clade isolates were mostly European and/or with recent travel history to a European country, mainly to Italy (n = 4), Germany (n = 3), United Kingdom (n = 2), Spain (n = 1), and Norway (n = 1) (Table 1 and Fig. 2). Onset of symptoms reported in this group was within or after the second week of March (Table 1) suggesting that the viral infections in this group could have occurred during late February to early March. Of note, a SARS-CoV-2 isolate submitted from Mexico (GISAID ID: EPI_ISL_412972) was 100% identical to that from an Italian expatriate working in the UAE (L0881), while another submitted in Germany (GISAID ID: EPI_ISL_412912) differed by a single mutation (Fig. 1). All three individuals had a recent travel history to Italy and overlapping infection time frames (late February–early March). Within this group, isolates from patients L1758, L0484, and L2185 were identical (Fig. 2) suggesting a possible common direct source of transmission.
Isolates in the A3 clade were obtained from five individuals with travel history to Iran (L2409, L6627, L0904, L0184, and L4682), one Indian resident (L0231), and one Indian tourist (L0068) (Fig. 2). Onset of symptoms for the five individuals with travel history in this group was reported to be around 21–24 February (Table 1). Patient L0231 had no travel history and reported symptom onset on 7 March suggesting a possible community-based transmission event. Interestingly, all but one isolate obtained from patient L4682—the only patient in this group with severe clinical presentation—shared a common ancestral strain identical to that obtained from patient L2409. The SARS-CoV-2 isolate from L4682 had two unique missense variants in the ORF1ab gene (Supplementary Table S3) which might be worth investigating for any possible biological effect(s). Consistent with its Iranian origin, a SARS-CoV-2 sequence submitted by the University of Sydney (GISAID ID: EPI_ISL_412975) on 28 February 2020 differed by only two mutations from that of L2409, and both this Iranian male tourist and the Australian male had a recent travel history to Iran. We speculate that individuals with travel history to Iran around this time frame (L8386, L6867, and L3280), for whom a full viral genome sequence could not be obtained, were also very likely to cluster within the A3 clade.
Only one viral strain obtained from L5630, a family member of the early Chinese index patient, belonged to the B2 clade. Although we did not obtain full viral genome sequences from the other members of that Chinese family, we expect that all had a similar strain to L5630. Interestingly, our data do not suggest any transmission of this clade at least among the earliest patients (Fig. 2) included in this study which is consistent with the reported early detection and isolation of this family. This finding also supports the notion of secondary source(s) for the ongoing local transmission.
The remaining five isolates did not belong to A2a, A3, B2, or any of the clades on nextrain.org as of 12 May 2020, suggesting earlier introduction(s). Those isolates were obtained from four Asians, two residents (L4280, L6599) and two tourists (L4184, L9766), and one Czech resident (L1014) working as an airline cabin crew with travel history to Austria (Table 1). Consistent with the Asian predominance among this patient group and the fewer (1 or 2) mutations for most of their isolates (4 out of 5) relative to the Wuhan reference genome (Fig. 2), several early viral strains submitted in Asia clustered very closely to this group (Fig. 1). L4280 was the first sequenced patient without travel history and became infected after transporting a work colleague, L0826, to hospital. Patient L0826 reported symptoms onset on 22 January suggesting that community-based transmission started in the UAE in early-to-mid January. L6599 was an Indian expatriate living with three other Filipino and Sri Lankan expatriates (L3715, L2771, L8480) (Table 1). All four individuals had no documented recent travel history suggesting local transmission, and although full viral genome sequences could only be obtained from one patient L6599, it is very likely that all have related isolates.
In aggregate, we identified 70 variants relative to the reference GenBank SARS-CoV-2 sequence NC_045512.2. The majority of these variants were missense (n = 41) with the most frequent nucleotide change being C > T (n = 33), and more than half (38/70) were localized in the ORF1ab gene (Supplementary Table S3). Notably, 2 out of the 70 variants were novel as they were not identified in the Chinese National Center for Bioinformation Database (https://bigd.big.ac.cn/ncov/variation/annotation; last accessed August 13, 2020). The novel variants were a coding missense variant and a synonymous variant in the N and ORF1ab genes, respectively. In addition, 9 variants were very rare (i.e. seen less than 4 times out of 81,625 genomes), including one missense variant (F850I) in the S gene (Supplementary Table S3).
Our findings suggest multiple independent spatiotemporal introductions of SARS-CoV-2 into the UAE where the majority of introductions (76%) were from Iran and Europe during two different time frames (mid-late February and early March, respectively). Although we show evidence for possible local transmission within the Middle Eastern/Iranian isolates, it will be important to sequence further isolates at subsequent dates to determine whether these introductions succeeded in seeding more clustering and whether such clustering was affected by proactive and vigilant public health measures, such as transitioning to online learning for schools and universities, implementing work-from-home protocols across all sectors, and nationwide disinfection campaigns.
Six isolates (22%) did not cluster with the European or Iranian groups and represented earlier introductions which did not appear to seed larger clusters in our sampled cohort. However, additional sequencing is needed to determine the extent of community transmission, especially given that our data strongly suggest that the earliest patient (early to mid-January) in the UAE could have been a secondary infection from one of those introductions.
The new SARS-CoV-2 mutations identified in the UAE warrant further investigation to explore whether they influence viral characteristics, especially pathogenicity, or provide important information for vaccine development. One of the major strengths of the study was the non-biased representative sample of early cases, including the index family cluster, in Dubai from the only central testing lab, along with detailed demographic and clinical information. Limitations included the inability to conduct full whole genome sequencing on more samples most likely due to low viral load issues, although we were able to deduce the origin of transmission in most of those individuals based on travel history. Regardless, this study contributes important molecular epidemiological data that can be used to further understand the global transmission network of SARS-CoV-28.
Human subjects and ethics approval
Sociodemographic and clinical data was extracted from the electronic medical records of the earliest 49 patients with laboratory confirmed SARS-CoV-2 from 29 January to 18 March 2020 using the WHO case report form. Cases were categorized into three groups based on disease severity: asymptomatic and mild cases with either no symptoms or mild non-life-threatening symptoms e.g. dry cough, mild fever; moderate cases with symptoms (e.g. breathlessness, persistent fever) requiring hospitalization and medical attention (e.g. supplementary oxygen therapy, intravenous fluids); and severe/critical cases with advanced disease and pneumonia requiring admission to intensive care units and specialized life-support treatment (e.g. mechanical ventilation). This study was approved by the Dubai Scientific Research Ethics Committee—Dubai Health Authority (approval number #DSREC-04/2020_02). The requirement for informed consent was waived as this study was part of a public health surveillance and outbreak investigation in the UAE. Nonetheless, all patients treated at a healthcare facility in the UAE provide written consent for their deidentified data to be used for research and this study was performed in accordance with the relevant laws and regulations that govern research in the UAE.
SARS-CoV-2 whole genome sequencing
All 49 COVID-19 patients tested positive for SARS-CoV-2 by RT-qPCR using RNA extracted from nasopharyngeal swabs following the QIAamp Viral RNA Mini or the EZ1 DSP Virus Kits (Qiagen, Hilden, Germany). RNA libraries from all samples were then prepared for shotgun transcriptomic sequencing using the TruSeq Stranded Total RNA Library kit from Illumina (San Diego, CA, USA), following manufacturer’s instructions. Libraries were sequenced using the NovaSeq SP Reagent kit (2 X 150 cycles) from Illumina (San Diego, CA, USA). Sample L5630 underwent a target enrichment approach where double stranded DNA (synthesized using the QuantiTect Reverse Transcription Kit from Qiagen, Hilden, Germany) was amplified using 26 overlapping primer sets covering most of the SARS-CoV-2 genome as recently described by our group9. PCR products were then sheared by ultra-sonication (Covaris LE220-plus series, MA, USA) and prepared for sequencing using the SureSelectXT Library Preparation kit (Agilent, CA, USA). This library was sequenced using the Illumina MiSeq Micro Reagent Kit, V2 (2 X 150 cycles).
SARS-CoV-2 genome assembly
High quality (> Q30) sequencing reads were trimmed and then aligned to the reference SARS-CoV-2 genome from Wuhan, China (GenBank accession number: NC_045512.2) using a custom-made bioinformatics pipeline (Supplementary Fig. S1). Assembled genomes with at least 20X average coverage across most nucleotide positions (56–29,797) were used for subsequent phylogenetic analysis (Supplementary Table S1). A total of 25 viral genomes (24 by shotgun and 1 by target enrichment) met this inclusion criterion and were submitted to the Global Initiative on Sharing All Influenza Data (GISAID) database under accession IDs: EPI_ISL_435119-435,142 (Supplementary Table S2).
We downloaded 157 global non-UAE sequences (Supplementary Table S2) with largely complete genomes (nucleotide positions 56–29,797) submitted to GISAID EpiCoV (https://www.epicov.org/) between December 2019 and 04 March 20207. All 182 sequences, including the 25 UAE sequences generated in this study, were analysed using Nexstrain10, which consists of Augur v6.4.3 pipeline for multiple sequence alignment (MAFFT v7.45511) and phylogenetic tree construction (IQtree v1.6.1212). Tree topology was assessed using the fast bootstrapping function with 1000 replicates. Tree visualization and annotations were performed in FigTree v1.4.413 for Fig. 1 and in auspice v2.13.0 tool10 for Fig. 2. SARS-CoV-2 clades annotations were performed in auspice v2.13.0 and cross-checked with nextstrain.org as of 12 May 2020.
All data generated or analysed during this study are included in this published article (and its Supplementary Information files) and the sequences are available on the GISAID database under the corresponding accession numbers.
Ashour, H. M., Elkhatib, W. F., Rahman, M. M. & Elshabrawy, H. A. Insights into the recent 2019 novel coronavirus (SARS-CoV-2) in light of past human coronavirus outbreaks. Pathogens. 9, E186. https://doi.org/10.3390/pathogens9030186 (2020).
Uddin, M. et al. SARS-CoV-2/COVID-19: viral genomics, epidemiology, vaccines, and therapeutic interventions. Viruses 12, 526. https://doi.org/10.3390/v12050526 (2020).
World Health Organization (WHO). Coronavirus disease 2019 (COVID-19) Situation Report—52.https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200312-sitrep-52-covid-19.pdf?sfvrsn=e2bfc9c0_4 (2020).
COVID-19 Dashboard by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU). https://coronavirus.jhu.edu/map.html.
Loney, T. et al. An analysis of the health status of the United Arab Emirates: the “Big 4” public health issues. Glob. Health Action 6, 20100. https://doi.org/10.3402/gha.v6i0.20100 (2013).
Wu, F. et al. A new coronavirus associated with human respiratory disease in China. Nature 579, 265–269. https://doi.org/10.1038/s41586-020-2008-3 (2020).
Forster, P., Forster, L., Renfrew, C. & Fortser, M. Phylogenetic network analysis of SARS-CoV-2 genomes. Proc. Natl. Acad. Sci. U S A. 117, 9241–9243. https://doi.org/10.1073/pnas.2004999117 (2020).
Alm, E. et al. Geographical and temporal distribution of SARS-CoV-2 clades in the WHO European Region, January to June 2020. Eur. Surveill. 25, 2001410. https://doi.org/10.2807/1560-7917.ES.2020.25.32.2001410 (2020).
Harilal, D. et al. SARS-CoV-2 whole genome amplification and sequencing for effective population-based surveillance and control of viral transmission. Clin. Chem. https://doi.org/10.1093/clinchem/hvaa187 (2020).
Hadfield, J. et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics 34, 4121–4123. https://doi.org/10.1093/bioinformatics/bty407 (2018).
Katoh, K., Misawa, K., Kuma, K. & Miyata, T. MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066. https://doi.org/10.1093/nar/gkf436 (2002).
Chernomor, O., von Haeseler, A. & Quang Minh, B. Terrace aware data structure for phylogenomic inference from supermatrices. Syst. Biol. 65, 997–1008. https://doi.org/10.1093/sysbio/syw037 (2016).
Rambaut. A. FigTree 1.4.2 Software. Institute of Evolutionary Biology, Univ. Edinburg.
This work was supported by internal funds from the College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences (to A.T., grant no. MBRU-CM-RG2020-04).
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Tayoun, A.A., Loney, T., Khansaheb, H. et al. Multiple early introductions of SARS-CoV-2 into a global travel hub in the Middle East. Sci Rep 10, 17720 (2020). https://doi.org/10.1038/s41598-020-74666-w
Nature Reviews Microbiology (2022)
Genomic epidemiology of SARS-CoV-2 in the UAE reveals novel virus mutation, patterns of co-infection and tissue specific host immune response
Scientific Reports (2021)