The recent ancestry of Middle East respiratory syndrome coronavirus in Korea has been shaped by recombination.

Middle East respiratory syndrome coronavirus (MERS-CoV) causes severe cases of human respiratory disease. Since 2012, the victims have mainly come from the Middle East countries or sporadically from some other geographical regions seeded by the travelers who visited the Middle East. Such an introduction through travelling led to the emergence of a MERS-CoV outbreak in Korea in May 2015, which caused more than 140 confirmed human cases in less than a month. Using 70 complete genome sequences of MERS-CoV isolates, including the most recent sequences for the Korean and Chinese isolates, we reconstructed the phylogenetic relationships of the complete genome and the individual protein coding regions. The Korean MERS-CoV strain clustered in the previously established Hafr-Al-Batin-1_2013 clade together with two Saudi Arabian and one Chinese strain sampled in 2015. Although these four strains remained monophyletic in the entire protein-coding region, this clade showed different phylogenetic relationships across the genome, indicating a shared unique recombination pattern that is different from previously reported putative recombination strains. Our findings suggest that the recent ancestor of the Korean and its related MERS-CoV strains is characterized by unique mosaic genome pattern that is different from other putative recombinants.

demonstrated largely among camels and people in the Middle East 12 and travellers who visited the region could occasionally seed the virus to long-distance destinations in the Europe, Southeast Asia, and North America 9,13,14 .
The first case of traveler-associated MERS-CoV outbreak in Korea occurred in May 2015 15 . A 68-year-old index person traveled to four countries in the Middle East and returned to Korea on May 4 without clinical complaints. When clinical symptoms were developed one week later, the index person sought medical attention in two primary clinics and two upper-class hospitals, but a diagnosis was only made on May 20th by confirming MERS-CoV infection. In the meantime, one of the nosocomial contacts of the index patient travelled to China via Hong Kong and was diagnosed with MERS-CoV infection on May 27 in Guangdong. As of June 19, 2015, a total of 166 confirmed cases have been reported in the MERS outbreak in Korea including one Chinese case 16 . Even though the viral spread is mainly limited to hospital-based transmission, as seen in previous cases 17,18 and no more confirmed cases are reported in Korea, this represents the largest outbreak outside the Middle East region. To investigate the evolutionary history of the MERS-CoV strain (GenBank accession No. KT029139; KOR-KNIH-002_2015, KOR002) responsible for the outbreak in Korea, we analyzed 70 complete genome sequences available in the NCBI including the most recent Chinese (KT006149; China-GD01_2015, China01) and Saudi Arabian (KT026455; Riyadh-KSA-2959_2015, KSA2959 and KT026456; Riyadh-KSA-4050_2015, KSA4050) sequences (Table S1).
Recombination has been described previously in other coronavirus genomes [21][22][23] , and was also suggested to affect the evolution of MERS-CoV 24 Table 1). The 20 strains isolated before 2015 appeared to retain two recombination breakpoints in the linear ORF alignment (Fig. 3a). However, four putative recombinants in 2015 (KOR002, China01, KSA2959, and KSA4050), coinciding with the strains showing the unique relationships noted in the phylogenetic trees above (Figs 1 and 2 and S1 to S4), shared four breakpoints, which resulted in five recombinant fragments (Fig. 3b). Based on these mosaic patterns shared among the four putative recombinants in 2015, we compiled five new datasets representing each non-recombinant fragment and evaluated the phylogenetic relationships of the four putative recombinants in 2015.
In each tree (Fig. 3c-e and S5), the four putative recombinants in 2015 always grouped together and showed close relationships with their parental strains as detected in the recombination test (Table 1). A recombination analysis using a larger window size suggested similar strains as putative recombinants, especially for KOR002 and its related 2015 strains (Table S2). Consistent with phylogenetic results above, the trees of each recombination region exhibited similar evolutionary clustering patterns according to the inclusion of corresponding ORF regions: in the trees of recombination region II and IV ( Fig. 3d and S1B, S2, S3, and S4A), which represent the ORF1ab and a large part of the S-M protein coding regions, each clade clustered similar to the trees in Fig. 1b and S1A, respectively. In the recombination region III (18,033 to 23,502 region; 5,470 nucleotides in length) (Fig. 3b), which comprises the region of C-terminal ORF1ab and N-terminal S protein genes, the tree pattern appeared to be similar to that of S protein gene (Figs 2a and 3e), which is characterized by a much higher substitution rate than the ORF1ab (Table  S3). Even though we used cell culture media of the third passaged Vero cells for the RNA isolation of the KOR002 strain, the possibility for contamination of the original sputum sample with multiple viral clones and subsequent recombination can be excluded because the Chinese and Saudi Arabian strains related to KOR002 all exhibited similar genomic recombination patterns. Taken together, these results suggest that genetic recombination has contributed to the evolutionary dynamics of MERS-CoV genomes and that this has particularly shaped the recent MERS-CoV ancestry of the Korean outbreak.
Based on the phylogenetic clustering patterns and the recombination imprints we detected (Figs 1-3 and Table 1), one of the recombinant strains that evolved from the Hafr-Al-Batin-1 clade was introduced by air travel into Korea. We can only speculate about when the genetic recombination occurred. However, in the Riyadh area, some strains of the Jeddah-Riyadh clade already circulated before May 2015, and considering the close relationships between some of the Hafr-Al-Batin-1 and Jeddah-Riyadh clade sequences shown in the phylogenetic trees,  Fig. 3e, genetic exchange appeared to have occurred among them and affected the phylogenetic evolution of MERS-CoV lineages before the Korean traveler was infected by a productive recombinant strain in the area 24 . However, as discussed previously with regard to the emergence of severe acute respiratory syndrome coronavirus (SARS-CoV) in 2002, other evolutionary aspects, such as mutation rates and selection pressure, should be considered to understand the evolutionary dynamics of MERS-CoV 21,25,26 . Possibly different molecular clock rates of MERS-CoV in animal hosts and humans may also have to be taken into account. As shown by the genomic evolution of influenza A viruses 27 , MERS-CoV might experience different evolutionary courses in different hosts. To better understand these dynamics, the chain of MERS-CoV zoonotic transmissions should be further clarified.
Outside the Arabian Peninsula, Korea experienced the biggest outbreak of MERS. Through seeding by only a single patient, MERS-CoV resulted in more than 160 confirmed patients in less than a month and thousands of people were confined under close monitoring. The CFR of MERS outbreak in Korea may appear to be relatively low (approximately 11.7%), compared with the previous outbreaks in the Middle East and no signs of community transmission have been reported. In addition, an announcement regarding the situation assessment of MERS outbreak in Korea issued by the WHO Global Alert and Response program stated that significant virological change was not seen so far and the transmission patterns are unlikely to be different from those previously reported in the Middle East 16 . However, human infections with the MERS-CoV are ongoing in the Middle East countries, and the virus may travel anywhere from the region as seen in the current Korean outbreak and many other previous cases. In support of the struggle against the relatively new MERS-CoV infection, effective medical arsenals should be prepared using the comprehensive measures of epidemiology, pathogenesis, and transmission researches.
In conclusion, we suggest that the MERS-CoV outbreak in Korea appears to be caused by a strain that is closely related to three 2015 strains from the Hafr-Al-Batin-1 clade and that the relatively recent ancestor of these viruses exhibits a unique recombination pattern that is different from other putative recombinants.

Methods
Sequence preparation. In    Phylogenetic trees and evolutionary dynamics. Phylogenetic relationships, evolutionary rates (nucleotide substitutions/site/year), and the time (year) of the most recent common ancestor (tMRCA) were estimated using a time-framed Bayesian evolution analysis approach via a Markov Chain Monte Carlo (MCMC) inference method, implemented in the BEAST package (v1.8.2) 19 . We used the GTR+ I+ Γ substitution model, a lognormal relaxed molecular clock model and a Bayesian skygrid tree prior. For the of ORF4a, ORF4b, and ORF5 datasets, we used the HKY+ Γ substitution model and a strict clock model. The evolutionary parameters (only for substitution and molecular clock parameters, not the tree model) were linked for the dataset of E coding region by adjoining those of the complete genome sequences. MCMC analyses were run for 50 million iterations, sampling every 25 thousand iterations after a 10% burn-in. Two or three independent runs for each dataset were combined and assessed to ensure their convergence in Tracer (v1.6) 35 . The MCMC tree samples were used to summarize a maximum clade credibility (MCC) trees for each dataset using TreeAnnotator v1.8.1, which were visualized using FigTree (v1.4.2). The estimates were presented as mean values along with the lower and upper limits of the 95% highest probability density (HPD).

Recombination analysis.
To detect putative recombinant regions in the MERS-CoV genome, we used the RDP4 program (v.4.39) 36 with a default (window size: 30 bp) and a higher window size of 1,000 bp, and the results obtained were confirmed by a manual bootscan method. Using the recombination breakpoints detected in the KOR002 strain by the default setting, we compiled new sequence datasets by dividing the complete genome sequences into five non-recombinant fragments. We subsequently reconstructed the phylogenetic relationships in each region using a maximum likelihood method (GTR+ I+ Γ , 500 bootstrap replication) implemented in MEGA5 37 . The trees were visualized using FigTree (v1.4.2).