Phylogeographic analysis of hemorrhagic fever with renal syndrome patients using multiplex PCR-based next generation sequencing

Emerging and re-emerging infectious diseases caused by RNA viruses pose a critical public health threat. Next generation sequencing (NGS) is a powerful technology to define genomic sequences of the viruses. Of particular interest is the use of whole genome sequencing (WGS) to perform phylogeographic analysis, that allows the detection and tracking of the emergence of viral infections. Hantaviruses, Bunyaviridae, cause hemorrhagic fever with renal syndrome (HFRS) and hantavirus pulmonary syndrome (HPS) in humans. We propose to use WGS for the phylogeographic analysis of human hantavirus infections. A novel multiplex PCR-based NGS was developed to gather whole genome sequences of Hantaan virus (HTNV) from HFRS patients and rodent hosts in endemic areas. The obtained genomes were described for the spatial and temporal links between cases and their sources. Phylogenetic analyses demonstrated geographic clustering of HTNV strains from clinical specimens with the HTNV strains circulating in rodents, suggesting the most likely site and time of infection. Recombination analysis demonstrated a genome organization compatible with recombination of the HTNV S segment. The multiplex PCR-based NGS is useful and robust to acquire viral genomic sequences and may provide important ways to define the phylogeographical association and molecular evolution of hantaviruses.

Scientific RepoRts | 6:26017 | DOI: 10.1038/srep26017 ultra-low copy and frequent variation of viral genomes obtained from specimens. However, the characterization of viral genomic sequences by NGS provides broad and deep insights into preventive and therapeutic strategies for the virus infections 11 .
Hantaviruses, family Bunyaviridae, are negative sense RNA viruses and contain tripartite RNA genomes; large (L), medium (M), and small (S) segments encoding RNA-dependent-RNA polymerase (L), membrane glycoproteins (Gn and Gc), and nucleocapsid protein (N), respectively 12 . Hantaviruses are the causative agents of hemorrhagic fever with renal syndrome (HFRS) in Eurasia and hantavirus pulmonary syndrome (HPS) in the Americas [13][14][15] . They are a significant public health threat with ~200,000 clinical cases reported annually and a fatality rate ranging from 1 to 35% worldwide 16 . Hantaviruses are transmitted to human via the respiratory route of aerosolized infectious particles from rodent excreta or bite of infected rodents. Apodemus agrarius is the reservoir of Hantaan virus (HTNV), an etiological agent of HFRS in the Republic of Korea (ROK) and China 17 . Common symptoms of HFRS include headache, myalgia, abdominal and back pain, nausea, vomiting, and diarrhea. The typical disease course consists of five phases; febrile, hypotensive, oliguric, diuretic, and convalescent 18 . The febrile phase is an initial stage defined by fever, pains, and edema during 3-5 days. The hypotensive phase lasts few hours and identified as decreased blood pressure, internal bleeding, proteinuria, and thrombocytopenia. The oliguric phase lasts for 3-7 days and is marked by renal dysfunction (reduced urine output), hypervolemia, and blood electrolyte imbalance. The diuretic (increased urine output) and convalescent phases are recovery stages that last for several weeks to months, and is characterized by progressive improvements in glomerular filtration rate, renal blood flow, and urine output control. A critical hallmark of HFRS is capillary leakage that results in edema and hemorrhage, suggesting that the vascular endothelium is damaged by cytokine storm against HTNV infection 19 . Hantavirus infections remain less understandable due to the lack of animal experimental models, difficulties in propagating the agent, and reverse genetic tools.
In this study, a method based on multiplex PCR-based NGS to achieve whole genome sequencing of HTNV L, M, and S segments from human or mammalian clinical specimens was developed. Using this method, the whole genome sequence of HTNV L, M, and S segments from four ROK and US Army military HFRS patients was obtained and compared with genome sequences from natural hosts captured from endemic areas to perform the phylogeographic analysis. Based on this analysis, the most likely site of HTNV infections was determined to be the field military areas where patients had trained during the 50 days previous to the onset of symptoms. In addition, analysis of the HTNV tripartite genomes suggested a recombination of the S segment in nature.

Results
Whole genome sequencing of HTNV using multiplex PCR-based NGS. To obtain whole genome sequence of HTNV from the patient samples, HTNV tripartite genomes were amplified by performing HTNVspecific multiplex PCR. The coverage of HTNV tripartite genomes from ROKA13-8, ROKA14-11, US8A14-2, and US8A15-1 is shown (Fig. 1). The genomic sequence of HTNV L, M, and S segments from ROKA 13-8 was recovered up to 85.9%, 98.2%, and 100% based on full-length of the prototype HTNV 76-118 tripartite genomes, respectively. For ROKA14-11, 97.2% of HTNV L segment was obtained and the consensus sequence of HTNV M and S segments was completely obtained. The coverage of HTNV L segment from US8A14-2 was 96.1% whereas the HTNV M and S segments were completely sequenced. The genomic sequence of HTNV from US8A15-1 was acquired by 87.4%, 98.2%, and 100% for L, M, and S segments, respectively.

Phylogeographic analyses of HFRS patients.
For phylogeographic analyses, the whole genome sequence of 26 HTNV strains was acquired from rodents collected in endemic areas in the ROK (Fig. 2). The HTNV strains in the phylogenetic tree consisted of 3 strains from Twin Bridge Training Area South (TBTA-S), 3 strains from Twin Bridge Training Area North (TBTA-N), 3 strains from Dagmar North (DN) in Paju, 2 strains from Fire Point 131 (FP131) in Yeoncheon, 4 strains from Nightmare Range (NR) and Rodriguez Multi-Purpose Range Complex (MPRC) in Pocheon in Gyeonggi province, and 6 strains from Cheorwon-A (Guntan-ri) and B (Jigyeong-ri and Munhye-ri), 2 strains from Hwacheon, and 3 strains from Yanggu in Gangwon province. The genomic sequence of HTNV tripartite genomes from A. agrarius formed geographic clusters providing a platform for the phylogeographic analysis.
Patient ROKA13-8, assigned to Hwacheon, conducted military training in Cheorwon (  Table 2. On December 1, the Soldier was confirmed to be positive for HTNV infection by IFA and RT-PCR. Phylogenetically, the L, M, and S segments of ROKA14-11 associated with the HTNV strains from TBTA-S. Patient US8A14-2 exhibited the onset of clinical symptoms on December 17, 2014 and was admitted to the Brian Allgood Army Community Hospital (BAACH) with diarrhea, vomiting, fever, and rapid pulse on December 21. The Soldier was diagnosed for HFRS while referring also myalgia, headache, vomiting, diarrhea, epigastric pain, petechiae on his thigh, splenomegaly, and kidney inflammation. Laboratory diagnosis confirmed that the Soldier was positive for HTNV by IFA and RT-PCR on December 29. According to the medical record, the Soldier was a humvee driver and tactically moved to Rodriguez MPRC range on

Recombination analysis of HTNV.
To examine the genetic exchange of HTNV S segment, HTNV tripartite genomes were sequentially aligned and analyzed using the RDP4 program (Fig. 3). The L and M segment genomic sequences of US8A15-1 showed high homology with HTNV strains, Apodemus agrarius (Aa) 04-722 and Aa09-410 collected from Pocheon. The partial sequence (coordinated to 577-1,578nt) of S segment of US8A15-1 was highly homologous to Aa14-272 identified from Hwacheon. P-value of the analyses was from 5.803E-5 to 7.013E-11 and the RDP recombination consensus score (RDPRCS) of US8A15-1, approximately 0.659, was relatively high when compared to HTNV strains in Pocheon (RDPRCS = 0.011) and Hwacheon (RDPRCS = 0.330). These results indicated that UA8A15-1 may be a recombinant due to the exchange of partial S segment in nature.

Discussion
In 2009, a previous study showed a phylogeographic link between four US Soldiers diagnosed with HFRS due to exposure at military training sites and A. agrarius positive for HTNV based on a partial sequence of HTNV M segment 20 . This analysis provided mechanisms to identify the location where the soldiers most likely acquired HTNV infections and under what conditions. The analysis of 354 bp-partial sequence of HTNV M segment between HFRS patients and rodent hosts may be insufficient to better define the phylogeographic link for future cases because of the low resolution. A recent study demonstrated that whole genome sequencing of Lassa virus enhanced the resolution of phylogenetic analysis for the molecular diversity and genomic characteristic compared to the partial sequences 21 . The extension of HTNV tripartite genomic sequences was limited because of the lack of NGS technology and ultra-low copy of viral RNA in clinical samples. Notably, whole genome sequences of HTNV was obtained from HFRS patient sera by multiplex PCR-based NGS using the primer set specific for HTNV L, M, and S segments. This attempt yielded enhanced levels of viral reads throughout the NGS, demonstrating the complete open reading frame genomic sequences of HTNV S and M segments. The multiplex PCR-based NGS is robust and efficient for acquiring the whole genome sequence of HTNV from HFRS patients as well as wild rodent hosts.
To establish the phylogeographic map, HTNV tripartite genome sequences from A. agrarius captured in the endemic and military training areas near the demilitarized zone were completely acquired using RT-PCR. Compared with the genomic HTNV sequences of patients and the rodent hosts, the site where HFRS patients acquired HTNV infection was suggested by the phylogeographic analyses (Fig. 4). ROKA13-8 and ROKA14-11 (Korean army HFRS patients) formed a cluster with HTNV strains from Cheorwon-B and TBTA-S, respectively. Patient ROKA13-8 was assigned to Hwacheon, but conducted military training in Cheorwon. ROKA13-8 was phylogenetically associated with HTNV strains in Cheorwon-B suggesting the Soldier was most likely infected with HTNV during the training in Cheorwon. This result also suggested that patient ROKA14-11 conducted military activity and was the most likely infected with HTNV near TBTA-S in Paju. Phylogenetic analysis of US Army HFRS patient US8A14-2 showed a close relationship with HTNV strains from DN in Paju. This result and clinical records suggested the Soldier was the most likely infected with HTNV at DN between November 17 and 20, fell within the normal incubation period of HTNV. The US Army HFRS patient US8A15-1 conducted military training at MPRC, Pocheon. The phylogenetic association of US8A15-1 with HTNV strains in Pocheon identified the most likely site infection. The phylogenetic and geographic analyses of HTNV whole genome sequences from the patients and natural reservoirs provide an epidemiological diagnostic and surveillance system for annual endemic HTNV infections, identifying the most likely site of infection and under what conditions the infection was acquired. These data contribute to the development of disease risk analyses and preventive measures by changing the environment that reduces rodent habitat and training site avoidance during periods of highest risk 22 .
Recombination is a molecular genetic mechanism that confers the genetic diversity in RNA viruses in nature 23 . The genetic event of bunyaviruses was observed in nature and in vitro 24,25 . The frequency of the recombination of hantavirus tripartite genomes may be associated with the function of proteins encoded on the L, M, and S segments. Diverse recombinants generated by the exchange of different genetic information could be present. For instance, Seoul virus, Andes virus, Tula virus, and Pummala virus generated the S segment recombinants, whereas the M segment recombination was observed in HTNV, China 26 . In this study, the phylogenetic tree demonstrated L and M segments of US8A15-1 formed clusters with HTNV in Pocheon but the partial sequence of S segment of the HTNV was closely associated to that of HTNV in Hwacheon. This observation suggests that  US8A15-1 appears to be an S segment recombination of HTNV with 0.659 of RDPRCS and P-value < 0.05. In addition, the exchange of genetic components may be a critical factor in the pathogenicity of the viral infections 27 . Given our limited knowledge of genetic events in nature, whether the S segment recombination is a virulence determinant of hantavirus-associated diseases remains to be investigated.
In conclusion, multiplex PCR-based NGS elicited whole genome sequences of HTNV from HFRS patients. In combination with the whole genomic sequences of HTNV from natural reservoirs, A. agrarius, phylogeographic analyses may be useful for genomic-based diagnosis and surveillance of hantavirus-borne outbreaks. The novel platform of NGS-based diagnosis provides a potential tool to develop risk analyses and preventive and therapeutic strategies against emerging viral diseases. Indirect Immunofluorescence Antibody test. Serum was diluted 1:32 and placed into duplicate wells of acetone-fixed Vero E6 cells infected with HTNV, and then incubated at 37 °C for 30 min. After the wells were washed three times with Phosphate-Buffered Saline (PBS), fluorescein isothiocyanate-conjugated goat antibody to mouse immunoglobulin G (IgG) antibodies (ICN Pharmaceuticals, Laval, Canada) was added to the wells. The wells were incubated at 37 °C for 30 min, washed three times with PBS, and then examined under a fluorescence microscope. The detection of virus-specific fluorescence was indicative of HTNV infection.

RNA extraction and RT-PCR.
A. agrarius lung tissues were mechanically homogenized using TissueLyser (MP Biomedicals, Santa Ana, USA), and human sera were added with TRI Reagent Solution (Ambion, Austin, USA). Total RNA was extracted using the Hybrid R TM Kit (GeneAll, Seoul, Republic of Korea) according to the manufacturer's specifications. cDNA was generated using M-MLV (Promega, Madison, USA) with a random hexamer or OSM55, followed by nested PCR performed in 25 μ l reaction mixture containing 250 μ M dNTP, 2 mM MgCl 2 , 1U of Super-Therm Taq DNA polymerase (JMR Holdings, London, UK), 1.5 μ l of cDNA template and 5 pM of each primer 28 . Initial denaturation was performed at 94 °C for 4 min, followed by 6 cycles of denaturation at 94 °C for 30 sec, annealing at 37 °C for 30 sec, elongation at 72 °C for 1 min, followed by 30 cycles of denaturation at 94 °C for 30 sec, annealing at 42 °C for 30 sec, elongation at 72 °C for 1 min and then elongation at 72 °C for 5 min using ProFlex PCR System (Life technology, Carlsbad, USA).
Genomic sequencing of HTNV from A. agrarius. Viral cDNA from lung tissues of A. agrarius was synthesized with random hexamers or OSM55 (5′ -TAGTAGTAGACTCC-3′ ). Conventional nested PCR was performed using Solg TM 2X h-Taq PCR Smart mix (Solgent, Daejeon, Republic of Korea). The PCR program is an initial denaturation at 95 °C for 15 min, followed by 40 cycles of denaturation at 95 °C for 20 s, annealing at 50 °C for 40 s, and elongation at 72 °C for 1 min, and then a cycle of 72 °C for 3 min.
Multiplex PCR. Multiplex PCR primers were designed using BioEdit Sequence Alignment Editor (Version 7.1.11).
Each primer for the multiplex PCR of HTNV L, M, and S segments was prepared and mixed. cDNA synthesized above was amplified twice for HTNV using the primer mixtures and Solg TM 2X Uh-Taq PCR Smart mix (Solgent) according to the manufacturer's instruction. The first PCR was performed in 25 μ l reaction mixture containing 12.5 μl 2X Uh pre-mix, 1.0 μ l of cDNA template, 1.0 μ l of each primer mixture, and 10.5 μ l distilled water (DW). Initial denaturation was performed by a cycle of 95 °C for 15 min, then 40 cycles of 95 °C for 20 sec, 50 °C for 40 sec, 72 °C for 1 min, and a cycle of final elongation at 72 °C for 3 min. The second PCR was performed in 25 μ l reaction mixture containing 12.5 μ l 2X Uh pre-mix, 1.0 μ l of the first PCR product, 1.0 μ l of each primer mixture, and 10.5 μ l DW. Initial   Recombination analysis. Sequence sets to HTNV L, M, and S segments were aligned and analyzed using the Recombination Detection Program 4 (RDP4) software package 30 . The recombination event evaluated by RDP4 was considered significant if it satisfied at least 2 criteria when the P-value (p) < 0.05 and the RDP recombination consensus score (RDPRCS) was > 0.6 31 . When p < 0.05 and the RDPRCS was between 0.4 and 0.6, the recombination event was possible. An RDPRCS < 0.4 with p < 0.05 indicated the rejection of the recombination event. The whole genome sequences of HTNV L, M, and S segments from recombinants, parents, and in-and outgroups were used to generate ML analysis for inferring evolutionary trees in MEGA 5.2, respectively.