Molecular epidemiology of hepatitis C virus in Cambodia during 2016–2017

In Cambodia, little epidemiological data of hepatitis C virus (HCV) is available. All previous studies were limited to only small or specific populations. In the present study, we performed a characterization of HCV genetic diversity based on demography, clinical data, and phylogenetic analysis of HCV non-structural 5B (NS5B) sequences belonging to a large cohort of patients (n = 3,133) coming from majority part of Cambodia between September 2016 and December 2017. The phylogenetic analysis revealed that HCV genotype 1 and 6 were the most predominant and sharing equal proportions (46%). The remaining genotypes were genotype 2 (4.3%) and unclassified variants (3.6%). Among genotype 1, subtype 1b was the most prevalent subtype accounting for 94%. Within genotype 6, we observed a high degree of diversity and the most common viral subtypes were 6e (44%) and 6r (23%). This characteristic points to the longstanding history of HCV in Cambodia. Geographic specificity of viral genotype was not observed. Risks of HCV infection were mainly associated with experience of an invasive medical procedure (64.7%), having partner with HCV (19.5%), and blood transfusion (9.9%). In addition, all of these factors were comparable among different HCV genotypes. All these features define the specificity of HCV epidemiology in Cambodia.


Results
study population. Table 1 describes sociodemographic and clinical characteristics of the 3,133 patients included in the study. The average age and standard deviation (SD) of the study population was 55.0 (SD, 11.2) years (range,  and 59% were female (Table 1). Forty-four percent (n = 1,374/3,133) of the patients were from Phnom Penh, the capital city of Cambodia and its vicinity (n = 1,006 from Phnom Penh, n = 368 from Kandal Province) and a greater proportion were from outside the capital (56%, n = 1,758/3,131). Nearly all patients were Cambodian (99.9%) and most (94.2%) were naïve to HCV treatment including interferon and ribavirin.
HCV genotype and subtype distribution. Preliminary phylogenetic analysis performed on HCV non-structural 5B (NS5B) sequences obtained from 3,133 patients revealed the presence of HCV genotypes 1, 2, 6 and an unassigned genotype. HCV genotype 1 and 6 were the most predominant genotypes, found in 1,444 (46.1%) and 1,442 (46.0%) samples, respectively. A small number of samples were identified as genotype 2, accounting for 4.3% (134/3,133), followed by unclassified genotype (3.6%; 113/3,133). Subsequently, maximum likelihood (ML) phylogenetic trees of HCV genotype 6 and non-6 sequences including both genotype 1 and 2 were separately constructed. Figure 1 shows the phylogenetic patterns of HCV sequences within each genotype.
Geographical distribution of HCV genotype. Patients included in the study were coming from all province across Cambodia. However, the number was unequal from one province to another. Figure 2 shows the distribution of HCV genotypes according to geographical area. When focusing on provinces with a high number of participants, we observed that the 3 viral genotypes detected in the present study was distributed similarly in these provinces, except Kampong Chhnang province where the proportion of HCV genotype 1 was much higher than other provinces (63% GT1 and 29% GT6 among n = 305 in Kampong Chhnang Province compared to 44% GT1 and 48% GT6 among n = 2,826 patients from all other provinces).

HCV transmission risk factors and HCV genotypes. Patients were interviewed for risk factors of HCV
infection including a history of invasive medical procedure, blood transfusion and drug use ( Table 1). The majority of patients had a history of invasive medical procedures (64.7%). Having partners with known HCV infection and history of blood transfusion were also common (19.5% and 9.9%, respectively). The distribution of risk factors associated with HCV infection was comparable across genotypes.

Clinical features of HCV infection and HCV genotypes.
We found that the frequency of patients having a high HCV RNA viral load (VL) of greater than or equal to 800,000 IU/mL was highest amongst patients infected with HCV genotype 6 compared to other genotypes. In contrast, we observed that all patients shared similar clinical features, except for the fibroscan value. Patients infected with HCV genotype 1 had the highest level of liver stiffness as presented by a high value of fibroscan (median: 10.4 kPa) compared to those infected with HCV genotypes 2 (8.6 kPa), 6 (9.7 kPa), and unclassified genotype (9.2 kPa) (p < 0.05). Additionally, the distribution of fibrosis stage differed across genotypes: HCV GT1 had the greatest proportion of patients with fibrosis stage ≥ F3 (54%), followed by HCV genotype 6 (51%) and HCV genotype 2 with the smallest proportion (44%), although differences were not statistically significant (Table 1).

HCV genotype and advanced fibrosis.
To evaluate the association between HCV genotype and advanced fibrosis (defined as fibroscan value ≥ 20 kPa), we calculated the odds of having advanced fibrosis for 3,099 patients with fibroscan data. In univariate analysis, we found that HCV genotype 6 had lower odds of advanced fibrosis compared to HCV genotype 1 and this was consistently observed in the multivariable analysis after adjusting for age, sex, and body-mass-index (adjusted odds ratio (aOR) [95% confidence interval (CI)]: 0.7 [0.6-0.9], p < 0.01) (

Discussion
Here, we report a large data of HCV molecular epidemiology in Cambodia based on demographic, clinical data and HCV sequences belonging to 3,133 patients presenting from the majority parts of Cambodia between September 2016 and December 2017. This study revealed a circulation of HCV genotype 1, 2, 6, and unclassified variants, with a predominance of genotypes 1 and 6, accounting for 46.1% and 46.0% of the study population, respectively. Our finding is in agreement with previous studies conducted in different populations and regions in Cambodia 15,16,18,20 , at least for the viral genotypes. However, it seems difficult to perform quantitative comparison for the proportions of each genotype reported in these studies as all previous studies were based on small population size, restricted geographical areas, and/or specific populations. For instance, HCV genotype 1 was found as the most prevalent strain (68%) in a study carried out on 28 HIV and HCV co-infected patients followed-up at Calmette Hospital in Phnom Penh, while HCV genotype 6 (25%) and genotype 2 (7%) were less frequent 16 . On the other hand, HCV genotype 6 was the most predominant among 25 HCV-infected immigrant workers from Cambodia in Thailand 18 and among 11 HCV viremic adults living in Siem Reap province 20 . In a recent study, De Weggheleire et al. reported a predominance of both genotypes 1 (52.8%) and 6 (41.4%) among 87 HIV and HCV co-infected patients from Phnom Penh 15 .
When analyzing each genotype in-depth, we observed a low level of genetic diversity among HCV genotypes 1 and 2 present in Cambodia. Overall, the most predominant subtype was 1b and followed by small percentages of subtypes 2a, 1a, 2m, 2b, and 2i. The frequency of subtype 1b found in our study was comparable with previous report by De Weggheleire et al. 15 .
Within the genotype 6, there was a high genetic diversity with subtypes 6e as the most common and accounting for 43.5%. Subtype 6e is commonly detected in neighboring countries and it has been suggested that HCV subtype 6e originated from Viet Nam 23 . HCV subtype 6r was the second most common subtype found in 22.9% of HCV genotype 6 sequences. In contrast to the subtype 6e, it has been stated that HCV subtype 6r is specific to Cambodia, as it has been detected only in this country or among its native population living abroad 18,24 . Nevertheless, further molecular evolutionary analysis of this subtype is needed to confirm this hypothesis. Other subtypes representing low frequencies were also detected: 6q, 6p, 6a, 6xf, 6s, 6o, 6l, 6u, 6h, 6n, 6t, 6xb, 6xc, and unclassified genotype 6 variants. The great genetic diversity of HCV genotype 6 was also described in neighboring Laos 13 , Thailand 25 , and Vietnam 23 . This characteristic may indicate that HCV genotype 6 has circulated, adapted and evolved for a long period of time in Cambodia as previously suggested for other countries in the region 13,23 .
HCV genotype 3 circulates commonly among intravenous drug users through needle and syringe sharing 26 . Unsurprisingly, this genotype is completely absent in our study. This may be explained by a low number of intravenous drug users in the current study. As mentioned in Table 1, 18 out of 3,120 patients were drug users.
Among 3,133 HCV sequences included in the analysis, 113 (3.6%) remained unclassified, possibly related to new HCV variants. Subsequently, further investigation by analyzing other genes or full-length genome sequences of these viruses could be relevant.
All of the HCV genotypes detected were homogenously distributed in province where significant numbers of patients were included, except Kampong Chhnang province where the frequency of HCV genotype 1 is higher than other provinces. This observation suggests that there is no geographic specificity of the corresponding genotype. However, we were not able to explain the high frequency of HCV genotype 1 in Kampong Chnang province. It is important to note that our data was statistically biased and was not representative of the whole country, as patients included in the present study were self-refered to receive diagnosis and/or treatment for HCV in facility located in the Capital city.  www.nature.com/scientificreports www.nature.com/scientificreports/ Figure 1. Phylogenetic trees of HCV NS5B sequences. Phylogenetic trees were inferred using the maximum likelihood (ML) method based on GTR + Γ + I (for HCV genotype none 6) and JC + Γ (for HCV genotype 6) models of nucleotide substitution with HCV genotype 8 as outgroup. (A) ML phylogenetic tree for HCV NS5B sequences from 1,691 Cambodian patients with HCV genotype none 6 (indicated in sky blue), and 285 GenBank reference sequences indicated in different colours (subtype 1a: orange; 1b: red; 2: green; 2a; magenta; 2b: purple; 2i: navy; 2m: golden red; 3: teal; 4: pink; 5: maroon; 6: yellow; 7: dark orange, and 8: dark purple). (B) ML phylogenetic tree for HCV NS5B sequences from 1,442 Cambodian patients with HCV genotype 6 (indicated in SkyBlue) and 285 GenBank reference sequences are indicated in different colours (genotype 1: red; 2: green; 3: teal; 4: pink; 5: yellow; 6a: salmon; 6e: magenta; 6h: medium purple; 6 l: dark slate grey 6n: turquoise; 6o: cyan; 6p: dark green; 6q: coral; 6r: dark golden red; 6s; rosy bran; 6t: olive; 6u: purple; 6xb: deep pink; 6xc: maroon; 6xf: brown; other subtypes within genotype 6 including 6b, 6c, and 6d, 6f, 6g, 6i, 6k, 6m, 6v, 6w, 6xa, 6xd, and 6xe: blue; 7: dark orange, and 8: dark blue). www.nature.com/scientificreports www.nature.com/scientificreports/ Risk factors associated with HCV infection such as exposure to invasive medical procedures, blood transfusion, partner with HCV and a history of drug use were comparable for all HCV genotypes circulating in Cambodia. Among these factors, past experience of invasive medical procedures were reported by more than 60% of patients included in the present analysis and many cases are likely to be iatrogenic, since Cambodia has been known as a country with high rate of medical injections such as therapeutic injections and intravenous infusions 22,27,28 . Recently, we have reconstructed a likely transmission history of a massive iatrogenic HIV outbreak occurred in Roka (a rural commune in Cambodia) between 2014 and 2015. We showed that unsafe injections most likely led to this large outbreak which was also associated with a long-standing HCV transmission with multiple and independent sources of introduction 22 .
Very few studies have assessed natural history and clinical features of HCV genotype 6 compared to other genotypes. A cross-sectional study performed in 308 Southeast Asians in California, USA found no significant differences in virological and clinical characteristics between HCV genotype 6 and other genotypes 29 . A study conducted in Hong Kong, Seto et al. compared the natural history of 138 HCV genotype 1 patients (median age: 50 years old) with 78 HCV genotype 6 patients (median age: 46.5 years old) after a median follow-up period of over 5 years. In this survey, both genotypes had comparable liver biochemistry, HCV RNA viral load and similar rates of development of cirrhotic complications and mortality 30 . The findings of these studies suggested that viral genotype is not the main discriminating factor of disease outcome. In our study, it seems that HCV genotype 6 has higher viral load compared to other genotypes. However, we were not able to further explore the association between disease progression and viral genotypes, since our analysis was cross-sectional and many significant factors which impact the natural history of HCV such as age at the time of initial infection or duration of infection and host factors were not available.
In conclusion, we showed that molecular epidemiology of HCV in Cambodia is predominantly associated with two genotypes sharing similar proportions: genotypes 1 and 6. The most prevalent viral subtypes were 1b, followed by 6e and 6r. The route of transmission of HCV in Cambodia could be predominantly linked to invasive medical procedures including unsafe injection practices. This characteristic of epidemiology is specific to Cambodia. Further investigation is needed to better understand the evolution of HCV viral strain in Cambodia.  www.nature.com/scientificreports www.nature.com/scientificreports/ study design. Between September, 2016 and December, 2017, 3,352 adult chronic HCV patients, 18 years of age or older, visiting HCV clinic of MSF were recruited to participate in the study. Among them, 3,133 patients gave their written consent to be included. MSF's HCV service was open to any patients seeking diagnosis and/or treatment for HCV and consisted of patients from Phnom Penh and any of the other 23 provinces in Cambodia.

Methods
The current study was a prospective and anonymous analysis of data collected between September 19, 2016 and December 6, 2017 from HCV-infected adult patients who were enrolled in a cohort of the MSF's clinics. The treatment regimen utilizes a combination of Sofosbuvir (NS5B inhibitor) and Daclatasvir (or Ledipasvir: NS5A inhibitor) for 12 weeks, as recommended by the current AASLD/IDSA HCV guideline 31 . REDCap electronic database (Research Electronic Data Capture; Vanderbilt University, USA) 32 , which is hosted at Epicentre, Médecins Sans Frontières (Paris, France), was used to store demographic (gender, age, place of residence etc.) and clinical data at the enrollment of all patients. Blood samples were collected for assessment of HCV infection.
All HCV-infected adults (≥18 years) whose demographic, clinical data and HCV sequences were available, were eligible in the present analysis. ethics statement. All patients included in the present study provided written informed consent for the use of their demographic, clinical, and biological data. The study protocol was approved by the Cambodian National Ethics committee for Health Research. All methods were performed in accordance with relevant guidelines and regulations.
Assessment of HCV infection and genotyping. All specimens tested positive for HCV antibodies by SD Bioline HCV (Standard Diagnostics, Inc., Rest-of-World regulatory version, Kyonggi-do, Korea) were assessed for HCV RNA viral load testing using the COBAS AmpliPrep/Cobas TaqMan HCV Quantitative Test, v2.0 platform (Roche) according to manufacturer's instructions.
HCV genotype and subtype were determined for all samples with detectable HCV RNA viral load (>1.2 Log 10 IU/mL) based on phylogenetic analysis of the HCV non-structural 5B (NS5B) genome region (371 bp) that was amplified using a semi-nested RT-PCR, as described previously 33 . The amplifications of the NS5B gene were performed at the Institut Pasteur du Cambodge (Phnom Penh, Cambodia). All PCR amplified fragments were sent for sequencing to a commercial sequencing facility (Macrogen, Inc., Seoul, South Korea) using the Big Dye Terminator v3.1 Cycle Sequencing kit (Applied Biosystems). Chromatograms were sent back electronically to the Institut Pasteur in Cambodia for verification by visual inspection using CEQ 2000 (Beckman Coulter) software. Viral sequences were aligned with reference sequences for HCV subtypes available in GenBank database (Supplementary Table 1). Phylogenetic trees were constructed using the maximum likelihood (ML) method based on GTR + Γ + I 34 (for HCV genotype none 6) and JC + Γ 35 (for HCV genotype 6) models of nucleotide substitution, as recommended by the Find Best DNA/Protein models program inserted in the MEGA7 software 36 . statistical analysis. For the descriptive analysis of patients in the cohort, mean, standard deviation (SD) and ANOVA were used to describe normally distributed variables, and median, inter-quartile-range (IQR) and Kruskal-Wallis test were used to describe non-normally distributed continuous variables. Proportions and chi-squared tests were used to describe categorical variables. To compare virological and clinical features of different HCV genotypes, HCV RNA viral load at baseline, liver stiffness by transient elastography using Fibroscan, transaminase values, and comorbidities of all patients were analyzed. To examine whether HCV genotypes were associated with advanced fibrosis (≥20 kPa), univariate and multivariate logistic regression analysis was performed using the following variables: gender, age (<40; ≥40 and <50; ≥50 and <60; ≥60 and <70; ≥70 years old) and body-mass-index (<20; ≥20 and <30; ≥30 kg/m 2 ) with HCV genotype 1 as the reference group.
All statistical tests were performed two-sided at alpha 0.05 using STATA version 13.1 software (STATACorp LP, College Station, Texas, USA, 2016).
Geographic HCV mapping. During the initial assessment of HCV infection, the province of residence was collected for each patient. Mapping was performed by importing Cambodian shapefiles from Open Development Cambodia for the Economic Census of Cambodia 2011 (Ministry of Planning, National Institute of Statistics) and genotype data from this study into QGIS version 2.16.0 (Development Team -Open Source Geospatial Foundation Project, 2016) 37 .