Introduction

Hepatitis C virus (HCV), one of the causative agents of chronic liver disease, remains a global public health issue despite the availability of direct-acting antivirals (DAA). In 2015, the World Health Organization (WHO) estimated 71 million people worldwide were living with chronic hepatitis C and about 10 million from Southeast Asia1. HCV is a blood borne virus that is commonly transmitted through blood exposure including blood transfusion, sharing of drug injecting devices and reuse of contaminated medical equipment, in particular, syringes and needles. Exhibiting a high degree of genetic diversity2, HCV is classified into eight genotypes (1–8)3,4 and subdivided into 87 subtypes named in alphabetical order (1a – 1n, 2a – 2 u, 3a – 3k, 4a – 4w, 5a, 6a – 6xf, 7a – 7b, and 8a)4,5.

An understanding of HCV molecular epidemiology is important for surveillance of transmission dynamics and leads to an appropriate public health response. In the era of DAAs, knowledge of HCV genetic diversity may also be useful for the treatment and management of HCV-infected patients, particularly in case of virological failure. The majority of described hepatitis C infection and treatment outcome is associated with HCV genotype 1, which is globally distributed and well conserved6,7. In contrast, high diversity HCV lineages are observed in high endemic areas. For instance, in West Africa, endemic strains belonging to genotypes 1, 2, and 5 are highly prevalent8,9,10. Regional patterns of endemic diversity have been described for genotype 3 in the Indian subcontinent11, genotype 4 in North Africa and the Middle East6, genotype 5 in West9,10 and South Africa12, and genotype 6 in China and Southeast Asia6,13.

In Cambodia, a Southeast Asian country, data of HCV prevalence and molecular epidemiology are poorly documented. Previous studies were performed on small sample sizes of specific populations: blood donors14, human immunodeficiency virus (HIV) and HCV co-infected adults15,16,17, Cambodian immigrants in foreign countries18, and children19 – and with different age categories19,20. Consequently, an accurate prevalence of HCV in general population is unknown. The prevalence of HCV ranges between 2.8% and 14.7% in rural areas depending on study sites14,19,20,21 and between 5.5% and 10.4% among people with HIV in hospital-based programs in Phnom Penh, the capital city of Cambodia15,16,17. In terms of HCV genotype, there is a remarkable scarcity of data for this Southeast Asian country, compared to others in the region. Available sequences from previous studies revealed co-circulation of HCV genotype 1 and 616,18,22.

In the present study, we report epidemiology of HCV genotypes in Cambodia based on a large number of data obtained from a Ministry of Health (MoH)-integrated HCV program.

Results

Study population

Table 1 describes sociodemographic and clinical characteristics of the 3,133 patients included in the study. The average age and standard deviation (SD) of the study population was 55.0 (SD, 11.2) years (range, 18–87) and 59% were female (Table 1). Forty-four percent (n = 1,374/3,133) of the patients were from Phnom Penh, the capital city of Cambodia and its vicinity (n = 1,006 from Phnom Penh, n = 368 from Kandal Province) and a greater proportion were from outside the capital (56%, n = 1,758/3,131). Nearly all patients were Cambodian (99.9%) and most (94.2%) were naïve to HCV treatment including interferon and ribavirin.

Table 1 Sociodemographic and clinical characteristics of patients.

HCV genotype and subtype distribution

Preliminary phylogenetic analysis performed on HCV non-structural 5B (NS5B) sequences obtained from 3,133 patients revealed the presence of HCV genotypes 1, 2, 6 and an unassigned genotype. HCV genotype 1 and 6 were the most predominant genotypes, found in 1,444 (46.1%) and 1,442 (46.0%) samples, respectively. A small number of samples were identified as genotype 2, accounting for 4.3% (134/3,133), followed by unclassified genotype (3.6%; 113/3,133). Subsequently, maximum likelihood (ML) phylogenetic trees of HCV genotype 6 and non-6 sequences including both genotype 1 and 2 were separately constructed. Figure 1 shows the phylogenetic patterns of HCV sequences within each genotype.

Figure 1
figure 1

Phylogenetic trees of HCV NS5B sequences. Phylogenetic trees were inferred using the maximum likelihood (ML) method based on GTR + Γ + I (for HCV genotype none 6) and JC + Γ (for HCV genotype 6) models of nucleotide substitution with HCV genotype 8 as outgroup. (A) ML phylogenetic tree for HCV NS5B sequences from 1,691 Cambodian patients with HCV genotype none 6 (indicated in sky blue), and 285 GenBank reference sequences indicated in different colours (subtype 1a: orange; 1b: red; 2: green; 2a; magenta; 2b: purple; 2i: navy; 2m: golden red; 3: teal; 4: pink; 5: maroon; 6: yellow; 7: dark orange, and 8: dark purple). (B) ML phylogenetic tree for HCV NS5B sequences from 1,442 Cambodian patients with HCV genotype 6 (indicated in SkyBlue) and 285 GenBank reference sequences are indicated in different colours (genotype 1: red; 2: green; 3: teal; 4: pink; 5: yellow; 6a: salmon; 6e: magenta; 6h: medium purple; 6 l: dark slate grey 6n: turquoise; 6o: cyan; 6p: dark green; 6q: coral; 6r: dark golden red; 6s; rosy bran; 6t: olive; 6u: purple; 6xb: deep pink; 6xc: maroon; 6xf: brown; other subtypes within genotype 6 including 6b, 6c, and 6d, 6f, 6g, 6i, 6k, 6m, 6v, 6w, 6xa, 6xd, and 6xe: blue; 7: dark orange, and 8: dark blue).

Among non-genotype 6 viruses, the viral sequences were well conserved. Among genotype 1, two viral subtypes were identified: 1b was the most predominant subtype with a frequency of 94% (n = 1,357) followed by a lower frequency of subtype 1a (6%; n = 87) (Fig. 1A). Within genotype 2 strains, 125 (93.3%) sequences were distributed into the 2a subtype cluster, 6 (4.5%) were 2 m subtype, 2 (1.5%) were 2b subtype, and only 1 sequence (0.7%) was 2i subtype.

A high genetic diversity was observed within the genotype 6 (Fig. 1B). The most common subtypes were 6e and 6r accounting for 43.5% (n = 628) and 22.9% (n = 331), respectively. Seventy (4.9%) sequences were not clustered with any known subtype of HCV genotype 6. The remaining sequences belong to subtypes: 6q 7.8% (n = 113), 6p 5.5% (n = 80), 6a 4.9% (n = 70), 6xf 4.4% (n = 63), 6s 4.2% (n = 61), 6o 0.7% (n = 10), 6 l 0.4% (n = 5), 6u 0.3% (n = 4), 6h 0.2% (n = 3), 6n 0.07% (n = 1), 6t 0.07% (n = 1), 6xb 0.07% (n = 1) and 6xc 0.07% (n = 1).

Geographical distribution of HCV genotype

Patients included in the study were coming from all province across Cambodia. However, the number was unequal from one province to another. Figure 2 shows the distribution of HCV genotypes according to geographical area. When focusing on provinces with a high number of participants, we observed that the 3 viral genotypes detected in the present study was distributed similarly in these provinces, except Kampong Chhnang province where the proportion of HCV genotype 1 was much higher than other provinces (63% GT1 and 29% GT6 among n = 305 in Kampong Chhnang Province compared to 44% GT1 and 48% GT6 among n = 2,826 patients from all other provinces).

Figure 2
figure 2

Geographical distribution of HCV genotypes by province in Cambodia. Different-sizes circles represent study population size of each province.

HCV transmission risk factors and HCV genotypes

Patients were interviewed for risk factors of HCV infection including a history of invasive medical procedure, blood transfusion and drug use (Table 1). The majority of patients had a history of invasive medical procedures (64.7%). Having partners with known HCV infection and history of blood transfusion were also common (19.5% and 9.9%, respectively). The distribution of risk factors associated with HCV infection was comparable across genotypes.

Clinical features of HCV infection and HCV genotypes

We found that the frequency of patients having a high HCV RNA viral load (VL) of greater than or equal to 800,000 IU/mL was highest amongst patients infected with HCV genotype 6 compared to other genotypes. In contrast, we observed that all patients shared similar clinical features, except for the fibroscan value. Patients infected with HCV genotype 1 had the highest level of liver stiffness as presented by a high value of fibroscan (median: 10.4 kPa) compared to those infected with HCV genotypes 2 (8.6 kPa), 6 (9.7 kPa), and unclassified genotype (9.2 kPa) (p < 0.05). Additionally, the distribution of fibrosis stage differed across genotypes: HCV GT1 had the greatest proportion of patients with fibrosis stage ≥ F3 (54%), followed by HCV genotype 6 (51%) and HCV genotype 2 with the smallest proportion (44%), although differences were not statistically significant (Table 1).

HCV genotype and advanced fibrosis

To evaluate the association between HCV genotype and advanced fibrosis (defined as fibroscan value ≥ 20 kPa), we calculated the odds of having advanced fibrosis for 3,099 patients with fibroscan data. In univariate analysis, we found that HCV genotype 6 had lower odds of advanced fibrosis compared to HCV genotype 1 and this was consistently observed in the multivariable analysis after adjusting for age, sex, and body-mass-index (adjusted odds ratio (aOR) [95% confidence interval (CI)]: 0.7 [0.6–0.9], p < 0.01) (Table 2).

Table 2 Crude (cOR) and adjusted odds ratio (aOR) for advanced fibrosis of ≥20 kPa on Fibroscan (n = 3099).

Discussion

Here, we report a large data of HCV molecular epidemiology in Cambodia based on demographic, clinical data and HCV sequences belonging to 3,133 patients presenting from the majority parts of Cambodia between September 2016 and December 2017. This study revealed a circulation of HCV genotype 1, 2, 6, and unclassified variants, with a predominance of genotypes 1 and 6, accounting for 46.1% and 46.0% of the study population, respectively. Our finding is in agreement with previous studies conducted in different populations and regions in Cambodia15,16,18,20, at least for the viral genotypes. However, it seems difficult to perform quantitative comparison for the proportions of each genotype reported in these studies as all previous studies were based on small population size, restricted geographical areas, and/or specific populations. For instance, HCV genotype 1 was found as the most prevalent strain (68%) in a study carried out on 28 HIV and HCV co-infected patients followed-up at Calmette Hospital in Phnom Penh, while HCV genotype 6 (25%) and genotype 2 (7%) were less frequent16. On the other hand, HCV genotype 6 was the most predominant among 25 HCV-infected immigrant workers from Cambodia in Thailand18 and among 11 HCV viremic adults living in Siem Reap province20. In a recent study, De Weggheleire et al. reported a predominance of both genotypes 1 (52.8%) and 6 (41.4%) among 87 HIV and HCV co-infected patients from Phnom Penh15.

When analyzing each genotype in-depth, we observed a low level of genetic diversity among HCV genotypes 1 and 2 present in Cambodia. Overall, the most predominant subtype was 1b and followed by small percentages of subtypes 2a, 1a, 2m, 2b, and 2i. The frequency of subtype 1b found in our study was comparable with previous report by De Weggheleire et al.15.

Within the genotype 6, there was a high genetic diversity with subtypes 6e as the most common and accounting for 43.5%. Subtype 6e is commonly detected in neighboring countries and it has been suggested that HCV subtype 6e originated from Viet Nam23. HCV subtype 6r was the second most common subtype found in 22.9% of HCV genotype 6 sequences. In contrast to the subtype 6e, it has been stated that HCV subtype 6r is specific to Cambodia, as it has been detected only in this country or among its native population living abroad18,24. Nevertheless, further molecular evolutionary analysis of this subtype is needed to confirm this hypothesis. Other subtypes representing low frequencies were also detected: 6q, 6p, 6a, 6xf, 6s, 6o, 6l, 6u, 6h, 6n, 6t, 6xb, 6xc, and unclassified genotype 6 variants. The great genetic diversity of HCV genotype 6 was also described in neighboring Laos13, Thailand25, and Vietnam23. This characteristic may indicate that HCV genotype 6 has circulated, adapted and evolved for a long period of time in Cambodia as previously suggested for other countries in the region13,23.

HCV genotype 3 circulates commonly among intravenous drug users through needle and syringe sharing26. Unsurprisingly, this genotype is completely absent in our study. This may be explained by a low number of intravenous drug users in the current study. As mentioned in Table 1, 18 out of 3,120 patients were drug users.

Among 3,133 HCV sequences included in the analysis, 113 (3.6%) remained unclassified, possibly related to new HCV variants. Subsequently, further investigation by analyzing other genes or full-length genome sequences of these viruses could be relevant.

All of the HCV genotypes detected were homogenously distributed in province where significant numbers of patients were included, except Kampong Chhnang province where the frequency of HCV genotype 1 is higher than other provinces. This observation suggests that there is no geographic specificity of the corresponding genotype. However, we were not able to explain the high frequency of HCV genotype 1 in Kampong Chnang province. It is important to note that our data was statistically biased and was not representative of the whole country, as patients included in the present study were self-refered to receive diagnosis and/or treatment for HCV in facility located in the Capital city.

Risk factors associated with HCV infection such as exposure to invasive medical procedures, blood transfusion, partner with HCV and a history of drug use were comparable for all HCV genotypes circulating in Cambodia. Among these factors, past experience of invasive medical procedures were reported by more than 60% of patients included in the present analysis and many cases are likely to be iatrogenic, since Cambodia has been known as a country with high rate of medical injections such as therapeutic injections and intravenous infusions22,27,28. Recently, we have reconstructed a likely transmission history of a massive iatrogenic HIV outbreak occurred in Roka (a rural commune in Cambodia) between 2014 and 2015. We showed that unsafe injections most likely led to this large outbreak which was also associated with a long-standing HCV transmission with multiple and independent sources of introduction22.

Very few studies have assessed natural history and clinical features of HCV genotype 6 compared to other genotypes. A cross-sectional study performed in 308 Southeast Asians in California, USA found no significant differences in virological and clinical characteristics between HCV genotype 6 and other genotypes29. A study conducted in Hong Kong, Seto et al. compared the natural history of 138 HCV genotype 1 patients (median age: 50 years old) with 78 HCV genotype 6 patients (median age: 46.5 years old) after a median follow-up period of over 5 years. In this survey, both genotypes had comparable liver biochemistry, HCV RNA viral load and similar rates of development of cirrhotic complications and mortality30. The findings of these studies suggested that viral genotype is not the main discriminating factor of disease outcome. In our study, it seems that HCV genotype 6 has higher viral load compared to other genotypes. However, we were not able to further explore the association between disease progression and viral genotypes, since our analysis was cross-sectional and many significant factors which impact the natural history of HCV such as age at the time of initial infection or duration of infection and host factors were not available.

In conclusion, we showed that molecular epidemiology of HCV in Cambodia is predominantly associated with two genotypes sharing similar proportions: genotypes 1 and 6. The most prevalent viral subtypes were 1b, followed by 6e and 6r. The route of transmission of HCV in Cambodia could be predominantly linked to invasive medical procedures including unsafe injection practices. This characteristic of epidemiology is specific to Cambodia. Further investigation is needed to better understand the evolution of HCV viral strain in Cambodia.

Methods

Study setting

In collaboration with the MoH of Cambodia, Médecins Sans Frontières (MSF– Doctors without borders) launched a HCV program inside the Hepato-Gastro Department of Preah Kossamak Hospital, a 254-bed national hospital in Phnom Penh, in September 2016 aimed at developing a simplified care model adapted for Cambodian context. MSF provided free testing and treatment services to patients seeking care either through the out-patient services or through MSF’s screening programs for at-risk patient groups, which include patients enrolled in HIV care, HIV patients who are female entertainment workers (FEW), transgender (TG) or men who have sex with men (MSM), and injection drug-users receiving needle-exchange and other support services from various collaborating non-governmental organizations. MSF operated two sites within the hospital: one dedicated for screening and diagnosis (Screening Site) and another for HCV treatment (Treatment Site).

Study design

Between September, 2016 and December, 2017, 3,352 adult chronic HCV patients, 18 years of age or older, visiting HCV clinic of MSF were recruited to participate in the study. Among them, 3,133 patients gave their written consent to be included. MSF’s HCV service was open to any patients seeking diagnosis and/or treatment for HCV and consisted of patients from Phnom Penh and any of the other 23 provinces in Cambodia.

The current study was a prospective and anonymous analysis of data collected between September 19, 2016 and December 6, 2017 from HCV-infected adult patients who were enrolled in a cohort of the MSF’s clinics. The treatment regimen utilizes a combination of Sofosbuvir (NS5B inhibitor) and Daclatasvir (or Ledipasvir: NS5A inhibitor) for 12 weeks, as recommended by the current AASLD/IDSA HCV guideline31. REDCap electronic database (Research Electronic Data Capture; Vanderbilt University, USA)32, which is hosted at Epicentre, Médecins Sans Frontières (Paris, France), was used to store demographic (gender, age, place of residence etc.) and clinical data at the enrollment of all patients. Blood samples were collected for assessment of HCV infection.

All HCV-infected adults (≥18 years) whose demographic, clinical data and HCV sequences were available, were eligible in the present analysis.

Ethics statement

All patients included in the present study provided written informed consent for the use of their demographic, clinical, and biological data. The study protocol was approved by the Cambodian National Ethics committee for Health Research. All methods were performed in accordance with relevant guidelines and regulations.

Assessment of HCV infection and genotyping

All specimens tested positive for HCV antibodies by SD Bioline HCV (Standard Diagnostics, Inc., Rest-of-World regulatory version, Kyonggi-do, Korea) were assessed for HCV RNA viral load testing using the COBAS AmpliPrep/Cobas TaqMan HCV Quantitative Test, v2.0 platform (Roche) according to manufacturer’s instructions.

HCV genotype and subtype were determined for all samples with detectable HCV RNA viral load (>1.2 Log10 IU/mL) based on phylogenetic analysis of the HCV non-structural 5B (NS5B) genome region (371 bp) that was amplified using a semi-nested RT-PCR, as described previously33. The amplifications of the NS5B gene were performed at the Institut Pasteur du Cambodge (Phnom Penh, Cambodia). All PCR amplified fragments were sent for sequencing to a commercial sequencing facility (Macrogen, Inc., Seoul, South Korea) using the Big Dye Terminator v3.1 Cycle Sequencing kit (Applied Biosystems). Chromatograms were sent back electronically to the Institut Pasteur in Cambodia for verification by visual inspection using CEQ 2000 (Beckman Coulter) software. Viral sequences were aligned with reference sequences for HCV subtypes available in GenBank database (Supplementary Table 1). Phylogenetic trees were constructed using the maximum likelihood (ML) method based on GTR + Γ + I34 (for HCV genotype none 6) and JC + Γ35 (for HCV genotype 6) models of nucleotide substitution, as recommended by the Find Best DNA/Protein models program inserted in the MEGA7 software36.

Statistical analysis

For the descriptive analysis of patients in the cohort, mean, standard deviation (SD) and ANOVA were used to describe normally distributed variables, and median, inter-quartile-range (IQR) and Kruskal-Wallis test were used to describe non-normally distributed continuous variables. Proportions and chi-squared tests were used to describe categorical variables. To compare virological and clinical features of different HCV genotypes, HCV RNA viral load at baseline, liver stiffness by transient elastography using Fibroscan, transaminase values, and comorbidities of all patients were analyzed. To examine whether HCV genotypes were associated with advanced fibrosis (≥20 kPa), univariate and multivariate logistic regression analysis was performed using the following variables: gender, age (<40; ≥40 and <50; ≥50 and <60; ≥60 and <70; ≥70 years old) and body-mass-index (<20; ≥20 and <30; ≥30 kg/m2) with HCV genotype 1 as the reference group.

All statistical tests were performed two-sided at alpha 0.05 using STATA version 13.1 software (STATACorp LP, College Station, Texas, USA, 2016).

Geographic HCV mapping

During the initial assessment of HCV infection, the province of residence was collected for each patient. Mapping was performed by importing Cambodian shapefiles from Open Development Cambodia for the Economic Census of Cambodia 2011 (Ministry of Planning, National Institute of Statistics) and genotype data from this study into QGIS version 2.16.0 (Development Team - Open Source Geospatial Foundation Project, 2016)37.

Nucleotide sequence accession number

All HCV NS5B sequences included in the present study were submitted to GenBank and registered under accession numbers: MK436248 - MK436360 (unassigned genotype), MK436361 - MK437938 (genotype non-6), and MK437939 - MK439380 (HCV genotype 6).