Pretreatment drug resistance in a large countrywide Ethiopian HIV-1C cohort: a comparison of Sanger and high-throughput sequencing

Baseline plasma samples of 490 randomly selected antiretroviral therapy (ART) naïve patients from seven hospitals participating in the first nationwide Ethiopian HIV-1 cohort were analysed for surveillance drug resistance mutations (sDRM) by population based Sanger sequencing (PBSS). Also next generation sequencing (NGS) was used in a subset of 109 baseline samples of patients. Treatment outcome after 6– and 12–months was assessed by on-treatment (OT) and intention-to-treat (ITT) analyses. Transmitted drug resistance (TDR) was detected in 3.9% (18/461) of successfully sequenced samples by PBSS. However, NGS detected sDRM more often (24%; 26/109) than PBSS (6%; 7/109) (p = 0.0001) and major integrase strand transfer inhibitors (INSTI) DRMs were also found in minor viral variants from five patients. Patients with sDRM had more frequent treatment failure in both OT and ITT analyses. The high rate of TDR by NGS and the identification of preexisting INSTI DRMs in minor wild-type HIV-1 subtype C viral variants infected Ethiopian patients underscores the importance of TDR surveillance in low– and middle–income countries and shows added value of high-throughput NGS in such studies.

Outcomes of ART and sequencing. Of the 490 patients, 408 (83.3%) were still on treatment at month six and the remaining were either LTFU (n = 33) or dead (n = 49) (Fig. 1). Plasma HIV-1 RNA (VL) was not tested in 20 subjects and among those with a VL data, 316 (81.4%) had undetectable viremia, and 72 (18.6%) detectable viremia. At month 12, 383 (78.2%) out of 490 subjects were still on treatment. VL was not tested in 114 subjects and among those with an available VL, 228 (84.8%) had undetectable viremia, and 41 (15.2%) had detectable viremia. Eleven patients had died and 14 were LTFU, respectively, between month six and 12.  For PBSS, a pol-sequence was obtained in 461 out of 490 (94%) samples at baseline, 47 out of 51 (93%) at month six and 30 out of 33 (91%) at month 12 ( Fig. 1). For NGS, a result was obtained for all 109 samples with contigs ranging from one to ten. Of these, the best contig was selected. Three samples gave a fragmented contig, which were rectified manually. Another sample showed a large deletion in the RNaseH and integrase region. The sequences clustered with HIV-1C, except one CRF02_AG by the three subtyping tools listed in the methodology section.
Baseline sDRM detected by PBSS and impact on treatment outcome. At baseline, 18 (3.9%) of the 461 patients with sequence data had sDRM (NRTI: n = 9; NNRTI: n = 7; PI: n = 2) ( Figure 1; Table 2). None of the patients had dual drug class sDRM. Three patients had two mutations of the same drug class. There was no correlation between the presence of sDRM and study site, sex, and CD4 + T-cell count (data not shown), but the odds of having sDRM decreased significantly as participant's age increased (OR: 0.93; 95% CI: 0.87-0.99) and increased with higher baseline viral load (OR: 2.67; 95% CI: 1. 25-5.71).
The INSTI-DRMs detected by NGS were E138K (n = 2; 1.3% and 2.1%, respectively), Q148R (n = 1; 1.6%), Q148H (n = 1; 1.5%), and T66I (n = 1; 22.3%) ( Table 3). These patients were from all study sites, except Jimma and the Army unit (details of study sites depicted in the methodology section). No clustering was found among the viral strains with INSTI DRM (Fig. 2).   (Table 3). PBSS failed to detect six out of 14 (42.9%) sDRM from four patients despite that the NGS assay detected sDRM corresponding to greater than 20% of the viral population. These six sDRM were derived from four different patients. Patients who failed ART with >150 copies/ml at month six and/or 12 had more frequently one or more NRTI, NNRTI and/or PI sDRM by NGS at baseline as compared to the virologic suppressors (OR: 6.4; 95% CI: 1.6-26.4 adjusted for NRTI regimens and CD4 cell counts) ( Table 3). This was also holds true when only patients with sDRM NRTI and/or NNRTI were considered (20/71 versus 3/38) (p < 0.05). Next we checked whether sDRM detected by NGS appeared at virological treatment failure (>1000 copies/ml) at month six and/or 12, as determined by PBSS (Table 4). Among 16 patients who failed at month six, only six out of 25 NGS NRTI or NNRTI sDRM at baseline appeared at month six. All these sDRM were detected at a high proportion at baseline (T215S: 99

Discussion
The present study is the first countrywide representative survey of transmitted drug resistance (TDR), based on the first large national ART cohort study in Ethiopia 13,14 . Analysing 461 pol sequences by PBSS, we found a low frequency, 3.9%, of treatment-naïve patients with sDRM. In a selected sub-set of 109 patients, additional DRMs were found by NGS, including major INSTI DRMs in five patients. Patients with TDR failed therapy more frequently both in OT and ITT analysis, suggesting a clinical impact of these mutations.
By PBSS, NRTI and NNRTI sDRM were found as expected, but also non-polymorphic accessory PI sDRM in two patients, despite the infrequent use of PI in Ethiopia in 2009-2011. An inclusion criterion in the ACM cohort was self-reported no earlier use of ART. If correctly self-reported, the prevalence of TDR was 3.9% and no regional difference within Ethiopia was observed. However, it shall be emphasized that the patients were recruited in 2009-2011 and that the present situation of pretreatment resistance (PDR) may have been changed. TDR in LMIC has increased, primarily NNRTI TDR, over time in sub-Saharan Africa (SSA) 9 . In addition, it should be noted that our patients had low CD4 cell counts at start of ART and were most likely not newly infected. Therefore, the TDR rate might be underestimated in our study since some drug resistant variants frequently disappear from the major viral population after a period of no ART. The increase has been steepest in east Africa up to a 7.4% prevalence eight to nine years after rollout of ART. An update until 2016, but now including all PDR, confirms this trend and the predictions of the prevalence of NNRTI PDR for 2016 were 11% (95% CI 7.5-15.9) and 15.5% (95% CI 7.7-28.8) in Southern-and Eastern-Africa, respectively 9 . Data from Ethiopia was however not included in these reports. Smaller and regional studies using PBSS have reported low frequencies, 3.3% in 2003 11 and 0% in 2005 15 , which increased in later studies, 5.6% in 2008 12 , and 7.2% in 2010 16 . Our nationwide data from 2009-2011 in a larger number of patients did not however suggest an increasing trend of TDR in Ethiopia up to then.
A higher number of sDRM was identified by NGS, which is in line with our earlier report of a high detection rate (6.5%) of NNRTI TDR in Addis Ababa, 2009-2010, using a sensitive allele-specific PCR 10 . Thus, additional DRMs were detected in 17 patients selected for the NGS assay. Of these, mutated viral populations representing more than 20% were found in four patients represented, which should have been possible to be able to detect with our PBSS assay. Although the selection of these patients were biased, the discrepancy between PBSS and NGS suggests that NGS facilitates detection of HIV-1 sDRMs in LMICs and reveals a higher prevalence of PDR to the same or lower cost if high-throughput approaches are used 17 .
In a study conducted on small number of patients (n = 45) from Gondar, Ethiopia recruited in 2008, using PBSS, no major INSTI DRM was found 18 . Interestingly, in our study major INSTIs mutations (T66I, E138K, Q148R, and Q148H) were found in five patients albeit at a low abundance. At the time when the study started (2009-2011), to our knowledge, no patient in Ethiopia had been treated with an INSTI and still these drugs are not an integrated part of the Ethiopian ART regimes. It cannot be excluded that INSTIs DRMs have been introduced in Ethiopia through patients who have been treated outside the country. However, our phylogenetic analysis showed no clustering of the strains with INSTIs DRM and the patients came from five different study sites all over Ethiopia. It seems therefore unlikely these strains have been transmitted from INSTI treated subjects. Also, we found no evidence of cross-contamination of INSTI-resistant strains in our laboratory, which is strictly separated from the clinical diagnostic laboratory. A possibility is that wild-type HIV-1C strains in Ethiopia may harbor low abundance of INSTI DRMs. All of the identified DRMs alone or in combination associated with resistance to raltegravir and/or elvitegravir.  Recently, dolutegravir (DTG) has been planned to be used in some African countries, as fixed dose combination given once daily. Importantly, the INSTI DRM E138K contributes to reduced susceptibility to DTG in combination with other INSTI DRM. Also, Q148R and Q148H are associated with low-level or intermediate resistance to DTG, which should be administered twice daily if these DRM are present. A similar pattern is found for cabotegravir and bictegravir 19,20 . Our finding warrants therefore expanded analysis of minor quasispecies with regard to INSTI DRMs in different African patient populations in order to identify how often such low abundance DRM can be find.
The impact of preexisting INSTI DRMs on clinical treatment response has been discussed earlier 21,22 . E.g. the E157Q mutation has been reported in 1.7% and 5.6% of viral sequences from ART-naïve patients, depending on subtype 23 and been implied to affect treatment response 24 . On the other hand, low abundance INSTI DRMs were not shown to have impact on treatment outcome 21,22 . However, these latter studies used allele-specific PCR detecting a significantly lower proportion of mutated virus than our NGS method, which has 1% cut-off. Therefore, a potential clinical impact of our findings still remains to be evaluated.
Patients with baseline pol-sDRM failed ART more frequently at month six. Also additional sDRM were identified pre-ART with the NGS assay. A dose-effect association between the level of low-abundance NNRTI-resistant mutants and a 5% threshold of mutant frequency has been suggested to be clinically relevant 25 . In our study, the number of patients with NGS sDRM was too small to allow identification of a threshold. However, among 16 patients who failed at month six, only sex out of 25 NGS NRTI or NNRTI sDRM at baseline were detected at month six by PBSS. All these six sDRM had been detected at a high proportion at baseline. The lack of detection of the minor sDRM in the follow-up samples could possibly indicate they had a limited or no impact on the emergence of drug resistance at the follow-up time points. However, further study is recommended to assess their impact as a secondary mutation for the emergence of DRM.
In conclusion, we have analysed TDR in the largest nationwide Ethiopian cohort so far and found that in 2009-2011 the rate was still low, 3.9%, using PBSS, but TDR before treatment was associated with a poorer treatment outcome. Also, our NGS results showed that the rate of 3.9% is an underestimation although we could not confirm that the low abundance DRM had a clinical impact. Interestingly, we identified preexisting INSTI DRM in wild-type HIV-1C from treatment naïve patients. Our data shows the importance of surveillance for TDR and PDR in LMIC and suggests an added value of using high-throughput NGS in such studies.

Material and Methods
Study Population. Through October 2009 to December 2011, a total of 874 ART naïve patients were recruited to the Advanced Clinical Monitoring (ACM) of ART in Ethiopia cohort, and started ART, as per the national guideline 4 . The subjects were from seven universities 13,14 , distributed geographically all over the country: Tikur Anbessa Specialized Hospital in Addis Ababa-Central region; Gondar-Northwest; Jimma-West; Mekelle-North; Harrar-East; Hawassa-South; the Army unit providing service to mobile military staff, which is located in Addis Ababa (Fig. 3). Our study was conducted on 490 subjects (age ≥ 14 years), randomly selected after stratifying by study sites (70 from each site), who were followed until the end of 2013 (Table 1). The following FDCs were given: TDF + 3TC + EFV (n = 222), TDF + 3TC + NVP (n = 39), ZDV + 3TC + EFV (n = 60), ZDV + 3TC + NVP (n = 144), stavudine (d4T) + 3TC + EFV (n = 15), d4T + 3TC + EFV (n = 9), and abacavir (ABC) + 3TC + EFV (n = 1). Clinical and routine laboratory tests were performed at the study sites. Ten ml whole blood was collected and processed for each patient at baseline, month six and 12. Plasma samples were centrally stored at the Ethiopian Health and Nutrition Research Institute (EHNRI) at −80 °C after transport on dry ice. Quantification of VL was performed by NucliSENS easyQ ® HIV-1 Nucleic Acid Sequence-Based-Amplification (NASBA) (BioMérieux Diagnostics) with a detection limit of 150 HIV-1 RNA copies/ml. CD4 T-cell count was determined at the hospital laboratories by BD FACSCalibur machines (Becton Dickinson, San Jose, USA). Data was entered into a site database and later uploaded to the central database at EHNRI, from which the following data were extracted: sex, age, WHO clinical stage, ART regimen, CD4 cell count and VL.
Population-based Sanger sequencing (PBSS). PBSS was attempted on 490 baseline samples as well as on 51 and 33 samples with VL ≥1000 copies/ml at month six and 12, respectively. HIV RNA was extracted from 140 µl plasma using the QIAamp ® RNA extraction mini-kit (Qiagen, Hilden, Germany). cDNA synthesis was done using RevertAid H-minus reagents (Life technologies, Paisley, UK). The first-round PCR was done using JA203F-C (forward) and JA206R-C (reverse) primer pair, followed by the second-round PCR, using JA204F-C (forward) and JA205R-C (reverse) primer pair 26 . The amplified fragments were purified (QIAquick PCR Purification Kit, Qiagen, Hilden, Germany) and sequenced with JA204F-C and JA205R-C PCR-primers plus PR2R (5′-GGATTTTCAGGCCCAATTTTTG-3′) and RT07 (5′-AAGCCAGGAATGGATGGCCCA-3′). This method has been used extensively at our laboratory. Positive PCR reactions are obtained in 100% of plasma samples containing the equivalent of 500 HIV-1 RNA copies per PCR reaction. In practice the assay gives positive results in the vast majority of plasma samples containing >500 copies/ml with a sensitivity to detect 20% mutated variants in the viral population. A comparison between the original assay 26 and our slightly assay modified with primers specifically designed for HIV-1C has shown equal results.
Next generation sequencing (NGS). NGS was performed on 109 baseline samples of all patients, who had viremia at month six and/or 12 (n = 71), and from randomly selected patients with undetectable viremia (n = 38), as described 27 . In brief, fragment I (HXB2: 790 -5096) covering Gag-pol was amplified, gel purified, and fragmented on the Coveris S200 followed by library preparation using NEBNext UltraTM DNA library Prep Kit. Forty-eight libraries were then pooled at equimolar (10 nM each) and run on Illumina HiSeq. 2500. The FASTQ file was demultiplexed and the consensus sequence was created for each sample followed by realignment again with the consensus sequence as input. The variant calling was performed at amino acid (AA) level. Only AA covering 5000× per position was considered quality passed. Based on the error calculation generated by PCR and NGS, any mutation >1% was considered. WHO list of DRM for surveillance of TDR was used to interpret sDRM for NRTIs, NNRTIs, PIs, and the Stanford drug resistance summaries for INSTIs (hivdb.stanford.edu).

HIV-1 subtyping and phylogenetic analysis. Subtyping was done by Recombinant Identification
Program (http://www.hiv.lanl.gov/content/sequence/RIP/RIP.html), REGA HIV Subtyping Tool v3, (http:// dbpartners.stanford.edu:8080/RegaSubtyping/stanford-hiv/typingtool) and COMET HIV-1 (http://comet.retrovirology.lu). Maximum likelihood phylogenetic analysis was performed using Molecular Evolutionary Generics Analysis version 7.0 (MEGA 7) software. Treatment outcome measures. The outcomes at month six and 12 were analysed by both on-treatment (OT) and intention-to-treat (ITT) approaches. In the OT analysis, two VL cut offs were used for the definition of virological treatment failure; >150 copies/ml and >1000 copies/ml, respectively. For ITT, treatment failure was defined as either failure to attain undetectable viremia (either <150 copies/ml or <1000 copies/ml), LTFU or death. Statistical analysis. Descriptive statistics (mean, median, standard deviation, percentiles for numerical variables, frequencies and percentages for categorical variables) were used to summarize sociodemographic, clinical, immunological, and virological parameters. Prevalence and types of DRM at baseline were investigated for their possible relationship with sociodemographic and clinical characteristics using t-test (for continuous variables), and Chi-square or Fisher's exact test (for categorical variables). The impact of pretreatment sDRM (RTI, PI) detected by PBSS and NGS assays on virologic treatment outcome at month six and 12 was assessed by using a multivariable model testing for different confounding factors including gender, age, WHO clinical stage, functional status, TB, CD4 cell count, baseline VL, and NRTI regimens. P-value < 0.05 was considered statistically significant. Data analysis was performed using STATA software 14 (Stata Corp. College Station, Texas, USA).
Ethical approval and informed consent. Scientific and ethical approvals were obtained from the National Research Ethics Review Committee in Ethiopia (3.10|528|06) and the Institutional Review Board (IRB) of EHNRI (Reference No. E.H.N.R.I 6.13/163). Written informed consent was obtained from all patients. All the methods were performed in accordance with approved institutional guidelines.
The sequences generated by Sanger sequencing in this study are deposited in Gene Bank [accession numbers: MG009597-MG010057].