Detection of Gag C-terminal mutations among HIV-1 non-B subtypes in a subset of Cameroonian patients

Response to ritonavir-boosted-protease inhibitors (PI/r)-based regimen is associated with some Gag mutations among HIV-1 B-clade. There is limited data on Gag mutations and their covariation with mutations in protease among HIV-1 non-B-clades at PI/r-based treatment failure. Thus, we characterized Gag mutations present in isolates from HIV-1 infected individuals treated with a PI/r-regimen (n = 143) and compared them with those obtained from individuals not treated with PI/r (ART-naïve [n = 101] or reverse transcriptase inhibitors (RTI) treated [n = 118]). The most frequent HIV-1 subtypes were CRF02_AG (54.69%), A (13.53%), D (6.35%) and G (4.69%). Eighteen Gag mutations showed a significantly higher prevalence in PI/r-treated isolates compared to ART-naïve (p < 0.05): Group 1 (prevalence < 1% in drug-naïve): L449F, D480N, L483Q, Y484P, T487V; group 2 (prevalence 1–5% in drug-naïve): S462L, I479G, I479K, D480E; group 3 (prevalence ≥ 5% in drug-naïve): P453L, E460A, R464G, S465F, V467E, Q474P, I479R, E482G, T487A. Five Gag mutations (L449F, P453L, D480E, S465F, Y484P) positively correlated (Phi ≥ 0.2, p < 0.05) with protease-resistance mutations. At PI/r-failure, no significant difference was observed between patients with and without these associated Gag mutations in term of viremia or CD4 count. This analysis suggests that some Gag mutations show an increased frequency in patients failing PIs among HIV-1 non-B clades.

www.nature.com/scientificreports/ region. Mutated residues in HIV protease are classified as either major or minor resistance mutations, according to their effect on ART clinical outcomes 8 . Following the Stanford algorithm (mutation list), minor resistance mutations (L10F, V11I, K20TV, L23I, L33F, K43T, F53L, Q58E, A71IL, G73STCA, T74P, N83D, and L89V) 9 are assumed to have ancillary roles such as compensation for lower efficiency of proteolysis caused by major mutations; major resistance mutations (V32I, M46IL, I47VA, G48VM, I50VL, I54VTALM,L76V,V82ATFS, I84V,N88S,  L90M) 9 tend to confer high levels of resistance to one or multiple PI/r and develop early in patient treatment 10 . Specifically, the emergence of HIV resistance to PI/r requires a stepwise accumulation of primary and compensatory mutations in the viral protease. Additionally, selected Gag (Group of specific antigens) mutations have been recently shown to provide compensatory functions for PR resistance mutations, which may contribute to poor treatment outcomes on PI/r-containing regimens due to the emergence of Gag-specific drug selected mutations among B subtype 11,12 . As mechanism of resistance may differ between B and non-B mutations, studies on non-B mutations are limited.
The HIV protease is essential for cleaving the Gag and Gag-pol polyproteins into their functional protein products, leading to the assembly of a mature infectious virus particle [13][14][15] . The protease has five cleavage sites on the Gag gene: the first cleavage separates the nucleocapsid protein (NC) from the capsid protein (CA) downstream of the 14 amino acid binding peptide called spacer peptide 1 (SP1); the capsid is subsequently separated from the matrix protein (MA), which remains associated with the virion membrane [13][14][15] ; this event is almost simultaneous to the release of the C-terminal p6 Gag protein, downstream of another linker peptide located between NC and p6, termed SP2 (spacer peptide 2, formerly termed p1); finally, the two linker peptides SP1 and SP2 are trimmed from the CA and NC proteins, respectively [13][14][15] . Of note, mutations in Gag ere reported to be contribute substantially to PI/r resistance besides compensating for fitness loss 11,12 . These potential effects are essentially driven by the C-terminal region, with little contribution from MA, CA and SP1 11 .
Some adherent patients failing treatment on PI/r-based ART do not harbor any major protease mutations, thus suggesting the detrimental effects of closer genes like Gag on the resulting poor treatment response 16 . This calls for further investigation on Gag genes for a successful scaling-up of PI/r-based ART in RLS like Cameroon. Of note, the role of drug resistance mutations in HIV protease has been studied extensively, whereas mutations in its substrate Gag have not been thoroughly ascertained.
A better understanding of HIV-1 Gag gene mutations and their co-variation with protease mutations among patients failing on PI/r-based regimen might be of great clinical relevance, especially as failure under PI without resistance is common.
We therefore sought to determine P7 (NC)-P6 HIV-1 Gag gene mutations selected under PI/r pressure and their covariations with protease mutations among HIV-1 non-B clades.
The overall prevalence of the presence of at least one protease drug resistance mutation among PI/r-experienced patients was 19.5% (n = 28). The most frequent mutations were M46I (21; 14.69%), I84V (11,7.69%) and I54V (11,7.69%) (Fig. 2). P7 (NC)-P6 Gag mutations associated to PI/r exposure. By evaluating the last 56 amino acids of Gag sequences derived from 101 drug-naïve and 143 patients on PI/r containing regimen, we identified 18 mutations associated to PI/r exposure, based on the assumption that these mutations occurred with different frequencies in ART-naïve patients compared to patients on PI/r-based regimen.
These mutations were grouped into three classes, based on their prevalence in isolates from treatment naïve and PI/r treated individuals (Fig. 3).
Class I included five mutations (L449F, D480N, L483Q, Y484P, T487V) that were completely absent or occurred with a frequency of < 1% in isolates from drug-naïve patients and showed a significant increase in isolates from patients on PI/r-based regimen (5.59-11.18%). www.nature.com/scientificreports/ www.nature.com/scientificreports/ Class II included four mutations (S462L, I479G, I479K, D480E) already present in isolates from drug naïve patients at a frequency between 1 and 5% but with a significant increase in isolates from patients on PI/r-based regimen (5.59-37.76%).

Gag P7(NC)-P6 mutations according to HIV-1 viral subtypes.
Of the 18 mutations significantly associated to PI/r exposure, seven mutations were statistically different among HIV-1 subtypes. Indeed, in class 1, mutations L483Q, Y484P and T487V showed a significantly higher prevalence among subtype A1 infected individuals, when compared to other subtypes (p < 0.001, Table 2). Of note, all the mutations with significantly varying frequencies of class 1 were found only in subtype A1 and other (mainly made of CRF11_cpx which has a portion of A subtype in Gag region). Regarding class 2, I479K was significantly more frequent in subtype categorized as other and D480E in subtype G, p < 0.001. for class 3, S465F was significantly more frequent in subtype categorized as other and E482G in CRF02_AG, p < 0.001.

Covariation of Gag mutations with protease mutations.
Another goal of our study was to assess the covariation of HIV Gag P7(NC)-P6 mutations with other mutations observed in the protease gene of 143 PI/rtreated patients, focusing on PI/r major and/or accessory drug resistance mutations according to the Stanford list of mutations 9 .
To identify significant patterns of pairwise correlations between Gag P7 (NC)-P6 mutations and protease mutations observed in isolates from PI/r-treated patients, we calculated the binomial correlation coefficient (phi) and its statistical significance for each pair of mutations (Table 3).

Clusters of correlated mutations.
Because pairwise analysis suggested that Gag mutations are associated with specific evolutionary pathways of known resistance-conferring mutations, we performed average linkage hierarchical agglomerative cluster analysis 17 to investigate this hypothesis in more detail. The dendrogram (Fig. 4) shows that Gag mutation L449F and P453L significantly correlated to PI/r major resistance mutations. Specifically, P453L clustered (bootstrap value = 0.33) with major PI/r resistance mutations M46I and I84V (covariation frequency: 40.9% and 31.8%, respectively). Likewise, another cluster was formed by L449F and L90M (bootstrap value = 0.75, covariation frequency: 25.0%).
Association of Gag P7(NC)-P6 mutations positively correlated to PI/r major resistance mutation with viral load and CD4 cell count at failure.. A further step in our study was to assess the difference in HIV viral load and CD4 cell count at the time of genotypic test between the patients harboring the Gag P7(NC)-P6 mutations positively correlated to PI/r major resistance mutations and those with wild type amino acids. Despite the slightly high median viremia in presence of two Gag mutations (L449F and Y484P) when compared to individuals with a wild-type residues, no significant variation was found in terms of viremia. Similarly, even though the median CD4 cells count was lower in all patients with Gag mutations when compared to wild type residues, no significant difference was found between the two groups in term of CD4 cell count (Table 4). Table 3. Significantly correlated pairs of HIV-1 Gag mutations with protease major or accessory resistance mutations. a The frequency was determined in 143 isolates from PI/r-treated patients. b Percentages were calculated for patients containing each specific mutation. c All P values for covariation were significant at a false discovery rate of 0.05. Mutations in bold represents major protease resistance mutations.

Discussion
In this study we identified eighteen HIV-1 Gag mutations which are significantly associated with exposure to PIs. These findings suggest that HIV-1 Gag P7(NC)-P6 mutations was associated to PI/r regimen among HIV-1 non-B subtypes. Some HIV-1 Gag P7 (NC)-P6 mutations (L449F, D480N, L483Q, Y484P, T487V) which were rare or completely absent in isolates from ART-naïve patients had a significantly increased frequency among PI/r treated isolates at virological failure. This suggests that these mutations might be selected under PI/r pressure. These mutations have also been documented by previous studies as being associated with PI/r regimen 11,16,18 . Moreover, class I mutations (e.g., L449F and Y484P) occurred principally in combination with several major PI/r resistance mutations, suggesting that they emerge after a prolonged PI/r exposure, when the virus has already accumulated  www.nature.com/scientificreports/ a large number of PI/r resistance mutations. In this regard, a previous study demonstrated that the emergence of protease major resistance mutation I50V require as a prerequisite changes in the Gag gene at position L449 in vivo and cause reduction of sensitivity to amprenavir and an improved viral fitness in vitro 18 . The same observation was made in another study where this mutation was present exclusively among individuals failing PI/r-based treatment 19 . Protease mutation L90M had the strongest correlation with L449F which is confirmed in the dendrogram. These major protease mutations I50V and L90M would therefore be a sentinel for the L449F in the Gag gene. Gag Y484P mutation was shown to be associated to darunavir exposure and was classified as a novel Gag C-terminus mutation associated to PI/r regimen 20 .
We have also observed that some mutations (class II: S462L, I479G, I479K and D480E) which were already moderately present (1-5%) in isolates from ART-naïve patients significantly increased their prevalence (positive association) in isolates from patients on PI/r regimen. Mutations S462L and I479K were previously identified as Gag polymorphisms which are associated to PI/r exposure 21 . Also, Gag mutation D480E which is recognized as a PI-exposure associated mutation 21 significantly correlated with two major PI/r resistance mutations (I54V and V82A). In the other hand, given that some studies have shown that only Gag mutations are capable of inducing resistance to darunavir 16 , the role of the novel mutation I479G which did not significantly correlate with any major or accessory PI/r resistance mutation deserve to be investigated.
Among the mutations already present in isolates from ART-naïve patients at a frequency of ≥ 5% but with a significant increase in isolates from PI/r treated patients (Fig. 2), only one mutation significantly correlated with major PI/r resistance mutations. More precisely, Gag mutation P453L positively correlated with seven major protease resistance mutations (L33F, M46I, I54L, I54V, V82A, V82T, I84V). Several studies have shown a positive correlation of Gag P453L mutation with some protease major resistance mutations such as I50V and I84V 11,12,22 . The involvement of this mutation in PI/r resistance has been demonstrated in vitro and has been incriminated in contributing to the restoration of viral fitness 18 . The strong correlation of this mutation with PI resistance mutations has been confirmed by the dendrogram; where P453L Gag mutation clustered with two major protease resistance mutations (M46I and I84V) (Fig. 4). This mutation although classified as PI/r-associated mutation 21 should be investigated to better understand its likely involvement in PI/r resistance. Among the mutations which did not correlate with any major or accessory protease resistance mutations, E460A was described to be repeatedly associated with therapy failure 23 , but its role in viral fitness and or drug resistance has not yet been proven. Gag mutations R464G, S465F, V467E, Q474P, I479R, and E482G were also previously identified as polymorphisms associated to PI/r exposure 21 . Thus, the potential role of these Gag mutations in the pathways to the development of resistance still need to be confirmed.
Furthermore, by comparing resistance mutation profile of RTI treated patients versus ART-naïve patients, we found the presence of certain Gag mutations (L449F, I479R, E482G and Y484P) significantly associated to RTI treated patients, and which were also significantly associated to PI/r treated patients. This suggest that some Gag mutations in patients failing PI/r may have been previously induced during RTI treatment, showing a possible interactive role of RTI on emerging Gag mutants. As previously reported by Soldi et al. in a subtype F highly-prevailing setting, the presence of NRTI mutations was associated to some PI/r resistance mutations (i.e. I50LV) in patients failing PI/r treatment 24 . Indeed, some studies revealed that some inserts at the P6 region within the Gag gene may favor virus escape from nucleoside reverse transcriptase inhibitors (NRTI) through greater accumulation of resistance mutations, leading to high level of resistance to this drugs class 25 . Of the 18 Gag mutations associated to PI/r exposure, fifteen were not significantly present among patients on RTI when compared to naïve patients, and ten mutations were significantly associated to PI/r exposure when compared PI/r-treated vs RTI-treated. This reinforces the fact that these mutations are primarily selected under PI/r exposure.
In this study, we observed that the distribution of certain Gag mutations was significantly different according to the HIV-1 non-B viral subtypes, which seem to harbor more than two substitutions in P2/NC Gag cleavage site compare to B subtype 26 . Data on the distribution of Gag mutations potentially associated with drug susceptibility among non-B subtypes are limited. Our analysis showed that subtypes A and G, although not in large numbers seem to be most affected ( Table 2). The association of Gag mutations in class 1 (< 1% variability among drugnaïve individuals that significantly increased with PI treatment) among some of these non-B subtypes deserves to be further investigated. Of note, some differences in the susceptibility to PIs of certain non-B subtypes such CRF02_AG and G when compared to other subtypes was previously documented in the literature 27 .
Among Gag mutations which positively correlated with major protease drug resistance mutations, we did not find any significant association with a worse virologic or immunologic outcome. However, Gag mutations L449F and Y484P had higher median viremia when compared to individuals with a wild-type residues. Of note L449F were described to contribute to full recovery of viral fitness in protease inhibitor resistance 13,28 .
The significantly associated Gag gene mutations described to PI/r regimen in this non-B subtypes population are similar to those described in several studies conducted among B subtypes 28 . The selection of Gag gene mutations related to PI/r exposure would therefore be similar in B and non-B subtypes in this C-terminal Gag region. These prominent findings henceforth support the need for a cohort-study in order to monitor the emergence of these mutations over time, the dynamics of immuno-virological parameters, in order to generate clinical confirmations that would be underscored by advanced in vitro analyses (by phenotyping via viral culture). Because we compared the non-B subtypes altogether, without a paired-wise comparison to B subtype, this could shadow some insightful information in non-B that would have reinforced our study-findings.

Conclusion
In summary, we have found that some Gag P7 (NC)-P6 mutations are associated with an increased frequency at PI/r failure among HIV-1 non-B isolates. In particular, mutations L449F, P453L and Y484P have shown a significant correlation with several major and/or accessory protease resistance mutations. The potential implication www.nature.com/scientificreports/ of novel Gag P7 (NC)-P6 mutations in treatment failure under PI/r-based regimen deserves to be further investigated in each non-B subtype and using cohort studies adding to in vitro experiments. These mutations could have clinical implications, since the current level of potential PI/r drug resistance might be underestimated.

Materials and methods
Specimen used for analysis. The study was performed on plasma samples of PLHIV failing their treatment. They were collected from January 2018 through December 2020 in Cameroon for routine clinical monitoring of HIV genotypic drug resistance at the Virology Laboratory of the Chantal BIYA International Reference Centre for research on the HIV/AIDS prevention and management (CIRCB) in Yaoundé-Cameroon. Patients were either naïve to antiretrovirals or treated with a PI/r-based regimen or a reverse transcriptase inhibitor (RTI) based regimen. Only patients with a clearly documented treatment history (available in their medical record) were enrolled; all participants on treatment were experiencing virological failure (i.e. a sustained plasma viral load > 1000 copies/ml after enhanced adherence counseling); and these participants on ART were considered to be adherent, based on self-reporting and enhanced adherence support according to national guidelines 29 .
HIV sequencing. For the sequences obtained, HIV-1 P7 (NC)-P6 Gag and protease genotyping was performed as previously described by Teto et al. 30 . Briefly, after viral RNA extraction from plasma samples, RNA was reverse-transcribed and amplified. From positive amplicons, DNA sequencing was performed in both sense and antisense using eight overlapping sequence specific primers. Sequences were obtained after capillary electrophoresis on Applied Biosystem™ 3500 genetic analyzer (Applied Biosystems™, USA), and sequences of approximately 168 nucleotides of the Gag gene P7 (NC)-P6) and 297 nucleotides for protease region were assembled and manually edited using Seqscape v.2.6. for P7 (NC)-P6 Gag gene and RECall (CDC, Atlanta) for protease gene. Regarding protease mutations, we focused our attention on PI/r major and accessory drug resistance mutations according to the Stanford list of mutations (https:// hivdb. stanf ord. edu/ hivdb/ by-mutat ions/). In our analysis, the Cochran rule, which is a conventional criterion for the chi-squared test to be valid, was fully satisfied. In fact, in each contingency table performed with our data set, 72% of the expected frequencies exceed 5, and all the expected frequencies exceed 1. In addition, in those few cases where the expected frequency in a single cell of the contingency table was less than 5, the significance was also confirmed by using the Monte Carlo significance test procedure 32 .
We used the Benjamini-Hochberg method 33 to identify results that were statistically significant in the presence of multiple-hypothesis testing. A false discovery rate of 0.05 was used to determine statistical significance.
Mutation's covariation. In the set of 143 PI/r treated patients, we exhaustively analyzed patterns of pairwise interactions among Gag mutations associated PI/r treatment and Protease mutations. Specifically, for each pair of mutations and corresponding wild-type residues, Fisher's exact test was performed to assess whether cooccurrence of the mutated residues differed significantly from what would be expected under an independence assumption. Again, the Benjamini-Hochberg method was used to correct for multiple testing, here at a false discovery rate of 0.01. Samples having a mixture of two or more mutations at a given pair of positions were ignored in calculating the covariation, since it is not possible to identify whether these mutations are indeed located in the same viral genome.
Cluster analysis. To analyze the covariation structure of mutations in more detail, we performed average linkage hierarchical agglomerative clustering, as described elsewhere 34 . Hierarchical clustering methods, which under different names are also widely used in phylogenetic tree building, rely on a matrix of pairwise dissimilarities between entities. Briefly, in average linkage clustering, clusters of increasing size are formed starting from one-element groups by iteratively joining two clusters with minimum average inter cluster distances between pairs of mutations. The distance between a pair of mutations was derived from the phi correlation coefficient, which is a measure of the association between two binary random variables, with 1 and − 1 representing maximal positive and negative association, respectively. This similarity measure was transformed into a distance by mapping phi = 1 to distance 0 and phi = − 1 to distance 1, with linear interpolation in between. The distance between different mutations at a single position was left undefined, as such pairs never co-occur in a single sequence (except from mixtures) and would lead to distorted dendrograms owing to their great distance. sed www.nature.com/scientificreports/ on which groups are associated into hierarchical clusters of decreasingly strong association. To assess the stability of the resulting dendrogram, confidence values for all sub trees in the dendrogram were computed by 100 replications of the clustering procedure on sequence sets bootstrapped from the original 143 sequences 34 . For instance, a bootstrap value of 1 simply means that out of 100 runs, all 100 had these two mutations (or groups of mutations) most closely linked. In this dendrogram, only Gag gene mutations significantly associated with PI/r exposure and major/accessory protease resistance mutations were considered.
Ethical approval and informed consent. This study was conducted in accordance with the Declaration of Helsinki. The study protocol was approved by the Cameroon National Ethics Committee, all subjects gave written informed consent/assent for inclusion before participating in the study.