Predicted resistance to broadly neutralizing antibodies (bnAbs) and associated HIV-1 envelope characteristics among seroconverting adults in Botswana

We used HIV-1C sequences to predict (in silico) resistance to 33 known broadly neutralizing antibodies (bnAbs) and evaluate the different HIV-1 Env characteristics that may affect virus neutralization. We analyzed proviral sequences from adults with documented HIV-1 seroconversion (N = 140) in Botswana (2013–2018). HIV-1 env sequences were used to predict bnAb resistance using bNAb-ReP, to determine the number of potential N-linked glycosylation sites (PNGS) and evaluate Env variable region characteristics (VC). We also assessed the presence of signature mutations that may affect bnAb sensitivity in vitro. We observe varied results for predicted bnAb resistance among our cohort. 3BNC117 showed high predicted resistance (72%) compared to intermediate levels of resistance to VRC01 (57%). We predict low resistance to PGDM100 and 10-1074 and no resistance to 4E10. No difference was observed in the frequency of PNGS by bNAb susceptibility patterns except for higher number of PNGs in V3 bnAb resistant strains. Associations of VC were observed for V1, V4 and V5 loop length and net charge. We also observed few mutations that have been reported to confer bnAb resistance in vitro. Our results support use of sequence data and machine learning tools to predict the best bnAbs to use within populations.

bnAb binding efficiency 9 .Several studies have evaluated HIV-1 Env characteristics to determine barriers to the ability of various bnAbs to neutralize different HIV subtypes 9,10 .Few studies have looked into how this diversity has evolved over time to show vast differences from within host (over 20% amino acid differences) to population scale differences, especially in settings of highly prevalent viruses such as HIV-1 subtype C (HIV 1 C) 11,12 .
For resource limited settings such as Botswana, it is difficult to carry out in vitro assays to detect neutralization capacity, and direct clinical trial data in the country to study bnAb efficacy is limited.However, machine learning prediction models/tools in silico can determine potential resistance or susceptibility of bnAbs in this setting 13,14 .Correlation of genotypic viral sequence data with actual in vitro data helps us determine the strength of these prediction models 15 .Because HIV-1 strains vary in their sensitivity to antibodies, accurate prediction of the sensitivity can be used to find the best bnAb combinations that can potentially be used to neutralize regional HIV strains 13 .In this analysis, we used proviral HIV-1C sequences from adults with documented HIV-1 seroconversion to predict (in silico) resistance to 33 known bnAbs and evaluate the different HIV-1 env sequence characteristics that may affect virus neutralization in Botswana.

HIV Env gp120 variable loop length and net charge (V1-V5)
We analyzed variable loop lengths including V1-V5 (including hypervariable regions) length and net charge for all gp120 sequences.The differences observed between the envelope characteristics were calculated using Wilcoxon Ranksum Test, stratified by predicted resistance/sensitivity.Although these regions are characterized with extreme variability for HIV-1C variants, not many differences were observed when looking at V1-V4 loop lengths, except for VRC26.25 where V1 loop (and V1 hypervariable loop region, data not shown) had shorter loop lengths observed in resistant strains (p < 0.01).A similar observation was observed for NIH45-46 resistant strains showing lower V4 loop region (p = 0.04).The most significant differences in variable region correlations were observed with V5 loop length, where CD4 bnAb 2G12, DH270.5 and FP interface 35O22 resistant strains had shorter V5 loop lengths; p = < 0.01, < 0.01 and 0.02 consecutively.Boxplot representing comparison of medians with p values are provided in Fig. S1.
Net charge was generally observed to be similar when comparing predicted resistant/sensitive strains.V1 charge distributions were observed to be significantly different among in VRC26.25 (p = 0.01), PGT121 (p < 0.01) and VRC01 (p = 0.01).A strong association was also observed for 2F5 resistant strains (p < 0.01); however, this was the only significant observation made for this bnAb for all sequence characteristics (Fig. S1).V4 charge distributions were observed to vary only among CD4 binding bnAbs HJ16 and VRC03 (p = 0.02 for both).

Signature amino acid positions and mutations
The presence of signature amino acid mutations on key positions was evaluated across all alignments with high level predicted resistance (70-100%), excluding FP interface bnAbs.We identified signature mutations which confer resistance to a subset of HIV-1C bnAbs from Bricault.et al. 16 ; who identified signature patterns by correlating each amino acid and PNGs across several HIV env alignments from different HIV subtypes based on

V2 apex mutations
Signature amino acid positions associated with proven in vitro resistance to CH01 and PGT145 were evaluated by correlating frequency of mutation to predicted resistance (Table S1a).We observed high associations across key signature sites to CH01, PGT145.The common loss of K169 and K171 which are associated with some V2 apex binding bnAbs were not observed across any of the resistant strains.E164 was observed and associated with resistance to both CH01 and PGT145.N160 was associated with resistance to most V2 apex binding bnAbs but was observed in sequences predicted to be sensitive to PGDM1400 (data not shown).We did not observe the K169E/T mutation in high frequencies among sequences with high predicted resistance.

V3 mutations
Most V3 signature sites are generally well conserved (Table S1b).Positions 332, 334 which are usually associated with resistance to most V3 binding were associated with resistant strains; 100% of resistant strains for having an asparagine (N) at position 334 for both 2G12 and PGT135.A few sites had several variations, but these were all significantly associated with resistance: 295, 336, 413, 640, 774.

CD4 mutations
Majority of CD4 resistant mutations are often observed around the CD4 binding site.We observe the mutation T234N which is a non-contact mutation, this appeared as a consensus amino acid among our sequences and was present in 126/140(90%) of our sequences; were 93 sequences were resistant to 3BNC117 and 91 to b12.We did not observe the mutation G458Y and D99Y was observed in very low (< 5%) frequencies.Position 371 had amino acids I (consensus) and V (mutation) observed in majority of sequences, this is different from the T371K mutation usually observed for subtype C viruses.We also observe a few other mutations that were associated with high levels of resistance across most bnAbs: S364H/P and G471A.

MPER mutations
We evaluated signature amino acid positions associated with 2F5 resistance.Overall, all positions considered had strong associations with resistance including positions where only wildtype amino acids were represented higher than mutations.We did not record mutations that are associated with complete MPER binding bnAb resistance; N671T, W672L, WG80G and F673L among all 2F5 resistant strains.We did however have K683R mutation observed in 25/140 (18%) of the sequences, 24 of which were to resistant strains to 2F5 (Table S1d).Of note this mutation was also observed among all the 4E10 sensitive strains (data not shown).

Discussion
We report, for the first time, prediction of resistance to 33 different bnAbs using HIV-1 proviral sequences from adults with HIV-1 seroconversion in Botswana, using a readily available in silico approach (bNab-ReP) 15 .There are limited data on the diversity of HIV-1 env and how it relates to probability of the successful use of bnAbs as immunotherapies in sub-Saharan African countries.We here report on the variability of the HIV Env characteristics that may affect susceptibility and potential use of these bnAbs in Botswana, including signature mutations that may facilitate bnAb escape in the setting of HIV-1 subtype C. Generally, predicted bnAb resistance varied among different bnAbs even if they were of the same binding type class, indicating variability in the potential neutralization capabilities of different bnAbs in the HIV-1C setting.For instance, we observe high rates of predicted resistance to 2F5 and no resistance to 4E10 among the two evaluated MPER binding site bnAbs.Similarly, the prediction of resistance also varied among other classes, including V2 apex and V3 binding bnAbs.Consistent with other studies that have analyzed phenotypic neutralization of HIV-1C envelopes 9,17 , we observed diversity in the predicted response to bnAbs that target V2.Recently, Mandizvo et al. 17 demonstrated differences in neutralization breath of different bnAbs; 2G12 (23%) compared to 10-1074 (80%) among V3 binding bnAbs, as well as 3BNC117 (59%) compared to VRC07-LS (100%) among CD4 binding bnAbs in single env genomes from nine South African individuals followed longitudinally postacute HIV-1 infection.The authors also describe that transmitted founder viruses were observed to be more resistant to VRC01 and sensitive to PGDM1400, but a reverse observation is seen for more chronic infections, and we observed similar results for adults with documented seroconversion.In Botswana, several studies have evaluated VRC01 and 10-1074 bnAbs in both adult and infant populations showing moderate to good neutralization breath, especially for 10-1074 18,19 ; we demonstrate that the prediction model of choice was able to predict similar results in the setting of adults with HIV-1 seroconversion, further highlighting the potential use of machine learning and artificial intelligence to determine bnAbs to use in different clinical settings.Furthermore, the results observed in our study show moderate to low levels of resistance in bnAbs used in germline targeting vaccines like VRC01, VRC26.25,PG9, PGT 121 and PGT128.These results could support the use of these bnAbs as templates for potential vaccine development 20,21 within the sub-Saharan African region, with caution against ongoing epitope changes and bnAb recognition and binding site altering mutations due to high viral diversity of the predominant HIV-1 subtypes.
Assessing N-linked glycans found in PNG sites in HIV-1 is important as they take up an average 50% of total mass of gp120 and are used by the virus to escape bnAb neutralization and increase viral pathogenesis 22 .Our data reveal high conservation of signature PNG sites across HIV-1C sequences, and higher levels of PNG sites were mostly associated with V3 binding bnAbs.We observed highly conserved PNG sites: N88, N156, N160, N197, N241, N262, N276, N289, N301, N386, N448.These data agree with results from Sutar et al. 9 , who evaluated the geospatial differences in HIV-1C genetic signatures and what attributes differentiate region-specific HIV-1C with virus neutralization to key bnAbs using full-length sequences across 37 different countries (including Botswana) retrieved form the Los Alamos National Laboratory HIV database.They reported that N88, N156, N160, N197, N276, N301 and N386 were observed to be highly conserved in all countries with over 70% abundance, where N301 was the most conserved (89-100%).Similarly, in our cohort, N301 conservation was 95% among our sequences, with N262 as the most conserved (100%) PNG site which has been previously predicted to interact with PGT151 23 .Furthermore, Sutar et al. 9 also observed significant differences in PNG sites from India and South Africa, where N130, N295, N392 and N448 which represent C1,C2,V4 and C4, these sites are responsible for the integrity of the mannose patch as well as associations with bnAbs such as 2G12, VRC-PG05 and PGT135 22 .We observe low conservation at PNG site N295, which is one of the sites where glycans that interact with V3 bnAbs are often positive for neutralization; this supports our finding of significant differences between resistant and sensitive strains within the V3 binding bnAbs.However, the extreme variability of V3 positions 336-442 may lead to inconclusive observations in terms of bnAb sensitivity 16 .We also observe fewer PNGS in most V3 resistant strains compared to sensitive strains.This is in contrast with what has been described in other studies but could be attributed to the effects of evolutionary pressure that lead to high diversity of HIV-1C.Naturally, glycan holes can create vulnerable sites on the virus that are more susceptible to antibody binding and neutralization, inducing autologous neutralizing responses 24,25 .The V3 loop glycan shield is the most conserved of the variable regions and this evolutionary pressure can therefore cause mutations in the glycosylation sites as a defense against glycan shielding, which in turn would lead to poorer antibody recognition.
Traditionally, variable region characteristics of HIV Env have been used to inform HIV vaccine designs, where the higher variation requires a robust and potent intervention.Our results report minor differences in variable characteristics by predicted bnAb resistance, although these differences which were most likely to affect bnAb sensitivity were observed in V1, V4 and V5.The variations observed in these sites have been reported to have impact on neutralization breath of majority of different bnAbs, this has been the main driver for development of bnAb cocktails targeting different regions of the env gene for treatment and prevention of HIV-1 [26][27][28] .
The presence of mutations in majority of sequences with high levels of predicted bnAb resistance has been a general observation for several other studies 5,16,17 .We observe similar results to what has been observed in different studies done both in vitro and in silico 29 , where introduction of several mutations in the env backbone of most HIV-1 pseudoviruses has led to reduced potency and neutralization breath of several bnAbs.We also reported the T234N which has been a glycosylation site associated with resistance to several CD4 binding bnAbs 16,30,31 , including strong associations with HIV-1 vertical transmission 32 .
Our study had several limitations.We used the bNab-ReP tool to predict HIV-1 resistance/sensitivity to 33 different known bnAbs.Although the tool has been shown to have a great correlation between in silico predictions and in vitro data through the CATNAP database (96% prediction accuracy) 15 , several newer algorithms have been deposited in Github which may yield more accurate results by updating the tools with other predictors selected apriori through time 13 .Although this is a limitation, it also highlights the need to develop withincountry, regional and subtype-specific algorithms for such predictions with the hope of getting predictions that are able to be used across all subtypes and easier to perform by end-users in resource-limited countries.We only evaluated a subset of mutations for a subset of bnAbs of interest, and future studies will include HIV-1C-specific mutations that correlate with phenotypic data demonstrating actual resistance caused by these mutations.Another limitation is the lack of phenotype data from Botswana or from our current cohort of interest, as we are unable to correlate genotypic prediction data with in vitro neutralization which may allow us to draw more firm conclusions.The strengths of our analyses include the ability to use a readily available prediction tool using sequences from recent seroconversions in Botswana.The predictions of resistance and sensitivity/susceptibility shown in our results are in line with what has already been published for teams that used phenotypic assays within the region to detect resistance to several bnAbs.

Figure 1 .
Figure 1.Predicted bNAb resistance based on bNab-ReP predictions from HIV-1 gp160 proviral sequences from seroconverters in Botswana.bNab-ReP algorithm was used to predict resistance/sensitivity of 140 HIV-1 gp120 sequences aligned to HXB2 and consensus HIV-1C reference from proviral strains of adults with documented seroconversion in Botswana.Probability gradient indicates probable sensitivity/resistance where values lower than 0.5 indicate probable resistance (dark orange color being most resistant) and vice versa.Percentage (%) resistance indicates percentages of resistance using cutoff bNab-ReP probabilities of 0.5.

Figure 2 .
Figure 2. Frequency of gp120 N-glycosylation sites frequencies.This figure shows differences in overall number of PNGS across all sequences.(A) I-V represent V2 apex binding antibodies, V3 binding antibodies, CD4binding antibodies, FP interface binding antibodies as well as MPER binding antibodies, respectively.These proportions were plotted for strains with predicted resistance (1/blue) and those with predicted susceptibility (0/ pink).P values indicate differences in proportions calculated using the Wilcoxon ranksum test.(B)Percentage abundance of signature PNLG sites (N88-N463).The logo section of the graph shows the proportions of dominant amino acids from all consensus proviral sequences at each PNG site, the size of the letter represents the proportion of sequences with that amino acid among all sequences.These sites are divided by their specific PNG domains.

Table 1 .
Baseline demographics of adults with documented HIV-1 seroconversion in the Botswana combination prevention project (BCPP).This table describes the baseline characteristics of the adults with documented seroconversion from the BCPP study.ART antiretroviral treatment, CD4 cluster of differentiation 4 T-cells, IQR interquartile range.