Viral and Host Characteristics of Recent and Established HIV-1 Infections in Kisumu based on a Multiassay Approach

Integrated approaches provide better understanding of HIV/AIDS epidemics. We optimised a multiassay algorithm (MAA) and assessed HIV incidence, correlates of recent infections, viral diversity, plus transmission clusters among participants screened for Kisumu Incidence Cohort Study (KICoS1) (2007–2009). We performed BED-CEIA, Limiting antigen (LAg) avidity, Biorad avidity, and viral load (VL) tests on HIV-positive samples. Genotypic analyses focused on HIV-1 pol gene. Correlates of testing recent by MAA were assessed using logistic regression model. Overall, 133 (12%, 95% CI: 10.2–14.1) participants were HIV-positive, of whom 11 tested recent by MAA (BED-CEIA OD-n < 0.8 + LAg avidity OD-n < 1.5 + VL > 1000 copies/mL), giving an incidence of 1.46% (95% CI: 0.58–2.35) per year. This MAA-based incidence was similar to longitudinal KICoS1 incidence. Correlates of testing recent included sexually transmitted infection (STI) treatment history (OR = 3.94, 95% CI: 1.03–15.07) and syphilis seropositivity (OR = 10.15, 95% CI: 1.51–68.22). Overall, HIV-1 subtype A (63%), D (15%), C (3%), G (1%) and recombinants (18%), two monophyletic dyads and intrinsic viral mutations (V81I, V81I/V, V108I/V and K101Q) were observed. Viral diversity mirrored known patterns in this region, while resistance mutations reflected likely non-exposure to antiretroviral drugs. Management of STIs may help address ongoing HIV transmission in this region.

infection 5,20 . However, changes in such biomarkers over time often vary per person 21 , hence the use of incidence assays at population rather than individual level.
Apart from incidence testing, techniques like genetic sequencing provide insights on vital viral properties that could boost understanding of HIV epidemics. HIV-1 exhibits high diversity with nine distinct subtypes: A-D, F-H, J and K and several circulating recombinant forms (CRFs) which show distinct geographical distribution 22 . In Kenya, subtype A is most predominant, however, subtypes C, D, G and CRFs are also common [23][24][25] . Viral diversity remains a challenge to HIV diagnosis, treatment, and other biomedical interventions 26 . Hence knowledge and continuous monitoring of circulating viral strains is an indispensible requisite for effective HIV prevention.
Current technological advances have made multi-pronged assessment integrating serological and molecular viral data, together with host's clinical, demographic and socio-behavioral indices, easily attainable. This could provide highly useful and timely information on which to base formulation and review of transmission prevention initiatives. In this paper, we aimed to optimise an MAA strategy and utilise viral and host data to characterise MAA-identified recent and established HIV infections among participants screened for Kisumu HIV Incidence Cohort Study (KICoS) 27 .

Results
General characteristics of study participants. Of the 1106 participants, the median age was 21 years  (Table 1). Briefly, fewer participants (286/1053, 27.2%) had acquired post secondary education and a majority (833/1099, 75.8%) had never been married. While most participants reported having had no previous treatment for sexually transmitted infections (STIs), a majority (66.2%) of the HIV infected individuals had HSV-2 co-infection. We noticed a reduced rate of reporting, especially pertaining to sexual behaviour questions, but a majority (168/235, 71.5%) of those who responded reported having received money in exchange for sex, while condom use during sex in the previous three months was low, especially among HIV positive individuals (43/110, 39.1) ( Table 1).

Recent and established HIV-1 infections.
Of the HIV-positive samples (n = 133), a total of 8 samples were excluded from the MAAs evaluation due to missing Biorad and BED-CEIA results (5 of which also lacked LAg avidity results) occasioned by sample depletion in preceding tests, resulting into 125 samples with complete incidence assays and VL results.
Considering single assays, Biorad avidity had the highest proportion of recent infections compared to the other two assays i.e. 20.0% versus 15.2% and 13.2% for BED-CEIA and LAg avidity respectively. Similarly, there was a trend where MAAs with single incidence assay plus VL generally had higher estimates of recent infections compared to MAAs with more than one incidence assay (Table 2). Narrowing our focus on the MAAs in the last category reported above, we reviewed the incidence estimates against the KICoS1 incidence (1.4%) estimated prospectively and selected the MAA in which recent infections were defined by BED-CEIA OD-n < 0.8 + LAg OD-n < 1.5 + VL > 1000 copies/mL, giving 11 (8.8%, 95% CI: 5.0-15.1) recent infections and incidence of 1.46% (95% CI: 0.58-2.35) per year.
Of the 125 samples used in evaluating the MAAs, 114 (91.2%) (12 recent and 102 established) were concordantly classified by BED-CEIA and LAg avidity assays (Kappa score of 0.635 (P = 0.001, 95% CI: 0.429-0.841) and Pearson's phi coefficient (φ ) of 0.638 (P = 0.001)) (Supplementary Table S1). One of these 12 recent infections, with VL of 761 copies/mL, was reclassified as established based on the MAA's criteria. We show the general overview of the classification of all the 125 samples by all the five parameters and by the selected MAA ( Supplementary Fig. S1).
We further analysed the performance of the three incidence assays on 144 samples that were classified as established by the MAA. Biorad avidity had the highest rate of misclassification of samples as recent, followed by BED-CEIA (Table 3). In addition, VL mean values were lower among samples classified as recent infections than the established infections, for all the three incidence assays. However the differences were not statistically significant (Table 3). Although A, D and AD were the subtypes misclassified as recent, statistical analysis showed that overall misclassification by the three incidence assays were not significantly linked with viral diversity, Chi-square p-values > 0.05 ( Supplementary Fig. S2).

Characteristics of individuals with multiassay algorithm-determined recent infections. Among
individuals tested as recently infected on the BED + Lag + VL MAA, 7/11 (63.6%) were females, a similar number were < 25 years old, and a majority (8/11, 72.7%) had never married. The most prevalent HIV subtype was A (6/11, 54.5%), followed by C and AD (both with 2/11, 18.2%), and lastly D (1/11, 9.1%) (Supplementary Table S2). In the regression model for HIV recent infection versus HIV negatives, after controlling for age, gender, sex for gifts, history of STI treatment, syphilis and HSV-2, only history of STI treatment and syphilis sero-positivity explained the recent infections in this population (Table 4). Persons reporting past treatment for STI were nearly 4 times more likely to be recently infected than those never treated for STI (OR = 3.94; 95% CI: 1.03-15.07), while those who tested positive for syphilis were ten times more likely to be recently infected compared to the syphilis negative (OR = 10.15; 95% CI: 1.51-68. 22 insufficient plasma quantities, 5 with viral load < 1000 copies/mL (the amplification sensitivity threshold for sequencing), 9 with undetectable viral load, and 5 that failed amplification.
HIV transmission network analysis. Evaluation of evolutionary relatedness among 98 samples (after removing two sequences with sequence quality issues) revealed two monophyletic dyads (Fig. 2) of HIV-1 subtype A and AD. Each dyad had an older male and a younger female. One dyad had single individuals while individuals in the other dyad were both in married relationships. All the individuals had MAA-classified established infections characterised by higher VL in males compared to females. Discordant variable outcomes within dyads, for instance condom use, imply the possibility of transmission networks being larger than captured by this study. Additionally, the small size of the dyads could be due to limited sample size and not an indication of transmission pairs (Supplementary Table S4). Viral resistance mutations analysis. Observed viral mutations included V81I and V81I/V in the protease gene, and V108I/V and K101Q in the reverse transcriptase gene, all of which were intrinsic mutations not associated with ARV exposure. Four of the five individuals bearing the mutations were females, while two individuals (one male) had recent infections (Table 5).

Discussion
This is one of the few applications of a multifactorial strategy and the recent approaches and recommendations for cross-sectional HIV incidence testing algorithms in Africa 16,17 . We optimised an MAA that identified potential recent HIV infections in the study population. We then analysed host factors and viral molecular properties characterising this population. From our analysis, single incidence assays or MAAs with single incidence assay plus one or two non-serological biomarkers seemed to give poor outcomes compared to MAAs integrating at least two incidence assays and one or more non-serological biomarkers. This pattern generally corroborates the findings from previous evaluations of single incidence assays and MAAs in different setups 8,12,17,28,29 . Single incidence assays in our study gave more than twofold higher incidence estimates compared to the longitudinal KICoS1 incidence. This emphasises the inappropriateness of using the current incidence assays singly in cross-sectional incidence studies 9,13 . The selected MAA comprised of BED-CEIA, a relatively inexpressive and widely used assay 28 , plus LAg avidity, an assay with commendable performance in different epidemics, and VL, a test that is currently recommended for incidence MAAs 9 . The concordance between BED-CEIA and LAg avidity assays was statistically significant, with modest Pearson and Kappa statistical values. With expansion of capacity for dry blood spot specimens and point of care VL testing, such MAA will even be more attainable in resource limited settings.
In  30 reported for Nyanza, these statistics collectively emphasise the large HIV burden in this region, hence the need for well designed prevention strategies. As commonly observed in other parts of sub-Saharan Africa, HIV was more prevalent in females in this study than males 32,33 , potentially due to disproportionate social and biological factors influencing vulnerability as previously reported 34 , and possibly an imbalance in health seeking behaviours.
According to Kenya's census projections from 2000 to 2020, the total population of persons aged 15-34 years in Kisumu in 2007 was 233,570 35 Table 3.

HIV viral load characteristics of samples classified as established by the MAA (N = 114) categorised by three single incidence assays, Kisumu Incidence Cohort Study (KICoS): 2007-2009. Note: There were no differences in HIV viral load between recent and established infections (t-test).
and diagnosed STI infections could be among key factors driving new HIV transmission in this younger population. This is consistent with HIV risk factor analysis reported previously 34 . Two recently published studies also identified history of STI as a risk factor for recent HIV infections in both rural western Kenya and the country as a whole 16,25 . Prompt diagnosis and treatment of STIs, accompanied with risk reduction counselling remain vital to the success of HIV prevention initiatives in this setting. Additional efforts are needed to share risk knowledge with younger adults who are sexually active and promote early HIV testing. The overall distribution of HIV-1 subtypes in our study was synonymous to patterns earlier reported for this region 24,25 . Although slight changes were observed between recent and established infections, where some subtypes were apparently lower (A and D) or not found (AC, AG and G), while AD and C apparently increased among recent infections, the variations were not statistically significant, implying a mature epidemic. Additionally, Kenya is bordered by five countries with variable distribution of HIV-1 subtypes. Subtypes C and AC dominate in Somalia and Ethiopia, C and D in Sudan, while A and D are the most common subtypes in Uganda. Tanzania has subtypes A, C, D, AC, AD and CRF_CD in varying proportions 37 . Owing to the ongoing regional integration among East African states, frequent transfer of different viral subtypes between states is highly likely, hence continuous monitoring of HIV strains remains an important consideration when carrying out biological investigations in this region.
Although only polymorphic drug-resistance mutations, which have a low effect on HIV therapy, were observed in this study, such polymorphisms could lead to rapid treatment failure and development of drug resistant HIV-1 variants following initiation of therapy 38 . For instance, V82I polymorphism in subtype G contributes to emergence of I82M/T/S resistance after protease inhibitor based treatment failure 39 . The pattern of resistance mutations in this study could be a reflection of lack of prior ARV exposure. Nevertheless, with the increase in ARV use and consequent primary drug resistance mutations in Kenya as well as southern and eastern Africa regions, the importance of frequent drug resistance surveillance cannot be overstated 40,41 .
Finally, the small sample size of recent infections, and the convenience sampling method employed to screen participants for KICoS1, may have affected the statistical power of various variables. We also lacked professional panels to generate local incidence assays window periods. Although we utilised longitudinal KICoS1 incidence to validate the derived cross-sectional incidences, the possibility of misclassification by the MAA cannot be completely ruled out. These factors may reduce the representativeness of our findings.

Conclusion
In summary, our study presents an MAA that estimated cross-sectional HIV incidence with perfect concordance to longitudinal incidence, with a mean recency of infection below one year. This offers important insights on the performance of MAAs in local African epidemics. This MAA allowed us to demonstrate the possibility of comprehensive evaluations covering key groups in the HIV epidemic, i.e. the HIV negative, recent and established infections. This study showed that current/past STI infections could be possible independent factors for new HIV infection in this population. We observed limited viral resistance mutations, four pure HIV-1 subtypes (A, C, D, and G) plus a number of recombinant viruses, and existence of transmission clusters, consistent with previous molecular surveys in this region. Application of our strategy in larger cross-sectional studies will enable a more in-depth assessment with definite outcomes that will support progressive approaches for tackling the spread of HIV.   27 . Demographic and behavioural information was collected at screening via Audio Computer Assisted Self Interview (ACASI), followed by medical examination and testing for common sexually transmitted infections (STIs). This was termed KICoS1, followed by later design  . The BED-CEIA and LAg avidity were performed according to manufacturer's instructions as previously described 5,20,43 . Normalised optical density (OD-n) < 0.8 and < 1.5 represented recent infections on BED-CEIA and LAg avidity respectively, while values above the cut-offs were considered established. Same was done for avidity index (AI) < 30% for Biorad avidity assay.
HIV genetic sequencing. Protease (1-99 amino acids) and part of reverse transcriptase (1-250 amino acids) regions of HIV-1 were sequenced by a broadly sensitive in-house assay as previously described 44 . Briefly, HIV-1 RNA was extracted using QiaAmp Viral RNA mini kit following manufacturer's instruction (QiagenInc, Chatsworth, CA). Using primers spanning the target pol region, RT-PCR and nested PCR were conducted sequentially followed by Big Dye Terminator sequencing and resolution using an ABI 3100 Genetic Analyser (Applied Biosystems, Foster City, CA, USA). The sequences were assembled with Sequencher v.3.1 (Genecodes, Ann Arbor, MI) and quality checks done using sequence quality assessment tool (SQUAT).
To assess genetic diversity, sequences were analysed by REGA HIV-1 subtyping tool v.3.0, and further compared with NCBI-BLAST and MEGA v.7.0. Sequences showing ambiguous subtyping were selected for recombination analysis using SimPlot software v3.5.5 in a 400 base pair (bp) sliding window with 20 bp increments.
We also investigated potential existence of transmission clusters by evaluating evolutionary relatedness between the sequences in MEGA v.7. We used pair-wise Tamura Nei 93 (TN93) model, assuming gamma distribution (shape parameter = 0.3305). Potential transmission clusters were defined as ≥ 2 sequences with ≤ 1.5% genetic distance and high bootstrap values (> 95%) from 1000 re-samplings 45,46 . The trees were rooted using subtype K reference sequence (Los Alamos Database accession number AJ249239_CM_K).
HIV viral resistance mutations were assessed by the algorithm in the Stanford University HIV Drug Resistance Database and categorised according to the International AIDS Society-USA Drug Resistance Mutations Group December 2010 updates 47 . Statistical methods. From the incidence assay results, we evaluated the performance of each assay singly and in various MAAs with and without VL (cut-off of > 1000 copies/mL) 48 . Samples with missing values by any of the four parameters (three incidence assays or VL) were excluded from this analysis. We derived mean duration of recency (w) and 95% confidence intervals (95% CI) for the three assays from a previous publication 49 , and estimated percent incidence by individual assays and MAAs as previously described 50 , assuming missing at random to adjust for the samples missing incidence test data. We transformed the five parameters' data into binomial values according to their respective cut-offs and generated heat maps using PermutMatrix-1.9.3 to evaluate the classification of samples by the five parameters. We considered published incidence of 1.4%, derived from the longitudinal phase of KICoS1 42 , as a guide to select suitable MAA for subsequent analyses. Kappa coefficient and Pearson's phi coefficient (φ ), with their respective 95% CIs and p-values, were calculated to measure agreement between incidence assays. We further sought to characterise misclassification by the three incidence assays based on the optimised MAA.
To assess factors potentially associated with the MAA-identified recent infections, we fitted two models using logistic regression for both bivariate and multivariate analyses. One model assessed recent infection versus HIV negatives and another model assessed recent infections versus established infections. All variables with bivariate p-value ≤ 0.2 or set a priori were included in the final multivariable models. In the multivariable models, covariates were added one by one to assess their individual effect on the outcome, while controlling for other covariates as potential confounders. We used likelihood ratio statistical test to select the best models. Chi-square and t-test