Patterns and risk of HIV-1 transmission network among men who have sex with men in Guangxi, China

The prevalence of HIV-1 in Guangxi is very high, and the rate of HIV-1 infection among men who have sex with men (MSM) has been increasing. Therefore, it is necessary to explore the patterns and risk factors of HIV transmission in Guangxi. For this purpose, individuals diagnosed with HIV-1 during 2013–2018 in Guangxi were recruited. Phylogenetic relationship, transmission clusters, and genotypic drug resistance analyses were performed based on HIV-1 pol sequences. Related factors were analysed to assess for their association with HIV-1 transmission. CRF07_BC (50.4%) and CRF01_AE (33.4%) were found to be the predominant subtypes. The analysed 1633 sequences (50.15%, Guangxi; 49.85%, other provinces) were segregated into 80 clusters (size per cluster, 2–704). We found that 75.3% of the individuals were in three clusters (size ˃ 100), and 73.8% were high-risk spreaders (links ≥ 4). Infection time, marital status, and subtype were significantly associated with HIV-1 transmission. Additionally, 80.2% of recent infections were linked to long-term infections, and 46.2% were linked to other provinces. A low level of transmitted drug resistance was detected (4.8%). Our findings indicated superclusters and high-risk HIV-1 spreaders among the MSM in Guangxi. Effective strategies blocking the route of transmission should be developed.

Region Center for Disease Prevention and Control, and all the methods in this study were performed in accordance with the approved guidelines. All the participants provided their written informed consent to participate in the study.
Study subjects. All the MSM who were diagnosed with HIV-1 at voluntary counselling and testing clinics in Guangxi, China from January 2013 to December 2018, and who had not been treated with antiviral therapy were enrolled in the study (in total, 955 individuals). A total of 732 sequences were successfully sequenced. In addition, 887 individuals (835 heterosexuals, 45 intravenous drug users, and 7 mother-to-child cases) diagnosed with HIV-1 in Guangxi were included in the study. We performed a BLAST search by using the sequences and selected the top 5 sequences showing the highest homology from publicly available HIV databases at Los Alamos National Laboratory. A total of 1119 reference sequences were selected for analysis after excluding repeated sequences. Overall, 2738 sequences were included for subsequent analysis 9 .
HIV-1 LAg-Avidity enzyme immunoassay (EIA). These individuals who had been treated with antiretroviral therapy (ART) and CD4 + T cell < 200 were excluded for HIV-1 LAg-Avidity EIA. An HIV-1 LAg-Avidity EIA was performed according to the manufacturer's instructions (Sedia, Portland, OR, USA). Test specimens were initially run as single samples. If the normalized OD (ODn) was > 2.0, the specimen was classified as a longterm infection. Specimens with an ODn < 2.0 were tested again in triplicate to confirm the obtained values. In confirmatory testing, specimens with an ODn < 1.5 were classified as recent infections.
Phylogenetic analysis. All sequences were edited by using Sequencher5.1 software and aligned by using Network construction. These aligned sequences were entered into the HyPhy software to calculate their genetic distances, and Tamura-Nei93 pairwise genetic distances were calculated for all pairs of sequences. Two sequences showing a genetic distance of ≤ 1.5% were identified as potential transmission partners. The network data were processed using Cytoscape v3.5.1 software to construct the transmission network. We determined the characteristics of the network, including nodes (individuals in the network), edges (the link between two nodes, representing the potential transmission relationship between the two individuals), degrees (the number of edges linking one node to other nodes), network sizes (the number of individuals in a cluster), and clusters (groups of linked sequences) 12 . The network data was visualized using a custom R script with the network package in R version 4.0.2 software 13 .

Statistical analysis.
To explore factors associated with a potential transmission, the edge between the two nodes was calculated. Comparisons among individuals were based on the following factors: year diagnosed, domicile region, age, marital status, educational level, ethnicity, HIV subtype, and when infection occurred. Significant differences in categorical variables were analyzed using chi-square or Fisher's exact tests. P values < 0.05 indicated statistical significance. All statistical analyses were performed using the SPSS 21.0 software (IBM, Chicago, IL, USA).

Results
Demographic characteristics of the study participants. Among

Identification and characteristics of transmission networks. An HIV-1 transmission network was
constructed and is shown in Fig. 2A 2B). Links (degrees) between each (node) for the 611 Guangxi MSM subjects (nodes) ranged from 1 to 508 and 543 (88.9%) had more than 2 degrees, indicating that each subject was linked to two or more subjects. Additionally, 451 (73.8%) subjects had more than 4 degrees and were considered high-risk spreaders and more likely to spread HIV-1 to other uninfected people. One subject was linked to 508 subjects, classifying this individual as a super spreader (Fig. 2C).
To explore factors associated with a potential transmission, we investigated the influence of the year diagnosed, residence region, age, marital status, educational level, ethnicity, subtype, and when infection occurred on the number of links in the network. A total of 723 Guangxi MSM, newly diagnosed between 2013 and 2018,  www.nature.com/scientificreports/ were included in the analysis. Chi-square test (Table 1) showed that there was a significant difference in the number of links between marital status (P = 0.009), subtype (P < 0.001), and when infection occurred (P < 0.001). Single subjects had more edges than divorced or widowed subjects. Among the different subtypes, CRF55_01B occurred most frequently, followed by CRF07_BC and CRF01_AE. Recent infection was more common than long-term infection.

Risk factors of recently infected individuals.
To explore the potential transmission patterns between the recently infected subjects and those with long-term infection in the networks according to geographic locations, 106 (of 469, 22.6%) recent infections were identified by performing an LAg-Avidity EIA, of which 26 (24.5%) were linked to other recent infections, and 85 (80.2%) were linked to the long-term infections. 70 (66.0%) were interlinked in Guangxi, 49 (46.2%) were linked to another province (Fig. 4A). To better understand the potential risk of HIV-1 transmission between recent infection and long-term infection, we compared the links between the two. The results showed that recently infected individuals had significantly more links than those with long-term infection (mean of the links, 23.06 vs 1.89), and most of the recent infections had more links with other provinces and long-term infections (Fig. 4B). There was a significant difference in the number of links between recent infection and long-term infection (Table 1). and NNRTIs. The highest rate of drug resistance was observed for subtype B (55.6%), followed by CRF55_01B (6.0%), CRF01_AE (4.9%), and CRF07_BC (3.5%). Eleven (29.7%) individuals harboured drug resistance mutations that were included in the transmission networks, and three were interlinked, indicating potential transmitted drug resistance (Fig. 5). These individuals with drug resistance in the transmission networks all corresponded to long-term infection.

Discussion
Our findings show that CRF07_BC and CRF01_AE were the major HIV-1 subtypes among the MSM in Guangxi, which agrees with the results of previous studies [14][15][16][17][18] . CRF07_BC was first found among IDUs in Guangxi in 2002 19 , and gradually formed a large-scale epidemic in the area in recent years. In 2013, CRF01_AE (62.0%),  www.nature.com/scientificreports/ CRF07_BC (25.0%), and CRF08_BC (6.5%) remained as the major strains 20 , with CRF07_BC accounting for 50.4% of cases in MSM in this study. These results suggest that CRF07_BC spread much easier among MSM and became a predominant strain in the region. CRF55_01B was first identified in Chinese MSM in 2012 21 , and a recent paper has reported that CRF55_01B was identified in the MSM in 7 provinces with prevalence rates of 1.5-12.5% 22 . In that study, CRF55_01B accounted for 11.4% in MSM, with CRF55_01B widely disseminated among the MSM in the region. In addition to the major CRFs, we identified 2 CRFs (CRF67_01B, CRF68_01B) that were imported from outside of the province and 2 URFs (01_AE/CRF07_BC, CRF01_AE/C). A previous study reported that higher rates of dual-variant and multiple-variant HIV infection were found in MSM compared to in heterosexual individuals in the same populations 23,24 , increasing the probability of HIV-1 recombination. There is a growing concern that URFs may be prone to be formed in MSM because of the high-risk behaviour features of this population, including multiple sex partners, low rates of condom use, and anal intercourse. It is challenging to trace the most probable infection route. We used novel molecular methods to explore the probable HIV-1 transmission network and risk associated with potential transmission. We showed that most HIV-1 infections were associated with cases outside of the province, and the main infection route was MSM. Moreover, the HIV-1 transmission network is concentrated; 3 networks contained more than 100 individuals, with the largest network containing 704 individuals. The finding reflects the existence of an HIV-1 super transmission network in the MSM in Guangxi and the characteristics of HIV-1 agglutinative transmission, which differs from those in other provinces of China 9,17 . Further analysis showed that 88.9% of individuals had more than 2°, meaning each subject was linked to at least two subjects; 451 (73.8%) subjects had more than 4° and were considered high-risk spreaders and more likely to spread HIV-1. One individual had 508° and was considered a super spreader. The finding reveals the existence of HIV-1 multiple partners and a super transmission network among MSM in Guangxi. Previous studies showed that individuals with more links in the network have a higher probability of spreading the virus to others because of their high viral load and high rate of partner change; therefore, these individuals may function as super-spreaders 25,26 . Chi-square test shows that marital status, subtype, and infection time were factors associated with potential transmission in networks, with a single status, CRF55_01B, and recent infection more likely to be associated with the virus spreading. This finding reveals why CRF55_01B has been widely disseminated among the MSM in the region in recent years. Thus, additional molecular epidemiology surveillance of CRF55_01B is required. www.nature.com/scientificreports/ It is very important to analyze the geographic dimension of HIV-1 transmission among MSM within the province and throughout the country. Guangxi is located in southwest China and is adjacent to Vietnam, Yunnan, and Guangdong. This area of China is heavily affected by HIV-1 7 . Nanning, the capital and largest city of Guangxi, is the provincial center in terms of the economy, culture, science, education, and tourism. Our findings revealed that Nanning is a major region of HIV-1 transmission, and the virus has spread to 13 other cities in Guangxi. Further analysis revealed that Nanning accounted for the dominant proportion of infections, not only in terms of local transmission (74.4%) but also in terms of cross-regional HIV-1 transmission (73.5%) based on the provincial transmission network, highlighting its remarkable role in the intra-provincial spread of HIV-1. The major province of cross-regional HIV-1 transmission with Guangxi is Guangdong, accounting for 57.5% of cases. Guangdong, a prosperous Chinese province adjacent to Guangxi, and it takes approximately 3 h to travel between Guangxi and Guangdong by high-speed railway. In the future, Guangdong is expected to become a major source of HIV-1 transmission. We also found that although most recent infections were associated with long-term infections, recent infections were closely related to potential transmission among networks, which is consistent with the results of previous studies 18,27 . This suggests that early HIV-1 infection plays an important role in transmission, emphasizing the need for early diagnosis and timely antiretroviral treatment.
Our findings revealed an estimated prevalence rate of drug-resistant HIV-1 strains of 4.8% among antiretroviral treatment-naïve MSM in Guangxi, representing a low level of TDR, which is consistent with the results of a previous study in the region 28 and those of the studies in other regions of China 9,29,30 . However, the rate of drug resistance remains close to a moderate level according to the World Health Organization categorization method. Although it is not necessary to make large adjustments to current antiretroviral therapy regimens, further studies are essential for monitoring TDR.
This study has several limitations. First, most of the sequences from the other provinces were downloaded from the internet, which may not fully reflect the actual situation in the corresponding provinces. However, the results provide direction, which may strengthen regional cooperation. This is especially important in Guangdong, which is the province with the most cross-regional HIV-1 transmission with Guangxi. Second, although our findings show that MSM is the largest factor in HIV transmission, the sample sizes for other routes of infection, including among heterosexuals, were low. Third, a potential sampling bias; we could analyze only the samples that had been diagnosed, but those that had been infected but not diagnosed could not be included in the analysis. Another limitation is that LAg-Avidity cannot be completely accurate to distinguish recent infection and longterm infection, including viral load will be good to define recent infection. In the future, more samples will be included in the genetic transmission network.
We showed that network analysis based on a molecular approach can be used to trace the most probable HIV-1 transmission routes and infer additional features of an HIV-1 epidemic. Additionally, we determined the characteristics and risk of HIV-1 transmission among the MSM in Guangxi. These results can be used to design and implement evidence-based interventions. Further studies are required to determine how to combine the results of network analysis with public health approaches to enable effective HIV-1 intervention.

Data availability
The datasets used and analysed during the current study are available from the corresponding author on reasonable request.