Peripheral blood mononuclear cell transcriptomes reveal an over-representation of down-regulated genes associated with immunity in HIV-exposed uninfected infants

HIV-exposed uninfected (HEU) infants are disproportionately at a higher risk of morbidity and mortality, as compared to HIV-unexposed uninfected (HUU) infants. Here, we used transcriptional profiling of peripheral blood mononuclear cells to determine immunological signatures of in utero HIV exposure. We identified 262 differentially expressed genes (DEGs) in HEU compared to HUU infants. Weighted gene co-expression network analysis (WGCNA) identified six modules that had significant associations with clinical traits. Functional enrichment analysis on both DEGs and the six significantly associated modules revealed an enrichment of G-protein coupled receptors and the immune system, specifically affecting neutrophil function and antibacterial responses. Additionally, malaria pathogenicity genes (thrombospondin 1-(THBS 1), interleukin 6 (IL6), and arginine decarboxylase 2 (ADC2)) were down-regulated. Of interest, the down-regulated immunity genes were positively correlated to the expression of epigenetic factors of the histone family and high-mobility group protein B2 (HMGB2), suggesting their role in the dysregulation of the HEU transcriptional landscape. Overall, we show that genes primarily associated with neutrophil mediated immunity were repressed in the HEU infants. Our results suggest that this could be a contributing factor to the increased susceptibility to bacterial infections associated with higher morbidity and mortality commonly reported in HEU infants.


Differential expression of PBMC genes between HEU and HUU infants.
To investigate gene transcription profiles, we sequenced mRNA extracted from PBMCs sampled at twelve (n = 18) and twenty-four months (n = 14) after birth from HEU infants and in HUU infants (n = 15). After quality control filtering, 47 transcriptomes with an average read depth of 30 million per sample were obtained ( Supplementary Fig. S1). Differential gene expression analysis revealed a total of 262 differentially expressed genes (DEGs) of which approximately two thirds (188) were up-regulated, while a third (74) were down-regulated ( Supplementary  Fig. S2 & Table S2). The top 25 upregulated and downregulated DEGs are shown in Fig. 1a. Due to the age range around the targeted 12-and 24-months age groups, we analysed within these two populations genes showing significant transcriptional differences probably attributed to the infant's age, henceforth referred to as HEU-DEGs. Comparisons between DEGs and HEU-DEGs showed an overlap of only 5% ( Supplementary  Fig. S3). Therefore, we concluded that DEGs were not related to the HEU infant's age, but rather were due to differences between populations (HEU vs. HUU).
Among the DEGs, ODC1 was the most divergent gene (BH adjusted p-value = 1.49e-21) followed by H1FX (Fig. 2). Interestingly, the top most statistically significant up-regulated genes consisted primarily of non-coding genes (ENSG00000213073), antisense genes (ENSG00000273599 and ENSG00000255031), and a pseudogene (ENSG00000266777) (Fig. 2). Other DEGs comprised genes encoding novel transcripts (57/262) (see Supplementary Table S2). Hierarchical clustering of DEGs identified groups of co-expressed genes. Key among the up-regulated cluster of genes were those coding for G-protein coupled receptors (GPCRs) (HCAR2, HCAR3, FFAR2), TNFSF14 and CISH. The down-regulated gene clusters consisted of: five epigenetic factors (H1FX, HMGB2, HIST2H2AC, HIST1H1C, H1F0), genes that function in transcriptional regulation (two noncoding RNAs; RNU6-595P, ATP1B3-AS1), genes involved in the malaria pathway (THBS1, CYP1B1, SDC2) and genes involved in neutrophil-mediated immunity (LCN2, CAMP, HP, MMP8, BPI, LTF) ( Fig. 2  General overrepresentation of the G-protein coupled receptor family in up-regulated DEGs. To investigate potential molecular and biochemical pathways in gene sets of either up-regulated or down-regulated DEGs, we performed enrichment analysis using STRING 24,25 . In the analysis of the up-regulated gene set, two-thirds of all DEGs surprisingly collapsed into one GO term of plasma membrane, particularly the hydroxycarboxylic acid-binding receptors that belong to the G-protein-coupled receptor family and tumor necrosis factor that is involved in systemic inflammation (Supplementary Table S3). Of interest, the down-regulated gene set was associated with 237 GO terms and KEGG pathways. The enriched terms covered diverse biological pathways such as early growth response, osteogenesis, cellular response to stress and external stimuli, response to the oxygen-containing compounds DNA synthesis and immune system processes (Fig. 3 & Supplementary  Table S3). The enriched GO terms related to immune system processes included; neutrophil mediated immunity, defense response to bacterium, defense response to gram-negative bacteria, myeloid and leukocyte activation involved in immune responses, humoral immune responses, negative regulation of cytokine production, leukocyte degranulation and inflammatory responses. www.nature.com/scientificreports www.nature.com/scientificreports/ Altered immune responses towards malaria in HEU infants. Malaria disease was the single KEGG pathway identified and this was linked to IL6, syndecan-2 (SDC2) and thrombospondin-1 (THBS1) DEGS that were enriched in the down-regulated gene set. Malaria is endemic in the study region, and thus our data suggests immune responses to the disease may be affected in HEU infants. To test this hypothesis, we analysed antibody responses to Plasmodium falciparum apical membrane antigen (PfAMA) 1, a leading malaria vaccine candidate, and responses to Plasmodium schizont extract, a crude marker of exposure to malaria parasites during the blood stage infection, in the larger cohort from where the HEU and HUU infants included in the transcriptomic analysis had been sampled. A comparison between the HEU and HUU infants' antibody titers confirmed the  www.nature.com/scientificreports www.nature.com/scientificreports/ hypothesis. HEU infants residing in a high malaria transmission region, had significantly lower antibodies specific for Plasmodium schizont extract (Fig. 4a) and PfAMA1 (Fig. 4b) compared to HUU infants living in the same region. In contrast there were no differences in antibody levels between HEU and HUU infants residing in the low malaria transmission settings, which were similar to those observed in the HEU infants in the high malaria transmission region. Taken together, the antibody responses measured in infants in the larger cohort, support the transcriptional findings in the subset of infants included in the transcriptomic analysis, for the malaria pathway. The low antibody titers in HEU infants corresponds to the low expression of genes coding for host proteins such as THBS1, a receptor required for parasite sequestration and avoidance of splenic destruction. In addition to the direct impact of daily cotrimoxazole prophylaxis, it is possible that lower THBS1, contributes to adequate splenic clearance of parasites leading to a significant reduction in parasite biomass, further reducing exposure and potentially diminishing antimalarial antibody responses.
Weighted gene co-expression analysis reveals significant modules in relation to HIV exposure and neutrophils. To identify groups of genes with similar expression patterns, we performed unsigned WGCNA. Co-expression analysis of 8000 genes from a total of about 17,000 genes resulted in 12 modules of co-expressed genes ( Fig. 5A) whereby the grey module contains unassigned genes. Five modules of clinical interest had significant association between the module eigengenes and clinical traits associated with HIV exposure (Fig. 5B). Modules green (cor = 0.6, P-value = 7e-06) and black (cor = 0.5, P-value = 3e-04) were significant within the HIV exposure trait (HEU vs HUU) whereas modules brown (cor = 0.63, P-value = 2e-06), purple (cor = 0.56, P-value = 4e-05) and green/yellow (cor = 0.52, P-value = 2e-04) were significant within the hematocrit, neutrophils and eosinophils traits, respectively (Fig. 5B).
Eigengene network analysis resulted in similar expression patterns between the neutrophil trait and the brown, purple and green/yellow modules, whereas the HEU vs. HUU trait had similar expression patterns to the green and black modules (Fig. 5C).

Enrichment analysis of DEGs and WGCNA reveals an overrepresentation of neutrophil mediated immunity.
Of the five significant modules, modules brown, purple and green/yellow were enriched in immunological GO terms (Fig. 5D & supplementary Table S4). Of interest was the purple module that also had neutrophil mediated immunity as an enriched term (Bonferroni corrected P-value = 7.18e-04) (Supplementary Table S4). Further analysis of individual DEGs linked to neutrophil-mediated immunity revealed a general trend of down-regulated genes in HEU infants (Fig. 5E). The most significant change was observed in the gene coding for resistin (RETN), that plays a crucial role in neutrophil activation and the formation of neutrophil extracellular traps (NETs) 26 .
Epigenetic factors have been shown to be regulators of global transcription through chromatin remodelling. In our study H1FX and HMGB2 genes, were top amongst the significantly down-regulated genes in HEU infants. Interestingly, on performing a correlation analysis between the 262 DEGs and individual genes from the chromatin hub, a strong positive correlation with all the downregulated genes and a strong negative association with the upregulated genes was observed (Fig. 6). This observation suggests a potential role for the chromatin-related genes in modulating the dysregulation of the HEU infants' transcriptome. www.nature.com/scientificreports www.nature.com/scientificreports/

Discussion
Overall, the transcriptome analysis draws attention to a specific immune system process that is downregulated in HEU infants, neutrophil mediated immunity, that could be a potential marker of increased susceptibility to infection and resulting morbidity and mortality. www.nature.com/scientificreports www.nature.com/scientificreports/ It is speculated that viral particles may traverse the placental membrane, most likely due to the trans-placental diffusion of HIV soluble proteins as opposed to live and replication-competent HIV. Exposure to HIV soluble proteins may trigger immune responses to HIV. Previous reports showed that more than 20% of HEU infants possess immunological memory to HIV despite not being infected 27 , suggesting that early exposure may cause intrinsic immunological triggers. In this study, we observed the upregulation of non-coding RNA (ncRNA) ATP1B3-AS1, in HEU infants. ATP1B3-AS1 is an antisense transcript of ATP1B3 gene, which belongs to the Na + / K + ATPase family that catalyzes the hydrolysis of ATP. The ATP1B3 gene has recently been shown to be a cofactor of the bone marrow stromal cell antigen 2 (BST-2) protein. BST-2 is highly expressed on the surface of HIV infected cells and has been demonstrated to stop the release of newly produced virions by blocking virus escape from infected cells. Knockdown of ATP1B3 enhances the expression of BST-2, in turn modulating the restriction of HIV-1 virion production. The high expression of the antisense transcript, ATP1B3-AS1, may therefore be an indicator of exposure to viral particles and the priming of BST-2 dependent anti-HIV responses in HEU infants as previously suggested 28 .
The high expression of genes primarily associated with the plasma membrane identified a number of receptor genes including those on the surface of immune cells. The high transcription levels of TNFSF14, CISH, FFAR2, HCAR2, and HCAR3 in HEU infants, highlights a role for immune processes involving cells such as dendritic cells (DCs), neutrophils and macrophages. TNFSF14 is expressed on activated T lymphocytes, preferentially CD8+ 29 , and the upregulation of TNFSF14 could be as a result of increased CD8+ activation as previously reported in HEU infants 17,30 . CISH is expressed during the development of DCs and regulates DC-mediated cytotoxic T-lymphocyte activation 31,32 . Its expression is associated with the negative regulation of cytokine signaling, controlling inflammation during infection, and susceptibility to infectious diseases such as bacteria and malaria 33 . High expression of CISH in HEU infants may similarly predispose them to severe bacterial infection and may be potentially associated with the high inflammation observed in HEU infants 34 . HCAR2 and HCAR3, encode for receptors expressed on various immune cells such as neutrophils and macrophages, which have been shown to have anti-inflammatory effects [35][36][37][38] . FFAR2, on the other hand, regulates the differentiation and activation of macrophages and leukocytes 39 .
Further analysis of the downregulated genes revealed a high concentration of DEGs related to immune processes associated with neutrophils. Our data has demonstrated the down-regulation of several genes coding for proteins involved in chromatin structure and regulation of the global transcription program. The strong co-regulation of expression between the DEGs and the chromatin genes, implies that they both contribute to the observed transcriptome changes in HEU infants and identified biological pathways. However, we acknowledge that there are likely to be additional players to the chromatin genes that may contribute to the observed changes. Indeed, several non-coding RNA species including LUCAT1 and LINC01531 were among the DEGs that could potentially be involved in transcriptional regulation.
HEU infants have been reported to have normal neutrophil counts with an impaired neutrophil function 40 . Neutrophils function in both the innate and adaptive immune system 41,42 attacking microorganisms by phagocytosis, degranulation or via the generation of NETs [43][44][45] . Genes primarily associated with neutrophil mediated immunity (LCN2, CAMP, HP, RETN, MMP8, BPI, LTF) were repressed in the HEU infants and could be a contributing factor to the increased susceptibility of HEU infants to bacterial infections [46][47][48] . RETN, found in specific neutrophil granules and cell membranes 49 plays a role in the formation of NETs and the activation of neutrophils 26 , whereas BPI is bactericidal and stimulates phagocytosis by neutrophils through the complement system 50 . LCN2 encodes for lipocalin 2 (neutrophil gelatinase-associated lipocalin -NGAL), a protein mostly released by neutrophils. This protein is known for its bacteriostatic role 46 and enhancement of phagocytosis by macrophages 51 . The downregulation of HP which encodes for an acute phase bacteriostatic protein that is released during inflammation and injury by activated neutrophils 52 , may also interfere with neutrophil function. CAMP encodes for cathelicidin antimicrobial peptides which not only have direct antimicrobial 44,53 properties, but they also affect neutrophil activity through neutrophil recruitment 54,55 and inhibition of neutrophil apoptosis 56,57 . www.nature.com/scientificreports www.nature.com/scientificreports/ It is worth noting that the HEU infants in this study were on daily cotrimoxazole prophylaxis in line with the recommended policy to protect them from Pneumocystis jirovecii pneumonia and they had regular clinical visits, which ruled out on-going febrile infections during blood sample collection. Whether daily cotrimoxazole is beneficial and reduces invasive bacterial infections 58 is unclear as evidence for this is weak and current guidelines have been debated 59,60 . Some studies have reported little to no effect of cotrimoxazole on mortality, LRTIs and diarrheal infections [60][61][62] , while others have observed a significant reduction in hospitalizations 63 . Cotrimoxazole not only acts as an antimicrobial agent, it may also directly interfere with the host's immune system 64 . Therefore, we cannot rule out the immunomodulatory role of cotrimoxazole on the gene expression profiles observed in HEU infants and a repeat of the analysis post cotrimoxazole prophylaxis, that is, after two years of life, may be important for future work. Daily cotrimoxazole may also explain the lower expression profiles for genes associated with malaria pathogenesis on the KEGG pathway. Although not the primary aim, daily cotrimoxazole does protect against malarial disease as previously shown [65][66][67] . In the wider cohort, HEU infants residing in a high malaria transmission area had lower antibody levels to Plasmodium schizont extract and PfAMA1 compared to the community controls living in the same region, implying that they had experienced fewer malaria episodes, in agreement with the gene expression profile.
To our knowledge, this is the first study that has used a transcriptomic approach to describe the gene expression profile of PBMCs from HEU infants. Our data confirm previous reports indicating reduced immune responses in HEU infants 4,7,13,14,20 , as well as revealing new insights into the potential pathways involved in the global dysregulation of immune development in this group of infants. We observed changes in genes involved in innate immune responses, specifically neutrophil function. It is possible that HEU infants have an ineffective neutrophil response that would predispose them to severe bacterial infections and subsequently increased morbidity and mortality, and further studies are warranted to address this.

Materials and Methods
All methods were performed in accordance with the Kenya Medical Research Institute Scientific and Ethics Review Unit guidelines and regulations for handling and analyzing human-derived biological samples. The Kenya Medical Research Institute Scientific and Ethics Review Unit provided ethical approval (SSC2085) and written informed consent was obtained from mothers of the infants recruited into the study.
Study site and participants. This study used samples previously collected from a longitudinal cohort of HEU infants recruited between 2011 and 2012 at the comprehensive care and research clinic (CCRC), Kilifi County hospital, previously described 22,23 . HIV care was provided as per the World Health Organization recommendation at the time of the study. Briefly, HEU infants born to mothers, not on HAART were prescribed nevirapine prophylaxis at birth. Treatment was continued until one week after the complete cessation of breastfeeding, while for those with mothers on HAART, nevirapine prophylaxis was stopped at six weeks of life. Infant's HIV-1 status was determined by polymerase chain reaction (PCR) at six weeks of age or at first contact, and repeated by a rapid antibody test at nine and 18 months of age. Infants suffering from an acute infection or malnutrition at the time of recruitment were excluded from the study. The infants were followed up longitudinally every three months until they were 24 months of age, with an upper age limit of up to 30 months being used to cover late follow-up visits. All HEU infants were given prophylactic cotrimoxazole during the first 18 months of life. Age-matched community control HUU infants were recruited from three localities within the catchment area of Kilifi County with varying malaria endemicity and sampled at a single time point 20,22 . From the parent study, we used samples from 19 HEU infants that contributed to the 12 and 24-month time points. However, one and five infant samples were not available at the 12 and 24-month time points, respectively. Also, we obtained control samples from 15 HUU infants specifically from the low malaria transmission region, to avoid any malaria interference on gene expression profiles. Due to ethical reasons, we could not get blood samples from healthy HUU infants more than once. Samples from selected age-matched infants at 12 months or 24 months were therefore analyzed. In total 34 individuals (19 HEU infants matched to 15 HUU infants) were used for RNA sequencing and the gene expression profiling reported here. During the parent study, 5 ml whole blood was drawn and processed within 4 hours. From this, 300 µl of whole blood was used for RNA isolation by first depleting the red blood cells through lysis and storing the PBMCs at −80 °C until ready for RNA isolation.
Bioinformatics analysis. mRNA library preparation, sequencing and quality control. The frozen PBMCs were thawed in bulk and RNA extracted as per the Qiagen RNeasy kit protocol. The quality of the extracted RNA was confirmed using Agilent Bioanalyzer and messenger RNA (mRNA) enrichment done. mRNA sequencing libraries for 200 bp short-inserts were prepared according to the manufacturer's recommendations. RNA sequencing was done using the Hi-Seq Illumina platform using nine lanes of single-ended reads at a read length of 50 base-pairs (50 bp). RNA extraction, Library preparation, and sequencing were performed at the Beijing Genomics Institute (BGI), China.
Genome assembly and abundance estimation. High quality sequenced data was obtained through quality control which involved removing the adapter sequences, low quality reads and unknown sequences. Using Tophat2 v2.1.051 68 and bowtie2 v2.2.552 69 software, the high quality reads were aligned against the human reference genome GRCh38vs86 obtained from Ensembl 70 using the file transfer protocol (FTP). The estimated abundance levels were then quantified at the exon level using HTSeq v0.6.154 71 . The output of abundance estimation was in form of text files for each sample. Text files from all samples were combined into a count matrix for differential expression analysis using Bioconductor packages in R. www.nature.com/scientificreports www.nature.com/scientificreports/ Differential expression analysis. The quantified abundance estimates, in the form of a count matrix, were normalized and analyzed for differential gene expression using the geometric mean method and negative binomial distribution incorporated in the DESeq2 v1.22.2 package 72 . A threshold of Padj < 0.05, after adjusting for multiple testing using Benjamini-Hochberg, and an absolute log2 fold change of 1 was used to identify differentially expressed genes (DEGs). Hierarchical clustering analysis was then done on the DEGs.
Unsigned weighted gene co-expression networks. Unsigned weighted gene network analysis was performed using the Weighted gene co-expression network analysis (WGCNA) R package 73 . As input data, variance stabilizing transformation was applied to the expression dataset using the variance stabilization transformation function in the DESeq2 package. Genes with read counts less than 10 and "NA" as a p-value were filtered out and using a heuristic cut-off, the top 8000 most variant genes were selected. Correlational analysis of the selected genes was done using the Pearson correlation co-efficient to create a similarity matrix. Power adjacency function (β) was used for soft thresholding to convert the similarity matrix into an adjacency matrix. Scale free topology criterion was used to obtain the power adjacency function and linear regression model fitting index R 2 used to choose the optimum power for our analysis, β = 20 (supplementary Fig. S4). An unsigned topological overlap matrix with a minimum module size of 30 and a merge cut height of 0.25 was used to create co-expression gene modules from the adjacency matrix. The gene modules were assigned colors and visualized using a hierarchical clustering dendogram.
Module-trait relationship. The module-trait relationship was determined by relating eigengenes from each module to clinical traits using pearson correlation. Associations were deemed significant if the eigengene module membership had a p-value of ≤0.001 and the eigengene-trait absolute cor was ≥0.25. The significant modules were further analysed for functional enrichment.
Functional enrichment analysis. We conducted enrichment analysis on the significant DEGs using the Search Tool for Retrieval of Interacting Genes/Proteins (STRING) v11.0 software 24 . DEGs were divided into two gene sets, up-regulated and down-regulated, as input and an evidence score of 0.7 selected to increase stringency in the analysis. Enrichment analysis was performed on the significant modules using the GOenrichmentanalysis function of WGCNA R package and visualized using Cytoscape 74 .
Analysis of antimalarial antibody responses. During the parent study, antibody levels against crude Plasmodium falciparum schizont extracts and PfAMA1 had been measured using previously reported protocols 75,76 . Plasma samples from HEU (n = 31) and HUU (n = 57) infants were analysed at 18 months of age (to rule out any maternal antibody influences). Samples were classified based on malaria exposure, that is, HEU and HUU infants residing in high malaria transmission region (n = 10 and n = 21, respectively) or HEU and HUU infants residing in low malaria incidence region in Kilifi (n = 21 and n = 36, respectively). Of the infants whose antibody levels had been determined, HEU (n = 6) and HUU (n = 5) were infants included in the current transcriptome analysis. Only HUU infants from the low malaria transmission region were included in the transcriptome analysis to avoid potential impact of malaria exposure on gene expression profiles. To determine antibody titers, plates pre-coated with 20 ng/µl of recombinant-PfAMA1 or crude Plasmodium falciparum schizont extract were probed with sera from HEU infants or controls (HUU) and IgG antibody levels against PfAMA1 or schizont extract measured in optical density (OD) values. Hyperimmune serum serially diluted and included on each plate was assigned antibody arbitrary units and used to convert OD values of the samples into relative arbitrary units. The arbitrary units were then log transformed to normalize the antibody titres and the non-parametric Mann-Whitney test used to determine differences in median antibody titres between HEU and HUU infants in low, and similarly between HEU and HUU infants in high malaria exposure regions. These data were used to test the observed KEGG pathway enriched in the down-regulated gene set in HEU infants.

Data availability
The datasets generated during the current study are available from the corresponding author on reasonable request and will be made available in a public repository.