HIV silencing and cell survival signatures in infected T cell reservoirs

Clark, Iain C.; Mudvari, Prakriti; Thaploo, Shravan; Smith, Samuel; Abu-Laban, Mohammad; Hamouda, Mehdi; Theberge, Marc; Shah, Sakshi; Ko, Sung Hee; Pérez, Liliana; Bunis, Daniel G.; Lee, James S.; Kilam, Divya; Zakaria, Saami; Choi, Sally; Darko, Samuel; Henry, Amy R.; Wheeler, Michael A.; Hoh, Rebecca; Butrus, Salwan; Deeks, Steven G.; Quintana, Francisco J.; Douek, Daniel C.; Abate, Adam R.; Boritz, Eli A.

doi:10.1038/s41586-022-05556-6

Download PDF

Article
Open access
Published: 04 January 2023

HIV silencing and cell survival signatures in infected T cell reservoirs

Nature volume 614, pages 318–325 (2023)Cite this article

28k Accesses
31 Citations
100 Altmetric
Metrics details

Subjects

Abstract

Rare CD4 T cells that contain HIV under antiretroviral therapy represent an important barrier to HIV cure^1,2,3, but the infeasibility of isolating and characterizing these cells in their natural state has led to uncertainty about whether they possess distinctive attributes that HIV cure-directed therapies might exploit. Here we address this challenge using a microfluidic technology that isolates the transcriptomes of HIV-infected cells based solely on the detection of HIV DNA. HIV-DNA⁺ memory CD4 T cells in the blood from people receiving antiretroviral therapy showed inhibition of six transcriptomic pathways, including death receptor signalling, necroptosis signalling and antiproliferative Gα12/13 signalling. Moreover, two groups of genes identified by network co-expression analysis were significantly associated with HIV-DNA⁺ cells. These genes (n = 145) accounted for just 0.81% of the measured transcriptome and included negative regulators of HIV transcription that were higher in HIV-DNA⁺ cells, positive regulators of HIV transcription that were lower in HIV-DNA⁺ cells, and other genes involved in RNA processing, negative regulation of mRNA translation, and regulation of cell state and fate. These findings reveal that HIV-infected memory CD4 T cells under antiretroviral therapy are a distinctive population with host gene expression patterns that favour HIV silencing, cell survival and cell proliferation, with important implications for the development of HIV cure strategies.

Mapping genotypes to chromatin accessibility profiles in single cells

Article 08 May 2024

Pretreatment with IL-15 and IL-18 rescues natural killer cells from granzyme B-mediated apoptosis after cryopreservation

Article Open access 10 May 2024

Depleting myeloid-biased haematopoietic stem cells rejuvenates aged immunity

Article 27 March 2024

Main

Understanding how HIV persists during antiretroviral therapy (ART) can advance the search for a safe and scalable HIV cure. A central example of this is the latent reservoir concept, in which some HIV proviruses are thought to persist by maintaining a quiescent state that spares their host cells from virus- or immune-mediated killing². Evidence supporting this concept includes the presence of rare memory CD4 T cells in ex vivo samples that inducibly express HIV^1,3,4, as well as data from culture models demonstrating molecular blocks to HIV transcription, particularly in resting cells^{5,6,7,8,9,10,11}. These and other findings have prompted the development of latency-reversing agents (LRAs) that can induce HIV transcription with the goal of exposing infected cells to elimination in vivo. However, the lack of a demonstrable reduction in reservoir size in clinical trials of LRAs^{12,13,14,15,16} has emphasized how much remains unknown about the barriers to an HIV cure. Of particular importance is the long-standing uncertainty about the biology of HIV-infected CD4 T cell reservoirs. As cells containing quiescent viruses in the blood and tissues have not been identifiable without substantial manipulation, it has been impossible to establish whether these rare cells have special attributes that favour HIV latency or otherwise help to account for HIV persistence under ART. Studies attempting to circumvent this obstacle by detecting HIV enrichment in phenotypic, functional or anatomic CD4 T cell subsets^{17,18,19,20,21,22,23,24,25,26,27}—in some cases using advanced single-cell analyses^28,29—have found low levels of infected cells across subsets and emphasized the heterogeneity of the infected cell pool. Thus, the identification of distinctive biological signatures among HIV-infected CD4 T cells under ART has emerged as a central challenge in HIV cure research.

To help address this challenge, we developed a custom microfluidic technology that enables the unbiased detection and gene expression profiling of HIV-infected cells directly ex vivo. The technology, termed focused interrogation of cells by nucleic acid detection and sequencing (FIND-seq)³⁰, separates millions of single cells within water-in-oil droplets for immediate lysis, followed by polyadenylated RNA sequence recovery and then sorting according to HIV DNA detection. This approach isolates whole transcriptomes from cells containing quiescent viruses without the need for in vitro latency reversal, thereby capturing a transcriptome-wide profile of these cells in their natural state. Here we used FIND-seq in people with HIV receiving long-term ART to analyse host gene expression patterns of memory CD4 T cells containing HIV gag DNA—a marker of the HIV-infected cell reservoir that encompasses both intact and defective virus sequences³¹. Our results reveal distinctive transcriptomic signatures that help to explain HIV-infected CD4 T cell persistence despite the suppression of virus replication, highlighting important opportunities for further progress towards an HIV cure.

HIV-DNA⁺ cell transcriptome sorting

FIND-seq uses three microfluidic devices to isolate polyadenylated RNA sequences from HIV-DNA⁺ cells (Fig. 1a–c). The first device loads millions of single cells into water-in-oil droplets with a strongly denaturing lysis buffer and molten agarose covalently conjugated to oligo-dT (Fig. 1a). After encapsulation, the agarose in each single-cell droplet is cooled to form a hydrogel that retains high-molecular-mass DNA as well as polyadenylated RNA. This approach maintains compartmentalization among cells during oil removal, incubations, washes and reagent exchanges, therefore enabling optimized cell lysis, mRNA reverse-transcription and subsequent PCR while preventing interference between steps (Extended Data Fig. 1a–d). The second device reinjects washed hydrogels containing single-cell transcriptome cDNA and genomic DNA into a second emulsion for HIV gag DNA detection (Fig. 1b). The third device uses an accurate dielectrophoretic sorter³² to separate droplets on the basis of their fluorescence (Fig. 1c) for subsequent whole-transcriptomic analysis (Fig. 1d and Extended Data Fig. 1e). Using dilutions of latently infected human J-Lat T cells in uninfected human Jurkat T cells, FIND-seq droplet cytometry detected HIV-DNA⁺ cells with an estimated sensitivity of 50% and a per-droplet false-positive rate of 1 in 300,000 (Fig. 1e). Transcriptome sequencing in HIV-DNA⁺ droplets sorted from a 1:1 mixture of J-Lat and mouse cells revealed >99% human sequences (Extended Data Fig. 1f,g). These findings demonstrate that FIND-seq accurately detects rare HIV-DNA⁺ cells and isolates the transcriptomes from these cells.

**Fig. 1: Whole-transcriptomic analysis of HIV-DNA⁺ cells using FIND-seq.**

Transcriptome sequencing after FIND-seq

We tested whether FIND-seq-sorted transcriptomes accurately represent the cells from which they are sorted by using mixtures of J-Lat T cells and Raji human B cells (Extended Data Fig. 2a). We cultured J-Lat and Raji cell lines separately and performed RNA sequencing (RNA-seq) analysis of each using standard protocols. At the same time, a 1:100 mixture of J-Lat and Raji cells was analysed using FIND-seq (Extended Data Fig. 2b). Gene expression differences between J-Lat and Raji cells after standard processing were highly correlated with differences between HIV-DNA⁺ and HIV-DNA⁻ cells after FIND-seq processing (R = 0.47, P = 2.2 × 10⁻¹⁶; Extended Data Fig. 2c). Furthermore, differential expression between J-Lat and Raji cells analysed using FIND-seq identified canonical T cell and B cell genes (Extended Data Fig. 2d) and agreed with published findings (Extended Data Fig. 2e). These results demonstrate that FIND-seq can be used to study the transcriptomic signatures of rare HIV-DNA⁺ cells.

FIND-seq of HIV-DNA⁺ cells ex vivo

To define gene expression patterns of HIV-DNA⁺ memory CD4 T cells under ART, we applied FIND-seq to magnetically purified memory CD4 T cell samples from five people with HIV receiving long-term ART that was initiated during chronic infection (Supplementary Table 1). Droplet cytometry data acquired during sorting demonstrated between 534 and 2,153 HIV-DNA⁺ cells per million (Extended Data Fig. 3a), consistent with previous studies using quantitative PCR analysis of extracted DNA^19,20. False-positive frequencies of HIV-DNA⁺ memory CD4 T cells measured in three HIV-uninfected control participants ranged between 7 and 19 per million (Extended Data Fig. 3b). To maximize sorted transcriptome cDNA quantity and therefore reduce the need for extensive whole-transcriptome amplification (WTA) that could skew gene abundance in the sequencing libraries, we collected all droplets after HIV detection PCR in aliquots of 100 cell-equivalents. Sorting resulted in different numbers of aliquots collected across participants owing to the different frequencies of HIV-DNA⁺ cells (Extended Data Fig. 3c). After WTA and sequencing, we used a prospective curation process to select only those samples with a high library quality for further analysis (Methods). This resulted in a set of 22 curated samples from three people with HIV (Supplementary Table 2 and Extended Data Fig. 4).

Host transcriptomes of HIV-DNA⁺ cells

Using the curated dataset (Supplementary Table 3), we first compared host gene expression between HIV-DNA⁺ and HIV-DNA⁻ memory CD4 T cells at the global level. Unsupervised clustering revealed partial segregation between HIV-DNA⁺ and HIV-DNA⁻ cell transcriptomes (Fig. 2a), and the use of Euclidean distance as a summary measure of transcriptomic relatedness demonstrated that distances between HIV-DNA⁺ and HIV-DNA⁻ cell samples were significantly greater than distances among HIV-DNA⁻ cell samples (P = 8.0 × 10⁻⁴; Fig. 2b). However, we also observed sample clustering by participant (Fig. 2a) as well as significantly greater Euclidean distances among HIV-DNA⁺ cell samples than among HIV-DNA⁻ cell samples (P = 2.7 × 10⁻⁵; Fig. 2b). We conclude that the whole-transcriptome clustering analysis suggested distinctive host gene expression by HIV-DNA⁺ memory CD4 T cells, but also indicated that transcriptomic differences among populations of HIV-DNA⁺ cells and across study participants are substantial sources of variation in the dataset.

**Fig. 2: Host transcriptomic pathways in HIV-DNA⁺ memory CD4 T cells under ART.**

Host gene differential expression

To identify individual genes and transcriptomic pathways that were characteristic of HIV-DNA⁺ memory CD4 T cells, we performed differential gene expression (DGE) analysis using two distinct approaches (Supplementary Table 4). Using a combined approach that analysed participants as biological replicates, we identified 2,776 differentially expressed genes (DEGs; absolute fold change > 1.5, FDR ≤ 0.05) (Extended Data Fig. 5a). Pathway enrichment analysis on the basis of these DEGs yielded several cancer- and cell-cycle-related pathways (Fig. 2c), suggesting differences between HIV-DNA⁺ and HIV-DNA⁻ memory CD4 T cells related to cell proliferation and survival. Notably, a comparison of DEG lists defined for each of the participants separately revealed only 11 DEGs common to all three participants (Extended Data Fig. 5b–d). However, pathway enrichment analysis using participant-specific DEG lists (absolute fold change ≥ 2, P ≤ 0.01) identified six pathways that shared concordant direction across participants (Fig. 2d and Supplementary Table 5). All six concordant pathways showed z-activation scores of <0, indicating pathway inhibition in HIV-DNA⁺ cells relative to HIV-DNA⁻ cells. Notably, these inhibited pathways in HIV-DNA⁺ cells included death receptor signalling, necroptosis signalling and the anti-proliferative Gα12/13 signalling pathway³³. Inferences of pathway inhibition arose from both decreased expression of pathway activators and increased expression of pathway inhibitors in HIV-DNA⁺ cells and depended on differential expression of distinct pathway genes in different participants (Fig. 2e). We conclude that although many individual DEGs distinguishing HIV-DNA⁺ cells from HIV-DNA⁻ cells differed between the participants, higher-order analysis revealed that inhibition of cell death and anti-proliferative signalling are shared attributes of HIV-DNA⁺ memory CD4 T cells under ART.

Analysis of co-expressed gene signatures

We anticipated that pooled sequencing from diverse HIV-DNA⁺ memory CD4 T cells under ART could dilute signals from infected cell subpopulations, thereby limiting the detection of informative features of HIV-infected cells in conventional DGE analysis. To identify transcriptomic signatures of HIV-DNA⁺ cells as groups of genes, we used weighted gene co-expression network analysis (WGCNA) to define gene modules on the basis of correlation patterns across samples (Supplementary Table 6). Within the curated set of 22 samples that together expressed 17,898 different genes, this process produced 28 distinct gene modules of varying sizes (Fig. 3a). Correlating module gene expression patterns with cell infection status (that is, HIV-DNA⁺ versus HIV-DNA⁻) identified significant correlations for module 5 (60 genes, R = 0.46, P = 0.03) and module 28 (85 genes, R = 0.78, P = 2 × 10⁻⁵) (Fig. 3a). Thus, unsupervised clustering using WGCNA revealed two groups of genes that account for only 0.81% of the measured transcriptome that distinguished HIV-DNA⁺ from HIV-DNA⁻ memory CD4 T cells in ART-treated people with HIV.

**Fig. 3: Co-expressed gene signatures in HIV-DNA⁺ memory CD4 T cells under ART.**

To characterize the differences between HIV-DNA⁺ and HIV-DNA⁻ memory CD4 T cells reflected by these modules, we analysed the module gene lists using Gene Ontology (GO). In both modules, we found statistically significant enrichment (adjusted P ≤ 0.05) for genes related to the regulation of gene expression at the transcriptional and post-transcriptional levels (Fig. 3b). Module 28 was enriched for GO terms related to mRNA splicing and processing. Module 5 was enriched for genes involved in mRNA degradation by nonsense-mediated decay, which has been linked to negative post-transcriptional regulation of HIV gene expression in vitro³⁴. Moreover, module 5 was enriched for terms related to cell survival, activation and proliferation, including regulation of death receptor signalling, regulation of calcineurin–NFAT signalling and DNA-damage checkpoint regulation. We conclude that GO analysis of WGCNA module genes identified transcriptional and post-transcriptional gene regulation as well as several cell state regulatory processes as distinguishing attributes of HIV-DNA⁺ memory CD4 T cells under ART.

Furthermore, we examined the transcriptomic differences between HIV-DNA⁺ and HIV-DNA⁻ memory CD4 T cells by inspecting a filtered list of the 44 genes in WGCNA modules 5 and 28 that showed at least twofold average difference between HIV-DNA⁺ and HIV-DNA⁻ cell populations and a concordant direction between populations across the participants (Fig. 3c, Extended Data Table 1 and Supplementary Table 6). We noted that 8 out of 44 genes were previously implicated in the regulation of HIV transcription. Four genes were linked to negative regulation of HIV transcription through histone modification (EHMT1³⁵, RBBP4³⁶ and MTA1³⁷) or promoter-proximal pausing of RNA polymerase II (CTR9³⁸), and were higher in HIV-DNA⁺ cells. The remaining four genes were linked to positive regulation of HIV transcriptional initiation (GTF2I³⁹ and MAPKAPK3⁴⁰) or elongation (NCOA1⁴¹ and SNW1⁴²), and were lower in HIV-DNA⁺ cells. We conclude that host gene expression signatures of HIV-DNA⁺ memory CD4 T cells under ART were relatively non-permissive for HIV transcription.

We next examined the remaining 36 genes from the filtered module 5 and 28 gene lists. Ten of these genes encoded RNA-processing factors. In module 5, these included higher levels in HIV-DNA⁺ cells of antiviral defence factor NCBP1⁴³ and post-splicing complex component RNPS1⁴⁴, both of which have been linked to nonsense-mediated decay. Module 5 also included higher levels in HIV-DNA⁺ cells of G3BP2, a stress granule factor in a gene family that has been implicated in cytoplasmic sequestration and translational inhibition of HIV mRNAs⁴⁵. mRNA-processing factors in module 28 included higher levels in HIV-DNA⁺ cells of PRRC2A—a reader of N⁶-methyladenosine RNA modifications that can be induced by HIV infection in vitro⁴⁶—and the splicing regulator SRPK. Among the additional 26 genes, we noted that module 28 included USP19 and LRRFIP2, which can inhibit apoptosis⁴⁷ or pyroptosis⁴⁸ and were higher in HIV-DNA⁺ cells, and TLN1⁴⁹, which is required for antigen-driven T cell proliferation mediated through immunological synapses⁴⁹ and was also higher in HIV-DNA⁺ cells. Finally, we noted multiple module 28 genes involved in the DNA-damage response and mitochondrial function. We conclude that the transcriptomic signatures of HIV-DNA⁺ memory CD4 T cells under ART suggest that these cells have the capacity for post-transcriptional HIV silencing, and are also consistent with DGE-based indications of increased cell survival and proliferation.

Enrichment of signatures in cell subsets

To investigate the origins of HIV-DNA⁺ memory CD4 T cell transcriptomic signatures identified by co-expression network analysis, we compared these signatures with the transcriptomes of defined CD4 T cell subsets. We isolated circulating naive and memory CD4 T cell subsets from nine ART-treated people with HIV (Supplementary Table 1) using fluorescence-activated cell sorting (FACS) (Extended Data Fig. 6), defined subset gene expression using RNA-seq and finally used gene set enrichment analysis (GSEA) to compare gene expression signatures in the sorted memory subsets (defined by expression relative to the naive subset) against co-expression network analysis signatures of HIV-DNA⁺ cells (Extended Data Table 2). This revealed significant enrichment of the module 5 signature in memory CD4 T cells of the CD27⁺CCR7⁺CD45RO⁺CXCR5⁺CCR6⁻ peripheral T follicular helper (T_FH) phenotype⁵⁰. No significant enrichment was observed for the module 5 signature in any other subset, or for the module 28 signature in any of the subsets. We conclude that, taken together, the transcriptomic signatures of HIV-DNA⁺ memory CD4 T cells under ART did not map to defined CD4 T cell subsets, although the module 5 signature showed partial similarity to the signature of CCR6⁻ peripheral T_FH cells in ART-treated people with HIV.

HIV RNA expression analysis

Finally, we used the curated set of 22 samples to analyse HIV transcriptional patterns in HIV-DNA⁺ memory CD4 T cells under ART by aligning transcriptome sequence reads to a reference HIV genome (Fig. 4). We found that some HIV-DNA⁺ cell samples showed hundreds of HIV reads (Fig. 4a), including one sample from participant 2510 with two distinct virus sequences (Fig. 4b,c) that suggested processive HIV transcripts from at least two cells in the sorted aliquot of 100 cells. Nevertheless, HIV read percentages for all HIV-DNA⁺ cell samples were <0.05% (Fig. 4a), which is 100-fold lower than previously reported for HIV-expressing cells sequenced after in vitro stimulation⁵¹. These findings are consistent with latent infection and/or HIV sequence defects that limit virus transcription in HIV-DNA⁺ cells. HIV genome coverage patterns of mapped reads were notable for isolated peaks interspersed with areas of no coverage (Fig. 4d), suggesting atypical transcription start sites⁵², transcripts from proviruses with deletion mutations and/or chance sampling variations. Spliced transcripts were not detected even by manually inspecting and mapping individual mates of read pairs using BLAST. The use of assembly-based tools to produce contigs from reads that did not initially map to the human reference yielded no HIV contigs from 5/6 HIV-DNA⁺ cell samples and did not substantially increase mapped HIV read counts in the remaining sample (not shown). We conclude that polyadenylated RNA-seq in HIV-DNA⁺ memory CD4 T cells from ART-treated people with HIV did not reveal either full-length genomic HIV transcripts or spliced HIV messages encoding accessory proteins.

**Fig. 4: HIV RNA sequences in memory CD4 T cells under ART.**

Discussion

The absence of evidence for HIV reservoir size reduction in ‘shock and kill’ clinical trials has bred uncertainty about the role of therapeutic HIV latency reversal and the use of the latent reservoir concept. Meanwhile, attempts to understand the mechanisms of HIV persistence under ART by identifying distinctive attributes of HIV-infected CD4 T cells have faced major technical obstacles. Using microfluidic technology developed to study HIV-DNA⁺ memory CD4 T cells under ART in their natural state, we identified host gene expression signatures in these rare cells that were intrinsically non-permissive for the transcription of the virus. This supports the concept that these cells are a latent reservoir and links HIV transcriptional quiescence in vivo to host gene expression patterns that are specific to infected cells. Furthermore, host transcriptomic signatures of HIV-DNA⁺ memory CD4 T cells under ART indicated that the persistence of these cells may involve additional mechanisms beyond HIV transcriptional silencing, including post-transcriptional HIV silencing, resistance to cell death and resistance to anti-proliferative signalling. These findings are consistent with incomplete latency reversal by early LRAs^{53,54,55,56,57,58} and the persistence of infected cells observed even after cell stimulation both in vitro⁵⁹ and in vivo^{12,13,14,15,16}. Overall, our results in this study therefore reveal a host cell transcriptomic signature of which further elucidation may lead to the development of new HIV cure strategies.

The origins of the gene expression patterns that we identified in this study will require further investigation. In part, these patterns may arise progressively under ART through the selective elimination of cells that do not express them. Selection for an HIV-silencing signature may occur among cells that are competent to express toxic virus gene products in vivo, while selection for cell survival and proliferation could apply to the entire HIV-DNA⁺ cell pool. Importantly, this selection model implies that there are pre-existing differences among CD4 T cells in the expression of HIV silencing, cell survival and cell proliferation signatures that did not trace in their entirety to a single memory CD4 T cell subset. These signatures may therefore reflect mixed contributions from multiple subsets, each with modest enrichment for the virus, perhaps exemplified by our partial mapping of one co-expressed module signature to peripheral T_FH cells. At the same time, it is also possible that some gene expression patterns of HIV-infected memory CD4 T cells are a consequence of HIV infection in these cells. Cellular transcriptomic reprogramming could represent a host response to HIV integration or other life cycle steps, as suggested by co-expressed module signature genes encoding virus-induced and DNA-damage response factors. Alternatively, although we detected little evidence of polyadenylated HIV RNA expression in HIV-DNA⁺ cells, it remains possible that components of infecting HIV virions or HIV expression products of which transcripts went undetected in our sequencing—due to transient expression or method sensitivity—might actively reprogram host cell gene expression. Future studies elucidating such mechanisms may yield new targets for HIV cure strategies.

Our findings in this study have several limitations. First, owing to technical challenges, we sorted and sequenced pools of HIV-DNA⁺ cell transcriptomes without distinguishing between intact and defective HIV genomes³¹. As a result, technical advances in FIND-seq and/or new technologies will be required to define how the transcriptomic signatures identified here are distributed among individual cells. Analysis of HIV-DNA⁺ cells at the single-cell level will avoid dilution of signatures from reservoir subpopulations, thereby refining and extending the findings from this study. Single-cell transcriptomic analyses that distinguish between intact and defective HIV may also clarify whether HIV silencing signatures arise strictly by selection within translation-competent reservoirs, or whether these signatures can arise even when the infecting virus genome has acquired lethal mutations during reverse transcription. Second, although many of the transcriptomic signature genes identified here have well-defined roles in regulating HIV gene expression, cell survival or cell proliferation, the roles of other genes in HIV persistence will require further study. Those signature genes that have RNA-processing functions but have not previously been linked with HIV replication will be of particular interest, as some of these could contribute to post-transcriptional regulation of HIV gene expression while others might serve only as markers of infected cells. Third, our findings address neither the durability of transcriptomic signature expression within each infected cell nor the distribution of cells expressing signature genes across diverse tissue compartments, raising important questions about reservoir cell dynamics that impact the development of HIV cure strategies. Fourth, as our study included a small number of participants, it is possible that larger FIND-seq studies performed in diverse participant populations and incorporating technical improvements to increase the recovery of high-quality data will reveal signatures that were not detected here. Finally, it is important to acknowledge that the barriers to HIV cure under ART may include virus reservoirs outside the memory CD4 T cell pool^60,61,62.

Notwithstanding these limitations, our findings highlight two parallel but complementary paths in translational and basic research towards an HIV cure. The first is an increased emphasis on in vivo studies targeting the full range of mechanisms that both maintain HIV quiescence and prevent the death of HIV-infected cells. The approaches taken may include synergistic combinations of LRAs targeting diverse HIV transcriptional and translational blocks, paired with therapies that potentiate physiological CD4 T cell death. However, as the complexity of therapeutic combinations increases, their potential for significant toxicity may become a growing concern. Thus, the second path forward is an ongoing effort to define gene expression patterns within HIV-infected cellular reservoirs and to understand their mechanistic basis. The intent is for these approaches to reveal how HIV silencing, cell survival and cell proliferation programs come to be expressed among the diverse memory CD4 T cells present in vivo, therefore generating additional insights that may be translated to effective and safe HIV-cure-directed therapies.

Methods

Study participants

Recruitment of study participants with HIV was performed in compliance with relevant ethical regulations under the IRB-approved SCOPE protocol (NCT00187512) at San Francisco General Hospital. Participants were enrolled from the SCOPE cohort on the basis of sample availability at the time of study, without use of sample size calculations, blinding or randomization. Demographic and clinical laboratory data were collected at San Francisco General Hospital and are reported in Supplementary Table 1. All of the participants provided informed consent before study. Prescreening of participant samples to ensure adequate numbers of HIV-DNA⁺ memory CD4 T cells for FIND-seq analysis was performed in parallel sample aliquots using fluorescence-assisted clonal amplification⁶³.

Cell lines

Jurkat human T cells (TIB-152, ATCC), HIV-DNA⁺ J-Lat full-length human T cells (clone 6.3, ARP-9846)⁶⁴ and Raji human B cells (CCL-86, ATCC) were cultured in Gibco RPMI Medium 1640 (Thermo Fisher Scientific, 11875093) with penicillin and streptomycin (Thermo Fisher Scientific, 15140122) and 10% fetal bovine serum (FBS). Mouse fibroblasts (NIH/3T3, CRL-1658, ATCC) were cultured in Dulbecco’s modified Eagle’s medium (DMEM) with penicillin and streptomycin (Thermo Fisher Scientific, 15140122) and 10% FBS. Before use, 3T3 cells were dissociated using 0.25% trypsin-EDTA (Thermo Fisher Scientific, 25200-072) and neutralized in DMEM with 10% FBS. Cell lines were used without authentication or mycoplasma contamination testing.

Fabrication of microfluidic devices

Standard photolithography techniques were used to fabricate microfluidic devices at the Harvard Medical School Microfluidics Facility. Silicon wafers were spin-coated with SU-8 2025/2050 photoresist (Kayaku Advanced Materials) and ultraviolet-patterned using a mask aligner. After developing, the wafers were baked overnight and used as master moulds for soft-lithography. In brief, the PDMS prepolymer and curing agent were mixed by hand at a ratio of 10 to 1 (Momentive, RTV615), degassed for at least 1 h, poured onto the mould and degassed until no bubbles remained. PDMS was baked overnight at 65 °C before holes were punched using a 0.75 mm biopsy punch and bonded to a glass slide (75 × 50 × 1.0 mm, Thermo Fisher Scientific, 12–550C) with a plasma bonder (Technics Plasma Etcher 500-II). Bonded devices were made hydrophobic with Aquapel with a 30 s contact time, flushed with HFE-7500, purged with air and baked for at least 1 h before use.

Cell line validation studies

Cells were washed twice with Hanks’ balanced salt solution (HBSS, no calcium, no magnesium, Thermo Fisher Scientific, 14170112) and then counted, mixed (mouse:human 1:1; J-Lat:Raji 1:100), and resuspended in HBSS containing 18% OptiPrep Density Gradient Medium (Sigma-Aldrich) for FIND-seq. For standard RNA-seq studies performed in parallel, aliquots of 5 × 10⁴ cells were lysed in RNAzol RT (Molecular Research Center) and stored at −80 °C until subsequent total RNA extraction according to the manufacturer’s instructions. Whole-transcriptome cDNA was then generated from total RNA by reverse transcription using 6 mM MgCl₂, 1 M betaine, 7.5% PEG-8000, 1 mM dNTP, 2 U µl⁻¹ Maxima H-minus reverse transcriptase (Thermo Fisher Scientific, EP0753), 0.5 U µl⁻¹ RNase inhibitor (Lucigen, NxGen) and 2 µM SMART TSO (AAGCAGTGGTATCAACGCAGAGTGAATrGrGrG). This cDNA was purified using AMPure XP beads (Beckman Coulter), and was then processed for WTA by PCR, with library preparation as previously described⁶⁵. FIND-seq sample processing and library preparation were performed as described below. The correlation between the DGE results from standard RNA-seq and FIND-seq was analysed using stat_cor (method = “pearson”) in R (v.4.1.0). The results from the J-Lat:Raji mixing study were compared with published transcriptomic signatures of CD4 T cells and B cells⁶⁶ using GSEA.

PBMC processing for FIND-seq

Approximately 20–30 million cryopreserved peripheral blood mononuclear cells (PBMCs) from each study participant were used for FIND-seq. Cryopreserved PBMC suspensions were thawed in a 37 °C water bath, washed in prewarmed RPMI with 10% FBS, and sedimented by centrifugation at 300 rpm (Sorvall Legend XT). Untouched memory CD4 T cells were then isolated by magnetic-column-based negative selection (Miltenyi, 130-091-893). Cells were counted manually with a haemocytometer using Trypan blue, and aliquots of 5 × 10⁴ cells were lysed and stored in RNAzol RT.

FIND-seq

FIND-seq was performed as described previously³⁰. In brief, four syringes were prepared for microfluidic cell encapsulation: lysis buffer, agarose, cells and oil. The lysis buffer consisted of 20 mM Tris-HCl pH 7.5, 1,000 mM LiCl, 1% LiDS, 10 mM EDTA, 10 mM DTT and 0.4 µg µl⁻¹ proteinase K. Conjugated agarose-dT was heated to 95 °C for 1 h before use and was kept heated throughout the run using a custom syringe heater. A 10 ml syringe was loaded with oil (Bio-Rad, 186–3005) for droplet generation. All of the syringes were connected to the microfluidic device using PE/2 tubing (Scientific Commodities, BB31695-PE/2). To make droplets, pumps were run at 600 μl h⁻¹ (cell mixture), 1,200 μl h⁻¹ (agarose), 600 μl h⁻¹ (lysis buffer), and 5,000 μl h⁻¹ (oil) using a bubble-triggered drop generator⁶⁷. Air was controlled to break the jet and generate 53–55 µm droplets. After lysis at 55 °C for 2 h, droplets were cooled at 4 °C overnight to allow agarose gelation. Solid agarose microspheres (beads) were removed from the oil using a drop-breaking procedure. All of the steps were performed at 4 °C to prevent dissociation of mRNA from the poly(T) oligonucleotides. The beads were removed from the oil and washed five times. For each wash, the beads were incubated in wash buffer for 5 min on ice, centrifuged at 4,700 rpm for 10 min and aspirated before the next wash. Beads were first washed in wash buffer 1 containing 20 mM Tris-HCl pH 7.5, 500 mM LiCl, 0.1% LiDS and 0.1 mM EDTA. Next, the beads were washed twice with wash buffer 2 containing 20 mM Tris-HCl pH 7.5 and 500 mM NaCl. Finally, the beads were washed twice in 5× reverse transcription buffer containing 250 mM Tris-HCl pH 8.3, 375 mM KCl, 15 mM MgCl₂ and 50 mM DTT and filtered with a 100 µm cell strainer. The beads were resuspended in reverse transcription master mix to a final concentration of 6 mM MgCl₂, 1 M betaine, 7.5% PEG-8000, 1 mM dNTP, 2 U µl⁻¹ Maxima H-minus reverse transcriptase (Thermo Fisher Scientific, EP0753), 0.5 U µl⁻¹ RNase inhibitor (Lucigen, NxGen) and 2 µM SMART TSO (AAGCAGTGGTATCAACGCAGAGTGAATrGrGrG). Reverse transcription was completed at 25 °C for 30 min, followed by 90 min at 42 °C. The tubes were mixed continuously with an inverter during all incubations. After reverse transcription, the beads were washed five times with 0.1% Pluronic in RNase/DNase-free water.

After reverse transcription, the cell occupancy of agarose beads was quantified by microscopy and successful reverse transcription was checked using WTA before continuing with bead reinjection and sorting. Agarose beads containing cellular genomes and transcriptomes were reinjected into droplets to perform single-cell HIV detection PCR. Beads were mixed with PCR reagents to achieve a final concentration of 1× TaqPath Mastermix (Thermo Fisher Scientific, A30866), PEG-6000 (0.5% (w/v)), Tween-20 (0.5% (w/v)), F-127 Pluronic (0.5% (w/v)), BSA (0.1 mg ml⁻¹), HIV gag forward primer (CACTGTGTTTAGCATGGTGTTT, 900 nM), HIV gag reverse primer (TCAGCCCAGAAGTAATACCCATGT, 900 nM) and HIV gag hydrolysis probe (CY5-ATTATCAGAAGGAGCCACCCCACAAGA-3′ Iowa Black RQ, 250 nM)⁶⁸. To generate the final 1× reaction mixture concentration, beads were soaked in 2× PCR master mix on a shaker for 30 min in the dark. Next, the beads were centrifuged and loaded into a 3 ml syringe. The remaining 1× PCR master mix (supernatant) was loaded into a separate 3 ml syringe. Finally, the beads and 1× PCR master mix were reinjected in the microfluidic device to encapsulate the beads into 70 µm droplets⁶⁹. Agarose beads were re-encapsulated in droplets with about 70% loading, which is not accounted for in the detection efficiency calculation. Droplets were collected in 40 µl aliquots in PCR strips and thermocycled as follows: 88 °C for 10 min; then 55 cycles of 88 °C for 30 s and 60 °C for 1 min. After thermocycling, droplets were transferred into a 3 ml syringe for microfluidic sorting.

HIV-DNA⁺ and HIV-DNA⁻ droplets were sorted on the basis of the HIV PCR signal using a concentric sorter as previously described³². For HIV-DNA⁻-sorted samples, we sorted 100 cell equivalents based on the number of genomes per hydrogel bead determined previously, collecting a mixture of HIV-DNA⁻ cell droplets and cell-free droplets. For HIV-DNA⁺-sorted samples, we sorted aliquots of 100 droplets. The sorter was run with the following flow rates: 180 μl h⁻¹ cell droplets, 6,000 μl h⁻¹ bias oil (HFE-7500), 250 μl h⁻¹ spacer oil (HFE-7500) and 3,500 μl h⁻¹ extra spacer oil (HFE-7500). To sort, the 2 M NaCl on-chip electrode was polarized using a high-voltage amplifier at 1,200 V, 4,000 Hz for 15 cycles with 120 μs delay. We sorted into 1.5 ml Eppendorf tubes, removed all but 20 µl of the oil, added 50 µl of distilled nuclease-free water and centrifuged the sample at 20,000g for 5 min, and then stored the samples at −80 °C.

Before performing WTA on sorted HIV-DNA⁺ droplets in each participant, we determined the WTA cycle number that was required to amplify transcriptome cDNA from 100 cells in that participant. Accordingly, we first performed WTA on HIV-DNA⁻-sorted sample aliquots. Sorted HIV-DNA⁻ sample aliquots (frozen at −80 °C) were heated to 60 °C on a heat block for 10 min, mixed carefully by pipet and centrifuged at 20,000g for 5 min. The aqueous layer was then transferred to PCR strips and a WTA PCR reaction was performed using the 1× KAPA HiFi Master mix (Roche, KK2601) and 0.4 μM Smart-seq2 primer (AAGCAGTGGTATCAACGCAGAGT). Sorted material was thermocycled as follows: 95 °C for 3 min; then 18–22 cycles of 98 °C for 15s, 67 °C for 20s and 68 °C for 4 min; then 72 °C for 5 min, with a 4 °C terminal hold. The WTA was performed at three different cycle numbers—18, 20, and 22 cycles. All reactions were subsequently purified using a 1.2:1 ratio of AMPure XP beads (Beckman Coulter), with the final elution performed in 20 µl of nuclease-free water. After WTA, the DNA yield was quantified using the Qubit 4 Fluorometer and DNA size distribution was assayed using a Bioanalyzer 2100 with High Sensitivity DNA chip. On the basis of these results, the HIV-DNA⁺-sorted samples were processed as above using the minimal cycle number required to achieve a concentration of greater than 2 ng µl⁻¹ in 20 µl of elution volume.

Sequencing and read preprocessing

Libraries were prepared from transcriptome material sorted by FIND-seq using the Nextera XT Library Preparation Kit with v2 indexes. Individual sample libraries were combined at equimolar amounts to produce a single library pool. The library was quantified using the KAPA SYBR FAST Universal qPCR Kit. The library concentration and fragment size distribution were confirmed using the Agilent Bioanalyzer 2100 with High Sensitivity DNA chip. The library was diluted and denatured in accordance with the Illumina MiSeq System Denature and Dilute Libraries Guide (document 15039740). Cell line libraries were sequenced on the Illumina MiSeq system in 2 × 75 bp runs, and the selected libraries were subsequently sequenced again on the Illumina HiSeq 4000 system in a 2 × 75 bp run, operated using the Illumina HiSeq Control Software (HCS) v.3.4.0. For samples from people with HIV, libraries were first pooled and run on the Illumina MiSeq system in a 2 × 75 bp run, then rebalanced and run on the Illumina HiSeq 4000 system in a 2 × 75 bp run. Raw sequencing data were converted to fastq format using the bcl2fastq2 script (v.2.20) from Illumina and the reads were demultiplexed using sample-specific indexes. The resulting fastq files were trimmed for quality, ambiguity and presence of read-through adapters using the ‘Trim reads’ tool with the default settings in CLC Genomics Workbench (GWB) v.21.0.3. The quality of the raw and trimmed reads was assessed using QC tools in GWB.

Participant sample data quality filtering

Owing to the abundance of HIV-DNA⁻ cells in samples from ART-treated people with HIV, HIV-DNA⁻ cells were sorted in multiple replicates. Sequencing data were generated from 53 HIV-DNA⁺ and HIV-DNA⁻ cell samples sorted by FIND-seq from 5 people with HIV. A prospective curation approach was used to exclude low-quality samples from downstream transcriptomic analysis. HIV-DNA⁻ sample quality was assessed according to the following parameters: (1) the total number of reads sequenced; (2) the percentage of intergenic and intronic reads; (3) the proportion of ribosomal RNA (rRNA) reads; and (4) the exonic fragment count (Supplementary Table 2). Samples that had a paired-end read count of less than 10⁶ and had >35% mapped intergenic reads were excluded. Furthermore, within each participant, HIV-DNA⁻ samples that differed qualitatively from other replicates by having lower exonic reads or higher rRNA content were removed. If all HIV-DNA⁻ samples were removed for a participant, that participant was excluded from further analysis. After the removal of 31 FIND-seq-sorted samples in this curation process, 22 HIV-DNA⁺ and HIV-DNA⁻ samples belonging to participants 2208, 2510 and 3209 remained (Supplementary Table 2).

Analysis pipeline testing

The transcriptomes of primary cell samples generated by FIND-seq showed high proportions of intronic and intergenic reads (Extended Data Fig. 4). We therefore performed a second, deeper sequencing of libraries from the J-Lat:Raji cell mixing study and tested whether bioinformatics pipelines that address coverage bias and/or genomic DNA contamination might mitigate the effects of these patterns on the gene expression results. In total, we evaluated nine different pipelines using control data from the J-Lat:Raji cell line mixing study. The details of each pipeline are found below; the default options and parameters were used for all tools unless otherwise noted. Reads were mapped against the GRCh38 (ENSEMBL v.100) reference with coding gene annotations only for all pipelines tested.

CLC Genomics Workbench

CLC Genomics Workbench (GWB) v.20 and v.21 (https://digitalinsights.qiagen.com/) were tested using the default settings for mapping and abundance estimation using the RNA-seq analysis tool. For DGE analysis in GWB v.21, the option to filter average expression before FDR correction was selected.

3′ tag counting

Raw reads were preprocessed and mapped using GWB v.21. As in a previous study⁷⁰, reads were mapped to the region within 1,500 bp from the 3′ end of the gene and expression values were calculated in GWB. Analysis of DGE was also performed in GWB.

Salmon with positional bias correction

Salmon v.1.3.0 was implemented as it includes an algorithm for transcript expression quantification that incorporates bias modelling to account for position specific and other biases that are commonly seen in RNA-seq data⁷¹. Read mapping generated from GWB v.20 was used as the input. Post-quantification analysis of DGE was performed using EdgeR (v.3.32.1)⁷² and DESeq2 (v.1.30.1)⁷³.

SeqMonk DNA contamination correction

We considered that relatively high intergenic read proportions in sorted samples might be due to library incorporation of the genomic DNA retained with each cell during FIND-seq. We therefore used the SeqMonk expression quantification (http://www.bioinformatics.babraham.ac.uk/projects/seqmonk/) pipeline v.1.47.2, which estimates and corrects count data for each transcript using the density of intergenic reads. Read mapping previously processed in GWB v.20 was used as the input. Analysis of DGE was performed in DESeq2. Expression qualification and DGE with or without DNA contamination correction (SeqMonk) was evaluated, and each was tested with or without automatic independent filtering (DESeq2).

Selection of the analysis pipeline

For each pipeline, transcriptome accuracy was assessed by comparing J-Lat:Raji FIND-seq mixing study DGE results with the DGE detected between J-Lat cells and the unsorted J-Lat:Raji mixture in standard RNA-seq. DEGs were considered as those with an absolute fold change of ≥1.5 and FDR ≤ 0.05. DEGs identified in standard RNA-seq but not in FIND-seq were considered to be false negatives (FN); those identified only after FIND-seq as false positives (FP); and those identified in both FIND-seq and standard RNA-seq as true positives (TP). Based on this, the sensitivity (or recall) as TP/(TP + FN) and positive predictive value (PPV) as TP/(TP + FP) for each analysis process were calculated (Supplementary Table 7).

GWB v.20 and v.21 yielded the highest combination of sensitivity and PPV. Pipelines that corrected for coverage bias and DNA contamination did not increase the sensitivity, and in several cases showed lower PPV. Although GWB v.20 had a higher PPV than GWB v.21, there were developments in the GWB v.21 transcriptome analysis pipeline that were anticipated to reduce noise in primary cell samples. Thus, the pipeline in GWB v.21 was selected for the analysis of participant samples.

DGE between HIV-DNA⁺ and HIV-DNA⁻ memory CD4 T cells

As described above, transcriptome data from FIND-seq-sorted material contained higher proportions of intronic and intergenic sequences than the standard RNA-seq data. These non-exonic sequences were also abundant in material that was subjected to only the hydrogel encapsulation and cDNA synthesis steps of FIND-seq, consistent with the requisite co-retention of cell genomic DNA with transcriptome material and with efficient nuclear lysis and capture of immature transcripts in our hydrogel-based workflow. Accordingly, after curating the participant samples on the basis of quality, differential expression using only exonic reads was performed (Supplementary Table 3). Using GWB v.21, a combined analysis was performed using the Wald test with Benjamini–Hochberg multiple-testing correction by defining DEGs between HIV-DNA⁺ and HIV-DNA⁻ samples using data from the three participants as biological replicates, while controlling for any interparticipant differences in expression. Moreover, a participant-specific analysis was performed by determining DEGs within each participant separately (Supplementary Table 4). The default settings for all other parameters for the differential expression for RNA-seq tool were used except for Filter on average expression for FDR correction, which was enabled for all analyses. Unless otherwise noted, cut-offs for statistical significance of DEGs were absolute fold change of ≥1.5 and FDR ≤ 0.05.

Euclidean distance calculation

Pairwise Euclidean distances between the curated samples were calculated using the dist function in R (v.4.1.0) using a matrix of counts per million mapped reads (CPM) gene expression values as input. For each sample of a given HIV DNA status group (that is, HIV-DNA⁺ or HIV-DNA⁻), average intragroup and intergroup distances to all other curated samples were calculated, with values plotted in GraphPad Prism (v.9.3.1). Statistical significance of distance differences between groups was calculated using Mann–Whitney U-tests.

Transcriptomic pathway expression differences between HIV-DNA⁺ and HIV-DNA⁻ cells

Ingenuity Pathway Analysis (Qiagen, summer release 2021) was used to identify enriched biological pathways (Supplementary Table 5) on the basis of DEG lists. For the combined analysis considering samples from different participants as biological replicates, DEGs with an absolute fold change of ≥1.5 and FDR ≤ 0.05 were used. For the participant-specific analysis, DEGs with a fold change of ≥2 and raw P ≤ 0.01 were used and pathways regulated in the same direction for all three participants were identified.

The directionality of enrichment of pathways for each analysis was determined from the z-score, which is calculated in Ingenuity Pathway Analysis to represent predicted relative pathway activity. The z-score for each pathway was calculated using the list of genes annotated to that pathway and meeting criteria for statistically significant differential expression between HIV-DNA⁺ and HIV-DNA⁻ cells. A simplified z-score was calculated as follows: Z = (N⁺ − N^–)/(√N), where N⁺ and N^– are those genes of which the direction of regulation is concordant or discordant with predictions from the literature. A positive z-score implies activation of a pathway, whereas a negative z-score implies inhibition. Statistical significance of the enrichment of a pathway was determined using a right-tailed Fisher’s exact test as described previously⁷⁴. Networks of pathways identified as inhibited across participants and their corresponding genes were plotted using ClusterProfiler (v.4.1.1)⁷⁵.

WGCNA

Weighted gene co-expression network analysis⁷⁶ was performed in R using the WGCNA package (v.1.70) with a gene expression matrix of CPM values. Genes detected in <2 samples were excluded from analysis. The one-step automatic method was used for network construction and module detection. A soft thresholding power (β) of 6 was selected based on approximate scale-free topology using the function pickSoftThreshold. The co-expression network was built with a minimum module size of 30, reassignThreshold of 0 and mergeCutHeight of 0.25. The default values were used for the other parameters. Co-expressed modules of genes that correlated with HIV-DNA⁺ and HIV-DNA⁻ status were identified. Modules that were correlated with the traits with P ≤ 0.05 were considered to be significant. GO enrichment analysis for the genes belonging to the two WGCNA modules significantly correlated with cell HIV DNA status was performed using Enrichr (29 March 2021 release)^77,78. Enrichment analysis was performed using a Fisher’s exact test with Benjamini–Hochberg multiple-testing correction.

Analysis of HIV reads

To identify sequence reads representing HIV RNA, we created a combined human (GRCh38, ENSEMBL v.100) and HIV (GenBank: KT284371) reference. The HIV sequence for this reference was derived from the clade B representative in the 2016 LANL HIV sequence compendium, with deletions in the LTR regions replaced by the corresponding sequence and annotations from HXB2CG (GenBank: K03455 M38432), and with masking of the gag amplicon detected in FIND-seq. Reads were aligned to the combined reference using the Map reads to reference tool with the default settings in GWB (v.21). Counts were obtained for reads extracted from mapping to the combined reference. Mapped reads were visualized using GWB and Integrated Genome Viewer (v.2.11.9).

The frequencies of sequence variants in HIV reads compared to the reference sequence were examined to assess the presence of multiple virus sequences. To do this, a consensus of aligned sequences was generated and reads mapping to the HIV genome were extracted. These reads were then mapped against the consensus reference sequence. The resulting mapping was improved by local realignment in areas containing insertions and deletions (indels). Variants were then identified using the ‘low frequency’ variant caller in GWB v.21 with a minimum coverage of 2, minimum count of 1, inclusion of broken reads and without relative read direction filter applied. The default options for the other parameters were used. The list of variants obtained was manually inspected and filtered to remove (1) those with a frequency above 50% (thus representing the predominant sequence rather than a minor variant) and (2) those with read count = 1 or that represented presumptive technical insertions in homopolymeric regions.

Moreover, the Sequences from HIV Easily Reconstructed (SHIVER)⁷⁹ pipeline (v.1.5.8) was tested to create a hybrid reference from de novo assembled contigs of HIV reads for individual samples and closely matched reference sequences. In brief, reads were mapped to the GRCh38 (ENSEMBL v.100) reference using the Map reads to reference tool in GWB v.21 with stringent settings, with the length fraction and similarity fraction parameters set to 0.8. Unmapped reads were then collected and paired reads among them were processed using the de novo assembly tool in GWB (v.21) with the default settings. We also tested the iterative virus assembler (IVA; v.1.0.11) to perform de novo assembly from the unmapped reads using the default settings, but did not recover HIV contigs using this tool. Contig sequences obtained from GWB (v.21) were exported in fasta format and were processed using the SHIVER pipeline with the default settings. A clade B HIV genome obtained from the 2016 LANL sequence compendium was used as a reference.

Enrichment analysis of WGCNA modules in defined CD4 T cell subsets

Viably cryopreserved PBMCs from ART-treated people with HIV were thawed and stained for FACS with LIVE/DEAD Aqua stain (Molecular Probes) and the following antibodies (with the indicated dilutions): CXCR5-Alexa Fluor 488 (1:7; BD), CCR5-Cy7PE (1:10; BD), CD27-Cy5PE (1:10; Beckman Coulter), CD45RO-PE-Texas Red (1:12; Beckman Coulter), CD14-PE (1:80; BD), CD11c-PE (1:40; BD), CD3-H7APC (1:5; BD), CCR7-Alexa Fluor 700 (1:8; BD), CD20-APC (1:5; BD), CD56-APC (1:10; BD), T cell receptor gamma delta (TCR-γδ)-APC (1:5; BD), PD1-Brilliant Violet 711 (1:10; BioLegend), CD8-Qdot 655 (1:200; Invitrogen), CD4-Qdot 605 (1:200; Invitrogen), CD57-Qdot 585 (1:50; Invitrogen) and CCR6-Brilliant Violet 421 (1:10; BD). Stained samples were sorted into CD4 T cell subsets using the FACSAria (BD) system by first gating for single cells that were CD3⁺, Aqua^low and negative for CD11c, CD14, CD20, CD56 and TCR-γδ. The remaining events that were CD4⁺ and CD8⁻ were then collected as naive (CD27⁺CD45RO⁻) or memory CD4 T cell subsets (see memory subset definitions in Extended Data Table 2). Sorted cell subsets were processed for total RNA extraction and whole-transcriptome sequencing as described previously⁶³. The resulting data were processed using the standard pipeline in GWB v.21 using the human reference (GRCh38, ENSEMBL v. 100) with only the coding gene annotations. The resulting CPM values were exported and provided as an input to GSEA (v.4.2.3)^80,81. Enrichment of module 5 and 28 signatures (separated into genes upregulated and downregulated between HIV-DNA⁺ and HIV-DNA⁻ cells) was identified in transcriptome data from each memory CD4 T cell subset (with data from the naive CD4 T cell subset serving as a reference). GSEA was run using the default settings for all of the parameters.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Transcriptome sequencing data from human study participants were deposited with controlled access in the database of Genotypes and Phenotypes (dbGaP; phs003095.v1.p1). Transcriptome sequencing data from cell line experiments were deposited in the NCBI Sequencing Read Archive (SRA; accessions PRJNA819479 and PRJNA893817). Gene sets M3077 and M3076 analysed in Extended Data Fig. 2 are available online (https://www.gsea-msigdb.org/). Source data are provided with this paper.

References

Finzi, D. et al. Identification of a reservoir for HIV-1 in patients on highly active antiretroviral therapy. Science 278, 1295–1300 (1997).
Article ADS CAS PubMed Google Scholar
Siliciano, J. D. & Siliciano, R. F. In vivo dynamics of the latent reservoir for HIV-1: new insights and implications for cure. Annu. Rev. Pathol. 17, 271–294 (2022).
Article PubMed Google Scholar
Wong, J. K. et al. Recovery of replication-competent HIV despite prolonged suppression of plasma viremia. Science 278, 1291–1295 (1997).
Article ADS CAS PubMed Google Scholar
Procopio, F. A. et al. A novel assay to measure the magnitude of the inducible viral reservoir in HIV-infected individuals. EBioMedicine 2, 874–883 (2015).
Article PubMed PubMed Central Google Scholar
Barboric, M. et al. Tat competes with HEXIM1 to increase the active pool of P-TEFb for HIV-1 transcription. Nucleic Acids Res. 35, 2003–2012 (2007).
Article CAS PubMed PubMed Central Google Scholar
Bosque, A. & Planelles, V. Induction of HIV-1 latency and reactivation in primary memory CD4⁺ T cells. Blood 113, 58–65 (2009).
Article CAS PubMed PubMed Central Google Scholar
Ghose, R., Liou, L. Y., Herrmann, C. H. & Rice, A. P. Induction of TAK (cyclin T1/P-TEFb) in purified resting CD4⁺ T lymphocytes by combination of cytokines. J. Virol. 75, 11336–11343 (2001).
Article CAS PubMed PubMed Central Google Scholar
Kinoshita, S. et al. The T cell activation factor NF-ATc positively regulates HIV-1 replication and gene expression in T cells. Immunity 6, 235–244 (1997).
Article CAS PubMed Google Scholar
Nabel, G. & Baltimore, D. An inducible transcription factor activates expression of human immunodeficiency virus in T cells. Nature 326, 711–713 (1987).
Article ADS CAS PubMed Google Scholar
Sedore, S. C. et al. Manipulation of P-TEFb control machinery by HIV: recruitment of P-TEFb from the large form by Tat and binding of HEXIM1 to TAR. Nucleic Acids Res. 35, 4347–4358 (2007).
Article CAS PubMed PubMed Central Google Scholar
Van Lint, C., Emiliani, S., Ott, M. & Verdin, E. Transcriptional activation and chromatin remodeling of the HIV-1 promoter in response to histone acetylation. EMBO J. 15, 1112–1120 (1996).
Article PubMed PubMed Central Google Scholar
Archin, N. M. et al. Interval dosing with the HDAC inhibitor vorinostat effectively reverses HIV latency. J. Clin. Invest. 127, 3126–3135 (2017).
Article PubMed PubMed Central Google Scholar
Archin, N. M. et al. Administration of vorinostat disrupts HIV-1 latency in patients on antiretroviral therapy. Nature 487, 482–485 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Elliott, J. H. et al. Short-term administration of disulfiram for reversal of latent HIV infection: a phase 2 dose-escalation study. Lancet HIV 2, e520–e529 (2015).
Article PubMed PubMed Central Google Scholar
Rasmussen, T. A. et al. Panobinostat, a histone deacetylase inhibitor, for latent-virus reactivation in HIV-infected patients on suppressive antiretroviral therapy: a phase 1/2, single group, clinical trial. Lancet HIV 1, e13–e21 (2014).
Article PubMed Google Scholar
Sogaard, O. S. et al. The depsipeptide romidepsin reverses HIV-1 latency in vivo. PLoS Pathog. 11, e1005142 (2015).
Article PubMed PubMed Central Google Scholar
Banga, R. et al. PD-1⁺ and follicular helper T cells are responsible for persistent HIV-1 transcription in treated aviremic individuals. Nat. Med. 22, 754–761 (2016).
Article CAS PubMed Google Scholar
Banga, R. et al. Blood CXCR3⁺ CD4 T cells are enriched in inducible replication competent HIV in aviremic antiretroviral therapy-treated individuals. Front. Immunol. 9, 144 (2018).
Article PubMed PubMed Central Google Scholar
Brenchley, J. M. et al. T-cell subsets that harbor human immunodeficiency virus (HIV) in vivo: implications for HIV pathogenesis. J. Virol. 78, 1160–1168 (2004).
Article CAS PubMed PubMed Central Google Scholar
Chomont, N. et al. HIV reservoir size and persistence are driven by T cell survival and homeostatic proliferation. Nat. Med. 15, 893–900 (2009).
Article CAS PubMed PubMed Central Google Scholar
Douek, D. C. et al. HIV preferentially infects HIV-specific CD4⁺ T cells. Nature 417, 95–98 (2002).
Article ADS CAS PubMed Google Scholar
Gosselin, A. et al. Peripheral blood CCR4⁺CCR6⁺ and CXCR3⁺CCR6⁺CD4⁺ T cells are highly permissive to HIV-1 infection. J. Immunol. 184, 1604–1616 (2010).
Article CAS PubMed Google Scholar
Hiener, B. et al. Identification of genetically intact HIV-1 proviruses in specific CD4⁺ T cells from effectively treated participants. Cell Rep. 21, 813–822 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lee, G. Q. et al. Clonal expansion of genome-intact HIV-1 in functionally polarized Th1 CD4⁺ T cells. J. Clin. Invest. 127, 2689–2696 (2017).
Article PubMed PubMed Central Google Scholar
Mendoza, P. et al. Antigen-responsive CD4⁺ T cell clones contribute to the HIV-1 latent reservoir. J. Exp. Med. 217, e20200051 (2020).
Article CAS PubMed PubMed Central Google Scholar
Simonetti, F. R. et al. Antigen-driven clonal selection shapes the persistence of HIV-1-infected CD4⁺ T cells in vivo. J. Clin. Invest. 131, e145254 (2021).
Article CAS PubMed PubMed Central Google Scholar
Yukl, S. A. et al. Differences in HIV burden and immune activation within the gut of HIV-positive patients receiving suppressive antiretroviral therapy. J. Infect. Dis. 202, 1553–1561 (2010).
Article PubMed Google Scholar
Collora, J. A. et al. Single-cell multiomics reveals persistence of HIV-1 in expanded cytotoxic T cell clones. Immunity 55, 1013–1031 (2022).
Article CAS PubMed Google Scholar
Weymar, G. H. J. et al. Distinct gene expression by expanded clones of quiescent memory CD4⁺ T cells harboring intact latent HIV-1 proviruses. Cell Rep. 40, 111311 (2022).
Article CAS PubMed PubMed Central Google Scholar
Clark, I. C. et al. Identification of astrocyte regulators by nucleic acid cytometry. Nature https://doi.org/10.1038/s41586-022-05613-0 (2023).
Ho, Y. C. et al. Replication-competent noninduced proviruses in the latent reservoir increase barrier to HIV-1 cure. Cell 155, 540–551 (2013).
Article CAS PubMed PubMed Central Google Scholar
Clark, I. C., Thakur, R. & Abate, A. R. Concentric electrodes improve microfluidic droplet sorting. Lab Chip 18, 710–713 (2018).
Article CAS PubMed PubMed Central Google Scholar
Herroeder, S. et al. Guanine nucleotide-binding proteins of the G12 family shape immune functions by controlling CD4⁺ T cell adhesiveness and motility. Immunity 30, 708–720 (2009).
Article CAS PubMed Google Scholar
Rao, S. et al. Host mRNA decay proteins influence HIV-1 replication and viral gene expression in primary monocyte-derived macrophages. Retrovirology 16, 3 (2019).
Article PubMed PubMed Central Google Scholar
Ding, D. et al. Involvement of histone methyltransferase GLP in HIV-1 latency through catalysis of H3K9 dimethylation. Virology 440, 182–189 (2013).
Article CAS PubMed Google Scholar
Wang, J. et al. Retinoblastoma binding protein 4 represses HIV-1 long terminal repeat-mediated transcription by recruiting NR2F1 and histone deacetylase. Acta Biochim. Biophys. Sin. 51, 934–944 (2019).
Article CAS PubMed Google Scholar
Cismasiu, V. B. et al. BCL11B is a general transcriptional repressor of the HIV-1 long terminal repeat in T lymphocytes through recruitment of the NuRD complex. Virology 380, 173–181 (2008).
Article CAS PubMed Google Scholar
Gao, R. et al. Competition between PAF1 and MLL1/COMPASS confers the opposing function of LEDGF/p75 in HIV latency and proviral reactivation. Sci. Adv. 6, eaaz8411 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Malcolm, T., Kam, J., Pour, P. S. & Sadowski, I. Specific interaction of TFII-I with an upstream element on the HIV-1 LTR regulates induction of latent provirus. FEBS Lett. 582, 3903–3908 (2008).
Article CAS PubMed Google Scholar
Yang, X., Chen, Y. & Gabuzda, D. ERK MAP kinase links cytokine signals to activation of latent HIV-1 infection by stimulating a cooperative interaction of AP-1 and NF-κB. J. Biol. Chem. 274, 27981–27988 (1999).
Article CAS PubMed Google Scholar
Kino, T., Slobodskaya, O., Pavlakis, G. N. & Chrousos, G. P. Nuclear receptor coactivator p160 proteins enhance the HIV-1 long terminal repeat promoter by bridging promoter-bound factors and the Tat-P-TEFb complex. J. Biol. Chem. 277, 2396–2405 (2002).
Article CAS PubMed Google Scholar
Bres, V., Gomes, N., Pickle, L. & Jones, K. A. A human splicing factor, SKIP, associates with P-TEFb and enhances transcription elongation by HIV-1 Tat. Genes Dev. 19, 1211–1226 (2005).
Article CAS PubMed PubMed Central Google Scholar
Gebhardt, A. et al. The alternative cap-binding complex is required for antiviral defense in vivo. PLoS Pathog. 15, e1008155 (2019).
Article PubMed PubMed Central Google Scholar
Lykke-Andersen, J., Shu, M. D. & Steitz, J. A. Communication of the position of exon-exon junctions to the mRNA surveillance machinery by the protein RNPS1. Science 293, 1836–1839 (2001).
Article ADS CAS PubMed Google Scholar
Cobos Jimenez, V. et al. G3BP1 restricts HIV-1 replication in macrophages and T-cells by sequestering viral RNA. Virology 486, 94–104 (2015).
Article CAS PubMed Google Scholar
Csosz, E. et al. Analysis of networks of host proteins in the early time points following HIV transduction. BMC Bioinform. 20, 398 (2019).
Article Google Scholar
Mei, Y., Hahn, A. A., Hu, S. & Yang, X. The USP19 deubiquitinase regulates the stability of c-IAP1 and c-IAP2. J. Biol. Chem. 286, 35380–35387 (2011).
Article CAS PubMed PubMed Central Google Scholar
Jin, J. et al. LRRFIP2 negatively regulates NLRP3 inflammasome activation in macrophages by promoting Flightless-I-mediated caspase-1 inhibition. Nat. Commun. 4, 2075 (2013).
Article ADS PubMed Google Scholar
Wernimont, S. A. et al. Contact-dependent T cell activation and T cell stopping require talin1. J. Immunol. 187, 6256–6267 (2011).
Article CAS PubMed Google Scholar
Pallikkuth, S. et al. Peripheral T follicular helper cells are the major HIV reservoir within central memory CD4 T cells in peripheral blood from chronically HIV-infected individuals on combination antiretroviral therapy. J. Virol. 90, 2718–2728 (2015).
Article PubMed Google Scholar
Cohn, L. B. et al. Clonal CD4⁺ T cells in the HIV-1 latent reservoir display a distinct gene profile upon reactivation. Nat. Med. 24, 604–609 (2018).
Article CAS PubMed PubMed Central Google Scholar
Kuniholm, J. et al. Intragenic proviral elements support transcription of defective HIV-1 proviruses. PLoS Pathog. 17, e1009982 (2021).
Article CAS PubMed PubMed Central Google Scholar
Blazkova, J. et al. Effect of histone deacetylase inhibitors on HIV production in latently infected, resting CD4⁺ T cells from infected individuals receiving effective antiretroviral therapy. J Infect Dis 206, 765–769 (2012).
Article PubMed PubMed Central Google Scholar
Falcinelli, S. D. et al. Combined noncanonical NF-kappaB agonism and targeted BET bromodomain inhibition reverse HIV latency ex vivo. J. Clin. Invest. 132, e157281 (2022).
Article CAS PubMed PubMed Central Google Scholar
Grau-Exposito, J. et al. Latency reversal agents affect differently the latent reservoir present in distinct CD4⁺ T subpopulations. PLoS Pathog. 15, e1007991 (2019).
Article CAS PubMed PubMed Central Google Scholar
Sannier, G. et al. Combined single-cell transcriptional, translational, and genomic profiling reveals HIV-1 reservoir diversity. Cell Rep. 36, 109643 (2021).
Article CAS PubMed Google Scholar
Yukl, S. A. et al. HIV latency in isolated patient CD4⁺ T cells may be due to blocks in HIV transcriptional elongation, completion, and splicing. Sci. Transl. Med. 10, eaap9927 (2018).
Article PubMed PubMed Central Google Scholar
Baxter, A. E. et al. Single-cell characterization of viral translation-competent reservoirs in HIV-infected individuals. Cell Host Microbe 20, 368–380 (2016).
Article CAS PubMed PubMed Central Google Scholar
Ren, Y. et al. BCL-2 antagonism sensitizes cytotoxic T cell-resistant HIV reservoirs to elimination ex vivo. J. Clin. Invest. 130, 2542–2559 (2020).
Article CAS PubMed PubMed Central Google Scholar
Cochrane, C. R. et al. Intact HIV proviruses persist in the brain despite viral suppression with ART. Ann. Neurol. 92, 532–544 (2022).
Article CAS PubMed PubMed Central Google Scholar
Heesters, B. A. et al. Follicular dendritic cells retain infectious HIV in cycling endosomes. PLoS Pathog. 11, e1005285 (2015).
Article PubMed PubMed Central Google Scholar
Pinzone, M. R. et al. Naive infection predicts reservoir diversity and is a formidable hurdle to HIV eradication. JCI Insight 6, e150794 (2021).
Article PubMed PubMed Central Google Scholar
Boritz, E. A. et al. Multiple origins of virus persistence during natural control of HIV infection. Cell 166, 1004–1015 (2016).
Article CAS PubMed PubMed Central Google Scholar
Jordan, A., Bisgrove, D. & Verdin, E. HIV reproducibly establishes a latent infection after acute infection of T cells in vitro. EMBO J. 22, 1868–1877 (2003).
Article CAS PubMed PubMed Central Google Scholar
Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).
Article CAS PubMed Google Scholar
Hutcheson, J. et al. Combined deficiency of proapoptotic regulators Bim and Fas results in the early onset of systemic autoimmunity. Immunity 28, 206–217 (2008).
Article CAS PubMed Google Scholar
Yan, Z. H., Clark, I. C. & Abate, A. R. Rapid encapsulation of cell and polymer solutions with bubble-triggered droplet generation. Macromol. Chem. Phys. 218, 1600297 (2017).
Pasternak, A. O. et al. Highly sensitive methods based on seminested real-time reverse transcription-PCR for quantitation of human immunodeficiency virus type 1 unspliced and multiply spliced RNA and proviral DNA. J. Clin. Microbiol. 46, 2206–2211 (2008).
Article MathSciNet CAS PubMed PubMed Central Google Scholar
Clark, I. C. & Abate, A. R. Microfluidic bead encapsulation above 20 kHz with triggered drop formation. Lab Chip 18, 3598–3605 (2018).
Article CAS PubMed PubMed Central Google Scholar
Sigurgeirsson, B., Emanuelsson, O. & Lundeberg, J. Sequencing degraded RNA addressed by 3′ tag counting. PLoS ONE 9, e91851 (2014).
Article ADS PubMed PubMed Central Google Scholar
Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).
Article CAS PubMed PubMed Central Google Scholar
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Article CAS PubMed Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central Google Scholar
Kramer, A., Green, J., Pollard, J. Jr & Tugendreich, S. Causal analysis approaches in Ingenuity Pathway Analysis. Bioinformatics 30, 523–530 (2014).
Article PubMed Google Scholar
Wu, T. et al. clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. Innovation 2, 100141 (2021).
CAS PubMed PubMed Central Google Scholar
Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 9, 559 (2008).
Article Google Scholar
Chen, E. Y. et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinform. 14, 128 (2013).
Article Google Scholar
Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).
Article CAS PubMed PubMed Central Google Scholar
Wymant, C. et al. Easy and accurate reconstruction of whole HIV genomes from short-read sequence data with shiver. Virus Evol. 4, vey007 (2018).
Article PubMed PubMed Central Google Scholar
Mootha, V. K. et al. PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet. 34, 267–273 (2003).
Article CAS PubMed Google Scholar
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
Article ADS CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank the participants in this study; and N. Morgan for discussions. J-Lat full-length cells, clone 6.3, contributed by E. Verdin, were obtained through the NIH HIV Reagent Program, Division of AIDS, NIAID. This work was supported by NIH U01 AI129206-01 (to E.A.B. and A.R.A.), the CZ Biohub (to A.R.A.), the American Foundation for AIDS Research investment grant 109537-61-RGRL (to E.A.B.), the Delaney AIDS Research Enterprise (DARE) to Find a Cure 1UM1AI126611-01 (to S.G.D.), the NIH Office of AIDS Research Strategic Fund (to E.A.B.), the Intramural AIDS Targeted Antiviral Program (IATAP; to D.C.D.), the NIH Intramural Research Program (to D.C.D. and E.A.B.) and by NIH R01AI149699 (to F.J.Q. and A.R.A.). I.C.C. is supported by a transition grant from the NIH (K22AI152644).

Author information

These authors contributed equally: Adam R. Abate, Eli A. Boritz

Authors and Affiliations

Department of Bioengineering and Therapeutic Sciences, School of Pharmacy, University of California, San Francisco, San Francisco, CA, USA
Iain C. Clark & Adam R. Abate
Ann Romney Center for Neurologic Diseases, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Iain C. Clark, Shravan Thaploo, Michael A. Wheeler & Francisco J. Quintana
Department of Bioengineering, California Institute for Quantitative Biosciences, QB3, University of California, Berkeley, Berkeley, CA, USA
Iain C. Clark & Sakshi Shah
Virus Persistence and Dynamics Section, Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
Prakriti Mudvari, Samuel Smith, Mohammad Abu-Laban, Mehdi Hamouda, Marc Theberge, Sung Hee Ko, Liliana Pérez, Divya Kilam, Saami Zakaria, Sally Choi & Eli A. Boritz
Human Immunology Section, Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
Daniel G. Bunis, James S. Lee, Samuel Darko, Amy R. Henry & Daniel C. Douek
Broad Institute of MIT and Harvard, Cambridge, MA, USA
Michael A. Wheeler & Francisco J. Quintana
Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
Rebecca Hoh & Steven G. Deeks
Department of Chemical and Biomolecular Engineering, California Institute for Quantitative Biosciences, QB3, University of California, Berkeley, Berkeley, CA, USA
Salwan Butrus

Authors

Iain C. Clark
View author publications
You can also search for this author in PubMed Google Scholar
Prakriti Mudvari
View author publications
You can also search for this author in PubMed Google Scholar
Shravan Thaploo
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Smith
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Abu-Laban
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Hamouda
View author publications
You can also search for this author in PubMed Google Scholar
Marc Theberge
View author publications
You can also search for this author in PubMed Google Scholar
Sakshi Shah
View author publications
You can also search for this author in PubMed Google Scholar
Sung Hee Ko
View author publications
You can also search for this author in PubMed Google Scholar
Liliana Pérez
View author publications
You can also search for this author in PubMed Google Scholar
Daniel G. Bunis
View author publications
You can also search for this author in PubMed Google Scholar
James S. Lee
View author publications
You can also search for this author in PubMed Google Scholar
Divya Kilam
View author publications
You can also search for this author in PubMed Google Scholar
Saami Zakaria
View author publications
You can also search for this author in PubMed Google Scholar
Sally Choi
View author publications
You can also search for this author in PubMed Google Scholar
Samuel Darko
View author publications
You can also search for this author in PubMed Google Scholar
Amy R. Henry
View author publications
You can also search for this author in PubMed Google Scholar
Michael A. Wheeler
View author publications
You can also search for this author in PubMed Google Scholar
Rebecca Hoh
View author publications
You can also search for this author in PubMed Google Scholar
Salwan Butrus
View author publications
You can also search for this author in PubMed Google Scholar
Steven G. Deeks
View author publications
You can also search for this author in PubMed Google Scholar
Francisco J. Quintana
View author publications
You can also search for this author in PubMed Google Scholar
Daniel C. Douek
View author publications
You can also search for this author in PubMed Google Scholar
Adam R. Abate
View author publications
You can also search for this author in PubMed Google Scholar
Eli A. Boritz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: D.C.D., A.R.A. and E.A.B. Methodology: I.C.C., A.R.A. and E.A.B. FIND-seq analysis of HIV-DNA⁺ cells: I.C.C., S.T., M.A.W. and S. Shah. Molecular optimization studies: I.C.C., S.T., S. Smith, M.H., S.H.K., D.G.B., J.S.L., D.K., S.Z., S.C. and S.D. Library preparation and sequencing: S. Smith and A.R.H. Flow cytometry analysis of CD4 T cell subsets: M.A.-L., M.T. and L.P. Bioinformatic analysis: P.M., I.C.C. and S.B. Resources: I.C.C., R.H., S.G.D., F.J.Q., D.C.D., A.R.A. and E.A.B. Manuscript preparation: I.C.C., P.M. and E.A.B. Supervision: F.J.Q., D.C.D., A.R.A. and E.A.B.

Corresponding authors

Correspondence to Adam R. Abate or Eli A. Boritz.

Ethics declarations

Competing interests

I.C.C., A.R.A. and E.A.B. have prepared a provisional patent application for submission related to the technology used in this study.

Peer review

Peer review information

Nature thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 FIND-seq workflow details and sorted transcriptome purity.

(a–d) Transcriptome recovery and HIV gag DNA detection steps in FIND-seq, including (a,b) capture of single-cell genomes and transcriptomes in agarose, (c) reverse-transcription of polyadenylated RNA in agarose following oil removal and washes, generating single-cell agarose hydrogel beads with retained genomic DNA and covalently linked whole transcriptome cDNA that incorporates a template-switch oligonucleotide (TSO) for subsequent WTA, and (d) HIV detection PCR using gag primers and hydrolysis probe after hydrogel re-encapsulation on Device 2 (Fig. 1). (e) Diagram of whole transcriptome amplification and library preparation from material sorted by FIND-seq, with representative microelectrophoresis traces. (f,g) Purity of transcriptome material sorted by FIND-seq. (f) HIV DNA⁺ J-Lat human T cells and HIV DNA⁻ 3T3 mouse fibroblasts were mixed and subjected to FIND-seq followed by whole transcriptome sequencing. (g) Percentages of transcriptome reads from pure input cell populations, the 1:1 mixture of 3T3 and J-Lat, and HIV DNA⁺ cell transcriptomes sorted by FIND-seq that aligned unambiguously to human (red) or mouse (black) references. Results from a single experiment are shown.

Extended Data Fig. 2 Validity of HIV DNA⁺ cell transcriptome sequencing after FIND-seq.

(a) HIV DNA⁺ J-Lat T cells and HIV DNA⁻ Raji B cells were mixed at a 1:100 ratio and then subjected to FIND-seq and whole transcriptome sequencing. In parallel, extracted RNA samples from J-Lat cells and Raji cells were subjected to standard RNA-seq processing for comparison. (b) Droplet cytometry plot representing sorting of HIV DNA⁺ cells from the J-Lat:Raji mixture. (c) Correlation of gene expression differences between HIV DNA⁺ cells and HIV DNA⁻ cells (determined after FIND-seq) with gene expression differences between J-Lat cells and Raji cells (determined after standard RNA-seq). Points are coloured according to concordance of statistical significance in differential gene expression (DGE) results between FIND-seq and standard RNA-seq. Purple denotes genes significantly different (p < 0.05) in both FIND-seq and standard RNA-seq; grey denotes genes not significantly different in either FIND-seq or standard RNA-seq; blue denotes genes with discordant significance between standard and FIND-seq samples. Pearson’s R and p values were calculated in R v4.1.0. (d) Volcano plot of genes that were differentially expressed between HIV DNA⁺ and HIV DNA⁻ cells after FIND-seq, coloured by expected direction of change. Blue: differentially expressed genes found in gene set GSE10325 (M3077: CD4 T cell vs B cell down), red: differentially expressed genes found in GSE10325 (M3076: CD4 T cell vs B cell up). (e) Gene set enrichment pre-ranked analysis (GSEA) comparing transcriptomic differences between HIV DNA⁺ and HIV DNA⁻ cell samples sequenced from the J-Lat:Raji mixture after FIND-seq to previously reported differences between CD4 T cells and B cells (CD4 TCELL VS BCELL UP, M3076; CD4 TCELL VS BCELL DN, M3077).

Extended Data Fig. 3 FIND-seq droplet cytometry of HIV DNA⁺ memory CD4 T cells ex vivo.

(a) Droplet cytometry plots from the sorting of HIV DNA⁺ and HIV DNA⁻ memory CD4 T cells from five ART-treated people with HIV (PWH). Two PWH where transcriptome data were excluded from later analyses based on data quality are indicated with * (see “Participant sample data quality filtering” in Methods). (b) Negative control droplet cytometry plots of memory CD4 T cells from three HIV-uninfected participants. (c) Numbers of HIV DNA⁺ cells per million memory CD4 T cells, as measured by droplet cytometry during FIND-seq of samples from PWH and HIV-uninfected participants. n = 1 measurement for each of 5 PWH and each of 3 HIV-uninfected participants. Bars indicate median values. Total numbers of HIV DNA⁺ cells collected for each PWH are indicated in the table.

Extended Data Fig. 4 Transcriptome sequence composition and yield after FIND-seq from PWH.

(a) Percentages of exonic, intergenic, and intronic reads, and total yields of sequencing reads, in curated samples. (b) Numbers of mapped exonic reads and genes detected in curated samples. (c) Rank abundance plots of genes in each sample, grouped by participant. Red bars and traces indicate HIV DNA⁺ samples, and black traces indicate HIV DNA⁻ samples.

Extended Data Fig. 5 Host gene expression by HIV DNA⁺ and HIV DNA⁻ memory CD4 T cells under ART.

(a) Volcano plot of DGE between HIV DNA⁺ and HIV DNA⁻ cells from DGE analysis that considered samples from participants as biological replicates. Genes showing Fold Change >1.5 and FDR ≤0.05 between HIV DNA⁺ and HIV DNA⁻ cells are highlighted in red. (b) DGE between HIV DNA⁺ and HIV DNA⁻ cells, analysed separately in each participant. Genes with absolute fold change >1.5 and p ≤0.1 are highlighted in red. Labels indicate DEGs that were common to all three participants. (c) Overlap of DEGs (absolute fold change ≥1.5, p ≤0.1) between HIV DNA⁺ and HIV DNA⁻ after separate analysis of each participant. (d) RNA expression of DEGs that were common to all participants. Each plotted point indicates the expression level of the given gene in a single sorted sample (n = 16 biologically independent HIV DNA⁻ and 6 biologically independent HIV DNA⁺ samples). Box plots indicate the median with the lower and upper hinges corresponding to the 25th and 75th percentiles and whiskers corresponding to 1.5 x the interquartile range.

Extended Data Fig. 6 Fluorescence-activated cell sorting of circulating CD4 T cell subsets in ART-treated PWH.

Leukocytes (a) not part of multi-cell conjugates (b) that were viable and stained with the T cell marker CD3 (c) but not lineage markers CD20, CD56, TCR-γδ, CD14, or CD11c (d) and were CD4⁺ and CD8⁻ (e) were identified by CD27 and CD45RO staining as phenotypically naïve (f, top-left gate, CD27⁺CD45RO⁻ population) or memory (f, top-right and bottom gates) CD4 T cells. CD27⁺ memory CD4 T cells (f, top-right gate) were further separated into three populations by CXCR5 and CCR7 expression (g). Each of these three populations was then collected in two subsets defined by CCR6 expression (h–j). CD27⁻ memory CD4 T cells (f, bottom gate) were collected in three subsets defined by CCR6 and CD57 expression (k). The sorting strategy yielded purified naïve and 9 subsets of memory CD4 T cells. The marker expression patterns of the sorted memory CD4 T cell subsets are shown in Extended Data Table 2. Percentages of all events on each plot falling within the indicated gates are indicated. Results are shown for participant ID 2013.

Extended Data Table 1 Transcriptomic Signature Genes of HIV DNA⁺ Memory CD4 T Cells under ART

Full size table

Extended Data Table 2 Enrichment of Signature Genes within Sorted Memory CD4 T Cell Subsets from ART-Treated PWH

Full size table

Supplementary information

Reporting Summary

Supplementary Table 1

Clinical characteristics of study participants with HIV. a, Participants studied by FIND-seq. b, Participants studied using FACS analysis.

Supplementary Table 2

Sample curation. a, Sequencing characteristics of all of the samples, including the numbers of sequencing reads, percentages of reads mapped and genomic distributions of mapped reads. b, Sequencing characteristics of FIND-seq-sorted samples that passed a prospective process of curation for library quality (described in the Methods).

Supplementary Table 3

Whole-transcriptome gene expression for curated samples. Raw counts of all genes for all 22 curated samples are shown, determined using GWB v.21 as described in the Methods.

Supplementary Table 4

DGE analysis between HIV-DNA⁺ and HIV-DNA⁻ memory CD4 T cells. The analysis was restricted to protein-coding genes. a, DEGs identified by analysing samples from all of the participants as biological replicates. b, DEGs from participant 2208. c, DEGs from participant 2510. d, DEGs from participant 3209. Statistical significance of differential expression was calculated using Wald Test in GWB v.21.0.3 with Benjamini–Hochberg multiple-testing correction.

Supplementary Table 5

Pathway analysis of DEGs. a, Pathways derived from significant DEGs (defined by absolute fold change ≥ 2, P ≤ 0.01) from each participant that were shared and had concordant direction among all three participants. a, Summary of pathway z-scores and predicted direction of regulation across participants. b–g, Pathway genes that were differentially expressed in each participant for Gα12/13 signalling (b), NGF signalling (c), Th1 pathway (d), necroptosis signalling (e), dendritic cell maturation (f) and death receptor signalling (g). Significance of differential expression for each gene was calculated using Wald Test in GWB v.21.0.3 with Benjamini–Hochberg multiple-testing correction. Right-tailed Fisher’s exact test was used for statistical testing of pathway enrichment.

Supplementary Table 6

Listing and GO analysis of HIV-DNA⁺ memory CD4 T cell signature genes. Gene modules 5 and 28 were defined and correlated with cell HIV DNA status using WGCNA as described in the Methods. a, Differential expression results of module 5 genes in each participant and for all participants combined. b, GO analysis of module 5 genes. c, Differential expression results of module 28 genes in each participant and for all of the participants combined. d, GO analysis of module 28 genes. Gene co-expression modules were identified using WGCNA v.1.7.0 with weighted Pearson correlation. GO enrichment analysis was performed in Enrichr using Fisher’s exact test with Benjamini–Hochberg multiple-testing correction.

Supplementary Table 7

Selection of analysis pipeline. The sensitivity and positive-predictive value of gene expression quantification and differential expression analysis performed using GWB v.20, GWB v.21 or other pipelines that address coverage bias and/or genomic DNA contamination were tested using J-Lat-Raji cell mixing study data as described in the Methods.

Source data

Source Data Fig. 2

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Clark, I.C., Mudvari, P., Thaploo, S. et al. HIV silencing and cell survival signatures in infected T cell reservoirs. Nature 614, 318–325 (2023). https://doi.org/10.1038/s41586-022-05556-6

Download citation

Received: 21 March 2022
Accepted: 11 November 2022
Published: 04 January 2023
Issue Date: 09 February 2023
DOI: https://doi.org/10.1038/s41586-022-05556-6

This article is cited by

Immune targeting of HIV-1 reservoir cells: a path to elimination strategies and cure
- Marie Armani-Tourret
- Benjamin Bone
- Mathias Lichterfeld
Nature Reviews Microbiology (2024)
HIV-Tocky system to visualize proviral expression dynamics
- Omnia Reda
- Kazuaki Monde
- Yorifumi Satou
Communications Biology (2024)
Silence, escape and survival drive the persistence of HIV
- Nicolas Chomont
Nature (2023)
HIV infection
- Linda-Gail Bekker
- Chris Beyrer
- Jeffrey V. Lazarus
Nature Reviews Disease Primers (2023)
Estimating the contribution of CD4 T cell subset proliferation and differentiation to HIV persistence
- Daniel B. Reeves
- Charline Bacchus-Souffan
- Peter W. Hunt
Nature Communications (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.