Nanobody-tethered transposition enables multifactorial chromatin profiling at single-cell resolution

Stuart, Tim; Hao, Stephanie; Zhang, Bingjie; Mekerishvili, Levan; Landau, Dan A.; Maniatis, Silas; Satija, Rahul; Raimondi, Ivan

doi:10.1038/s41587-022-01588-5

Download PDF

Article
Published: 19 December 2022

Nanobody-tethered transposition enables multifactorial chromatin profiling at single-cell resolution

Nature Biotechnology volume 41, pages 806–812 (2023)Cite this article

17k Accesses
14 Citations
260 Altmetric
Metrics details

Subjects

Abstract

Chromatin states are functionally defined by a complex combination of histone modifications, transcription factor binding, DNA accessibility and other factors. Current methods for defining chromatin states cannot measure more than one aspect in a single experiment at single-cell resolution. Here we introduce nanobody-tethered transposition followed by sequencing (NTT-seq), an assay capable of measuring the genome-wide presence of up to three histone modifications and protein–DNA binding sites at single-cell resolution. NTT-seq uses recombinant Tn5 transposase fused to a set of secondary nanobodies (nb). Each nb–Tn5 fusion protein specifically binds to different immunoglobulin-G antibodies, enabling a mixture of primary antibodies binding different epitopes to be used in a single experiment. We apply bulk-cell and single-cell NTT-seq to generate high-resolution multimodal maps of chromatin states in cell culture and in human immune cells. We also extend NTT-seq to enable simultaneous profiling of cell surface protein expression and multimodal chromatin states to study cells of the immune system.

Mapping histone modifications in low cell number and single cells using antibody-guided chromatin tagmentation (ACT-seq)

Article Open access 20 August 2019

Scalable single-cell profiling of chromatin modifications with sciCUT&Tag

Article 07 November 2023

Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells

Article 03 June 2021

Main

Several related methods were recently developed that enable individual aspects of chromatin state to be measured at single-cell resolution via an antibody-guided DNA tagmentation reaction^1,2,3. However, chromatin states are characterized by combinations of factors at an individual locus⁴, including histone post-translational modifications and the binding of non-histone proteins to the DNA. For example, promoters are commonly marked by both H3K27ac and H3K4me2, whereas enhancers are marked by H3K27ac but typically lack H3K4me2. Furthermore, active and poised enhancers are both marked by H3K4me1 and can be distinguished by the presence of H3K27ac⁵. Therefore, multimodal single-cell chromatin profiling methods are required to fully characterize chromatin states in heterogeneous tissues.

Most single-cell chromatin profiling methods employ protein-A/G fused to Tn5 transposase^1,2,3,6,7. Protein-A/G binds to IgG antibodies, enabling Tn5 to be directed to regions of the genome where an IgG antibody is bound and inserting adapters for DNA sequencing. As protein-A/G binds to IgG antibodies from different species with high affinity, such methods are difficult to perform in an antibody-multiplexed design aiming to measure multiple histone modifications in a single experiment. Current approaches for multimodal chromatin profiling using protein-A/G, such as MulTI-Tag, involve complex experimental workflows with multiple wash and incubation steps⁷. Such methods have not been demonstrated to work with complex tissues^6,7, thus limiting their broader application. We reasoned that the use of small single polypeptide chain antibodies (nanobodies) that specifically bind IgG from different species or different IgG subtypes in place of protein-A/G may enable the multiplexing of primary antibodies to facilitate a multimodal chromatin assay⁸. Nanobodies bind strongly to their target epitope with dissociation constants (K_d) in the high picomolar scale, whereas protein-A/G has K_d in the low nanomolar scale^9,10. Furthermore, nanobodies are stable under a broad temperature and pH range. We hypothesized that a nanobody–Tn5 (nb–Tn5) fusion would form a stable and specific protein–protein complex with a target primary IgG antibody.

In this study, we engineered a set of nb–Tn5 fusion proteins and apply these fusion proteins in a multiplexed chromatin-profiling assay, measuring up to three distinct chromatin targets genome-wide simultaneously in single cells. We demonstrate the accuracy of multiplexed chromatin data obtained using our novel assay using cultured cells and human immune cells from the bone marrow and peripheral blood.

Results

We engineered and produced four different recombinant nb–Tn5 fusion proteins, specific for IgG antibodies from different species or IgG subtypes (Fig. 1a and Extended Data Fig. 1a). This included anti-mouse and anti-rabbit IgG nanobodies as well as isotype-specific nanobodies for mouse IgG1 and IgG2a. Loading nb–Tn5 fusion proteins with barcoded DNA adaptor sequences enables the identity of individual nb–Tn5 fusion proteins that generated the sequenced DNA fragment to be determined through DNA sequencing.

**Fig. 1: Bulk-cell NTT-seq enables simultaneous profiling of multiple chromatin marks.**

We tested each recombinant nb–Tn5 fusion in a bulk-cell nanobody-tethered transposition followed by sequencing (NTT-seq) experiment and obtained an NTT-seq library only when the nb–Tn5 matched the target antibody, whereas the incubation of nb–Tn5 with the unmatched antibody resulted in no library amplification via polymerase chain reaction (PCR) (Extended Data Fig. 1b). Motivated by this result, we performed multiplexed NTT-seq aiming to profile multiple different chromatin features in a single experiment. In our protocol, extracted nuclei are stained in a single step using primary antibodies for multiple epitopes simultaneously; the excess antibody is washed; and nuclei are incubated with a mixture of adapter-barcoded¹¹ nb–Tn5s, with each nb–Tn5 recognizing a specific IgG antibody. Subsequently, nb–Tn5s are activated by adding Mg²⁺, resulting in the tagmentation of genomic DNA in proximity of the primary antibody. The released DNA fragments harbor specific barcodes enabling the assignment of sequenced fragments to an individual nb–Tn5 and its associated primary antibody (Fig. 1b).

To test the targeting specificity of our species-specific nb–Tn5 fusion proteins, we used antibodies for H3K27me3 and H3K27ac in bulk human peripheral blood mononuclear cells (PBMCs), as these marks do not co-occur in the genome¹². Multiplexed NTT-seq resulted in libraries with nearly identical genomic distributions for each separate mark to matched NTT-seq performed on the same cells for each histone mark separately (Fig. 1c). The enrichment of sequenced fragments falling in H3K27me3 and H3K27ac peaks was approximately the same across the multiplexed and non-multiplexed experiments (Fig. 1d,e) and showed mutual exclusivity (Fig. 1f,g and Extended Data Fig. 1c). This suggests that multiplexed NTT-seq results in highly accurate localization of chromatin marks genome-wide. Then, we tested our isotype-specific nb–Tn5 profiling of three primary antibodies in a single experiment, repeating similar experiments using K562 cells staining with mouse IgG1 antibody against H3K27me3 and mouse IgG2a antibody against H3K27ac and including an additional rabbit IgG antibody for RNA polymerase II (RNAPII) with phosphorylated serine 2 and serine 5 (elongating RNAPII, enriched on actively transcribed genes)¹³. In comparison with a control experiment in which each of the three targets was profiled individually, multiplexed NTT-seq again produced similar target enrichment specificity in peaks (Fig. 1h–j and Extended Data Fig. 1d), demonstrating the ability to profile three targets simultaneously as well as the ability to profile non-histone proteins.

Encouraged by the results obtained in bulk cells, we next applied NTT-seq to characterize multimodal chromatin states at single-cell resolution using the 10x Genomics scATAC-seq kit (Fig. 2a). We profiled H3K27me3, H3K27ac and elongating RNAPII in a mixture of 8,617 K562 and HEK293 cells. We obtained, on average, 743 (s.d. 699) fragments for H3K27me3, 382 (s.d. 282) fragments for H3K27ac and 542 (s.d. 350) fragments for RNAPII per cell, outperforming the recently developed multiCUT&Tag method⁶ in terms of sensitivity and specificity (Extended Data Fig. 2a–c and Extended Data Table 1). We projected cells into a low-dimensional space using latent semantic indexing (LSI) and uniform manifold approximation and projection (UMAP)^14,15 and clustered cells using a weighted combination of all three data modalities¹⁶ (Fig. 2b). We identified two groups of cells corresponding to K562 and HEK293 cells. The genomic distribution of reads for each mark obtained in the multiplexed single-cell experiment was highly similar to data from the same cell lines where each feature was profiled individually in bulk (Fig. 2c and Extended Data Fig. 2b). Examining the distribution of fragments at ATAC¹⁷, H3K27me3, H3K27ac and RNAPII peaks further showed the co-occupancy of RNAPII and H3K27ac in open chromatin regions, whereas the signal for H3K27me3 was mutually exclusive with the other profiled marks (Fig. 2d,e). Furthermore, multiplexed single-cell-derived signals were highly correlated with bulk-cell signal for each assay profiled individually (Fig. 2d). Using a combination of cellular modalities provided the strongest separation of the two cell types in low-dimension space. When constructing a neighbor graph, we observed a higher fraction of a cell’s neighbors belonging to the same cell type as that cell when using multiple modalities (Fig. 2f). This highlights the value of multimodal chromatin data in measuring cellular states, and, together, these results show that NTT-seq is an effective method for profiling multiple chromatin modalities at single-cell resolution.

**Fig. 2: NTT-seq provides accurate single-cell multimodal chromatin profiles.**

We next sought to extend the NTT-seq method to enable simultaneous measurement of cell surface protein expression alongside multimodal chromatin states at single-cell resolution. Building on the recently developed CUT&Tag-pro method¹⁸, we stained a population of mobilized PBMCs with an oligonucleotide-conjugated panel of 173 antibodies targeting immune-relevant cell surface proteins. Cells were then crosslinked, permeabilized and incubated with antibodies against H3K27me3 and H3K27ac, and our standard NTT-seq protocol followed to generate single-cell libraries. This resulted in a dataset of 4,684 cells with a mean of 2,854 H3K27me3 and 412 H3K27ac fragments per cell (s.d. 2,953 and 356, respectively), with similar sensitivity and specificity to PBMC scCUT&Tag¹⁹ (Extended Data Fig. 3a). We further quantified 690 antibody-derived tag (ADT) counts per cell (s.d. 613), achieving a sensitivity similar to the recently demonstrated scCUT&Tag-pro method (Extended Data Fig. 3b)¹⁸. We clustered cells using a weighted combination of each modality¹⁶ and annotated cell clusters based on their patterns of protein expression (Fig. 3a). Protein expression patterns were concordant with cell clusters determined from a chromatin-based clustering, and we observed uniform expression of CD3 in T cells, mutually exclusive expression of CD4 and CD8, expression of CD14 in monocytes, CD19 in B cells and IL2RB in natural killer (NK) cells (Fig. 3b). Pseudobulk H3K27me3 and H3K27ac NTT-seq profiles were highly correlated with individual single-cell CUT&Tag-pro¹⁸ profiles for human PBMCs for the same histone marks (Fig. 3c). Consistent with our previous results, we also observed an extremely low coefficient of determination (R² = 0.00028) between H3K27me3 and H3K27ac levels within peaks (Fig. 3d), further supporting the accuracy of multiplexed NTT-seq single-cell profiles when applied to complex tissues. We observed consistency between chromatin states and protein expression patterns for each cell type, supporting accurate cell surface protein quantification. For example, the PAX5 locus was repressed in non-B cells with low CD19 protein expression and active in B cells with high CD19 expression (Fig. 3e). Similarly, the CD33 locus was active in monocytes with high CD33 protein expression and repressed in B cells with low CD33 expression. To evaluate the accuracy of our cell type classifications and multimodal chromatin landscapes measured by NTT-seq, we compared the results of our single-cell NTT-seq experiment with FACS-sorted ChIP-seq profiles for CD14 monocytes, CD34⁺ common myeloid progenitors (CMPs) and B cells previously published by the ENCODE consortium¹⁷. Pseudobulk profiles generated from our NTT-seq cell types recapitulated the expected cell-type-specific ENCODE ChIP-seq profiles (Extended Data Fig. 3c). To evaluate the reproducibility of single-cell chromatin profiles measured by single-cell NTT-seq (scNTT-seq), we generated a second scNTT-seq dataset measuring H3K27me3 and H3K27ac in human PBMCs (Extended Data Fig. 3d). This dataset achieved a similar level of sensitivity and specificity (Extended Data Fig. 3e,f and Extended Data Table 1) and was highly correlated with the genome-wide chromatin profiles obtained in our first PBMC dataset (Extended Data Fig. 3g), supporting the reproducibility of the assay.

**Fig. 3: Application of multiplexed scNTT-seq to human tissues.**

Although cell surface protein expression information provides a powerful method of studying immune cells, these methods are of limited value outside of the immunology field. To test whether a low-dimensional structure similar to that obtained using protein expression could be learned using the chromatin data alone, we compared the neighbor graphs obtained using protein expression data to that obtained using individual or combined chromatin modalities. Although individual chromatin marks were unable to faithfully recapitulate the low-dimensional structure observed when including protein expression data, the combination of H3K27me3 and H3K27ac modalities provided a similar low-dimensional neighbor structure (Fig. 3f). This, again, highlights the unique power of multimodal chromatin data in resolving cellular states and indicates that multiplexed NTT-seq may be a powerful method capable of characterizing heterogeneous tissues without the need for cell surface protein measurements.

We next sought to apply NTT-seq in a complex tissue that contains differentiating cells to capture chromatin remodeling dynamics that shape cellular identity. We profiled H3K27me3 and H3K27ac in human bone marrow mononuclear cells (BMMCs) (Fig. 3g). This resulted in 5,236 cells with a mean of 1,217 and 326 fragments per cell for H3K27me3 and H3K27ac, respectively (Fig. 3h and Extended Data Table 1). We annotated cell clusters using a combination of label transfer using an annotated BMMC scATAC-seq dataset^20,21 using the H3K27ac assay and manual annotation inspecting the presence of active and repressive histone marks at key marker genes for each cell type. We identified the expected cell types present in the immune system, including hematopoietic stem and progenitor cells (HSPCs) (Fig. 3g). Consistent with results obtained using cells in culture and PBMCs, we observed mutual exclusivity between H3K27ac and H3K27me3 across regions of the genome for BMMCs and a mean fraction of fragments in ENCODE peaks of 0.18 and 0.26 for H3K27me3 and H3K27ac, respectively (Extended Data Fig. 4a,b). To study how multimodal chromatin states may change during cell development, we ordered cells belonging to the B cell lineage, including HSPCs, common lymphoid progenitors (CLPs), pre-B, B and plasma cells along a developmental pseudotime trajectory using Monocle 3 (ref. ²²) (Fig. 3i). Although the H3K27ac data were more sparse than the H3K27me3 data, combining data from both modalities enabled a trajectory to be identified that revealed the expected ordering of cells in a trajectory leading from HSPCs through CLP, pre-B, B and plasma cells. To identify regions of the genome that changed their H3K27me3 and H3K27ac state across this trajectory, we quantified fragment counts for each cell in 10-kb bins spanning the entire genome for each chromatin modality. We identified genome bins with signal correlated with pseudotime (Pearson correlation >0.2, Bonferroni-corrected P < 1 × 10⁻⁸) and identified a set of 514 regions with opposing relationships between H3K27me3 and H3K27ac signal (>0.5 difference in Pearson correlation between the marks). Sorting these regions by the point at which they reached maximal H3K27me3 signal revealed an ordered sequence of sites that became repressed or activated during B cell development (Fig. 3j). The genome bin with the strongest gain in H3K27ac and loss of H3K27me3 signals across pseudotime was located at the PAX5 promoter (H3K27me3 r = −0.70, H3K27ac r = 0.53), a B-cell-specific transcription factor. Of the 514 dynamic sites, we further identified 87 of these sites that displayed dynamic H3K27me3 and H3K27ac states across the B cell trajectory but were static in their DNA accessibility profile (|r| < 0.05, Bonferroni-corrected P > 0.01), as quantified in an existing BMMC scATAC-seq dataset²⁰. This suggests that additional chromatin state dynamics can be identified using multimodal epigenomic data generated by scNTT-seq. Further experimental analysis will be required to fully characterize the function of these chromatin-dynamic sites in B cell development. To systematically assess the cell-type-specific expression pattern of genes located near genomic bins that were repressed or activated along the B cell pseudotime trajectory, we examined a published single-cell RNA sequencing (scRNA-seq) dataset for healthy human BMMCs. We identified the closest gene to each pseudotime-correlated genome bin and classified these as activated (positive correlation between H3K27ac and pseudotime) or repressed (positive correlation between H3K27me3 and pseudotime). Examining the expression of repressed and activated genes in the scRNA-seq dataset revealed concordant patterns of gene expression, with chromatin-activated genes becoming expressed later in B cell development and repressed genes being expressed in HSPCs but turned off later in B cell development (P < 2.2 × 10⁻¹⁶, t-test; Fig. 3k).

Discussion

Together, these analyses demonstrate that NTT-seq datasets provide accurate multimodal chromatin landscapes at single-cell resolution; contain sufficient information to identify major cell types and states in primary human tissues; provide profiles that reflect high-quality bulk ChIP-seq data¹⁷; and can be generated in conjunction with accurate cell surface protein expression measurements. Existing multimodal chromatin technologies require complex experimental workflows and have not been demonstrated to work with complex tissue samples^6,7 or are strictly limited in the chromatin states that they can measure²³. NTT-seq overcomes both of these limitations, providing a streamlined experimental workflow applicable to complex tissues.

Current limitations of this method, as well as other tagmentation-based chromatin profiling methods, include the need to perform tagmentation in high salt conditions to avoid open chromatin bias¹. This may preclude the measurement of some DNA-binding proteins, including some transcription factors. Furthermore, the small number of currently available secondary nanobodies limits the number of different marks that can be profiled simultaneously.

We anticipate that future reagent development and protocol improvements will enable us to scale NTT-seq to profiling of more than three marks simultaneously, and we are actively working on the generation of additional nb–Tn5s targeting antibodies raised in different species, such as goat, rat, sheep and guinea pig, and multiple IgG isotypes within the same species. This will expand the portfolio of reagents for multimodal chromatin profiling. The application of computational integration methods^18,21 may also enable composite profiles for many aspects of chromatin states to be generated in silico, beyond what is feasible to measure in a single experiment. Moreover, we anticipate that the use of dual-barcoded nb–Tn5 can be implemented in our protocol to investigate intra-locus interactions between different chromatin features, such as bivalent promoters or enhancers. We think that the simplicity with which NTT-seq achieves simultaneous profiling of chromatin features makes this approach particularly appealing and could represent the standard for multifactorial chromatin mapping in the future.

Conclusions

In this study, we developed a novel multifactorial chromatin-profiling method, NTT-seq, capable of measuring the genome-wide distribution of up to three different chromatin marks in bulk-cell and single-cell samples. NTT-seq uses a set of engineered nb–Tn5 fusion proteins to guide Tn5 transposition to specific sites in the genome, where sequence-barcoded DNA sequencing adaptors are inserted by Tn5. Our results demonstrate the high accuracy of multiplexed chromatin profiles obtained by NTT-seq in comparison to non-multiplexed CUT&Tag or ChIP-seq experiments; compatibility with simultaneous cell surface protein expression measurement; and the application of NTT-seq to human tissues.

Methods

Cell culture

K562 cells were acquired from the American Type Culture Collection (CCL-243). HEK293FT cells were acquired from Thermo Fisher Scientific (R70007). HEK293FT cells were maintained at 37 °C and 5% CO₂ in D10 medium (DMEM with high glucose and stabilized L-glutamine (Caisson, DML23) supplemented with 10% FBS (Thermo Fisher Scientific, 16000044). K562 cells were maintained at 37 °C and 5% CO₂ in R10 medium (RPMI with stabilized L-glutamine (Thermo Fisher Scientific, 11875119) supplemented with 10% FBS).

Primary cells acquisition and processing

Fresh mobilized PBMCs used for scNTT-seq with cell surface protein measurement were isolated within 48 hours of blood collection using a Ficoll (Thermo Fisher Scientific, 45-001-750) gradient according to the manufacturer’s recommendations and cryopreserved. Isolated mononuclear cells were thawed and stained according to standard procedures, beginning with resuspension in staining buffer (BioLegend, 420201) and incubation with Human TruStain FcX (10 minutes at 4 °C; BioLegend, 422302) to block Fc receptor-mediated binding. Cells were then stained with a CD34-PE-Vio770 antibody (20 minutes at 4 °C; Miltenyi Biotec, clone AC136, 130-113-180) and DAPI (Invitrogen, D1306). The samples were then sorted for DAPI⁻, CD34⁺ cells using a BD Influx cell sorter. Live CD34⁺ and CD34⁻ cells were mixed 1:10 and processed with NTT-seq. BMMCs and PBMCs profiled by scNTT-seq without cell surface protein measurement were purchased from AllCells. After thawing into DMEM with 10% FBS, the cells were spun down at 4 °C for 5 minutes at 400g and washed twice with PBS with 2% BSA. After centrifugation, the cell pellet was resuspended in staining buffer (2% BSA and 0.01% Tween in PBS).

Cloning of nb–Tn5 plasmid constructs

Previously published sequences coding for secondary nanobodies⁸ were synthesized as a gene fragment (Integrated DNA Technologies (IDT)) flanked by restriction enzyme sites NcoI and EcoRI. To replace protein A with a nanobody, 3×Flag-pA-Tn5-Fl (Addgene, 124601) and gene fragments were digested with NcoI and EcoRI for 1 hour at 37 °C, ligated overnight at 16 °C and subsequently transformed into competent cells (New England Biolabs (NEB), C2992H).

nb–Tn5 transposase production

The pTXB1-nbTn5 vector was transformed into BL21(DE3)-competent Escherichia coli cells (NEB, C2527), and nb–Tn5 was produced via intein purification with an affinity chitin-binding tag²⁴. Then, 400 ml of Luria broth (LB) culture was grown at 37 °C to optical density (OD₆₀₀) = 0.6. nb–Tn5 expression was then induced with isopropyl-ß-d-thiogalactopyranoside (IPTG) 0.25 mM at 22 °C for 6 hours. After induction, cells were pelleted and then frozen at −80 °C overnight. Cells were then lysed by sonication in 100 ml of pf HEGX (20 mM HEPES-KOH pH 7.5, 0.8 M NaCl, 1 mM EDTA, 10% glycerol, 0.2% Triton X-100) with a protease inhibitor cocktail (Roche, 04693132001). The lysate was pelleted at 30,000g for 20 minutes at 4 °C. The supernatant was transferred to a new tube, and 3 µl of neutralized 8.5% polyethylenimine (Sigma-Aldrich, P3143) was added dropwise to each 100 µl of bacterial extract, gently mixed and centrifuged at 30,000g for 30 minutes at 4 °C to precipitate DNA. The supernatant was loaded on four 2-ml chitin columns (NEB, S6651S). Columns were washed with 10 ml of HEGX, and then 1.5 ml of HEGX containing 100 mM DTT was added to the column with incubation for 48 hours at 4 °C to allow cleavage of nb–Tn5 from the intein tag. nb–Tn5 was eluted directly into two 30-kDa molecular weight cutoff (MWCO) spin columns (Millipore, UFC903008) by the addition of 2 ml of HEGX. Protein was dialyzed in five dialysis steps using 15 ml of 2× dialysis buffer (100 HEPES-KOH pH 7.2, 0.2 M NaCl, 0.2 mM EDTA, 2 mM DTT, 20% glycerol) and concentrated to 1 ml by centrifugation at 5,000g. The protein concentrate was transferred to a new tube and mixed with an equal volume of 100% glycerol. nb–Tn5 aliquots were stored at −80 °C.

Transposome assembly

We obtained barcoded Tn5 adaptors from IDT, as described by Amini et al.¹¹, with 8-bp barcode sequences designed using FreeBarcodes²⁵. To produce mosaic-end, double-stranded (MEDS) oligos, we annealed each barcoded T5 tagmentation oligo with the pMENT common oligo (100 µM each) as follows, in TE buffer: 95 °C for 5 minutes and then cooling at 0.2 °C per second to 4 °C (bcMEDS-A). The same process was used to anneal a single T7 tagment oligo with the pMENT common oligo (MEDS-B; Extended Data Table 2). bcMEDS-A and MEDS-B were mixed 1:1, and 6 µl was transferred to a new tube and mixed with 10 µl of nb–Tn5 enzyme after 1 hour at room temperature to allow for transposome assembly. Adapter sequences are shown in Extended Data Table 2.

Antibodies

Antibodies used were H3K27ac (1:50, Active Motif, 39133), H3K27ac (1:50, Active Motif, 91193), H3K27ac (1:50, Abcam, ab4729), H3K27me3 (1:50, Active Motif, 61017) and Phospho-Rpb1 CTD (Ser2/Ser5) (1:50, Cell Signaling Technology, 13546). For NTT-seq with surface markers readout on primary cells, the TotalSeq-A conjugated Human Universal Cocktail version 1.0 panel was obtained from BioLegend (399907).

NTT-seq

We performed NTT-seq using similar methods to those described previously by Kaya-Okur et al.¹ (https://doi.org/10.17504/protocols.io.bcuhiwt6), described in detail below.

Antibody staining

For NTT-seq with surface markers readout on primary cells, 1 million thawed PBMCs were resuspended in 200 µl of staining buffer (2% BSA and 0.01% Tween in PBS) and incubated for 15 minutes with 20 µl of Fc receptor block (TruStain FcX, BioLegend) on ice. Cells were then washed three times with 1 ml of staining buffer and pooled together. The panel of oligo-conjguated antibodies was added to the cells to incubate for 30 minutes on ice. After staining, cells were washed three times with 1 ml of staining buffer and resuspended in 100 µl of staining buffer. After the final wash, cells were resuspended in 200 µl of PBS ready for fixation.

Fixation and permeabilization

For human cell lines, nuclei were extracted as previously described²⁶ and resuspended in 150 µl of PBS. Then, 16% methanol-free formaldehyde (Thermo Fisher Scientific, PI28906) was added for fixation (final concentration: 0.1%) at room temperature for 3 minutes. The cross-linking reaction was stopped by the addition of 12 µl of 1.25 M glycine solution. Subsequently, nuclei were washed once with 150 µl of antibody buffer (20 mM HEPES pH 7.6, 150 mM NaCl, 2 mM EDTA, 0.5 mM spermidine, 1% BSA, 1× protease inhibitors).

For NTT-seq on PBMCs and BMMCs, 16% methanol-free formaldehyde (Thermo Fisher Scientific, PI28906) was added for fixation (final concentration: 0.1%) at room temperature for 5 minutes. The cross-linking reaction was stopped by the addition of 12 µl of 1.25 M glycine solution. Subsequently, cells were washed twice with PBS. The permeabilization was performed by adding isotonic lysis buffer (20 mM Tris-HCl pH 7.4, 150 mM NaCl, 3 mM MgCl₂, 0.1% NP40, 0.1% Tween-20, 1% BSA, 1× protease inhibitors) on ice for 7 minutes. Subsequently, 1 ml of cold wash buffer (20 mM HEPES pH 7.6, 150 mM NaCl, 0.5 mM spermidine, 1× protease inhibitors) was added, and cells were centrifuged at 800g for 5 minutes at 4 °C.

Tagmentation

Nuclei or permeabilized cells were directly suspended with 150 µl of antibody buffer (20 mM HEPES pH 7.6, 150 mM NaCl, 2 mM EDTA, 0.5 mM spermidine, 1% BSA, 1× protease inhibitors) with a cocktail of primary antibodies and incubated overnight on a rotator at 4 °C. The next day, cells were washed twice with 150 μl of wash buffer to remove the remaining antibodies. The cells were then resuspended in 150 μl of high salt wash buffer (20 mM HEPES pH 7.6, 300 mM NaCl, 0.5 mM spermidine, 1× protease inhibitors) with 2.5 µl of nb–Tn5 for each target of interest and incubated for 1 hour on a rotator at room temperature. The cells were then washed twice with high salt wash buffer and resuspended in 50 μl of tagmentation buffer (20 mM HEPES pH 7.6, 300 mM NaCl, 0.5 mM spermidine, 10 mM MgCl₂, 1× protease inhibitors). The samples were incubated for 1 hour at 37 °C. Tagmentation steps were performed in 0.2-ml tubes to minimize cell loss.

NTT-seq bulk

To stop tagmentation, 1 µl of 0.5 M EDTA, 1 µl of 10% SDS and 0.25 µl of 20 mg ml⁻¹ proteinase K were added to the sample and incubated at 55 °C for 1 hour. DNA was extracted with ChIP DNA Clean & Concentrator kit (Zymo Research, D5201) following manufacturer instructions. To amplify libraries, 21 µl of DNA was mixed with 2 µl of a universal i5 and a uniquely barcoded i7 primer, using a different barcode for each sample. A volume of 25 µl of NEBNext HiFi 2× PCR Master mix was added and mixed. The sample was placed in a thermocycler with a heated lid using the following cycling conditions: 72 °C for 5 minutes (gap filling); 98 °C for 30 seconds; 14 cycles of 98 °C for 10 seconds and 63 °C for 30 seconds; final extension at 72 °C for 1 minute and hold at 8 °C. Post-PCR clean-up was performed by adding 1.1× volume of AMPure XP beads (Beckman Coulter), and libraries were incubated with beads for 15 minutes at room temperature, washed twice gently in 80% ethanol and eluted in 30 µl of 10 mM Tris pH 8.0.

NTT-seq single-cell encapsulation, PCR and library construction

After tagmentation, cells were centrifuged for 5 minutes at 1,000g, and the supernatant was discarded. Cells were resuspended with 30 µl of 1× Diluted Nuclei Buffer (10x Genomics, 2000207), counted and diluted to a concentration based on the targeted cell number. The transposed cell mix was prepared as follows: 7 µl of ATAC buffer and 8 µl of cells in 1× Diluted Nuclei Buffer. All remaining steps were performed according to the 10x Chromium Single Cell ATAC protocol. For NTT-seq with surface markers readout on primary cells, the library construction method was adapted from ASAP-seq²⁷. In brief, 0.5 μl of 1 μM bridge oligo A (TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGNNNNNNNNNVTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT/3InvdT/) was added to the barcoding mix. Linear amplification was performing using the following PCR program: 40 °C for 5 minutes, 72 °C for 5 minutes, 98 °C for 30 seconds; 12 cycles of 98 °C for 10 seconds, 59 °C for 30 seconds and 72 °C for 1 minute; ending with hold at 15 °C. The remaining steps were performed according to the 10x Genomics scATAC-seq protocol (version 1.1), with the following additional modifications:

ADTs: during silane bead elution (Step 3.1s), beads were eluted in 43.5 μl of elution solution I. The extra 3 μl was used for the surface protein tags library. During SPRI cleanup (Step 3.2d), the supernatant was saved, and the short DNA derived from antibody oligos was purified with 2× SPRI beads. The eluted DNA was combined with the 3 µl left aside after the silane purification to be used as input for protein tag amplification. PCR was set up to generate the protein tag library with KAPA HiFi Master Mix (P5 and RPI-x primers): 95 °C for 3 minutes; 14–16 cycles of 95 °C for 20 seconds, 60 °C for 30 seconds and 72 °C for 20 seconds; followed by 72 °C for 5 minutes and ending with hold at 4 °C.

RPI-x primer: CAAGCAGAAGACGGCATACGAGATxxxxxxxxGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA

P5 primer:

AATGATACGGCGACCACCGAGATCTACAC

Sequencing

The final libraries were sequenced on NextSeq 550 by using custom primers (Extended Data Table 2) with the following strategy: i5: 38 bp, i7: 8 bp, read1: 60 bp, read2: 60 bp (for PBMC single-cell NTT-seq without cell surface proteins, read1: 50 bp, read2: 50 bp).

Bulk-cell data analysis

Bulk-cell data for the cell culture and PBMC datasets were mapped to the hg38 analysis set using bwa-mem2 with default parameters²⁸. Output BAM files were sorted and indexed using samtools²⁹, and bigWig files were created using the DeepTools bamCoverage function with the –normalizeUsing BPM option set. Fragment files were created using Sinto (https://github.com/timoast/sinto), which uses the Pysam and htslib packages²⁹. Multi-NTT-seq heat maps were generated in DeepTools³⁰. ChIP-seq peak coordinates for H3K27me3 and H3K27ac for bulk PBMCs, and for H3K27me3, H3K27ac and RNAPII serine-2 and serine-5 phosphate for K562 cells, were downloaded from ENCODE¹⁷. We counted sequenced DNA fragments falling within each peak region for each bulk-cell PBMC or K562 cell NTT-seq dataset using custom R code and the scanTabix function in Rsamtools, and we normalized counts according to the total number of mapped reads for each dataset (counts per million mapped reads normalization). The coefficient of determination (R²) between peak counts across pairs of experiments was computed using the lm function in R.

Single-cell data analysis

Cell culture dataset

Read mapping

Reads were mapped to the hg38 analysis set using bwa-mem2 (ref. ²⁸) with default parameters; the output was sorted and indexed using samtools²⁹; and the resulting BAM file was used to create a fragment file using the Sinto package (https://github.com/timoast/sinto). We ran the Sinto fragments command with the –barcode_regex ‘[^:]*’ parameter set to extract cell barcodes from the read name. Output files were coordinate-sorted, bgzip-compressed and indexed using tabix³¹, and the resulting fragment files were used as input to downstream analyses.

Quantification, quality control and dimension reduction

Genomic regions were quantified using the AggregateTiles function in Signac¹⁴ with binsize = 10,000 and min_counts = 1, using the hg38 genome. Cells with <10,000 total counts, >75 H3K27ac counts, >150 H3K27me3 counts and >100 RNAPII counts were retained for further analysis. Each assay was processed by performing TF-IDF normalization on the count matrix for the assay, followed by LSI using the RunTFIDF and RunSVD functions in Signac with default parameters. Two-dimensional visualizations were created for each assay using UMAP, using LSI dimensions 2–10 for each assay. Weighted nearest neighbor (WNN) analysis was performed using the FindMultiModalNeighbors function in Seurat, with reduction.list = list(‘lsi.k27ac’, ‘lsi.k27me’, ‘lsi.pol2’) and dims = list(2:10, 2:10, 2:10) to use LSI dimensions 2–10 for each assay. Cell clustering was performed using the resulting WNN graph using the Smart Local Moving community detection algorithm³² by running the FindClusters function in Seurat, with algorithm = 3, graph.name = ‘wsnn’ and resolution = 0.05. This resulted in two cell clusters, which were assigned as HEK or K562 based on their correlation with bulk-cell chromatin data for HEK and K562 cells.

Specificity analysis

K562 cell bulk ChIP-seq peaks for H3K27ac, H3K27me3 and RNA Pol2 Ser-2 and Ser-5 phosphate were downloaded from ENCODE¹⁷. Because the fraction of reads in peaks metric can be sensitive to the peak set used, we opted to use previously reported ENCODE peaks throughout our analysis as much as possible. Ser-2 and Ser-5 phosphate peaks were combined using the reduce function from the GenomicRanges R package. Fragment counts for K562 cells in the bulk-cell and single-cell dataset were quantified for each peak using the scanTabix function in the Rsamtools R package, with counts normalized according to the total sequencing depth for each dataset. To assess the targeting specificity in single-cell NTT-seq, we computed the coefficient of determination (R²) between peak counts for each pair of assays and between bulk-cell and single-cell data for the same assay. We visualized relative peak counts for each assay for each peak by creating a ternary plot using the ggtern R package³³. To assess the low-dimensional neighbor structure obtained using each assay or combinations of assays, we computed the fraction of k-nearest neighbors for each cell i that belonged to the same cell type classification as cell i (k = 50 for single-modality neighborhoods and variable k per cell for multimodal neighbor graph due to the WNN method).

multi-CUT&Tag comparison

To create a fragment file for the published multi-CUT&Tag dataset, raw sequencing data from Gopalan et al.⁶ were downloaded from the National Center for Biotechnology Information Sequence Read Archive and split into separate FASTQ files according to their Tn5 barcode using a custom Python script. Reads were mapped to the hg38 genome using bwa-mem2, and fragment files were created as described above for the NTT-seq datasets. Code to reproduce this analysis is available on GitHub: https://github.com/timoast/multi-ct. We ran the CountFragments function in Signac to count the total number of fragments per cell for each multi-CUT&Tag assay and retained cells with >200 total counts for further analysis, as described in the original publication⁶. For mixed-barcode fragments, we counted ½ count to the total of each assay matching the pair of Tn5 barcodes. To compute the targeting specificity, we downloaded published ENCODE ChIP-seq peaks for H3K27me3 and H3K27ac for mouse embryonic stem cells (ENCFF008XKX and ENCFF360VIS) and computed the fraction of fragments in peak regions using the scanTabix function in the Rsamtools R package, normalizing counts according to the total sequencing depth for the dataset. We also computed the R² between H3K27me3 and H3K27ac as described above, using the ENCODE peak regions.

PBMC datasets

Read mapping

Genomic reads were mapped and processed as described above for the cell culture single-cell dataset. ADT reads were processed using Alevin³⁴. We first created a salmon index³⁵ for the BioLegend TotalSeq-A antibody panel, with the –features -k7 parameters. We quantified counts for each ADT barcode using the salmon Alevin command with the following parameters: –naiveEqclass, –keepCBFraction 0.8, –bc-geometry 1[1–16], –umi-geometry 2[1–10], –read-geometry 2[71–85].

Quantification, quality control, and dimension reduction

Genomic bins were quantified using the AggregateTiles function in Signac, with binsize = 5,000 and min_counts = 1 to quantify 5-kb bins genome-wide, retaining bins with at least one count. We retained cells with <40,000 and >300 H3K27me3 counts, <10,000 and >100 H3K27ac counts and <10,000 and >100 ADT counts. We normalized the ADT data using a centered log ratio transformation using the NormalizeData function in Seurat, with normalization.method = ‘CLR’ and margin = 2. We reduced the dimensionality of the ADT assay by first scaling and centering the protein expression values and running principal component analysis (PCA) (ScaleData and RunPCA functions in Seurat). We computed a two-dimensional UMAP visualization using the first 40 principal components (PCs) and clustered cells using the Louvain community detection algorithm. We identified and removed two low-quality clusters containing higher overall ADT counts as well as higher counts for naive IgG antibodies included in the staining panel. After removing low-quality ADT clusters, we reduced the dimensionality of the H3K27me3 and H3K27ac assays using LSI (FindTopFeatures, RunTFIDF and RunSVD functions in Signac) and created two-dimensional UMAPs using LSI dimensions 2–30 for each chromatin assay. To construct a low-dimensional representation using all three data modalities, we ran the WNN algorithm, using the first 40 ADT PCs and LSI dimensions 2–30 for H3K27me3 and H3K27ac (FindMultiModalNeighbors function in Seurat). We clustered cells using the WNN graph using the Smart Local Moving algorithm³² (FindClusters function in Seurat with algorithm = 3 and resolution = 1). Cell clusters were manually annotated as cell types using the protein expression information. To compare the low-dimensional structure obtained using individual chromatin modalities or combinations of modalities, we computed for each cell i the fraction of neighboring cells annotated as the same cell type as cell i. We repeated this computation using neighbor graphs computed using single data modalities or weighted combinations of modalities computed using the WNN method.

ENCODE data comparison

Peaks and genomic coverage bigWig files for H3K27me3 and H3K27ac ChIP-seq published by the ENCODE consortium¹⁷ for B cells, CD34⁺ CMPs and CD14⁺ monocytes were downloaded from the ENCODE website (https://www.encodeproject.org/). We created bigWig files for each corresponding cell type identified in the single-cell multiplexed NTT-seq PBMC dataset by writing sequenced fragments for those cells to a separate BED file, creating a bedGraph file using the bedtools genomecov command³⁶ and creating a bigWig file using the UCSC bedGraphToBigWig tool. We computed the genomic coverage for NTT-seq datasets and ChIP-seq datasets within H3K27me3 and H3K27ac regions using the DeepTools multiBigwigSummary function³⁰ with the –outRawCounts option set to output the raw correlation matrix as a text file. We computed the correlation between peak region coverage in NTT-seq and ENCODE ChIP-seq datasets using the cor function in R with method = ‘spearman’. We computed the fraction of fragments per cell falling in ENCODE H3K27me3 and H3K27ac ChIP-seq peak regions for PBMCs for each assay as described above.

CUT&Tag-pro data comparison

Processed CUT&Tag-pro H3K27me3 and H3K27ac datasets for human PBMCs were downloaded from Zenodo: https://zenodo.org/record/5504061. We compared the number of ADT counts in NTT-seq and scCUT&Tag-pro datasets by extracting the total number of ADT counts per cell from the scCUT&Tag-pro and NTT-seq Seurat objects and plotting the distribution of total ADT counts per cell for each dataset. We created bigWig files for each scCUT&Tag-pro dataset by first creating a bedGraph file using the bedtools genomecov function and then creating a bigWig file using the UCSC bedGraphToBigWig function. We computed the coverage for scCUT&Tag-pro datasets within H3K27me3 and H3K27ac PBMC ENCODE peaks using the multiBigwigSummary function in DeepTools as described above for the ENCODE data comparison.

BMMC dataset

Read mapping

Raw genomic reads were mapped and processed as described above for the cell culture single-cell dataset.

Quantification, quality control and dimension reduction

Genomic bins were quantified using the AggregateTiles function in Signac, with binsize = 5,000 and min_counts = 1 to quantify 5-kb bins genome-wide, retaining bins with at least one count. We retained cells with <10,000 and >100 H3K27me3 counts and <10,000 and >75 H3K27ac counts for further analysis. We normalized the counts and reduced dimensionality for each assay by running the RunTFIDF, RunSVD and RunUMAP functions in Signac and Seurat for each assay. We computed a WNN graph for H3K27me3 and H3K27ac using the FindMultiModalNeighbors function in Seurat, with reduction = list(‘lsi.me3’, ‘lsi.ac’) and dims.list = list(2:50, 2:80) to use LSI dimensions 2–50 and 2–80 for H3K27me3 and H3K27ac, respectively. A two-dimensional UMAP was created using the WNN graph by running the RunUMAP function in Seurat with nn.name = ‘weighted.nn’ to use the pre-computed neighbor graph. We clustered cells using the WNN graph using the Smart Local Moving community detection algorithm³² (FindClusters function in Seurat with algorithm = 3, resolution = 3 and graph.name = ‘wsnn’). We computed the fraction of fragments per cell falling in ENCODE PBMC H3K27me3 and H3K27ac ChIP-seq peak regions for each assay as described above.

Cell annotation

To annotate cell types, we performed label transfer²¹ using the H3K27ac assay and a previously published scATAC-seq dataset containing healthy human bone marrow cells²⁰. As the original publication mapped reads to the hg19 genome, we re-processed the original reads using the 10x Genomics cellranger-atac version 2 software with default parameters, aligning to the hg38 genome. Code to reproduce this analysis is available on GitHub: https://github.com/timoast/MPAL-hg38. To transfer cell type labels from the scATAC-seq dataset to our multimodal NTT-seq dataset, we quantified scATAC-seq peaks using the H3K27ac assay and then performed TF-IDF normalization on the resulting count matrix using the IDF value from the scATAC-seq dataset. We performed LSI on the scATAC-seq BMMC dataset using the RunTFIDF and RunSVD functions in Signac with default parameters. We next ran the FindTransferAnchors function in Seurat, with reduction = ‘lsiproject’, dims = 2:30 and reference.reduction = ‘lsi’ to project the query data onto the reference scATAC-seq LSI using dimensions 2–30, and we found anchors between the reference and query dataset. We ran TransferData with weight.reduction = bmmc_ntt[[‘lsi.me3’]] and dims = 2:50 to weight anchors using LSI dimensions 2–50 from the H3K27me3 assay. We used these unsupervised cell type predictions as a guide when assigning cell clusters to cell types.

Trajectory analysis

We subsetted the BMMC dataset to contain cells annotated as HSPC, GMP/CMP, pre-B, B or plasma cells. Using the subset object, we constructed a new UMAP dimension reduction by running FindTopFeatures, RunTFIDF and RunSVD in Signac, followed by RunUMAP in Seurat with reduction = ‘lsi’, for each assay. We then constructed a joint low-dimensional space using the WNN method by running the FindMultiModalNeighbors function in Seurat. We converted the Seurat object containing these cells to a SingleCellExperiment object using the as.cell_data_set function in the SeuratWrappers package (https://github.com/satijalab/seurat-wrappers). We next ran Monocle 3 (ref. ²²) using the pre-computed UMAP dimension reduction constructed using both chromatin modalities by running the cluster_cells, learn_graph and order_cells functions, setting the HSPC cells as the root of the trajectory. To find genomic features in each assay whose signal depended on pseudotime state, we quantified fragment counts for each cell in each 10-kb genome bin for the H3K27me3 and H3K27ac assays. To reduce the sparsity of the measured signal, we averaged counts for each genomic region across the cell’s 50 nearest neighbors, defined using the H3K27me3 neighbor graph with LSI dimensions 2–20 and normalized the fragment counts by the total neighbor-averaged counts per cell. For each genomic region, we computed the Pearson correlation between the signal in the genomic region and the cell’s position in pseudotime. To find regions that underwent coordinated activation or repression, we selected regions with a Pearson correlation >0.2 or <−0.2 and a difference in Pearson correlation between the H3K27me3 and H3K27ac assays greater than 0.5 (for example, −0.25 correlation for H3K27me3 and +0.25 for H3K27ac). To display genomic regions in a heat map representation, we ordered cells based on their pseudotime rank and ordered genomic regions based on the position in pseudotime showing maximal H3K27me3 signal. For the purpose of visualization, we smoothed the signal for each genomic region by applying a rolling sum function with cells ordered based on pseudotime, summing the signal over 100-cell windows. This was performed using the roll_sum function in the RcppRoll R package (version 0.3.0).

We used the ClosestFeature function in Signac to identify the closest gene to each genomic region correlated with pseudotime. Genomic regions where the closest gene was >50,000 bp away were removed (21 genes for H3K27me3 and seven genes for H3K27ac). To examine the gene expression patterns of these genes, we downloaded a previously integrated and annotated scRNA-seq dataset for the human bone marrow, produced as part of the HuBMAP consortium (https://zenodo.org/record/5521512)^20,37,38. We subset the scRNA-seq object to contain the same cell states that we examined in the NTT-seq data (HSC, LMPP, CLP, pro-B, pre-B, transitional B, naive B, mature B and plasma) and computed a gene module score for the active and repressed genes using the AddModuleScore function in Seurat.

To compare changes in scATAC-seq signal across the B cell developmental trajectory, we also downloaded a previously published BMMC scATAC-seq dataset²⁰ and subset the cells belonging to the B cell trajectory using the published cell type annotations provided by the original authors. We quantified the same set of genomic regions used in the scNTT-seq BMMC analysis and created a similar B cell developmental trajectory by assigning a numeric value to each B cell type according to its relative position along the known developmental trajectory (1 = HSC, 2 = CMP/LMPP, 3 = CLP, 4 = B and 5 = plasma) and computed the Pearson correlation between each genomic region and the B cell trajectory.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The datasets generated in this study are available from the National Center for Biotechnology Information Gene Expression Omnibus (GSE212588)³⁹ and Sequence Read Archive (SRP395379)⁴⁰. Processed single-cell R objects are available from Zenodo (https://zenodo.org/record/7102159)⁴¹. Data collected from PBMCs with cell surface protein expression are available from dbGaP (phs003068.v1.p1). nb–Tn5 fusion plasmids developed in this study are available from Addgene (184285, 184286, 184287 and 184288). The following publicly available datasets were used in this study: GSE195725, GSE157910, GSE139369 and GSM5227096.

Code availability

Signac 1.7.0 (ref. ¹⁴) and Seurat 4.1.0 (ref. ¹⁶) were used for all analysis and are available from CRAN: https://cran.r-project.org/package=Signac and https://cran.r-project.org/package=Seurat. Code to reproduce analyses is available on GitHub: https://github.com/stuart-lab/nanobody (ref. ⁴²). NTT-seq resources can be found at https://ntt-seq.com.

References

Kaya-Okur, H. S. et al. CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat. Commun. 10, 1930 (2019).
Article PubMed PubMed Central Google Scholar
Carter, B. et al. Mapping histone modifications in low cell number and single cells using antibody-guided chromatin tagmentation (ACT-seq). Nat. Commun. 10, 3747 (2019).
Wang, Q. et al. CoBATCH for high-throughput single-cell epigenomic profiling. Mol. Cell 76, 206–216 (2019).
Article CAS PubMed Google Scholar
Janssen, S. M. & Lorincz, M. C. Interplay between chromatin marks in development and disease. Nat. Rev. Genet. 23, 137–153 (2022).
Article CAS PubMed Google Scholar
Creyghton, M. P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl Acad. Sci. USA 107, 21931–21936 (2010).
Article CAS PubMed PubMed Central Google Scholar
Gopalan, S., Wang, Y., Harper, N. W., Garber, M. & Fazzio, T. G. Simultaneous profiling of multiple chromatin proteins in the same cells. Mol. Cell 81, 4736–4736 (2021).
Article CAS PubMed PubMed Central Google Scholar
Meers, M. P., Llagas, G., Janssens, D. H., Codomo, C. A., & Henikoff, S. Multifactorial profiling of epigenetic landscapes at single-cell resolution using MuTI-Tag. Nat. Biotechnol. https://doi.org/10.1038/s41587-022-01522-9 (2022)
Pleiner, T., Bates, M. & Görlich, D. A toolbox of anti-mouse and anti-rabbit IgG secondary nanobodies. J. Cell Biol. 217, 1143–1154 (2018).
Article CAS PubMed PubMed Central Google Scholar
Saha, K., Bender, F. & Gizeli, E. Comparative study of IgG binding to proteins G and A: nonequilibrium kinetic and binding constant determination with the acoustic waveguide device. Anal. Chem. 75, 835–842 (2003).
Article CAS PubMed Google Scholar
Hassanzadeh-Ghassabeh, G., Devoogdt, N., De Pauw, P., Vincke, C. & Muyldermans, S. Nanobodies and their potential applications. Nanomedicine 8, 1013–1026 (2013).
Article CAS PubMed Google Scholar
Amini, S. et al. Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing. Nat. Genet. 46, 1343–1349 (2014).
Article CAS PubMed PubMed Central Google Scholar
Tie, F. et al. CBP-mediated acetylation of histone H3 lysine 27 antagonizes Drosophila Polycomb silencing. Development 136, 3131–3141 (2009).
Article CAS PubMed PubMed Central Google Scholar
Zaborowska, J., Egloff, S. & Murphy, S. The pol II CTD: new twists in the tail. Nat. Struct. Mol. Biol. 23, 771–777 (2016).
Article CAS PubMed Google Scholar
Stuart, T., Srivastava, A., Madad, S., Lareau, C. A. & Satija, R. Single-cell chromatin state analysis with Signac. Nat. Methods 18, 1333–1341 (2021).
Article CAS PubMed PubMed Central Google Scholar
Becht E. et al. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. https://doi.org/10.1038/nbt.4314 (2018).
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–87 (2021).
Article CAS PubMed PubMed Central Google Scholar
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74, (2012).
Article Google Scholar
Zhang B. et al. Characterizing cellular heterogeneity in chromatin state with scCUT&Tag-pro. Nat. Biotechnol. 40, 1220–1230 (2022).
Wu, S. J. et al. Single-cell CUT&Tag analysis of chromatin modifications in differentiation and tumor progression. Nat. Biotechnol. 39, 819–824 (2021).
Article CAS PubMed PubMed Central Google Scholar
Granja, J. M. et al. Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia. Nat. Biotechnol. 37, 1458–1465 (2019).
Article CAS PubMed PubMed Central Google Scholar
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).
Article CAS PubMed PubMed Central Google Scholar
Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017).
Article CAS PubMed PubMed Central Google Scholar
Tedesco, M. et al. Chromatin velocity reveals epigenetic dynamics by single-cell profiling of heterochromatin and euchromatin. Nat. Biotechnol. 40, 235–234 (2022).
Article CAS PubMed Google Scholar
Picelli, S. et al. Tn5 transposase and tagmentation procedures for massively scaled sequencing projects. Genome Res. 24, 2033–2040 (2014).
Article CAS PubMed PubMed Central Google Scholar
Hawkins, J. A., Jones, S. K. Jr, Finkelstein, I. J. & Press, W. H. Indel-correcting DNA barcodes for high-throughput sequencing. Proc. Natl Acad. Sci. USA 115, E6217–E6226 (2018).
Article CAS PubMed PubMed Central Google Scholar
Kaya-Okur, H. S., Janssens, D. H., Henikoff, J. G., Ahmad, K. & Henikoff, S. Efficient low-cost chromatin profiling with CUT&Tag. Nat. Protoc. 15, 3264–3283 (2020).
Article CAS PubMed PubMed Central Google Scholar
Mimitou, E. P. et al. Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells. Nat. Biotechnol. 39, 1246–1258 (2021).
Article CAS PubMed PubMed Central Google Scholar
Vasimuddin, M., Misra, S., Li, H. & Aluru, S. Efficient architecture-aware acceleration of BWA-MEM for multicore systems. in: 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 314–324 (2019).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article PubMed PubMed Central Google Scholar
Ramírez, F., Dündar, F., Diehl, S., Grüning, B. A. & Manke, T. DeepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187–W191 (2014).
Article PubMed PubMed Central Google Scholar
Li, H. Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics 27, 718–719 (2011).
Article PubMed PubMed Central Google Scholar
Waltman, L. & van Eck, N. J. A smart local moving algorithm for large-scale modularity-based community detection. Eur. Phys. J. B 86, 471 (2013).
Article Google Scholar
Hamilton, N. E. & Ferry, M. ggtern: ternary diagrams using ggplot2. J. Stat. Softw. 87, 1–17 (2018).
Srivastava, A., Malik, L., Smith, T., Sudbery, I. & Patro, R. Alevin efficiently estimates accurate gene abundances from dscRNA-seq data. Genome Biol. 20, 65 (2019).
Article PubMed PubMed Central Google Scholar
Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).
Article CAS PubMed PubMed Central Google Scholar
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842, (2010).
Article CAS PubMed PubMed Central Google Scholar
Oetjen, K. A. et al. Human bone marrow assessment by single-cell RNA sequencing, mass cytometry, and flow cytometry. JCI Insight 3, e124928 (2018).
HuBMAP Consortium. The human body at cellular resolution: the NIH Human biomolecular atlas program. Nature 574, 187–192 (2019).
Article CAS Google Scholar
Stuart, T. et al. Nanobody-tethered transposition enables multifactorial chromatin profiling at single-cell resolution. Gene Expression Omnibus. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE212588 (2022).
Stuart, T. et al. Nanobody-tethered transposition enables multifactorial chromatin profiling at single-cell resolution. Sequence Read Archive. https://www.ncbi.nlm.nih.gov/sra/?term=SRP395379 (2022).
Stuart, T. et al. Nanobody-tethered transposition enables multifactorial chromatin profiling at single-cell resolution. https://zenodo.org/record/7102159 (2022).
Stuart, T. et al. Nanobody-tethered transposition enables multifactorial chromatin profiling at single-cell resolution. https://github.com/stuart-lab/nanobody (2022).

Download references

Acknowledgements

This work was supported by the National Institutes of Health (grants K99HG011489 to T.S. and RM1HG011014-01 to R.S. and D.L.). B.Z. is a postdoctoral fellow of the Jane Coffin Childs Memorial Fund for Medical Research. This investigation has been aided by a grant from the Jane Coffin Childs Memorial Fund for Medical Research. We thank members of the Satija and NYGC Technology Innovation laboratories for feedback on the manuscript.

Author information

Authors and Affiliations

Center for Genomics and Systems Biology, New York University, New York, NY, USA
Tim Stuart, Bingjie Zhang & Rahul Satija
New York Genome Center, New York, NY, USA
Tim Stuart, Bingjie Zhang, Levan Mekerishvili, Dan A. Landau & Rahul Satija
Technology Innovation Lab, New York Genome Center, New York, NY, USA
Stephanie Hao, Silas Maniatis & Ivan Raimondi
Weill Cornell Medicine, New York, NY, USA
Levan Mekerishvili, Dan A. Landau & Ivan Raimondi

Authors

Tim Stuart
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie Hao
View author publications
You can also search for this author in PubMed Google Scholar
Bingjie Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Levan Mekerishvili
View author publications
You can also search for this author in PubMed Google Scholar
Dan A. Landau
View author publications
You can also search for this author in PubMed Google Scholar
Silas Maniatis
View author publications
You can also search for this author in PubMed Google Scholar
Rahul Satija
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Raimondi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

I.R. conceived the study. I.R., S.H., B.Z. and L.M. performed experiments. T.S. and I.R. performed computational analysis. D.A.L., S.M. and R.S. supervised the study. T.S. and I.R. wrote the manuscript, with input from all authors.

Corresponding author

Correspondence to Ivan Raimondi.

Ethics declarations

Competing interests

In the past 3 years, R.S. has worked as a consultant for Bristol Myers Squibb, Regeneron and Kallyope and served as a Scientific Advisory Board member for ImmunAI, Resolve Biosciences, NanoString and the NYC Pandemic Response Lab. I.R. and S.M. have filed a patent application based on this work (US provisional application no. 63/276,533). The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks Junyue Cao, Andrew Adey and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Table 1 Quality metrics of datasets generated in this work

Full size table

Extended Data Table 2 nb–Tn5 adapter and custom oligo sequences

Full size table

Extended Data Fig. 1 Design and evaluation of nb-Tn5.

a Nanobody-Tn5 fusion protein plasmid map schematic showing position of Tn5 and secondary nanobody sequences. b Agarose DNA gel showing size-separation of PCR-amplified DNA sequencing library products for different combinations of nb-Tn5 and primary IgG antibody. Rabbit Ab: rabbit primary IgG antibody; Mouse Ab: mouse primary IgG antibody; IgG1 Ab: mouse IgG subtype 1 primary antibody; IgG2a Ab: mouse IgG subtype 2a primary antibody; rTn5: anti-rabbit IgG secondary nanobody-Tn5 fusion; mTn5: anti-mouse IgG secondary nanobody-Tn5 fusion; G1T: anti-mouse IgG1 secondary nanobody-Tn5 fusion; G2aT: anti-mouse IgG2a secondary nanobody-Tn5 fusion. Gels shows expected library amplification product (bands between 200 and 1,000 bp) in lanes where the nb-Tn5 fusion matches the primary IgG antibody (rabbit Ab + rTn5; mouse Ab + mTn5; IgG1 Ab + G1T; IgG2a Ab + G2aT). Repicates were not performed. c Scatterplots showing normalized fragment counts for H3K27me3 and H3K27ac peaks defined by ENCODE¹⁷ for bulk multiplexed and non-multiplexed NTT-seq experiments in human PBMCs. Peaks are colored according to their chromatin modality (red: H3K27me3 peak, yellow: H3K27ac peak). Coefficient of determination (R²) between experiments are shown above each scatterplot. d Scatterplots showing normalized fragment counts for H3K27me3, H3K27ac, and RNAPII peaks defined by ENCODE¹⁷ for bulk multiplexed and non-multiplexed NTT-seq experiments in K562 cells. Peaks are colored according to their chromatin modality (red: H3K27me3; yellow: H3K27ac; blue: RNAPII).

Extended Data Fig. 2 Data sensitivity comparison across multimodal chromatin profiling methods.

a Total reads and fragment counts per cell for multiCUT&Tag⁶ and scNTT-seq. Read and fragment counts on y-axis are on a log10 scale. multiCUT&Tag profiled only two marks, H3K27ac and H3K27me3, and so do not have RNAPII counts. Box-plot lower and upper hinges represent first and third quartiles. Upper/lower whiskers extend to the largest/smallest value no further than 1.5x the interquartile range. Data beyond the whiskers are plotted as single points. b Fraction of fragments falling in ENCODE peak regions for H3K27me3 and H3K27ac marks, for multiCUT&Tag (red) and scNTT-seq (blue). Box-plots constructed as for panel A. c Scatterplot showing the normalized insertion counts in H3K27me3 and H3K27ac ENCODE peak regions for the multiCUT&Tag mESC single-cell dataset. d Multimodal genome browser view of a representative genomic locus, for K562 cells. Top three tracks show H3K27ac, H3K27me3, and RNAPII profiled simultaneously in a single-cell experiment. Lower three tracks show H3K27ac, H3K27me3, and RNAPII profiled individually in bulk-cell NTT-seq experiments using K562 cells.

Extended Data Fig. 3 Sensitivity and reproducibility of scNTT-seq.

a Total read and fragment counts per cell and fraction of fragments in peaks (FRiP) for scCUT&Tag and scNTT-seq PBMC datasets. Box-plot lower and upper hinges represent first and third quartiles. Upper/lower whiskers extend to the largest/smallest value no further than 1.5x the interquartile range. Data beyond the whiskers are plotted as single points. b Comparison of total unique antibody-derived tag (ADT) counts sequenced per cell for CUT&Tag-pro¹⁸ and scNTT-seq. c Spearman correlation between H3K27me3 counts (top) or H3K27ac counts (bottom) for cells profiled using multiplexed single-cell NTT-seq, or FACS-sorted bulk ChIP-seq profiled by ENCODE¹⁷. d Two-dimensional UMAP projection and clustering for a second PBMC scNTT-seq replicate profiling H3K27me3 and H3K27ac. UMAP representation was constructed using both modalities, using the weighted nearest neighbors (WNN) method. e Scatterplots showing the number of fragment counts per H3K27me3 and H3K27ac ENCODE peak region for each assay profiled in the second PBMC scNTT-seq replicate dataset. f Total read and fragment count and FRiP distributions for H3K27me3 and H3K27ac assays profiled in the second PBMC scNTT-seq replicate dataset. g Pearson correlation between H3K27me3 and H3K27ac marks across PBMC scNTT-seq replicate datasets.

Extended Data Fig. 4 Accuracy of scNTT-seq applied to human BMMCs.

a Scatterplot showing the number of counts per H3K27me3 and H3K27ac peak for each assay, for BMMC cells profiled using single-cell multiplexed NTT-seq. Peaks are colored according to their assay (red: H3K27me3 peaks; yellow: H3K27ac peaks). b Fraction of fragments in ENCODE peaks per cell, for H3K27ac and HK27me3 marks. Box-plot lower and upper hinges represent first and third quartiles. Upper/lower whiskers extend to the largest/smallest value no further than 1.5x the interquartile range. Data beyond the whiskers are plotted as single points.

Supplementary information

Reporting Summary

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Stuart, T., Hao, S., Zhang, B. et al. Nanobody-tethered transposition enables multifactorial chromatin profiling at single-cell resolution. Nat Biotechnol 41, 806–812 (2023). https://doi.org/10.1038/s41587-022-01588-5

Download citation

Received: 08 March 2022
Accepted: 24 October 2022
Published: 19 December 2022
Issue Date: June 2023
DOI: https://doi.org/10.1038/s41587-022-01588-5

This article is cited by

Scalable single-cell profiling of chromatin modifications with sciCUT&Tag
- Derek H. Janssens
- Jacob E. Greene
- Steven Henikoff
Nature Protocols (2024)
Droplet-based single-cell joint profiling of histone modifications and transcriptomes
- Yang Xie
- Chenxu Zhu
- Bing Ren
Nature Structural & Molecular Biology (2023)
Beyond assembly: the increasing flexibility of single-molecule sequencing technology
- Paul W. Hook
- Winston Timp
Nature Reviews Genetics (2023)
Multimodal chromatin profiling using nanobody-based single-cell CUT&Tag
- Marek Bartosovic
- Gonçalo Castelo-Branco
Nature Biotechnology (2023)
SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks
- Carmen Bravo González-Blas
- Seppe De Winter
- Stein Aerts
Nature Methods (2023)

Subjects

Abstract

Similar content being viewed by others

Main

Results

Discussion

Conclusions

Methods

Cell culture

Primary cells acquisition and processing

Cloning of nb–Tn5 plasmid constructs

nb–Tn5 transposase production

Transposome assembly

Antibodies

NTT-seq

Antibody staining

Fixation and permeabilization

Tagmentation

NTT-seq bulk

NTT-seq single-cell encapsulation, PCR and library construction

Sequencing

Bulk-cell data analysis

Single-cell data analysis

Cell culture dataset

Read mapping

Quantification, quality control and dimension reduction

Specificity analysis

multi-CUT&Tag comparison

PBMC datasets

Read mapping

Quantification, quality control, and dimension reduction

ENCODE data comparison

CUT&Tag-pro data comparison

BMMC dataset

Read mapping

Quantification, quality control and dimension reduction

Cell annotation

Trajectory analysis

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links