Combinatorial single-cell profiling of major chromatin types with MAbID

Gene expression programs result from the collective activity of numerous regulatory factors. Studying their cooperative mode of action is imperative to understand gene regulation, but simultaneously measuring these factors within one sample has been challenging. Here we introduce Multiplexing Antibodies by barcode Identification (MAbID), a method for combinatorial genomic profiling of histone modifications and chromatin-binding proteins. MAbID employs antibody–DNA conjugates to integrate barcodes at the genomic location of the epitope, enabling combined incubation of multiple antibodies to reveal the distributions of many epigenetic markers simultaneously. We used MAbID to profile major chromatin types and multiplexed measurements without loss of individual data quality. Moreover, we obtained joint measurements of six epitopes in single cells of mouse bone marrow and during mouse in vitro differentiation, capturing associated changes in multifactorial chromatin states. Thus, MAbID holds the potential to gain unique insights into the interplay between gene regulatory mechanisms, especially for low-input samples and in single cells.

Gene expression programs result from the collective activity of numerous regulatory factors.Studying their cooperative mode of action is imperative to understand gene regulation, but simultaneously measuring these factors within one sample has been challenging.Here we introduce Multiplexing Antibodies by barcode Identification (MAbID), a method for combinatorial genomic profiling of histone modifications and chromatin-binding proteins.MAbID employs antibody-DNA conjugates to integrate barcodes at the genomic location of the epitope, enabling combined incubation of multiple antibodies to reveal the distributions of many epigenetic markers simultaneously.We used MAbID to profile major chromatin types and multiplexed measurements without loss of individual data quality.Moreover, we obtained joint measurements of six epitopes in single cells of mouse bone marrow and during mouse in vitro differentiation, capturing associated changes in multifactorial chromatin states.Thus, MAbID holds the potential to gain unique insights into the interplay between gene regulatory mechanisms, especially for low-input samples and in single cells.
Gene regulation involves the coordinated activity of many factors at different genomic scales.At a large scale, chromosomes reside in distinct nuclear territories 1,2 that are further partitioned into compartments of similar chromatin states 3,4 .At a local scale, DNA methylation 5 , histone post-translational modifications (PTMs) 6 and chromatin remodeling complexes 7 synergistically modulate transcriptional activity.The interplay between these factors ultimately determines cellular identity and function.To understand the mechanisms governing gene expression, technologies capable of simultaneously measuring multiple gene regulatory states are required.
Recently, methods have been developed to profile up to three histone PTMs in the same cell [21][22][23][24][25][26] .Such multifactorial strategies hold great potential for dissecting the underlying coordination and mechanistic basis that govern gene regulation.These methodologies generally build on antibody detection followed by Tn5-mediated tagmentation 27 and have thus far been implemented to measure three epitopes per cell, all residing in active chromatin or facultative heterochromatin [21][22][23][24][25][26] .Whether this strategy can be extended to profile increasingly complex sets of epitopes remains unresolved, especially including those of constitutive and inaccessible heterochromatin.
Here we present Multiplexing Antibodies by barcode Identification (MAbID), a method that employs standard restriction-digestion and ligation steps.With MAbID, combined measurements of epitopes across all major chromatin types can be obtained, including facultative and constitutive heterochromatin.We used secondary or primary antibody-DNA conjugates to generate low-input and single-cell readouts for up to six epitopes simultaneously.We demonstrate that with MAbID Article https://doi.org/10.1038/s41592-023-02090-9(Fig. 1b and Extended Data Fig. 2a).Genome-wide MAbID signal correlates best with publicly available ChIP-seq and CUT&Tag data of matching histone PTMs, with mean Pearson's correlation coefficients ranging from 0.24 to 0.50 for active chromatin types and 0.24 to 0.46 for heterochromatin types (Fig. 1c and Extended Data Fig. 2b).On a local scale, MAbID profiles show expected patterns of enrichment and similarity to ChIP-seq or DamID datasets (Fig. 1d and Extended Data Fig. 2c).
To further explore the specificity of MAbID, we calculated FRiP (Fraction of Reads in Peaks) scores over ChromHMM domains or lamina-associated domains (LADs).For non-normalized MAbID data, FRiP scores are lower with respect to public data from other approaches (Extended Data Fig. 2d).However, the enrichment of normalized data (Signal Enrichment in Peaks, SEiP score) is in a similar range to measurements obtained with orthogonal methods (Extended Data Fig. 2d).All MAbID samples display signal enrichment over the expected genomic regions, irrespective of chromatin type (Fig. 1e).Moreover, for histone PTMs associated with active gene expression, the signal scales according to the transcriptional activity of genes (Fig. 1f).
Next, we determined the resolution of MAbID by quantifying the signal distribution of H3K4me3 over transcription start sites (TSSs) and H3K27me3 over Polycomb-group domains (ChromHMM) compared with corresponding ChIP-seq datasets (Extended Data Fig. 2e).H3K4me3 signal decays to 50% (compared with 100% at the TSS) at 3-4-kilobase (kb) distance from the top of the peak.For H3K27me3, this distance corresponds to 7-8 kb around the domain border.Compared with ChIP-seq, MAbID signal thus generally extends an additional 1-2 kb in either direction.Finally, we investigated the compatibility of MAbID with profiling of other chromatin-binding proteins.We focused on the zinc-finger transcription factor CTCF, and SUZ12, a subunit of the Polycomb Repressive Complex 2. Both proteins display the expected enrichment of signal over publicly available ChIP-seq peaks or ChromHMM domains (Extended Data Fig. 2f).In summary, these results show that MAbID can accurately profile diverse epigenetic modifications and chromatin-binding proteins in as little as 1,000 cells.

Multiplexing MAbID measurements in one sample
MAbID is designed to multiplex several antibodies in the same sample to profile many epigenetic landscapes simultaneously.To test this, we performed MAbID with four antibodies of different host-origin, along with uniquely barcoded secondary antibody-DNA conjugates specific for each host (Fig. 2a and Supplementary Table 1).We first assessed differences in data quality between individual and multiplexed measurements.All samples group on epitope, with high concordance between biological replicates (Fig. 2b).Importantly, the yield and read-statistics are comparable between samples incubated with a single antibody or all four antibodies simultaneously (Extended Data Fig. 3a,b).Moreover, genome-wide mean Pearson's correlation coefficients with public ChIP-seq data are generally independent of the number of multiplexed antibodies (Fig. 2c and Extended Data Fig. 3c).Finally, we assessed the local distribution of signal for single and combined samples, which differences in single-cell chromatin states between closely related cell types can be detected and that the method is suitable to classify different cell types obtained from primary mouse bone marrow (BM).We anticipate that MAbID will provide an approach to obtain insights into the multifaceted regulation of gene expression in dynamic and complex biological systems.

MAbID enables genomic profiling of diverse chromatin states
To multiplex measurements of several chromatin states within one sample, we devised a strategy to uniquely barcode antibodies and map epitope positions on chromatin through specific restriction-ligation steps.To this end, we covalently conjugated antibodies to DNA adapters using click-chemistry [28][29][30] (Extended Data Fig. 1a,b).The MAbID protocol (Fig. 1a) involves: (1) collection of ∼250,000 cells, nuclei isolation and mild fixation; (2) incubation with primary antibodies followed by incubation with barcoded secondary antibody-DNA conjugates; (3) fluorescence-activated cell sorting (FACS); (4) genomic digestion with MseI, recognizing TTAA sequence motifs; (5) dephosphorylation of the digested genome to prevent self-ligation of the genome and enhance integration of the antibody-adapter; (6) NdeI digestion of the antibody-adapter, leaving an MseI-compatible overhang with a 5′ phosphate; and (7) proximity ligation of the antibody-adapter to the genome to mark the genomic position of the epitope.The protocol continues with (8) lysis and protein degradation followed by ( 9) NotI digestion to enable subsequent ligation of a sample-adapter with a unique sample barcode (Extended Data Fig. 1b).The sample-adapter enables pooling of multiple samples for linear amplification and subsequent Illumina library preparation 13 .
We first benchmarked MAbID against public datasets by using biological replicates of 1,000 K562 nuclei and performing individual measurements of several histone PTMs, RNA Polymerase II and the Lamin B1 protein.We generated secondary antibody-DNA conjugates and employed these in combination with different primary antibodies (Supplementary Table 1) to compare the quality of multiple genomic profiles in parallel.On average, 78.9% of the reads contained the correct sequence structure consisting of antibody and sample barcodes, and 97.7% of the uniquely aligned reads are located at expected TTAA sequence motifs (Extended Data Fig. 1c).A control sample in which the primary antibody was omitted during the first incubation step serves as an input (mock immunoprecipitation, IP) dataset for normalization.The control signal largely mirrors chromatin accessibility, as compared with publicly available assay for transposase-accessible chromatin with sequencing (ATAC-seq) data (Extended Data Fig. 1d).This necessitates normalization of the MAbID profiles to the control to effectively remove the accessibility component (Extended Data Fig. 1d-f).
Visualization of the normalized data by uniform manifold approximation and projection (UMAP) shows good concordance between biological replicates and consistent grouping of the 1,000-cell MAbID samples with corresponding chromatin immunoprecipitation followed by sequencing (ChIP-seq) datasets obtained from millions of cells  show matching profiles on both broad and more narrow genomic scales (Fig. 2d).Compared with publicly available ChIP-seq data, the signal enrichments are highly similar and located at the expected genomic regions (Fig. 2e and Extended Data Fig. 3d).
In combined experiments, we unexpectedly observed relatively high correlation between histone H3 and H3K27me3 ChIP-seq data (Extended Data Fig. 3c).This similarity is also apparent upon comparison of H3 genomic profiles for individual and combined measurements (Extended Data Fig. 3d).We anticipate that this is caused by cross-reactivity of the anti-sheep secondary antibody-DNA conjugate with the primary rabbit H3K27me3 IgG and therefore excluded H3 from subsequent analysis.To rule out the possibility of cross-reactivity between other secondary antibody-DNA conjugates, we performed multiplexed MAbID experiments, with combinations in which one of the three antibodies was excluded.The data are comparable between the different combinations, verifying that the signal is indeed specific to the corresponding IgG (Extended Data Fig. 3e).
We also noted that MAbID signal with the Polymerase II CTD Ser5P antibody is unexpectedly enriched over the gene body, in addition to the expected enrichment at the TSS 31,32 (Extended Data Fig. 3f).This may indicate that this antibody has broader affinity for other CTD phospho-modifications.Regardless, the signal scales with gene expression output, indicating that the antibody marks transcriptional activity (Extended Data Fig. 3f).Together, these results show that MAbID enables robust identification of the genomic localization of several epitopes in a multiplexed assay.H3K36me3 ChIP-seq domains Genomic barcoding can be tailored to the epitope of interest Next, we sought to increase the modularity of MAbID by including another pair of restriction enzymes in addition to MseI and NdeI.This allows tailoring to the epitope of interest by increasing the theoretical resolution and potentially enhancing signal.We paired MboI with BglII to target GATC sequence motifs, because of (1) the ability of MboI to digest cross-linked chromatin 33,34 , (2) the difference in genomic distribution to the TTAA motif (Extended Data Fig. 4a) and (3) the high fraction of mappable genome across chromatin types (Extended Data Fig. 4b).This extension of the method enables mixing of secondary antibody-DNA conjugates targeting both motifs in a single reaction.We tested this strategy with a combination of three antibodies in single or combined (each with TTAA or GATC) measurements.Based on motif enrichment, secondary antibody-DNA conjugates were created with NdeI-compatible adapters (TTAA) for H3K36me3 and BglII-compatible adapters (GATC) for H3K27me3 and Pol II CTD Ser5P (Extended Data Fig. 4a).Additionally, a sample was included in which each epitope was targeted by both types of secondary antibody-DNA conjugate (both TTAA and GATC), to increase the theoretical number of potential ligation events.Samples consistently group on epitope and display the expected enrichment of signal, regardless of the motif or experimental setting (Extended Data Fig. 4c,d).The signal resolution increases 1-2 kb in the combined (TTAA and GATC) sample compared with the individual samples, as measured by the decay of H3K27me3 signal over Polycombgroup ChromHMM domain borders (Extended Data Fig. 4e).Moreover, the overall complexity and read numbers are similar across samples (Extended Data Fig. 4f,g).These outcomes underscore the robustness and flexibility of MAbID, while the modular design creates the opportunity to tailor experiments to the genomic distribution of the target.In the following experiments, we match restriction enzyme pairs with the epitope of interest.

Primary antibody conjugates increase multiplexing potential
To increase the number of multiplexed measurements, we explored the potential of directly conjugating the antibody-adapter to primary antibodies.This is more challenging, because (1) only one primary antibody-DNA conjugate can bind per epitope and (2) conjugation could potentially affect epitope binding of monoclonal primary antibodies.Nevertheless, implementation of primary antibody-DNA conjugates eliminates the dependency on antibody host-origins and vastly increases the potential number of multiplexed measurements.We selected antibodies against a variety of chromatin types and conjugated each to a uniquely barcoded antibody-adapter, with slight modifications to account for differences in storage buffer compositions (Extended Data Fig. 5a).
We performed MAbID in biological replicates of 1,000 K562 nuclei (Fig. 3a).Genomic profiles largely overlap with those from publicly available ChIP-seq data and display overall similarity to corresponding MAbID samples obtained with secondary antibody-DNA conjugates (Fig. 3b,c).The MAbID signal amplitudes and signal-to-noise ratios are somewhat lower compared with MAbID performed with secondary antibodies.This is most apparent for H3K4me3 and H3K36me3, presumably relating to the narrow genomic windows of enrichment for these epitopes.Nevertheless, UMAP embedding shows all samples clustering on chromatin type and their respective secondary antibody-DNA conjugate sample (Fig. 3d).Moreover, the yield and read-statistics are as expected and the biological replicates correlate well (Extended Data Fig. 5b,c).We examined the epitope-specificity of the primary antibody-DNA conjugates further by comparing the signal enrichment with secondary antibody-DNA conjugate samples (Fig. 3e).The signal distribution over respective ChromHMM or public ChIP-seq domains is highly comparable, validating that the antibodies maintain specificity after conjugation.Collectively, these results confirm that MAbID performed with primary antibody-DNA conjugates generates specific genomic profiles for different chromatin types.
Finally, we tested the performance of primary antibody-DNA conjugates in a multiplexed setting, by selecting six epitopes encompassing a comprehensive set of chromatin types.K562 cells were incubated with the mix of primary antibody-DNA conjugates and sorted as samples of 100 nuclei in 384-well plates.The multiplexed MAbID samples group with the previously generated 1,000-cell individual samples, verifying the similarity between the sample types (Extended Data Fig. 5d).Genome-wide correlations with publicly available ChIP-seq data are comparable between individual and combined measurements (Extended Data Fig. 5e).Thus, MAbID with primary antibody-DNA conjugates potentiates profiling of an increasingly complex set of histone PTMs and chromatin-binding proteins.

scMAbID measures epigenetic states at single-cell resolution
We previously optimized single-cell genomic profiling methods using 384-well plates and liquid-handling robots.We therefore integrated these protocols with MAbID to generate multiplexed epigenomic measurements in single cells (scMAbID).To investigate the ability of scMAbID to discern chromatin states between different cell types, we differentiated mouse embryonic stem cells (mESCs) towards early neural progenitor cells (early NPCs) 35 .Both cell types were incubated with the mix of six primary antibody-DNA conjugates targeting a range of chromatin types (Fig. 4a).In parallel, human K562 cells were processed in the same plates to benchmark scMAbID against the bulk datasets (Fig. 4a).The human or mouse origin was assigned in silico with a median number of misannotated reads below 0.4%, indicating that the cell of origin can be robustly assigned (Extended Data Fig. 6a).
A total of 1,956 K562 cells, 1,424 mESCs and 1,424 early NPCs were sequenced and 1,248, 674 and 849 cells passed the quality thresholds, respectively (Extended Data Fig. 6b).The median number of unique counts per cell after filtering is 2,715 for K562 cells, 2,281 for mESCs and 2,842 for early NPCs, and the median count per epitope ranges between 119 and 706 per cell (Fig. 4b).These numbers are in a similar range to those reported by other methods measuring two or three histone PTMs simultaneously 21,23,24,26 (Extended Data Fig. 6c,d).nano-CUT&Tag 23 is the notable exception, substantially outperforming all other methods in terms of read counts (Extended Data Fig. 6c,d).scMAbID is the only plate-based protocol, resulting in lower throughput (Extended Data Fig. 6d).However, the recovery of cells after sequencing is equal to the other approaches and the combination with FACS provides opportunities for selecting cells of interest (Extended Data Fig. 6d).
To verify the specificity of scMAbID data, the unique reads of K562 cells were combined to generate in silico populations (ISPs).scMAbID ISP profiles display a comparable distribution to matching bulk MAbID, ChIP-seq or DamID datasets (Fig. 4c).The correspondence between scMAbID ISP and public reference datasets is moderately lower than observed for bulk MAbID, which is possibly related to the lower read numbers obtained in single-cell measurements.Nevertheless, UMAP visualization shows grouping of scMAbID ISP samples with their respective 1,000-cell counterparts, illustrating the genome-wide similarity between these datasets (Fig. 4d).
Next, we calculated FRiP scores for each epitope in single cells using public reference domains.High FRiP scores are observed for all epitopes over their corresponding domain, while these are considerably lower for unrelated chromatin types (Fig. 4e).We also compared scMAbID FRiP scores with those calculated with publicly available MulTI-Tag 26 and NTT-seq 24 datasets obtained in K562 cells (Extended Data Fig. 6e).NTT-seq moderately outperforms scMAbID, especially on active chromatin types, but overall FRiP scores are on a comparable scale (Extended Data Fig. 6e).scMAbID FRiP scores for H3K9me3 and Lamin B1 in LADs are markedly higher than expected for a random distribution (Extended Data Fig. 6e).Since these epitopes were not Article https://doi.org/10.1038/s41592-023-02090-9measured with the other multifactorial approaches, direct comparisons in constitutive heterochromatin were unattainable.
Finally, we sought to determine if the epitope-specific information from each individual cell enables separation of samples by chromatin state.We took all epitope measurements passing a threshold of 150 unique counts per cell (n = 6,729) and embedded these within UMAP space.Cells consistently separate based on chromatin type, with similar types mixing together (Fig. 4f).This is independent of read depth per cell or epitope (Extended Data Fig. 6f).Collectively, these results validate that scMAbID generates specific epigenetic profiles at single-cell resolution in multiplexed experiments of six epitopes.

Integrating measurements of multifactorial chromatin states
Next, we explored the ability of scMAbID to discern between mESCs and early NPCs based on multiplexed epigenomic profiles.mESC scMAbID ISP samples display similar genomic distributions to those observed with ChIP-seq or bulk MAbID (Extended Data Fig. 7a).For both cell types, FRiP scores of single-cell epitope measurements are higher for the corresponding chromatin types with respect to other regions (Extended Data Fig. 7b).Moreover, the single-cell epitope measurements with the highest depth (mESCs, n = 1,800; early NPCs, n = 1,800) mainly separate on their respective chromatin type (Extended Data Fig. 7c).Overall, these results validate the epitope-specific scMAbID measurements in mESCs and early NPCs.We noticed that while UMAP embedding is mostly driven by chromatin signature, mESCs and early NPCs do separate within the same chromatin type (Extended Data Fig. 7c).We wondered whether integration of all multiplexed measurements would improve separation.Therefore, we computed the combined epigenomic information for each cell by calculating cell-similarity ( Jaccard) matrices per epitope and performing dimensionality reduction on the summed matrices 16 .Upon cluster assignment, 97.8% of the mESCs and 70.2% of early NPCs are assigned to their cellular origin based on integrated chromatin signatures (Fig. 4g,h and Extended Data Fig. 7d).This confirms that multifactorial chromatin profiles contain sufficient information to accurately separate closely related cell types.Interestingly, 29.8% of early NPCs are annotated as mESCs, presumably because these cells failed to differentiate or maintained a more pluripotent state (Fig. 4h).We subsequently labeled the clusters as 'pluripotent' and 'differentiated'.We next assessed the contribution of each modality to the assignment of cellular states.To examine this, we used the information gain metric 36 to systematically determine the accuracy of cluster assignments with reduced sets of epitopes.The information gain metric increases with the inclusion of additional epitopes, underlining the added value of multiplexing measurements (Extended Data Fig. 7e).Unsurprisingly, H3K27me3, H3K4me1 and H3K27ac contributed most to cluster assignment, as these are reported to be valuable predictors of cell type and developmental stage (Extended Data Fig. 7f) 16,20 .

scMAbID captures epigenetic transitions upon X inactivation
Female mESCs undergo X-chromosome inactivation (XCI) 37 upon differentiation.This involves major changes in the distribution of several histone PTMs.Our female mESCs are of hybrid origin (Cast/ EiJ × 129SvJae), which enables the separation of parental alleles based on single-nucleotide polymorphisms.We therefore addressed whether multiplexed scMAbID data could be utilized to assign cells and alleles that underwent XCI and identify their multifactorial epigenetic signatures.
The inactive X-allele (Xi) is associated with a marked increase in H3K27me3 levels 38,39 .Therefore, we calculated the allelic ratio of unique H3K27me3 counts on the X-chromosome to identify cells that underwent XCI and their respective Xi-allele.As expected, XCI cells predominantly reside in the differentiated cluster (Fig. 4i and Extended Data Fig. 7g).Overall, cells of the differentiated cluster display increased H3K27me3 levels across the X-chromosome and over Hox gene clusters (Extended Data Fig. 7h).Interestingly, only 4.0% of early NPCs labeled  Article https://doi.org/10.1038/s41592-023-02090-9 as pluripotent have undergone XCI, compared with 27.0% of early NPCs assigned to the differentiated cluster, implying that these cells indeed failed to exit pluripotency.Lastly, we determined the occupancy of H3K4me1 and H3K9me3 on the Xi compared with the active X-allele.The levels of H3K4me1, marking enhancer regions, are decreased on the Xi as expected during early stages of XCI 40,41 (Fig. 4j).H3K9me3 levels on the Xi are increased, in line with previous reports showing that this mark increases coincidently with the accumulation of Xist RNA on the inactive X-allele 40,42 (Fig. 4j).These results highlight the potential of MAbID to capture single-cell multifactorial dynamics in chromatin states along differentiation trajectories.

Identifying gene expression signatures in mouse BM
As a further application, we investigated the performance of scMAbID on primary tissue.To this end, we isolated primary BM cells from mice and performed ethanol fixations to preserve overall cell structure.Cells were incubated with a combination of six primary antibody-DNA conjugates, with each antibody conjugated to both TTAA-and GATC-compatible antibody-adapters to maximize genomic coverage.Subsequently, cells were stained with five fluorescently labeled antibodies against BM cell-surface markers to isolate cell types by FACS, including granulocytes and erythroblasts from the myeloid lineage as well as B cells, T cells and natural killer cells from the lymphoid lineage (Fig. 5a and Extended Data Fig. 8a).In parallel, K562 cells are processed in each well as an internal control.The unique counts per cell and epitope are similar across cell types, even though these are lower compared with our previous observations for scMAbID experiments (Extended Data Fig. 8b).We retained 3,433 BM cells (of 4,862 sequenced cells, 70.6% recovery) after quality control, ranging from 471 to 969 cells per cell type (Extended Data Fig. 8b).
To assess the biological information in the BM scMAbID dataset, we integrated all epitope measurements per cell, as done before 16 .First, BM scMAbID ISP samples were generated by summing integrated-epitope matrices per 384-well plate for each cell type.UMAP visualizations of these samples show consistent separation between the myeloid and lymphoid lineages, as well as grouping on cell type (Fig. 5b).For single-cell integrated-epitope samples, we used three-dimensional diffusion map embedding 43,44 to preserve global structures (Fig. 5c).The diffusion components are primarily driven by lineage and although grouping on cell types is evident, it is less strong than in the ISP-based analysis (Fig. 5c and Extended Data Fig. 8c).Nonetheless, these results affirm that scMAbID can be used to obtain specific multifactorial chromatin states from primary cells.
Next, we intersected scMAbID measurements with cell typespecific gene expression programs.To achieve this, we identified genes unique to (1) each lineage and (2) each cell type based on public sortChIC data 45 .The specificity of these marker gene sets was validated by Gene Ontology (GO) enrichment analysis (Extended Data Fig. 8d).Because T cells were not part of the reference data, we compared these with the closely related natural killer cells.We then calculated scMA-bID signal over the top 50 marker genes of each lineage or cell type.We focused on enhancer epitopes H3K4me1 and H3K27ac because of the relative high epitope counts.The scMAbID ISP signals for these epitopes are strongly enriched over lineage-and cell type-specific genes (Fig. 5d,e).The lower enrichment for erythroblasts is likely related to the more lenient gating strategy for this cell type, resulting in a less pure population (Fig. 5e and Extended Data Fig. 8a).T cells show a high signal enrichment over natural killer cell marker genes, as expected (Fig. 5e).Moreover, average single-cell epitope counts across marker genes show lineage and cell type specificity, generally being higher for active chromatin types compared with inactive chromatin types (Extended Data Fig. 8e,f).
Finally, we examined the potential of scMAbID data for unbiased identification of differentially expressed genes between BM cell types.
We conducted Wilcoxon rank-sum tests on the single-cell scMAbID enhancer epitope counts over all genes, by combining count numbers for H3K4me1 and H3K27ac.As a result, we obtained a small selection of significant genes, many of which are reportedly expressed in specific BM cell types (Supplementary  52,53), which were also identified in the marker gene sets of the reference data (Extended Data Fig. 8g).Even though these genes are only detected in a small fraction of cells, the enhancer epitope-count numbers are evidently highest in the expected cell type (Extended Data Fig. 8g).Together, these results show the promise of scMAbID to study cell type-specific gene expression programs in primary tissues.

Discussion
Recent advancements of single-cell multi-omics strategies empower deeper analyses of gene regulation, but multiplexing measurements of several epigenetic modifications remains challenging [9][10][11]16,19 . We inroduced MAbID, a method for combined single-cell profiling of histone PTMs and chromatin-binding proteins, to obtain joint readouts of an unprecedented six epitopes encompassing all major chromatin types.
Other recent methods generating combined measurements employ Tn5 transposase to map epitope positions on chromatin 21,[23][24][25][26] .While this strategy yields high-quality single-cell profiles across different chromatin types 16,54 , it remains unclear how the intrinsic affinity for open chromatin 55 will affect combined measurements, especially in profiling constitutive heterochromatin.With Tn5-independent MAbID, we successfully performed multifactorial profiling of such epitopes located at inaccessible chromatin in combination with epitopes residing in active chromatin.
Further improvement of MAbID can be achieved by reducing background signal in accessible chromatin regions (Extended Data Fig. 1d).While the background can be corrected for by normalization, it would be desirable to further reduce this through optimization of blocking reagents, antibody titrations or improved restriction-digestion and ligation steps.Since MAbID employs restriction-ligation reactions, the signal distribution and resolution is limited to certain sequence motifs.To enhance this, additional sets of restriction enzymes could be included, such as the presented MboI/BglII restriction pair, or more sequence-unbiased genome-digestion enzymes could be incorporated, such as MNase 56 .
To our knowledge, MAbID is the first method to profile a combination of more than three epitopes, even though there is no theoretical or technical limitation towards combining more measurements for Tn5-based multifactorial approaches.As multiplexing measurements did not influence MAbID data quality, we expect that increasing the number of epitopes above six should be feasible.In this regard, the efficiency of the conjugation procedure is critical in obtaining high-quality data.Especially for monoclonal antibodies, it is imperative to validate the binding potency and specificity towards the epitope after conjugation.
MAbID's plate-based protocol provides the opportunity to select specific cells from a larger population by FACS, which can be a powerful strategy to enrich for rare cell types.We validated this by sorting five discrete cell types from mouse BM, using fluorescently labeled antibodies against cell-surface markers.This approach can not only reduce sequencing costs, but also allows for the addition of labeled control cells during antibody incubation, improving the efficiency and thereby enabling the protocol to work with increasingly low cell numbers.On the other hand, plate-based assays have limited throughput, which can be resolved through future implementation of combinatorial-indexing strategies 57 .
A common challenge for all current multifactorial methods, including MAbID, is the low coverage obtained from single cells 21,23,24,26 .This sparsity hampers studying relationships between epitopes, such as investigating co-occupancy.A recent study by Gopalan et al. 21tackled this in bulk samples by capturing reads containing two epitope-specific Article https://doi.org/10.1038/s41592-023-02090-9barcodes.Such a strategy could be incorporated in MAbID through a few modifications of the protocol, such as PCR-based amplification.
We anticipate that MAbID, as an orthogonal method to the existing tagmentation-based approaches, will contribute to the advancement of the single-cell multi-omics field to study the combined epigenetic landscapes of complex biological systems in integrated experiments.

Mouse BM isolation and ethanol fixation
All mice used in this study were bred and maintained in the Hubrecht Institute Animal Facility.Experimental procedures were approved by the Animal Experimentation Committee of the Royal Netherlands Academy of Arts and Sciences and performed according to guidelines.C57BL/6NCrl genotype mice were used for BM isolations, four female littermates of 9 weeks old.To isolate BM cells, tibia and femur bones from the hindlegs were removed.The top of the bone was removed and marrow was flushed out using a syringe with HBSS buffer (Gibco, 14025092).Cells were isolated from the marrow by pipetting and poured through a 70-µm cell strainer (Greiner, 542070) before diluting in 25 ml of HBSS buffer, followed by centrifuging for 10 min at 300g at 4 °C.Supernatant was removed, and cells were resuspended in 10 ml of PBS and centrifuged at 500g for 5 min.After removing supernatant, cells were counted using a TC20 Automated Cell Counter (BioRad, 1450102) and diluted in 300 µl of PBS per 1 × 10 6 cells.Per 300 µl of PBS, 700 µl of ice-cold ethanol (100%, Boom, 84028185) was added dropwise while vortexing, to reach a final 70% ethanol concentration.Cells were fixed for 1 h at −20 °C.Next, cells were washed with wash buffer 1 (WB1; 20 mM HEPES pH 7.5 (Gibco, 15630-056), 150 mM NaCl, 66.6 µg ml −1 Spermidine (Sigma, S2626), 1 × cOmplete protease inhibitor cocktail (Roche, 11697498001), 0.05% Tween20 (Sigma, P9416), 2 mM EDTA).Cells were stored in WB1 supplemented with 10% dimethylsulfoxide (Calbiochem, 317275) and frozen at −80 °C.

Antibodies
For antibodies, see Supplementary Table 1.
Primary antibody-DNA conjugates.Primary antibody-DNA conjugations were performed as described in the previous section, 'Secondary antibody conjugates', with minor modifications.Primary antibodies were first cleaned using the Abcam Antibody Purification Kit (Protein A) (Abcam, ab102784) following manufacturer's instructions (performing overnight incubation at 4 °C in the spin cartridge on a rotor at 8 r.p.m.).All elution phases were taken.Purified antibodies were concentrated using an Amicon Ultra-0.5 NMWL 10-kDa centrifugal filter, after which 350 µl of 100 mM NaHCO 3 was added and concentrated again to exchange buffers.Concentrated antibody was measured on the Nanodrop 2000.Subsequent steps were performed as described from the DBCO-PEG4-NHS incubation onwards.

Cell collection and fixation
K562 cells, mESCs and early NPCs.Cells were collected (∼10 × 10 6 cells) and washed with PBS.All centrifugation steps were at 200g for 4 min at 4 °C.Cells were fixed in 1% formaldehyde (Sigma, F8875) in PBS for 5 min, before adding 125 mM final concentration of glycine (Sigma, 50046) and placing on ice.Samples were kept cold for all subsequent steps and incubations performed on a tube roller.Cells were washed three times with PBS before resuspension in WB1 (20 mM HEPES pH 7.5 (Gibco, 15630-056), 150 mM NaCl, 66.6 µg ml −1 Spermidine (Sigma, S2626), 1 × cOmplete protease inhibitor cocktail (Roche, 11697498001), 0.05% saponin (Sigma, 47036), 2 mM EDTA) and transferred to a protein LoBind Eppendorf tube (Eppendorf, EP0030108116-100EA).Cells were permeabilized for 30 min at 4 °C.BSA (Sigma, A2153) was added to 5 mg ml −1 final concentration and incubated for 60 min at 4 °C.Permeabilized nuclei were used for antibody incubation.Note that formaldehyde fixation (using saponin-containing wash buffers) can be replaced with ethanol fixation (using Tween20-containing wash buffers) to preserve the cellular membrane and enable immunostainings for cell-surface markers.See the section 'Mouse BM isolation and ethanol fixation'.

Antibody incubations
All centrifugation steps were at 200g for 4 min at 4 °C and incubations were performed on a tube roller.See Supplementary Table 1 for antibodies and concentrations.
Secondary antibody-DNA conjugates.Permeabilized nuclei (or cells) were counted on a TC20 Automated Cell Counter.Nuclei were diluted to ∼2.5 × 10 6 cells per ml in WB1, and 200 µl (∼500,000 nuclei) was used for each primary antibody incubation.Primary antibody (unconjugated) was added and nuclei were incubated overnight at 4 °C.A control sample without primary antibody was taken along.Next, nuclei were washed two times with WB2 and resuspended in 200 µl of WB2 containing Hoechst 34580 at 1 µg ml −1 .Secondary antibody conjugated to an ABBC adapter (see the 'Antibody-DNA conjugation' section) was added (2 µg ml −1 ) and incubated for 1 h at 4 °C.Finally, nuclei were washed two times with WB2 and resuspended in 500 µl of WB2 before FACS.
BM antibody-fluorophore conjugate incubations.Hoechst staining was omitted for BM cells.Following primary antibody-DNA conjugate incubation, BM cells were washed once with WB2 (0.05% Tween20 instead of saponin) and resuspended in 400 µl of WB2 containing 5% Blocking Rat Serum (Sigma, R9759) per 1 × 10 6 cells.Cells were incubated with a set of commercial antibody-fluorophore conjugates against BM surface markers.Incubations were performed for 30 min at 4 °C.Samples were kept dark from this point onwards.Finally, cells were washed once with WB2 and resuspended in 1 ml of WB2 before FACS.

FACS
Nuclei (or cells) were pipetted through a Cell Strainer Snap Cap into a Falcon 5-ml Round Bottom Polypropylene Test Tube (Fisher Scientific, 10314791) before sorting on a BD Influx, BD FACsJazz or Beckman Coulter Cytoflex SRT cell sorter.Nuclei were sorted in G1/S cell-cycle phase, based on Hoechst.For BM cells, gates were set for each of the selected cell types using individually stained samples (Extended Data Fig. 8a) and all cell-cycle phases were included.For 1,000-cell samples, nuclei were sorted into a PCR tube strip containing 5 µl of 1 × CutSmart buffer (NEB, B7204S) per well, volume after sorting ∼7.5 µl per tube.For samples with 100 cells or fewer, nuclei or cells were sorted into 384-well PCR plates (BioRad, HSP3831) containing 200 nl of 1 × CutSmart buffer and 5 µl of mineral oil (Sigma, M8410) per well.Plates were sealed with aluminum covers (Greiner, 676090).
Since the majority of the analyses were performed in R, we created an R package (mabidR) to load, normalize and analyze the datasets.Technical and biological replicates were merged after verification that separate datasets were of high quality and in agreement.Genome browser tracks for bulk MAbID data represent positive log 2 (observed/ expected) values (log 2 (O/E)) of merged replicate datasets.

Accessibility normalization
Normalization of the data was performed by calculating reads per kilobase million values (RPKM) for both samples and control (see equation 1) and calculating the fold change over control with a pseudocount value of 1 (see equation 2).Raw counts of control and H3K27me3 data were compared with TTAA motif coverage and public ATAC-seq signal at 5-kb resolution.Pearson's r correlation coefficients were calculated between ATAC-seq and H3K27me3 MAbID signal (raw and normalized).Furthermore, we aligned control, Lamin B1 and H3K4me3 MAbID signals over LADs (10-kb resolution) and active TSS regions (5-kb resolution) to ascertain that the normalization was neither too lenient nor too harsh.

Correlation analyses
Pearson's r correlation coefficients between bulk normalized MAbID, ENCODE ChIP-seq (signal P value), CUT&Tag (counts) and DamID (Dam-only normalized) datasets were calculated at 5-kb resolution.Negative values (due to either biological or technical reasons) were omitted.Bins with low variability (μ ± 2σ) or in blacklisted regions were omitted.

FRiP and SEiP
To evaluate the sensitivity and specificity of MAbID, we calculated the FRiP.The input regions (peaks) were derived from the 18-state K562 ChromHMM model 61,62 and public LAD calls 67 .All peaks underwent filtering to exclude small peaks (∼5 kb, active TSS > 1.5 kb), and states were merged in the cases of Enhancers (Enh*), Polycomb domains (ReprPC*), Active TSS (TssA & TssFlnk*) and Transcription (Tx*).We allowed for at most 100,000 entries.
Considering that MAbID data are typically normalized, we devised the SEiP metric which incorporates both positive and negative noninteger values.Unlike FRiP, SEiP employs the average signal over peaks instead of the read-sum.To obtain the expected background distributions, we employed a 1,500-fold randomization strategy using randomizeRegions from the regioneR package 75 .This background distribution was then used to scale the observed SEiP scores: To compare the resulting metrics, we referenced public datasets of ChIP-seq, CUT&Tag and DamID at a resolution of 5 kb.

Signal enrichment
Enrichment computations were performed using computeMatrix from Deeptools v. 3.5.1 (ref.76).Polycomb-group domains were generated by merging 200-bp regions of the ChromHMM states 16-17, allowing a gap of 10 kb and filtering the resulting regions on a minimal size of 100 kb.Expression-based stratifications of gene bodies were made by splitting RNA-seq TPM-values on [0, 33.4,66.7, 100] percentiles, resulting in low/mid/high categories, respectively.

Dimensionality reduction analyses
The input for the UMAP analyses was log 2 (O/E) for the bulk and ISP approaches, and log 1 p(UMI counts) for single-cell samples, Z-score normalized before principle component analysis (PCA).To limit method-specific accessibility biases dominating dimensionality reductions, one public dataset (DamID or ChIP-seq) was included per chromatin type.The cross-epitope K562 UMAP contained epitope samples with more than 150 UMIs, belonging to a cell with more than 800 UMI counts.Only bins with counts for more than ten cells were used.For the mouse version, we kept the top 300 highest-depth samples for each epitope per cell type.
Pan-epitope mouse UMAPs were generated as described in Zhu et al. 16 : for each 100-kb [bin × cell] UMI-matrix per epitope, we computed Jaccard-distances ( D cell i,cell j = 1 − Jaccard cell i,cell j ).We rescaled each D-matrix to have values between 0 and 1 and summed these, whereafter PCA and UMAP were performed as above.
BM scMAbID datasets were filtered to include cells with at least 256 counts across epitopes.Pan-epitope ISP UMAPs were generated by first summing the per-cell counts per cell type and plate at 20-kb resolution and concatenating the vectors.We normalized for sequencing depth per plate with Seurat's SampleUMI function 77 .PCA was performed on the log-transformed normalized data, followed by UMAP on principle components 1-18.
BM scMAbID diffusion embedding was performed using Destiny 44 on the epitope-concatenated 1-Mb count matrix.Bins with fewer than 20 counts were removed to limit computational time.PCA was performed on the scaled log-transformed and depth-normalized matrix, followed by Destiny on principle components 1-4.

Benchmarking scMAbID
To assess the signal across different single-cell methods, the FRiP was calculated on applicable states of the 18-state K562 ChromHMM dataset 62 .LAD-annotations 67 were included as a high-quality constitutive heterochromatin annotation.Counts in peaks were tallied using Signac's 70 FeatureMatrix().

Information gain
Information gain was calculated by subtracting the weighted entropies of each cluster from the complete entropy.Entropy is defined as −1 × ∑ f log 2 ( f ), where f is the vector of cluster frequencies.

BM marker gene identification and analysis
As a reference for defining BM marker genes, publicly available H3K4me3 single-cell sortChIC 45 data were downloaded.Counts per million (CPM) were calculated for each cell type and each promoter region (2 kb upstream to 500 bp downstream of TSS).Per cell type, the Extended Data Fig. 6 | scMAbID data of human and mouse single cells with six multiplexed measurements.a) Percentages of reads per well mapping to the invalid reference genome per library index (mean = 0.38%).Calculations are based on wells containing cells of only one origin.b) Quality thresholds of scMAbID.Dot plot for K562, mESC and early NPC cells shows total number of unique counts per cell versus minimal number of unique counts per epitope per cell, with density plots indicating number of samples.Samples passing quality thresholds are highlighted in green.Table shows the yield (% of reads from total) for valid (at motif) or unique (based on UMI) counts for each species.c) Violin plots comparing the number of counts/reads per cell and per epitope/modality between scMAbID (K562, n = 1248; mESC/early NPC, n = 1523), Multi-CUT&Tag 21 (mESC/mTSC, n = 1949), MulTI-Tag 26 (K562, n = 376; H1 hESC, n = 373), nano-CUT&Tag 23 (mouse brain, n = 4434) and NTT-seq 24 (K562, n = 6900).Boxplots inside violin-plots show the minima, maxima, interquartile range (box bounds), and median (white dot).N, number of cells.d) Table comparing metrics of scMAbID, Multi-CUT&Tag 21 , MulTI-Tag 26 , nano-CUT&Tag 23 and NTT-seq 24 .Input cells staining -Number of cells used for antibody incubations; Input cells loaded -number of cells sorted or loaded on a 10x Chromium/ICELL8 chip; Recovered cells -number of cells reported after quality filtering; % Recoverypercentage of 'Recovered cells' from 'Input cells loaded'; Reads per cell -total reads/counts/fragments per cell; Reads per epitope -range of reads/counts/ fragments per epitope per cell.Metrics are based on the reported values of each manuscript as well as the calculated read counts from the public datasets.e) FRiP scores per single K562 cell for scMAbID (n = 1248), MulTI-Tag 26 (n = 376) and NTTseq 24 (n = 6900).FRiP scores are calculated across different ChromHMM and LAD (DamID) domains.f) UMAP of K562 scMAbID samples.Each dot represents one epitope measurement.Colouring is based on 1) the epitope, 2) total depth per cell or 3) depth per epitope.Only cells passing the threshold of 150 unique counts per epitope were included (n = 6729).k, thousands.
Extended Data Fig. 7 | Applying scMAbID to a mouse in vitro neural differentiation system.ISP, in silico population.a) Genome browser tracks comparing mESC scMAbID ISP (n = 674) samples of H3K4me1, H3K27me3 and H3K9me3 with bulk MAbID and ChIP-seq.Genes and ENCODE domain calls are indicated.Y-axis reflects positive log 2 (counts/control) for MAbID and fold change (IP/input) for ChIP-seq.b) FRiP scores of each scMAbID epitope measurement per mESC and early NPC cell.FRiP scores are calculated across different ChromHMM and LAD domains.Plotting order from back to front -H3K4me1, H3K27ac, H3K9me3, Lamin B1, H3K27me3, H3K36me3.c) UMAP of mESC and early NPC scMAbID samples.Per cell type, the top 300 highest depth measurements per epitope are included (mESC, n = 1800; early NPC, n = 1800).Colouring is based on 1) epitope, 2) total depth per cell, 3) depth per epitope or 4) cell type of origin.k, thousands.d) UMAP of integrated mouse scMAbID samples (mESC, n = 674; early NPC, n = 849).Each dot represents one cell, colouring based on 1) Leiden algorithm cluster assignment, 2) total depth per cell (log 10 transformed) or 3) cell type of origin.e) Information Gain (IG) for different numbers of epitopes included for data integration, calculated by comparing each resulting clustering to the gold standard (cells assigned as mESC pluripotent and early NPC differentiated using six integrated epitopes).Whiskers denote Standard Error of the Mean.Unique epitope-combinations, n = 6, 15, 20, 15, 6 and 1 respectively.f) Table showing sets of epitopes with the three highest IG values of (e) per number of included epitopes, colouring based on epitopes.g) Barplot with percentage of cells that have undergone XCI in the pluripotent or differentiated clusters, coloured on assigned inactive X-chromosome allele (Xi).Percentage values are indicated per bar (%).h) Genome browser tracks comparing scMAbID ISP H3K27me3 samples of pluripotent (n = 912) and differentiated (n = 611) clusters, along with mESC H3K27me3 ChIP-seq.Genomic regions are the X-chromosome and regions around Hox clusters A, B and C, highlighted with green, orange and red respectively.Y-axis reflects positive log 2 (counts/control) for MAbID and fold change (IP/input) for ChIP-seq.
Extended Data Fig. 8 | scMAbID data of primary mouse bone marrow cells with combined measurement of six epitopes.a) FACS gates to sort five specific cell types.Cells were generally gated on FSC and SSC values (gates P1 to P2) to obtain high quality single-cell samples, before passing through several gates to select each cell type.Title of each plot indicates the gate through which events passed previously.Percentages indicate events falling into the indicated gate compared to total events per plot.Granulocytes were selected as GR1 + , B cells as GR -CD19 + NK1 -, NK cells as GR -CD19 -NK1 + , T cells as GR -CD19 -NK1 -CD3 + and Erythroblasts were more leniently selected as GR -CD19 -NK1 -CD3 -TER119 high .b) Violin plots of total number of unique scMAbID counts per cell and per epitope for all cell types (Granulocytes, n = 844 cells; Erythroblasts, n = 655; B cells, n = 969; T cells, n = 471, NK cells, n = 494) after filtering on quality metrics.Boxplots inside violin-plots show the minima, maxima, interquartile range (box bounds), and median (white dot).N, number of cells.c) 3D Diffusion maps of single-cell scMAbID samples, with integrated epitope-measurements for each cell (n = 3433).Colouring is based on the lineage (left), cell type (middle) or depth (right, log 10 of the total depth per cell).DC, Diffusion component.d) GO-term analysis of differentially expressed marker genes for each lineage and cell type, based on publicly available sortChIC 45 data.The six terms with highest observed/expected are shown.e) Bargraph of the number of scMAbID counts per cell over the top 50 marker genes per lineage for each epitope.Active chromatin types and inactive chromatin types are stacked as individual bars for scMAbID counts of the myeloid-or lymphoid-lineage cells (x-axis).The title of each panel reflects the lineage-specific marker gene set.f) Bargraph as (e), per cell type instead of lineage.g) Count violins showing the number of combined scMAbID H3K4me1 and H3K27ac counts per cell over the Trem1 46,47 , Bank1 48,49 , Pik3cd 50,51 and Ccr2 52,53 genes.

Fig. 1 |
Fig. 1 | Genomic profiling of a broad range of epigenetic markers with MAbID.a, Schematic representation of the MAbID procedure.b, UMAP embedding of MAbID, ChIP-seq and DamID samples, colored on epitope with encircled chromatin types.One reference dataset is included per chromatin type.Selected Ab indicates the primary antibody used hereafter; alternative Ab represents a different primary antibody against the same epitope.c, UMAP as in b, colored by correlation with ChIP-seq samples of H3K36me3, H3K27me3 and H3K9me3.Color-scale represents the Pearson's r correlation coefficient of MAbID with ChIP-seq samples.d, Genome browser tracks of MAbID, ChIP-seq or DamID samples.Genes and ENCODE domain calls (LAD, lamina-associated domain; EnhA1, Active enhancer 1; ReprPC, repressed Polycomb) are indicated.The y axis reflects positive log 2 (counts/control) for MAbID and DamID and fold change (IP/input) for ChIP-seq.e, MAbID signal enrichment of Lamin B1 around LADs, H3K27me3 around Polycomb-group domains, as well as H3K4me3 and H3K4me1 around respective ChIP-seq peaks.Top, average enrichment; bottom, signal per genomic region (sorted on MAbID signal).The number n represents the number of genomic regions included per heatmap and the data range is indicated underneath.f, MAbID signal enrichment of H3K4me3 and H3K36me3 around active genes.Genes were stratified on expression level and categorized in percentiles.Top, average enrichment per percentile-group; bottom, signal per set of genes, ordered from high to low.n = 7,553 genes included per heatmap; the data range is indicated underneath.Ab, antibody; gDNA, genomic DNA; IP, immunoprecipitation; Mb, megabases; TES, transcription end site.

Fig. 3 |
Fig. 3 | Expanding MAbID with primary antibody-DNA conjugates.a, Schematic showing nuclei incubation with primary antibody-DNA conjugates.b, Genome browser tracks comparing MAbID samples using primary antibody-DNA conjugates with ChIP-seq or DamID samples.Genes (+, forward; −, reverse) and ENCODE/ChromHMM domain calls are indicated.The y axis reflects positive log 2 (counts/control) for MAbID and DamID and fold change (IP/input) for ChIP-seq.c, Genome browser tracks comparing MAbID samples using primary antibody-DNA conjugates or secondary antibody-DNA conjugates (in combination with a primary antibody).Genes and ENCODE/ChromHMM domain calls are indicated.Scaling is based on the minimum and maximum value per sample and the y axis reflects positive log 2 (counts/control) values.d, UMAP of MAbID replicates using primary antibody-DNA conjugates and a MAbID sample of merged replicates using secondary antibody-DNA conjugates (in combination with a primary antibody).Coloring is based on the epitope; chromatin types are encircled.e, MAbID signal enrichment of Lamin B1, H3K27me3 and H3K36me3 around the same domains/peaks from ChromHMM or ChIP-seq data, comparing MAbID samples using primary or secondary antibody-DNA conjugates.Top, average enrichment; bottom, signal per genomic region (sorted on MAbID signal).The number n represents the number of genomic regions included per heatmap and the data range is indicated underneath.LAD regions, 4D Nucleome; Polycomb-group domains, ChromHMM; H3K36me3, ChIP-seq domain calls. /doi.org/10.1038/s41592-023-02090-9

Fig. 5 |
Fig. 5 | Multifactorial chromatin profiling in primary mouse BM distinguishes cell type-specific gene expression programs.a, Schematic representation of the scMAbID experiment, incorporating BM cell-surface marker stainings.Cell images were created with BioRender.com.b, UMAP of BM scMAbID ISP samples, with integrated-epitope measurements summed per cell type, in which each dot represents the combined samples of one plate (plates, n = 13).Coloring is based on the lineage (top) or cell type (bottom).c, Three-dimensional diffusion maps of single-cell scMAbID samples, with integrated-epitope measurements for each cell (n = 3,433).Coloring is based on the lineage (top) or cell type (bottom).DC, diffusion component.d, Matrix visualizing the counts of scMAbID ISP enhancer (H3K4me1 + H3K27ac) samples per lineage over the top 50 most differentially expressed marker genes per lineage.scMAbID H3K4me1 and H3K27ac counts from each lineage were summed over all marker genes per set and normalized for the control dataset.Values reflect log 2 (counts/control).e, Matrix visualizing the counts of scMAbID ISP enhancer (H3K4me1 + H3K27ac) samples per cell type over the top 50 most differentially expressed marker genes per cell type.scMAbID H3K4me1 and H3K27ac counts from each cell type were summed over all marker genes per set and normalized for the control dataset.Values reflect log 2 (counts/control).NK, natural killer.

Extended Data Fig. 1 |
Overview of the MAbID method.a) Representative gel electrophoresis analysis of secondary antibody-DNA conjugates targeting primary IgGs of different species-of-origin.Conjugates were separated on a native polyacrylamide gel with a 4-12% gradient.Unconjugated antibody-adapter was loaded as control.Antibody-DNA conjugate preparations were repeated at least 5 times with similar results.b) Designs of the antibody-adapter and sample-adapter.Top strand of the antibody-adapter has a 5' azide modification (N 3 ) for coupling to the antibody.Fork in the double-stranded sample-adapter was created by adding 6 nt non-complementary sequences on each strand.UMI, Unique Molecular Identifier; nt, nucleotide; bp, basepair.c) Barplot of read counts (M, million) retained per computational processing step.Different segments represent separate samples used (BC 1-18), identified by the combined presence of the sample (SBC) and antibody-barcode (ABBC) within the read.Demux, demultiplexing of reads on combined barcodes; aln, alignment of reads to the genome; motif, reads mapping to the TTAA sequence motif.d) i; Genome browser tracks comparing K562 MAbID samples of H3K27me3 (Raw and Normalized) and the control with TTAA motif coverage, ATAC-seq, CUT&Tag and ChIP-seq.'Control' represents a combination of depth-normalized samples in which primary antibody was omitted, 'Raw' is the depth-normalized H3K27me3 signal, 'Normalized' is the raw sample normalized over the control.Y-axis reflects positive log 2 (counts/control) for MAbID and fold change (IP/input) for ChIP-seq and CUT&Tag.ii; Comparison of MAbID and ATAC-seq per bin of 'Raw' and 'Normalized' data, for the region in (i) and genome-wide.Pearson's r correlation coefficients between MAbID and ATAC-seq data indicated.e) i; Genome browser tracks of H3K27me3 Raw and Normalized MAbID data with ChIP-seq.Called domains are indicated below.ii; Overlap (in Mb) between H3K27me3 domains called on MAbID Raw, MAbID Normalized and ChIP-seq data.JI, Jaccard Index.f) Average signal enrichment of MAbID samples (Raw, Control or Normalized) of Lamin B1 or H3K4me3 using secondary antibody-DNA conjugates, around LADs or TSS of active genes.Y-axis reflects counts/million (for control and raw) or log 2 (counts/control) (for normalized).Extended Data Fig. 2 | Benchmarking MAbID to state-of-the-art methods.a) Boxplots showing the Pearson's r correlation coefficients between corresponding chromosomes of MAbID replicates.Asterisks (*) denote the correlation coefficients between MAbID samples using different primary antibodies against the same epitope.Boxplot represents the median (crossing line), minima, maxima, and interquartile range (bounds of box).n = 23 chromosomes.b) Heatmap of genome-wide Pearson's r correlation coefficients of MAbID with ChIP-seq (left) or CUT&Tag (right).c) Genome browser tracks of MAbID with ChIP-seq or DamID for active chromatin types on a narrow genomic scale (left) or inactive chromatin types on a broad genomic scale (right).Genes (+, forward; -, reverse) and ENCODE/ChromHMM domain calls (LAD, Laminaassociated domain; EnhA1, Active enhancer 1; ReprPC, Repressed PolyComb) are indicated.Y-axis reflects positive log 2 (counts/control) values for MAbID and DamID and fold change (IP/input) for ChIP-seq.d) FRiP (Fraction of Reads in Peaks) and scaled SEiP (Signal Enrichment in Peaks) scores comparing MAbID samples with corresponding ChIP-seq, CUT&Tag and DamID samples.Scores are calculated over different ChromHMM states (Enhancers, Transcription (state 10, TranscriptionElongation), Active TSS, Polycomb domains) and LADs.e) Average signal enrichment of MAbID or ChIP-seq for H3K27me3 or H3K4me3 around Polycomb-group domain borders (PcG, ChromHMM, left) or TSS of active genes (right) respectively.Y-axis reflects Z-score normalized values of log 2 (counts/ control) for MAbID and fold change (IP/input) for ChIP-seq.Top of the signal is called at the curve's inflection point, indicated by ^.Gray box highlights the 50% decay distance, noted in the top left corner.Dashed lines reflect linear steps of 4 kb distance from the top.f) MAbID signal enrichment of CTCF around ChIP-seq peaks (ENCODE) and SUZ12 around Polycomb-group domains (ChromHMM).Top -average enrichment, bottom -signal per genomic region (sorted on MAbID signal).N represents the number of genomic regions included per heatmap and the data range is indicated underneath.Extended Data Fig. 3 | Individual and multiplexed MAbID samples have similar data quality.a) Table of number of demultiplexed reads (demux), valid reads (aligned at TTAA motif) and the resulting yield (% of valid in demux) for individual (single) or combined (multi) samples.M; million.b) Barplot showing percentage of reads lost or retained in the different computational processing steps comparing individual (single) or combined (multi) measurements.Low mapq, low mapping quality; Non-motif, not aligned at TTAA sequence motif; Valid, reads passing quality thresholds aligning at a TTAA sequence motif; NS, non-significant difference based on two-sided Wilcoxon signed-rank test.c) Heatmap showing the genome-wide Pearson's r correlation coefficient of MAbID replicates of individual (single) or combined (multi) measurements with ChIPseq.d) Genome browser tracks comparing MAbID samples of individual (single) or combined (multi) measurements with ChIP-seq.Genes (+, forward; -, reverse) and ENCODE/ChromHMM domain calls (LAD, Lamina-associated domain; EnhA1, Active enhancer 1; ReprPC, Repressed PolyComb) are indicated.Y-axis reflects positive log 2 (counts/control) values for MAbID and fold change (IP/input) for ChIP-seq.e) Average MAbID signal enrichment of H3K27me3, H3K36me3 and Pol II CTD Ser5P around the same domains/peaks called on ChIP-seq data, comparing MAbID samples generated with different combinations of primary antibodies and corresponding secondary antibody-DNA conjugates.Y-axis reflects Z-score normalized values of log 2 (counts/ control).f) MAbID signal enrichment of individual (single) or combined (multi) measurements of Pol II CTD Ser5P and H3K36me3 around active genes.Genes were stratified on expression level by setting the top 50% as the "high"-group and the bottom 50% as the "low"-group.Top -average enrichment per group, bottom -signal per set of genes, ordered from high to low.Heatmaps are split for the individual (top) or combined (bottom) staining.N = 7553 genes included per heatmap, the data range is indicated underneath.Extended Data Fig. 4 | MAbID is customizable to the genomic context of the epitope of interest.a) Barplots showing the number of TTAA or GATC sequence motifs in the human genome distributed over ChromHMM states.Observed/ Expected (O/E) is the log 2 transformation of the number of motifs observed compared to the expected number based on the proportion of the genome per state.GATC-bias shows the difference between the O/Es for the GATC versus the TTAA motif.b) Barplot showing the genomic coverage (%) of unmappable bins across the different ChromHMM states and the whole genome (black, percentages indicated) for the TTAA motif, GATC motif or blacklisted regions 72,73 .c) UMAP embedding of MAbID replicates of individual (single) or combined (multi) measurements using different antibody-adapter types.Colouring on the targeted sequence motif, used primary antibodies are encircled.TTAA, NdeI-compatible antibody-adapter; GATC, BglII-compatible antibody-adapter; TTAA or GATC, NdeI-or BglII compatible antibody-adapter; TTAA and GATC, both types of antibody-adapter -each category indicating the antibody-adapter per epitope.d) Average MAbID signal enrichment of H3K27me3, H3K36me3 and Pol II CTD Ser5P around the same domains/peaks called on ChIP-seq data, comparing MAbID samples from combined or individual measurements using different types of antibody-adapters.Y-axis reflects Z-score normalized log 2 (counts/ control).e) Average signal enrichment of H3K27me3 MAbID samples around Polycomb-group domain borders (PcG, ChromHMM) for individual (TTAA; GATC) or combined (TTAA and GATC) measurements.Y-axis reflects Z-score normalized values of log 2 (counts/control).Vertical dashed lines -top and bottom of the signal; x -50% decay distance, noted in the top left corner (kb).Bottom right values reflect number of unique counts per sample.f) Table of demultiplexed reads (demux), valid reads (aligned at TTAA or GATC motif) and the resulting yield (% of valid in demux) for samples using different types of antibody-adapters in individual or combined samples.M; million.g) Barplot showing percentage of demultiplexed (demux) and valid (aligned at motif) reads per total number of sequenced reads per sample, comparing different antibodyadapter types and individual versus combined measurements.Extended Data Fig. 5 | MAbID with primary antibody-DNA conjugates generates specific genomic profiles in individual and combined measurements.a) Representative gel electrophoresis analysis of primary antibody-DNA conjugates targeting different epitopes.Conjugates were separated on a native polyacrylamide gel with a 4-12% gradient.Unconjugated antibody-adapter was loaded as control.Red and blue dots indicate the type of antibody-adapter used.Antibody-DNA conjugate preparations were repeated at least 5 times with similar results.b) Barplot of read counts (M, million) retained per computational processing step.Different segments represent separate samples used (BC 1-26), identified by the combined presence of the sample (SBC) and antibody-barcode (ABBC) within the read.Demux, demultiplexing of reads on combined barcodes; aln, alignment of reads to the genome; motif, reads mapping to the TTAA or GATC sequence motif.c) Boxplots showing the Pearson's r correlation coefficient between corresponding chromosomes of MAbID replicates using primary antibody-DNA conjugates.Boxplot represents the median (crossing line), minima, maxima, and interquartile range (bounds of box).n = 23 chromosomes.d) UMAP of MAbID samples from combined (multi) or individual (single, two replicates) measurements with primary antibody-DNA conjugates.Colouring is based on the epitope, chromatin types are encircled.e) Heatmap showing the genome-wide Pearson's r correlation coefficient of MAbID replicates using primary antibody-DNA conjugates for individual (single) or combined (multi) measurements with different ChIP-seq or DamID (for Lamin B1) samples.