Simultaneous single-cell three-dimensional genome and gene expression profiling uncovers dynamic enhancer connectivity underlying olfactory receptor choice

The simultaneous measurement of three-dimensional (3D) genome structure and gene expression of individual cells is critical for understanding a genome’s structure–function relationship, yet this is challenging for existing methods. Here we present ‘Linking mRNA to Chromatin Architecture (LiMCA)’, which jointly profiles the 3D genome and transcriptome with exceptional sensitivity and from low-input materials. Combining LiMCA and our high-resolution scATAC-seq assay, METATAC, we successfully characterized chromatin accessibility, as well as paired 3D genome structures and gene expression information, of individual developing olfactory sensory neurons. We expanded the repertoire of known olfactory receptor (OR) enhancers and discovered unexpected rules of their dynamics: OR genes and their enhancers are most accessible during early differentiation. Furthermore, we revealed the dynamic spatial relationship between ORs and enhancers behind stepwise OR expression. These findings offer valuable insights into how 3D connectivity of ORs and enhancers dynamically orchestrate the ‘one neuron–one receptor’ selection process.

The simultaneous measurement of three-dimensional (3D) genome structure and gene expression of individual cells is critical for understanding a genome's structure-function relationship, yet this is challenging for existing methods.Here we present 'Linking mRNA to Chromatin Architecture (LiMCA)', which jointly profiles the 3D genome and transcriptome with exceptional sensitivity and from low-input materials.Combining LiMCA and our high-resolution scATAC-seq assay, METATAC, we successfully characterized chromatin accessibility, as well as paired 3D genome structures and gene expression information, of individual developing olfactory sensory neurons.We expanded the repertoire of known olfactory receptor (OR) enhancers and discovered unexpected rules of their dynamics: OR genes and their enhancers are most accessible during early differentiation.Furthermore, we revealed the dynamic spatial relationship between ORs and enhancers behind stepwise OR expression.These findings offer valuable insights into how 3D connectivity of ORs and enhancers dynamically orchestrate the 'one neuron-one receptor' selection process.
Three-dimensional (3D) genome organization lays the physical foundation for gene expression and gene regulation [1][2][3][4][5][6] .Understanding the intricate relationship between genome architecture and gene expression necessitates the development of advanced techniques to simultaneously measure these two modalities with high sensitivity from the same cell [7][8][9][10][11][12] .Existing methods have severe limitations.Currently, imaging-based methods can only measure a limited number of genomic loci (1,000-3,660, namely every 1-3 Mb) and transcripts (70-1,000 genes) and therefore lack a genome-wide view [7][8][9][10][11] .The published sequencing-based methods, HiRES, had limited sensitivity (~0.3 million contacts per cell) because genomic DNA was damaged during reverse transcription, captured only nuclear RNAs because the cytoplasm was destroyed during the procedure and only detected the 3′ end of the transcript 12 .In addition, HiRES must be performed with a large number of cells, prohibiting analysis of low-input samples.
Here we report Linking mRNA to Chromatin Architecture (LiMCA), a sequencing-based method that simultaneously profiles single-cell 3D genome structure and full-length transcript information.In particular, LiMCA physically separates the nucleus and the cytoplasm of the same cell for measuring the 3D genome and transcriptome, respectively, and therefore does not compromise the detection sensitivity and performance of each modality.

Article
https://doi.org/10.1038/s41592-024-02239-0exhibited high concordance with a bulk in situ Hi-C contact map across various resolutions ranging from compartments to topologically associating domains and chromatin loops (Fig. 1b,c and Extended Data Fig. 1d-h).Furthermore, the gene expression profile of ensemble LiMCA displayed a high correlation with bulk RNA-seq data generated from the same cell line (Fig. 1d).Therefore, we concluded that LiMCA faithfully captures both the genome architecture and gene expression.
To examine the robustness of LiMCA, we further applied it to three different human cell lines (K562, eHAP and BJ) as well as mouse olfactory epithelium.This additional cell line dataset further validated our technique (Extended Data Fig. 1i and Extended Data Fig. 2a).Subsequently, we performed a comprehensive comparative analysis against published datasets, including HiRES 12 (single-cell joint Hi-C-RNA), Dip-C 18,26,28 (scHi-C) and single-cell RNA sequencing (scRNA-seq) data.Our results demonstrated that LiMCA detected a substantially higher number (2-5 folds) of contacts than HiRES and tissue datasets obtained through Dip-C (Fig. 1e, left).Furthermore, LiMCA exhibited a comparable number of genes detected when compared to Smart-seq, while surpassing the number of genes identified by HiRES and droplet-based scRNA-seq methods (Fig. 1e, right, and Extended Data Fig. 1c).Notably, LiMCA not only offers enhanced sensitivity but also provides full-length transcript information, in contrast to HiRES, which solely captured the 3′ end of genes.Therefore, LiMCA is capable of measuring both chromatin interactions and gene expression at high sensitivity and consistently performs well across diverse cell types.
We then performed clustering based on chromatin interaction (scA/B value; Methods) or gene expression profiles offered by LiMCA to evaluate its accuracy in distinguishing cell types.We found that both modalities clearly separated the four cell types (Fig. 1f and Extended Data Fig. 2b).To confirm accuracy, we calculated scA/B values for cell-type-specific marker genes, which showed specific enrichment in corresponding cell types (Extended Data Fig. 2c,d), consistent with our previous work 28 .Hi-C 'structural typing' identified an additional cluster containing cells from all four cell types, which belongs to the metaphase (Extended Data Fig. 2e-g).This is in line with the knowledge that the chromosome undergoes a homogeneous folding state during mitosis irrespective of cell type 29 .Furthermore, LiMCA accurately detected cell-type-specific chromatin loops (Extended Data Fig. 2h,i).Thus, we established a single-cell multi-omics assay that simultaneously measures genome-wide chromatin interactions and transcriptome-wide gene expression in hundreds of single cells.

Relationship between gene expression and 3D genome structure
With our previously developed Dip-C algorithm 26 , we showed that about 31% of GM12878 cells (68 of 220, with root-mean-square-deviation (r.m.s.d.) < 1.5) and 52% of eHAP cells (22 of 42, haploid cells) faithfully yielded 3D genome structures at a high resolution of 20 kb (Fig. 1a, Extended Data Fig. 1j and Methods).The pairwise 3D distance matrix obtained from individual single-cell structures exhibited a strong agreement with the ensemble and bulk contact maps (Extended Data Fig. 3a).With such high-resolution structures, we were able to pinpoint the spatial position of expressed genes in the nucleus.
To investigate the relationship between gene expression and chromatin structure, we focused on the NFKB1 gene, a critical transcription factor for B cell development and function.We sorted GM12878 cells into two groups based on NFKB1 expression level and compared ensemble contact maps.Our findings revealed that highly expressed NFKB1 interacts more frequently with an upstream enhancer (Fig. 1g).This observation was further validated by downsample and random sample control analysis (Extended Data Fig. 3c-h).Similar results were observed for other genes analyzed (Extended Data Fig. 3i,j), demonstrating that gene expression dynamics are associated with changes in chromatin structure.
To demonstrate the biological insights that LiMCA can generate, we applied LiMCA to the mouse olfactory system.Understanding how the 'one neuron-one receptor' paradigm is established during olfactory sensory neuron (OSN) development is a long-standing pursuit of the field.There are more than 1,000 olfactory receptor (OR) genes, which are presented as gene clusters distributed across 18 chromosomes in the mouse genome 13 ; however, each mature OSN expresses only one OR gene out of such a large repertoire in a monoallelic and seemingly stochastic manner 14 .Recent bulk and single-cell chromosome conformation capture (3C/Hi-C) studies showed that OSNs establish strong and specific inter-chromosomal interactions between OR gene clusters, which are heterochromatically modified to assure the complete silencing of OR genes 15,16 .Such OR-OR gene cluster interactions bring multi-chromosomal OR enhancers (termed the 'Greek Islands' (GIs)) together to form a super-enhancer hub 17,18 , which was proposed to activate the singular chosen OR gene, forging the 'silence all and activate one' model.
However, this model fails to address several unresolved issues.First, during OSN development, progenitors transiently express random sets of OR genes 19,20 .Additionally, the onset of multigenic OR expression precedes the formation of repressive OR-OR gene compartments.Furthermore, each OSN forms multiple enhancer aggregates, which means that simply being associated with enhancer hubs is insufficient to fully account for the singular OR gene.Unfortunately, existing bulk and single-cell techniques are unable to resolve these mysteries due to the lack of OR expression information and an inability to isolate a population expressing a random set of OR genes.Ideally, a technique that can simultaneously measure OR gene expression and 3D genome organization in the same cells would be necessary to elucidate how OR gene selection process is initiated and proceeded.
Using LiMCA and in combination with single-cell chromatin accessibility and a gene expression landscape of the developing OSNs, we gained an unprecedented view of how the accessibility of OR enhancers is regulated and how the association with multi-chromosomal enhancers regulates the stepwise OR gene selection from multigenic OR activation to singular OR gene determination.

Development of LiMCA
To enable simultaneous measurement of transcriptional activity and chromatin architecture in the same cell with high sensitivity, we employed a strategy utilizing physical separation of cytoplasm (mRNA) and nucleus (chromatin).This procedure has been used in single-cell multi-omics technologies [21][22][23] .Specifically, the separated cytoplasm was subjected to Smart-seq2 amplification for transcriptome analysis 24 , while the nucleus was proceeded to conventional chromosome conformation capture procedure 25 that included crosslinking, restriction enzyme digestion and proximity ligation (Fig. 1a and Methods).To further increase chromatin contact detection in single cells, we adopted our high-coverage transposon-based whole-genome amplification (WGA) method, META 26 , to amplify the resulting nucleus.Then the messenger RNA library and 3C library were sequenced and integrated to obtain both modalities (Fig. 1a).
To evaluate whether LiMCA precisely captures high-order genome structure, we performed a proof-of-concept experiment on GM12878, a well-studied female human lymphoblastoid cell line with an extensively characterized genome structure 27 .LiMCA detected a median of 1.08 million unique chromatin contacts per cell (n = 220, s.d.= 470,000, minimum = 130,000, maximum = 2.79 million) (Supplementary Table 1), which is comparable to our previously developed high-sensitivity single-cell Hi-C method, Dip-C 26 (Extended Data Fig. 1a).The composition of contacts is similar to Dip-C, with a greater proportion of short-range (<20 kb) and lower long-range (>20 kb) intra-chromosomal contacts.Ensemble chromatin interaction profiles (merged from 220 individual cells, referred to as ensemble LiMCA) Article https://doi.org/10.1038/s41592-024-02239-0 The positioning of genes within the nucleus, such as the radial positioning and the association with nuclear landmarks, is known to influence their expression 30 .To examine how gene positioning influences gene expression in single cells, we explored the spatial distribution of expressed genes within the nucleus.Our analysis revealed that expressed genes have a higher density in the nuclear interior and Article https://doi.org/10.1038/s41592-024-02239-0more neighbors at a given 3D distance than randomly selected controls (Fig. 1h,i and Extended Data Fig. 3k,l).Though population average radial position negatively correlated with expression level (Extended Data Fig. 3o), this is not observed in single cells (Extended Data Fig. 3m,n)., followed by a single OR gene outcompeting others during OSN maturation; however, our technique allows for simultaneous probing of both OR gene expression and 3D chromatin structure, providing an unprecedented insight into this complex process.We created a joint 3D genome and gene expression multi-omics atlas of the developing OSNs with LiMCA, consisting of 411 cells from the mouse main olfactory epithelium (MOE) across six time points (postnatal day 3, 7, 14, 28, 60 and 120) (Fig. 2a,b).We obtained an average of 650,000 unique chromatin contacts per cell (s.d.= 298,000, minimum = 119,000 and maximum = 2.8 million) (Supplementary Table 2), of which 224 (54%) have high-quality 20-kb resolution 3D structures (r.m.s.d.≤ 1.5; Supplementary Table 2).For gene expression, we detected a median of 4,528 genes per cell (Supplementary Table 2).
Upon embedding based on gene expression (Fig. 2c), we identified four clusters in RNA embedding: non-neuronal, progenitors, immature OSNs and mature OSNs (Fig. 2c, left, and Extended Data Fig. 4a).When examining the Hi-C embedding, we observed that the progenitors in RNA embedding were split into two distinct clusters, referred to as pro-genitor1 and progenitor2 (Fig. 2c, right, and Extended Data Fig. 5a,b).We further validated the separation of progenitor1 and progenitor2 by integrating our published mouse MOE data 18 , excluding the potential influence of mouse lines or contact numbers (Extended Data Fig. 5c,d).
We observed a continuous trajectory in OSN genesis, from progenitors to immature OSNs and finally to mature OSNs (Extended Data Fig. 4c,d).As expected, our LiMCA profiles recapitulated known characteristic chromatin reorganization during OSN maturation, including gradually increased chromosomal intermingling, OR-OR gene interaction and enhancer-enhancer interactions (Extended Data Fig. 5e-i).With the expression profiles of OR genes, we were able to reveal the spatial relationship between expressed OR genes and OR enhancers (Fig. 2d).
To comprehensively understand the underpinning chromatin state of OR enhancers along OSN development, we additionally generated a single-cell chromatin accessibility and gene expression atlas of the developing mouse MOE with our high-sensitivity METATAC 31 and droplet-based scRNA-seq, consisting of 11,880 cells and 73,577 cells (Fig. 2e and Extended Data Fig. 8b,c), respectively.We utilized the scRNA-seq atlas as a reference to annotate the cell types in our METATAC atlas (Extended Data Fig. 6g).The atlas allowed us to capture the dynamics of chromatin accessibility and gene expression throughout OSN development.For assay for transposase-accessible chromatin (ATAC), we detected a median of 66,000 ATAC fragments per cell (Extended Data Fig. 6b), and the gene expression yielded a median of 3,346 genes (8,651 unique molecular identifiers (UMIs)) per cell (Extended Data Fig. 8a).
Our datasets validated the changes in cell type composition between multi-potent progenitor cells and developing OSNs during the first postnatal month of development (Extended Data Fig. 6e and 8h).Notably, our dataset precisely recapitulated known cell types in MOE and their marker genes (Fig. 2c, Extended Data Fig. 6c-g and Extended Data Fig. 8b-d).Specifically, both of our single-cell chromatin accessibility and gene expression atlases captured the continuous developmental trajectory of OSNs from globose basal cell (GBC) to immediate neuronal precursor (INP), then to immature OSN (iOSN) and mature OSN (mOSN) (Fig. 2e and Extended Data Fig. 8b).Our high-resolution single-cell chromatin accessibility atlas offers a new opportunity to understand the epigenetic regulatory mechanism underlying multiple lineage specification of MOE.

Chromatin accessibility dynamics of OR enhancers
Using our high-resolution single-cell chromatin accessibility atlas, we identified 27 new enhancers (Fig. 2f and Supplementary Table 3) according to previous definitions 32,33 , which were located within OR gene clusters, exhibited ATAC peaks in mOSN, co-bound by LHX2 and EBF (Fig. 2f and Extended Data Fig. 7c-e) and contained the characteristic composite motif of LHX2 and EBF (Fig. 2g,h).The comprehensive characterization of OR enhancers proves that almost all OR gene clusters harbor at least one enhancer, implying the critical role of cis-enhancer in the regulation of OR gene expression.The absence of identified enhancers in certain small clusters may be due to the low abundance of OSNs expressing these specific OR genes.
We then analyzed the chromatin accessibility dynamics of OR genes and OR enhancers during OSN differentiation.Using our METATAC dataset, we performed pseudotime analysis to trace the developmental lineage from the GBC stage to mOSNs (Fig. 2e and Methods).Our findings revealed that OR genes initially had a closed state at the GBC stage, followed by a pervasive accessibility state at the late INP stage, corresponding to multigenic OR expression.During OSN maturation, OR genes returned to a fully inaccessible state (Fig. 2i, bottom), even more closed than non-OSN cell types (Extended Data Fig. 7b), indicating robust OR gene repression.OR enhancers were completely inaccessible at GBCs but rapidly reached peak accessibility at the late INP stage before decreasing to a lower level as OSNs matured to mOSN (Fig. 2i, top).Analysis of master transcription factors (TFs) of OR enhancers with our single-cell gene expression atlas showed that Lhx2/Ebf expression followed similar dynamics as OR enhancers along OSN development (Extended Data Fig. 8g).To further determine the temporal relationship between Lhx2 expression and OR-enhancer accessibility, we integrated METATAC and scRNA-seq data by extracting the continuous developmental trajectory from GBC to mOSN (Fig. 2j, Extended Data Fig. 8i and Methods).The integrated pseudotime analysis confirmed that Lhx2/Ebf expression clearly precedes OR-enhancer activation (Fig. 2k,l).These results suggest that the accessibility change of OR enhancers is elicited by LHX2, consistent with Lhx2 knockout eliminating GI accessibility in mOSNs 33 .
Overall, our study reveals that LHX2-activated OR enhancers reach their highest accessibility at the late INP stage, creating a highly activated environment for OR gene expression and explaining the multigenic OR activation at this stage.As OSN maturation, multiple OR genes initially activated in progenitors are silenced, leaving only one active OR gene.At the same time, the accessibility of OR genes and OR enhancers decreases as OSNs further matured, ensuring singular OR gene expression and silencing of excess ORs in mOSNs.

Spatial relationship between active OR genes and enhancers
With the paired 3D genome structure and OR gene expression profiles within the same cell, we explored the spatial relationship between OR enhancers and expressed OR genes to understand how OR gene is activated and selected.The presence of truncated and nonfunctional OR transcripts necessitates the utilization of full-length transcript information, a feature uniquely provided by LiMCA as opposed to HiRES.This capability plays a crucial role in accurately discerning genuine OR gene expression (Extended Data Fig. 9).According to OR gene expression profiles, we classified developing OSNs (progenitor, iOSN and mOSN) into three stages (Fig. 3a, Extended Data Fig. 10a-c and Supplementary Table  with multiple lowly expressed OR genes and without dominant ones; the silencing stage (stage 2) with one dominant OR gene and several weakly expressed OR genes; and the singular OR determination stage (stage 3) with only one highly expressed OR gene.We hypothesized that these three stages represent the stepwise OR gene expression starting with multigenic OR activation followed by one OR gene outcompeting the others and finally becoming the singularly determined one.
We then investigated the 3D connectivity between expressed OR genes and their enhancers at different stages.For this analysis, we included the newly identified OR enhancers.First, we focused on the progenitor stage where the expression of OR genes begins, which is rarely studied in previous research due to the lack of an available technique.At the activation stage, we observed that most activated OR genes have nearby enhancers (median distance to nearest enhancer was 2.29 radii of particle), which are predominantly cis-enhancers from the same chromosome (Fig. 3b,c and Extended Data Fig. 10d).This is consistent with weak inter-chromosomal (trans) OR gene interactions at this stage.This supports the importance and sufficiency of cis-enhancers for OR gene activation, preceding the establishment of inter-chromosomal OR-OR gene cluster interactions.Furthermore, this concept is supported by previous reports that deletion of specific GIs only downregulates the expression of limited numbers of nearby OR genes belonging to the same OR gene cluster [34][35][36] .
After multigenic OR gene activation, one specific OR gene outcompetes other OR genes and becomes the 'winner' before achieving singular OR gene expression.Nevertheless, how the association with GIs contributes to its dominance remains unclear.When analyzing OSNs undergoing OR gene silencing (stage 2), we observed that the dominant OR gene typically associates with a greater number of proximal enhancers compared to these OR genes undergoing silencing (Fig. 3d,f and Extended Data Fig. 10e).Specifically, within a proximity of 150 nm, 11 cells with the prevailing OR gene associate with more enhancers than silencing ones versus four cells showing the opposite trend; in the case of 300 nm, this is 16 cells versus 4 cells.Moreover, our contact map-based analysis further confirms this finding by illustrating that the dominant OR gene displays more specific and stronger interactions with trans-GIs (Extended Data Fig. 10i-k).These results suggest that an increased number of enhancers provide the associated OR gene with more transcriptional sources, thus contributing to its competitive advantage.This suggests a potential positive feedback mechanism between enhanced enhancer connectivity and higher expression levels.
Previous bulk 4C/Hi-C study on fluorescence-activated cell sorting (FACS)-purified OSNs expressing a specific OR gene suggests that the active OR gene interacts frequently with trans-and long-range enhancers 17,32 .Single-cell Hi-C on OSNs showed that each OSN harbors multiple enhancer aggregates and proposed that the active OR gene presumably resides in the largest enhancer aggregates according to the bulk observations 18 .To validate whether the finally chosen OR gene is associated with the largest number of enhancers, we inspected OSNs expressing a singular OR gene; however, we found that the ultimately selected OR genes are typically not located in the largest enhancer aggregates (Fig. 3e,g and Extended Data Fig. 10h).This result refutes the previous speculation that the finally determined OR gene is linked to the largest enhancer aggregate.
Through our investigation into the regulation of OR gene expression, we have developed a comprehensive understanding of how OR enhancers are associated with this process (Fig. 3h).During the GBC stage, both OR genes and OR enhancers are inaccessible, resulting in no OR activation.Subsequently, LHX2 and EBF induce the OR enhancers to become highly accessible, which serve as cis-enhancers and lead to multigenic OR activation.As this process continues, one OR gene associates with multiple enhancers to become the dominant one, while the rest of the OR genes gradually turn off.Ultimately, only a small set of OR enhancers are retained to support singular OR gene expression.

Discussion
In this study, we developed a single-cell multi-omics profiling method that enables the efficient and accurate measurement of both 3D genome structure and gene expression.The throughput of this method could be increased with the help of an automated liquid handler or microwell system equipped with liquid-dispensing capabilities in the future.By applying this assay to the developing OSNs and in combination with the single-cell chromatin accessibility and gene expression atlases, we have comprehensively investigated the regulation of OR expression.We have gained an unprecedented understanding of the stepwise process that governs OR gene determination and the dynamic changes in accessibility of OR enhancers at various stages of OR gene expression.Our multi-omics dataset provides valuable insight into the previously unexplored mechanisms before the establishment of the 'one neuron-one receptor' rule.
It remains unclear how OR gene expression occurs during OSN development.At the progenitor stage, multiple ORs are randomly activated, giving rise to two potential scenarios.The first scenario suggests that all but one of the activated OR genes become inactivated.Alternatively, in the second scenario, all initially activated OR genes are silenced, followed by a random reactivation process where one specific OR gene is chosen for final determination.Our hypothesis holds true if the finally selected OR gene is among those activated at the progenitor stage.However, these possibilities cannot be distinguished by current studies.This still needs to be explored in future research to fully understand the mechanism of OR gene determination.
Our finding uncovers that the active OR gene in mOSNs is usually not situated within the largest enhancer aggregate; however, this observation does not conflict with bulk Hi-C observations that demonstrated active OR genes interact most frequently with trans-and long-range cis-enhancers.The limitation of bulk Hi-C experiments is their inability to capture variability at the single-cell level.It is important to note that the highest contact frequency with GIs does not necessarily require that the active OR gene always interacts with the greatest number of enhancers in individual cells.Instead, the active OR gene interacts with a limited number but different sets of enhancers in individual OSN cells, which also explains the population-based observations.
To further reconcile why the dominant OR gene does not reside within the largest enhancer hub, one plausible explanation lies in the concept of OR 'zone' identity.OR gene selection is biased to predetermined sets of OR genes along the dorsoventral axis of MOE, referred to as 'zones' (refs.37-39).A recent study found that dorsal receptors form the strongest interactions across all zones, which are heterochromatic 40 .To investigate whether the ORs residing within the largest enhancer hub display a bias toward dorsal zone identity, we performed an analysis of OR zone identity on stage 2 and stage 3 OSNs harboring a dominant OR gene.Our findings revealed a significant difference in zone identity between the dominant OR gene and the OR genes located within the largest or second-largest enhancer hub (Extended Data Fig. 10l).Indeed, the largest enhancer hub typically encompasses more dorsal OR genes, which indicates that the largest enhancer hub tends to be inactive.This result potentially resolves the puzzle of why the active OR gene is not situated within the largest enhancer hub.
Previous studies proposed that intergenic OR enhancers facilitate specific and strong inter-chromosomal interactions among OR gene clusters across 18 chromosomes 17 .We observed that both the OR enhancers and their associated transcription factor, LHX2, exhibit peak activity during the INP stage.Notably, the inter-chromosomal contacts between OR-OR gene clusters are relatively weak at this stage.This suggests that additional mechanisms, such as the accumulation of repressive histone modifications on OR gene clusters during the maturation of OSNs 15 , as well as their interaction with HP1 proteins, may govern the compartmentalization of ORs 16 .Heterochromatic protein-guided phase separation could be the potential driving force for the formation of OR gene heterochromatic aggregate, which plays a central role in B compartment formation 41 .

Article
https://doi.org/10.1038/s41592-024-02239-0 The presence of multiple OR-enhancer aggregates in each OSN suggests that simply being associated with an OR-enhancer hub is insufficient for OR activation 42 .
Our findings reveal a noteworthy pattern: OR enhancers undergo reduced accessibility along the course of OSN development, demonstrating that only a subset of enhancers remain active in mature OSNs.Consequently, it can be inferred that only the OR gene interacting with an active multi-chromosomal enhancer hub is expressed, while others remain silenced.
When used, adherent cells (for example, K562 and eHAP) were washed twice in 1× PBS and 0.25% Trypsin-EDTA was added (Thermo Fisher Scientific, cat.no.25200072) and incubated at 37 °C for 5 min, then diluted with complete culture medium to stop trypsinization.Cells were collected by centrifuge at 350g for 5 min and resuspended in 1× PBS.All cell lines were maintained at 37 °C with 5% CO 2 at a recommended density.

Dissociation of single cells from the mouse olfactory epithelium.
The MOE was dissected and minced into small pieces, then dissociated into a single-cell suspension with the Papain Dissociation System (Worthington Biochemical, cat.no.LK003150) at 37 °C for 15 min during incubation, with titration every 5 min with a wide-bore pipette tip according to previously described methods 20 .Then the suspension was filtered with a 30-μm strainer (MACS) and washed twice with ice-cold 1× PBS.

Droplet scRNA-seq
A scRNA-seq library was prepared according to the 10x Genomics guidance using the Single-cell Gene Expression 5′ RNA-seq kit v.1.1 (CG000331_ChromiumNextGEMSingleCell5-v2_UserGuide_RevE).In brief, a dissociated single-cell suspension was stained with 7-AAD, then subjected to flow cytometry to sort viable cells.We used about 40,000 cells as input for each reaction.A total of three 10x runs were generated.For the P4-P7 sample, cells from mice at postnatal day 4 and postnatal day 7 were pooled together to load on the same channel.For P14 and P28 samples, cells were from mice at postnatal day 14 and day 28, respectively.One male and one female mouse were used at each time point.

LiMCA protocol
Our method was based on nucleus-cytoplasm physical separation, such as Trio-seq 21 .The nucleus was submitted to chromosome conformation capture processing and the cytoplasm mRNA was amplified according to the Smart-seq2 procedure 24 .A detailed step-by-step protocol is presented elsewhere 44 .
Single-cell nucleus-cytoplasm separation.In brief, viable cells were picked into a single tube containing 7 μl soft cell lysis buffer (25 mM Tris, pH 8.3, 30 mM NaCl, 0.45% IGEPAL-CA630 and 1 U μl −1 SUPERaseIn), incubated on ice for 30 min, followed by vortexing for 1 min.Samples were centrifuged at 500g for 5 min at 4 °C, then 5 μl supernatant was carefully placed into a new tube.The supernatant was used for Smart-seq2 reverse transcription and amplification 24 .The nuclei were used for chromosome conformation capture.
Integration of METATAC and scRNA-seq profiles.We used the Seurat CCA-based algorithm to integrate the METATAC and 10x scRNA-seq data of MOE.According to the cell typing of previous single-assay analyses, we used GBCs, early/late INPs, iOSNs and mOSNs from the METATAC dataset, and the same group of cells as those used in scRNA-seq pseudotime analysis from the scRNA-seq dataset.ATAC fragments of these cells and the ArchR peaks associated with these cell types were extracted and analyzed using Signac (v.1.11.0).The gene activities of scRNA-seq variable genes were calculated by the GeneActivity function and normalized.The FindTransferAnchors function was used to perform a canonical correlation analysis and identify the anchors between the two assays.According to the anchors, pseudo-transcriptomes of ATAC cells are imputed and merged with the scRNA-seq dataset.Standard principal-component analysis (PCA) and UMAP embedding of Seurat were performed on the resulting integrated dataset.Similar to the processing of transcriptome dataset, pseudotime analysis on the co-embedding space was performed by slingshot.
METATAC pseudotime analysis.We used the 'addTrajectory' function of ArchR with default parameters to reconstruct the trajectory of MOE development in the UMAP embedding space and assign pseudotime values for GBCs, early/late INPs, iOSNs and mOSNs.

Identification and validation of candidate OR enhancers
We utilized the Lhx2 and Ebf ChIP-seq data in mOSNs and previously defined GIs from Monahan et al. 17 .We first interrogated in our peak set if there were peaks in OR gene clusters following the criteria of GIs (overlap with both Lhx2 and Ebf ChIP-seq peaks) but not identified as GIs previously.DNA sequences of the resulting 27 peaks and 63 previously identified GIs were extracted from the mm10 genome and subjected to the XSTREME online server (https://meme-suite. org/meme/tools/xstreme) for de novo motif discovery with default parameters.The resulting motif with the most significant E-value corresponded to the desired composite motif.We used fimo (v.5.3.3) with a q value threshold of 10 −3 to further identify the location of the motif within GIs and candidate peaks, and determined the q values of all matched sites.GIs and identified regulatory peaks were visualized in a Circos plot with ChIP-seq and scATAC-seq (grouped by cell types) tracks of OR clusters using the R packages circlize (v.0.4.12) and ggplot2 (v.3.3.3).
ATAC footprinting was performed by the getFootprints function of ArchR with default parameters on the GIs and candidate peaks centered at the composite motif.Figures were generated by the plot-Footprints function of ArchR with the Tn5 insertion bias normalized by subtraction.
Comparison of composite motif score.We kept all results from FIMO scanning (without the q value < 0.1 filtering) and compared the distribution of FIMO motif scores of different groups (ATAC peaks outside OR gene clusters, ATAC peaks within OR gene clusters, candidate OR enhancers and GIs) using Kolmogorov-Smirnov tests.
Cell filtering.Barcodes with UMI counts over 1,000 and fewer than 25,000, number of detected genes over 200 and mitochondrial counts less than 10% were considered as high-quality cells.
Embedding and clustering.Cells from three batches (P4/P7, P14 and P28) that passed QC were merged, log-normalized and integrated using the Seurat IntegrateData function with anchors identified by the FindIntegrationAnchors function to correct batch effects.The integrated dataset was scaled and embedded using PCA followed by UMAP, using the ScaleData, RunPCA(npcs = 30), RunUMAP functions of Seurat, respectively.K-nearest neighbors of cells were identified using the Seurat FindNeighbors function and Louvain clustering was performed with a resolution of 1.0 using the FindClusters function.Then, cell types were annotated manually according to their marker genes identified by FindAllMarkers with parameters 'min.pct= 0.25, logfc.threshold= 0.25, only.pos= TRUE'.Then, we removed all cells except for GBCs, INPs, iOSNs and mOSNs.The remaining subset was scaled by SCTransform(vst.flavor= 'v2'), followed by RunPCA(npcs = 30), RunUMAP(dims = 1:15), and Find-Neighbors and FindClusters(resolution = 0.5).Six out of the identified 17 subclusters, which mainly consisted of mOSNs, could not be aligned well on the trajectory from GBCs to mOSNs and therefore were removed.
Pseudotime analysis.We used slingshot (v.2.2.1) to construct a trajectory on the UMAP of the subset after removing the outlier subclusters, and assigned pseudotime values for each cell from GBCs to mOSNs.

Calculation of correlation between METATAC and scRNA-seq.
To get the correlation between scATAC-seq and scRNA-seq data, we used the variable genes identified by Seurat and calculated the gene score matrix and the log-normalized UMI count matrix from the ATAC and RNA datasets, respectively.We calculated the mean scores or log-normalized counts for each cell type.The Pearson correlation coefficients between each pair of ATAC and RNA clusters over the variable genes were calculated using numpy (v.1.20.3) and visualized by pheatmap (v.1.0.12).

LiMCA data preprocessing
The RNA data and Hi-C data were preprocessed separately.

RNA data preprocessing.
For RNA data, we followed the Smart-seq2 processing workflow documented in the Human Cell Atlas Data Portal (https://broadinstitute.github.io/warp/docs/Pipelines/Smart-seq2_Single_Sample_Pipeline/README/).In brief, sequencing reads were mapped to transcriptomic references of hg38 (GRCh38) and mm10 (GRCm38) genome assembly for human and mouse data, respectively, using the hisat2 package.We then used RSEM to quantify the RNA reads and generate a gene count and fragments per kilobase of transcript per million mapped reads (FPKM) matrix.
Single-cell Hi-C data preprocessing.Single-cell Hi-C reads were processed as previously described 18 .In brief, contact pairs and contact maps were generated from raw sequencing reads using the hickit pipeline (https://github.com/lh3/hickit).The contact pairs files generated with hickit were then transformed to a Dip-C format for further analysis with Dip-C 'dip-c/scripts/hickit_pairs_to_con.sh' script (https://github.com/tanlongzhi/dip-c).As the human eHAP cell line contains the Philadelphia chromosome (t(9;22)(q34;q11)) and reconstructions of 3D genome structures are sensitive to chromosomal structural variations, for eHAP Hi-C data, we extracted the exact breakpoints, generated a customized hg38 genome reference accordingly and mapped Hi-C reads to it.
Haplotype imputation of contacts.We used the Dip-C method to determine the haplotypes of contacts 18 .In brief, for each read of a contact pair (a leg), we assigned a haplotype if the read segment overlapped with a phased SNP and had a base quality >20.We then performed haplotype imputations of contacts.Specifically, contacts with known haplotypes were used to vote the haplotype of contacts with unknown haplotype; if the majority of voted haplotypes are consistent, then the haplotype of the contact is assigned confidently.

Criteria for cell exclusion
For human cell line data, cells with <100,000 unique contacts (5 of 389 cells) were excluded.For mouse OSN data, only 3 of 411 cells had <100,000 unique contacts, and so we kept all cells.

LiMCA RNA data analysis
Cell embedding and clustering.The Seurat package (v.4.2.0) was used for QC and downstream analysis of RNA count matrices.We filtered cells with fewer than 200 genes detected and genes expressed in fewer than three cells.The filtered count matrices were then normalized using the NormalizeData (for human data) or SCTransform (for mouse data) function.We then performed PCA and UMAP embeddings and Louvain clustering with the following parameters: RunPCA(dims = 1:20), RunUMAP(dims = 1:15), FindNeighbors(dims = 1:10) and FindClusters(resolution = 0.4) for human data; RunPCA(dims = 1:15), RunUMAP(dims = 1:10), FindNeighbors(dims = 1:10) and FindClusters(resolution = 0.4) for mouse data.For each human cell, we calculated cell-cycle phase scores based on known cell-cycle markers and annotated cell-cycle phase using the CellCycleScoring function.Marker genes were identified using the FindMarkers function.
Determination of active ORs in single neurons.For olfactory receptor expression detection, the expression matrix was not filtered with the parameter 'genes expressed in fewer than three cells'.Genuine OR expression was determined by two criteria.First, we filtered out OR genes with an expression level <10 FPKM, as previously described 20 .Second, only OR genes with >90% of the exon region covered were kept, which is necessary to exclude map artifacts and truncated OR transcripts.For OR genes with multiple exons, it was required to detect splicing junctions.This was further confirmed by visual inspection in the Integrative Genomics Viewer (v.2.16.2).

Pseudotime analysis.
Monocle3 was used to construct the continuous developmental trajectory from progenitors to mOSN; pseudotime values were assigned to individual cells.

Analysis of contact maps
Calculation of scA/B values.The scA/B values were calculated from the single-cell contact map with the 'dip-c color2' function (with parameters '-b1,000,000 -H -c color/mm10.cpg.1m.txt').The sex of the mouse MOE cells was confirmed by dissection of adult mice and inferred by analyzing the copy number of sex chromosomes for newborn mice.

Structural cell typing.
We only retained autosomal bins that were present in all cells.The raw single-cell A/B values were rank-normalized to 0-1 in each cell with the scipy rankdata function.Then the rank-normalized scA/B value matrix was used for PCA and UMAP embedding analysis using the Python sklearn and UMAP https://doi.org/10.1038/s41592-024-02239-0packages with the following parameters: PCA(n_components = 20) and UMAP(n_neighbors = 10).

Virtual 4C analysis
We grouped cells into sets with high (above the median) or low (below the median) expression of specific genes such as NFKB1 and performed virtual 4C analysis.For each set of cells, we combined contact maps to generate a pseudobulk contact matrix.Contacts with the gene locus were extracted and normalized by the total number of contacts with the locus.

Single-cell chromatin looping analysis
We first identified cell-type-specific chromatin loops at a 10-kb resolution from published bulk Hi-C or Micro-C maps with the diff_mustache.py script from the mustache package.We then iteratively compared one cell type to others and retained calls unique in that cell type as its cell type-specific chromatin loops.The coolpuppy package (v.0.9.7) was used to generate pileups of merged single-cell maps for each set of loops.We calculated the total contact count for each set of chromatin loops in individual cells using 'dip-c ard' function and combined the counts into a matrix.We noted that the number of chromatin loops was inconsistent across groups due to varying sequencing depths.To eliminate this impact, for each group of loops, we normalized the contact counts by the number of contacts used for loop calling.

Mouse olfactory cell type annotation
For mouse olfactory data, we annotated cell types in two steps.In the first step, based on RNA counts of known marker genes of mouse olfactory epithelium, we manually annotated the four clusters into OSN progenitors, iOSNs, mOSNs and non-neuronal cells.In the second step, we visualized the above-mentioned cell type assignment on the UMAP plot of scA/B values derived from Hi-C data.Based on structural cell typing, the progenitor cluster in RNA embedding was segregated into two discrete clusters that we named progenitor1 and progenitor2, respectively.These two progenitor clusters did not overlap with each other on the UMAP plot of RNA data, further confirming our assignment.

3D genome structure analysis
Generation of 3D genomes.Single-cell 3D genome structures were reconstructed on haplotype-imputed contact maps using the hickit package (with parameters -M -i impute.pairs.gz-Sr1m -c1 -r10m -c2 -b4m -b1m -b200k -D5 -b50k -D5 -b20k).Then, the 3dg files were converted to Dip-C format (with scripts/hickit_3dg_to_3dg_rescale_unit.sh).The transformed 3dg files were further cleaned with the 'dip-c clean3' function to remove repetitive regions.For eHAP, no homolog imputation was needed due to it being a haplotype cell line.For K562 and BJ cells, reconstructions were impractical due to gross chromosomal aberrations or lack of phased SNP information.
3D genome structure alignment and uncertainty estimation.For each single cell, three independent replicate structures were generated.Then, the 'dip-c align' function was used to calculate the median and root-mean-square (r.m.s.) of r.m.s.d. of the single-cell 3D genome structures over all 20-kb particles from three independent replicates.This calculation involved two steps; first, the r.m.s.d. was calculated for each 20-kb particle over three replicate pairs (1-2, 1-3 and 2-3), followed by calculating the median or r.m.s.value over all 20-kb particles.Only r.m.s.-r.m.s.d.< 1.5 cells were considered as low uncertainty and kept for downstream 3D genome structure analysis.The Y chromosome was excluded from further analysis due to its short genomic length and low mappability.
3D genome structure visualization.The 3dg files were transformed into a PyMol-compatible cif format with the 'dip-c color' function and visualized by PyMol (https://pymol.org/2/).Spatial analysis of active genes.For this analysis, we considered expressed genes with ≥1 FPKM as active genes.The radial positions were calculated using 'dip-c color -C' and normalized by setting the genome-wide median to 1. First, we extracted genomic loci with active genes from 20-kb 3D genome structures and counted the number of active genes of each locus.We then calculated the total number of genes for different radial distances using 'dip-c color -R'.
To characterize active gene clustering for each cell, we extracted the midpoints of all active genes into a .legfile and then generated a .3dgfile with the 'dip-c pos' function.We calculated the number of active genes within 3 particle radii from each active gene with 'dip-c color -r 3'.To investigate whether there was a radial preference or prominent clustering of active genes, we used random controls to evaluate background levels.
Specifically, in each single cell, we counted the total number of active genes and randomly sampled the same number of genes from all genes included in the gene annotation files (GENCODE v.M25) regardless of their expression levels.The above analysis was then performed on randomly sampled genes and we compared the distribution between active genes and random controls with a two-sided Mann-Whitney U-test.We used the Dip-C name_color_x_y_z_to_cif.sh script to convert the .3dgfiles of active genes to mmcif-formatted files, which were then used for PyMol visualization.Note that for each active gene, both parental alleles were included in the analysis because most genes express both alleles similarly.

OR-GI 3D structure analysis
3D structure visualization.The 3D position of OR genes and GIs was located from the whole-cell 3D genome structure using the 'dip-c pos' function by providing the corresponding OR genes or GI leg file.Then the OR and GI 3dg files were transformed to cif files for visualization using PyMol.GIs were colored according to chromosomes.

OR-GI spatial relationship analysis.
Pairwise distances between ORs or between ORs and GIs were calculated using 'dip-c pd'.The 'dip-c net-work_around.py'script was then used to record GIs or ORs within 2.5 or 5 particle radii from each OR.In each cell, we extracted the number of GIs from ORs that were actively expressed.OR and GI aggregates were identified in single cells as previously described 18 .For analysis of 3D genome structures, we only retained cells in which the allele of the dominant OR (the active OR with the highest expression level) could be determined based on heterozygous SNPs, as OR expression is monoallelic.The second-dominant OR in single cells was defined as the active OR that had the next highest expression level, while requiring determined allelic information.Ensemble contact maps comparing IKZF2-high and (bottom left) and IKZF2-low (top right).Right: 4 C visualization of interactions between enhancer and the expressed gene, viewpoint centered at downstream enhancer.The green dot/line shows the position of candidate enhancer and the yellow dots/lines represent the position of TSS and TTS.j, The same as i but for MYC.k, Expressed genes spatial distribution of a representative GM12878 cell, genes are colored by gene number within 300 nm.l, Dotplot showing the median gene cluster size within 300 nm for expressed genes and randomly selected genes, the same cell was connected with line (n = 68).Two-sided Wilcoxon signed-rank test for paired data.m, Scatter-plot of expression level versus the normalized radial position at 20 kb resolution of a representative GM12878 cells.n, Correlation distribution of normalized radial position versus expression level among single cells.o, The same as i, but for mean expression level versus median radial position across cell population at 1 Mb resolution.Extended Data Fig. 10 | The spatial relationship between expressed OR genes and their enhancers.a, The same as Fig. 3a, cells are colored according to cell identity.b, Pie chats summarize the OR gene expression information.c, OR FPKM values within single cells of each stage.Each color represents an individual OR, except black, which represents all OR ranked below 8. d, Composition of cis or trans GIs within 150 nm or 300 nm from expressed ORs (considering all expressed ORs). e, Same as Fig. 3f but for the number of GIs within 2.5 particle radii (150 nm) from each expressed OR in single cells.f, Summary of the number of nearby cis-and trans-GIs for each OR in different stages.g, Same as f but grouped by different cell types.h, Left: Same as Fig. 3g but for within 2.5 particle radii (150 nm).Two-sided Wilcoxon signed-rank test for paired sample was used, *** indicates P < 0.01.Right: pie chart depicting the number of cells with active OR residing the largest GI aggregate, the second largest GI aggregate, and others.i, 3D surface plot showing the normalized interaction strength between active OR or inactive ORs (100 randomly selected OR genes) and inter-chromosomal OR enhancers for bulk Hi-C data on OSNs expressing Olfr1507 (left) and Olfr16 (right).Data from ref. 17. j, Boxplot showing the quantification of OR-enhancer interactions for the highest expressed ORs and second highest expressed ORs (stage1 and stage 2 OSNs) and randomly selected inactive ORs, for the random OR control, 10 independent sampling was performed.P values are from twosided one-sample t-tests.k, 3D surface plot showing the normalized interaction strength between active ORs and inter-chromosomal OR enhancers (left) or between randomly selected inactive ORs and inter-chromosomal OR enhancers (right) of LiMCA data.l, Boxplot showing the zone index of the highest expressed ORs and ORs residing within the largest or second largest enhancer hub, only stage 2 and stage 3 OSNs harboring a dominant OR were analyzed, the same cell was connected with a line.P values are from two-sided Wilcoxon signed-rank tests.

Fig. 2 |
Fig. 2 | Multi-omics profiling of the developing OSNs at single-cell resolution.a, Schematics of overall experimental design.b, Summary of number of single cells of the multimodal atlas at each developmental stage.ND, not done.c, UMAP visualization of single cells based on gene expression (left) and chromatin structure (scA/B compartment values) (right) of the LiMCA multi-omics dataset.d, The 3D positioning of OR genes and enhancers in a progenitor cell expressing multiple ORs.Enhancers and ORs are shown as spheres with radii of 0.9 and 0.27 particle radii (54 and 18 nm), respectively.Expressed ORs were displayed with radii of 1.2 particle radii (72 nm).e, Single-cell chromatin accessibility atlas of developing MOE.HBC, horizontal basal cell.OEC, olfactory ensheathing cell; OB, olfactory bulb; SUS, sustentacular cell; MV, microvillous cells.f, Circos plot showing 27 newly identified and 63 known OR enhancers; five sectors from interior to periphery were ensemble

Fig. 3 |
Fig. 3 | Stepwise OR determination observed with single-cell joint profiling of chromatin architecture and gene expression.a, Cells were classified into three stages based on the total OR expression level and the ratio of OR with highest expression level of the developing OSNs (progenitor, iOSN and mOSN).b, The 3D positioning of ORs and enhancers in a representative cell at multigenic OR activation stage with expressed ORs depicted in detail, revealing that cis-enhancer activates their expression.c, Histogram summarizing the percentage of cis-enhancers and trans-enhancers within 150 nm of expressed ORs for three OR expression stages.d, The 3D positioning of ORs and enhancers in a representative cell at silencing stage with the dominant and silencing OR depicted in detail.e, The 3D positioning of ORs and enhancers in a representative cell at singular OR activation stage with the selected OR depicted in detail.f, Number of enhancers within 300 nm of active dominant OR and secondhighest expressed OR of the same cell (connected with a line).Statistical significance is labeled.A two-sided Wilcoxon signed-rank test was used.g, Number of OR enhancers of the largest enhancer aggregate, second-largest enhancer aggregate and active OR-residing enhancer aggregate of the same cell within 5 particle radii (300 nm) (connected with a line, n = 18).A two-sided Wilcoxon signed-rank test for paired data was used, ***P < 0.01.h, Illustration showing the stepwise OR gene determinations and their coordination with OR enhancers, the accessibility of OR enhancers and OR genes, and the expression of Lhx2 along OSN development is shown below. Articlehttps://doi.org/10.1038/s41592-024-02239-0 4 ): the multigenic OR activation stage (stage 1) P = 0.012 Multigenic OR activation stage ORs and GIs are highly accessible Silencing stage One OR outcompetes the rest Singular Article https://doi.org/10.1038/s41592-024-02239-0

Extended Data Fig. 1 |
Validation of LiMCA.a, Comparison between LiMCA and Dip-C, scatter-plot for contacts number versus contact ratio (left) and reads number versus contact number (right).b, Violin plot showing the proportion of cis and trans contacts.c, Scatter-plot showing reads number versus detected genes for RNA.d, Insulation score of ensemble LiMCA is high concordant to bulk Hi-C, calculated at 50 kb resolution.e, Contact maps comparison between ensemble LiMCA and bulk Hi-C at 1 Mb resolution, all chromosomes are shown.f, Two selected regions showing ensemble can detect chromatin loops, RNA-seq tracks are shown below.g, Venn diagram showing chromatin loops detected by ensemble LiMCA and bulk Hi-C, loops are called with HICCUPS.h, Downsample analysis showing the relationship between number of detected chromatin loops (top panel) or precision rate of detected chromatin loops and downsampled cell number (bottom panel).Each cell number were independently sampled 5 times.i, Heatmap showing the correlation of A/B compartment score (first eigen value) (left) and insulation score (left) between ensemble LiMCA and in situ Hi-C.j, A imputed contact map of a representative GM12878 cells and the reconstructed 3D structure at 20 kb resolution (Top).Four chromosomes with expressed genes projected.Extended Data Fig. 2 | LiMCA accurately detects cell-type-specific gene expression and chromatin structures.a, Ensemble contacts maps of four cell lines at 1 Mb resolution, translocations are highlighted with red arrow.b, UMAP showing four represented markers for each cell type and one maker gene of G2M phase.c, Expression of top cell-type-specific marker genes for each cell type, top 20 of each cell type are shown.d, Mean scA/B value of cell-type-specific marker genes among single cells.For each cell types, the top 200 marker genes were identified from the paired transcriptome data.e, Scatter-plot showing the mitotic contact band (2-12 Mb) ratio versus short-range contacts (< 2 Mb) ratio.f, UMAP visualizing the mitotic band ratio of cell line embedding.g, Represented contact maps of cells in metaphase cluster.h, Pile-up of cell-type-specific chromatin loops using ensemble interaction profiles from each cell type.i, Heatmap showing the enrichment of cell-type-specific chromatin loops among single cells.Extended Data Fig. 3 | The relationship between 3D genome organization and expression.a, Comparison of pairwise 3D distance matrix measured, ensemble LiMCA and bulk Hi-C contact map.b, Histogram of expression level distribution for NFKB1 (left), IKZF2 (middle) and MYC (right).c, Histogram showing the distribution of detected gene numbers (left), RNA counts (middle) and contact numbers between NFKB-high and NFKB1-low group.n.s., no significance, two-sided Mann-Whitney U rank test.d, Differential contact matrix around NFKB1 (chr4: 102.3-102.8Mb) between NFKB1-high and low groups.e, Downsample analysis for NFKB1 gene.Left: contact matrices around NFKB1 (chr4: 102.3-102.8Mb), representing ensemble Hi-C data from NFKB-high (top left) and NFKB1-low (bottom right).Middle: Differential matrix between NFKBhigh and NFKB1-low group.Right: Normalized contact frequency plot, centered at NFKB1 upstream enhancer.f, The same as e, but cells are randomly grouped.g-h, The same as c, showing downsampled groups (g) and randomly assigned groups (h).n.s., no significance, two-sided Mann-Whitney U rank test.i, Left:

Fig. 4 |
Gene expression of the single-cell LiMCA multi-omics atlas recapitulate the continuous OSN genesis and marker gene expression.a, Expression of top cell-type-specific marker genes.The top 20 maker genes are plotted.b, UMAP projection of known maker genes for OSN progenitors, iOSNs and mOSNs.c, The same as Fig. 2c (left), colored by mouse age and pseudotime.d, Heatmap showing the continuous gene expression change along OSNs genesis using pseudotime.e, Scatter-plot showing the dynamic gene expression of known maker genes.Extended Data Fig. 5 | LiMCA recaptures the characteristic 3D genome reorganization along OSNs development.a, The same as Fig. 2c, The same cell was connected in the two embeddings.b, The same as Fig. 2d (right), cells were colored by mouse age.c, UMAP showing the embedding of integrated MOE cells of this study and published MOE data (Tan et al. 18 , Nat.Struct.Mol.Biol.2019), cells are colored by cell label in each dataset.d, The same as c, cells are labeled by different mouse crosses.e, UMAP visualization of inter-chromosomal contact ratio, long-range (> 20 kb) intra-chromosomal ratio and short-range (< 20 kb) intra-chromosomal ratio (top), and the boxplot quantification of these values, the black horizontal line and the box represent the median and quartiles, respectively (bottom).The whiskers indicates minima and maxima.(non-neuronal, n = 22; progenitor1, n = 47; progenitor2, n = 111; iOSN, n = 85; mOSN, n = 146).f, Pile-up of interactions between ORs and GIs of ensemble interaction profiles from each cluster (top), and bulk Hi-C datasets from Monahan et al. 17 , (2019) (bottom).g, Boxplot showing the gradual increasing of OR-OR, GI-GI interaction strength along OSN development.The box horizontal line and the box represent the median and quartiles, respectively.(non-neuronal, n = 22; progenitor1, n = 47; progenitor2, n = 111; iOSN, n = 85; mOSN, n = 146).h, Contact maps of ensemble interaction profiles of each cell cluster at regions of chr2: 30-95 Mb and chr9: 3-50 Mb, OR gene clusters are indicated.i, Contact maps of ensemble interaction profiles of each cell cluster of chromosome 2 (left).Extended Data Fig. 6 | Quality control and overview of MOE METATAC results.a, FACS gating strategy to sort single nuclei for METATAC experiment.b, Quality control metrics for METATAC dataset, from left to right are Distribution of ratio of decontaminated fragments in peaks (FRiP), number of decontaminated fragments, percentage of mitochondrial fragments, fragment sizes distribution, and TSS enrichment for METATAC cells of four batches, respectively.Numbers of cells are 4090, 3146, 2850, 1794, for P3, P4, P14, and P28, respectively.The box horizontal line and the box represent the median and quartiles, respectively, and the whiskers extends 1.5*interquartile range.c-d, UMAP of MOE METATAC cells colored by (c) cell source batches, and (d) gene scores of marker genes.e, Proportion of cell types associated with MOE development for cells from P3, P7, P14, and P28 mice.f, Tracks of METATAC signals normalized by number of reads in TSS near the promoter of marker genes of HBCs, GBCs, INPs, and OSNs.g, Heatmap shows the correlation coefficients between cell clusters of MOE 10x scRNA-seq and METATAC.Extended Data Fig. 7 | METATAC identifies new candidate OR enhancers.a, TF-binding motifs enriched in HBCs, GBCs, INPs, and OSNs.b, Numbers of detected OR genes in various cell types are shown as a boxplot (n = 11,880, for each cluster information is store at source data).An OR gene is considered to be detected if it has a gene score > 0 as calculated by ArchR.The box horizontal line and the box represent the median and quartiles, respectively, and the whiskers extends 1.5*interquartile range.c-d, Lhx2 and Ebf ChIP-seq signals (c) and METATAC signals of different MOE cell development stages (d) at previously defined 63 Greek Islands and 27 candidate regulatory peaks identified in this study.Aggregated signal of all GIs or peaks are shown above or below the heatmap, respectively.e, METATAC footprints at composite motif sites in GIs or identified peaks.Aggregated normalized METATAC insertions and the Tn5 bias corrected are shown.Extended Data Fig. 8 | Quality control and overview of MOE scRNAseq results.a, Number of detected genes, UMI counts, and percentage of mitochondrial UMIs for cells from P4/P7 (N = 28,303), P14 (N = 24,116), and P28 (N = 21,158) mice.b-c, UMAP of scRNA-seq data colored by origin batch (b) and cell types (c).The box horizontal line and the box represent the median and quartiles, respectively, and the whiles extends 1.5*interquartile range.d, Heatmap shows the expression of marker genes of each cell type.For cell types with more than 1,000 cells, top 1,000 representative cells with the highest UMI counts are shown.e, UMAP of the subset of scRNA-seq dataset including GBCs, INPs, iOSNs, and mOSNs, colored by previously identified cell types (left), new clusters (middle), and pseudotime (right).f-g, Dynamics of the expression of Omp (f), and Lhx2, Ebf1, Ebf2 (g) during the development of MOE.h, Proportion of cell types associated with MOE development for cells from P4/P7, P14, and P28 mice in the scRNA-seq dataset.i, UMAP of the co-embedded METATAC and scRNA-seq data from GBC to mOSN, colored by RNA clusters (left) and METATAC clusters (right).Extended Data Fig. 9 | Identification of genuine OR expression with LiMCA full-length transcript information.Genome browser views of mRNA read coverage profiles for different OR genes in example single cells.Read bars in the top panel indicates heterozygous SNP sites of each cross.mRNA coverage, junctions, and reads are shown below.Gene annotations are taken from GENCODE.Coverage is set to logarithmic scale.The first two columns show genuine OR expression and the last two columns show false OR expression.Red shades highlight the incomplete read covrages.The identified OR-expressing alleles are represented by ♀ (maternal) or ♂ (paternal), respectively.

A multi-omics atlas of the developing OSNs
Article https://doi.org/10.1038/s41592-024-02239-041.Larson, A. G. et al.Liquid droplet formation by HP1α suggests a role for phase separation in heterochromatin.Nature 547, 236-240 (2017).42.Pourmorady, A. & Lomvardas, S. Olfactory receptor choice: a case study for gene regulation in a multi-enhancer system.Curr.Opin.Genet.Dev.72, 101-109 (2022).and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons.