Staged developmental mapping and X chromosome transcriptional dynamics during mouse spermatogenesis

Ernst, Christina; Eling, Nils; Martinez-Jimenez, Celia P.; Marioni, John C.; Odom, Duncan T.

doi:10.1038/s41467-019-09182-1

Download PDF

Article
Open access
Published: 19 March 2019

Staged developmental mapping and X chromosome transcriptional dynamics during mouse spermatogenesis

Nature Communications volume 10, Article number: 1251 (2019) Cite this article

26k Accesses
131 Citations
51 Altmetric
Metrics details

Subjects

Abstract

Male gametes are generated through a specialised differentiation pathway involving a series of developmental transitions that are poorly characterised at the molecular level. Here, we use droplet-based single-cell RNA-Sequencing to profile spermatogenesis in adult animals and at multiple stages during juvenile development. By exploiting the first wave of spermatogenesis, we both precisely stage germ cell development and enrich for rare somatic cell-types and spermatogonia. To capture the full complexity of spermatogenesis including cells that have low transcriptional activity, we apply a statistical tool that identifies previously uncharacterised populations of leptotene and zygotene spermatocytes. Focusing on post-meiotic events, we characterise the temporal dynamics of X chromosome re-activation and profile the associated chromatin state using CUT&RUN. This identifies a set of genes strongly repressed by H3K9me3 in spermatocytes, which then undergo extensive chromatin remodelling post-meiosis, thus acquiring an active chromatin state and spermatid-specific expression.

Single-cell RNA-seq uncovers dynamic processes orchestrated by RNA-binding protein DDX43 in chromatin remodeling during spermiogenesis

Article Open access 29 April 2023

The dynamic genetic determinants of increased transcriptional divergence in spermatids

Article Open access 10 February 2024

scATAC-Seq reveals heterogeneity associated with spermatogonial differentiation in cultured male germline stem cells

Article Open access 12 December 2022

Introduction

Spermatogenesis is a tightly regulated developmental process that occurs in the epithelium of seminiferous tubules in the testis and ensures the continuous production of mature sperm cells. In the mouse, this unidirectional differentiation process initiates with the division of spermatogonial stem cells (SSC) to form a pair or connected chain of undifferentiated spermatogonia (A_paired and A_aligned)¹. These cells then undergo spermatogonial differentiation, involving six transit-amplifying mitotic divisions generating A_1–4, Intermediate, and B spermatogonia to give rise to pre-leptotene spermatocytes (pL) and initiate meiosis².

Meiosis consists of two consecutive cell divisions without an intermediate S phase to produce haploid cells and includes programmed DNA double strand break (DSB) formation, homologous recombination, and chromosome synapsis³. To accommodate these events, prophase of meiosis I is extremely prolonged, lasting several days in males, and is divided into leptonema (L), zygonema (Z), pachynema (P) and diplonema (D). Following the two consecutive cell divisions, haploid cells known as round spermatids (RS) are produced, which then undergo a complex differentiation programme called spermiogenesis to form mature spermatozoa⁴.

Spermatogenesis is tightly orchestrated, with tubules periodically cycling through 12 epithelial stages defined by the combination of germ cells present⁴. The completion of one cycle takes 8.6 days in the mouse, and the overall differentiation process from spermatogonia to mature spermatozoa requires ~35 days⁵. Thus, four to five generations of germ cells are present within a tubule at any given time, making the isolation and molecular characterisation of individual sub-stages during spermatogenesis difficult.

We use droplet-based single-cell RNA-Sequencing (scRNA-Seq) to elucidate the transcriptional dynamics of germ cell development in the adult testis. To confidently identify and label cell populations throughout the developmental trajectory, we profile cells from the first wave of spermatogenesis, where cells have only progressed to a defined developmental stage. This allows us to unambiguously identify the most mature cell-type by comparison with adult and to characterize the dynamic differentiation processes of somatic cells and spermatogonia that are enriched in juvenile testes.

Transcriptional complexity varies widely across germ cell development. For instance, early meiotic spermatocytes have characteristically low RNA synthesis rates, and are thus excluded by standard analysis protocols. To overcome this, we apply a statistical method that recovers thousands of cells with low transcript count that were originally classified as empty droplets⁶, revealing molecular signatures for leptotene and zygotene spermatocytes.

Finally, we focus our attention on the inactivation and reactivation of the X chromosome, which is subject to transcriptional silencing as a consequence of asynapsis⁷. By combining bulk and single-cell RNA-Seq approaches, we find that de novo gene activation shows an unexpected diversity of temporal expression patterns in post-meiotic spermatids. Profiling the associated chromatin landscapes of X chromosome re-activation, we reveal that de novo escape genes carry high levels of repressive H3K9me3 in spermatocytes prior to re-activation. Overall, our study presents an in-depth characterisation of mouse spermatogenesis and provides insights into the epigenetic control of X chromosome reactivation in post-meiotic spermatids.

Results

Single-cell RNA-Seq of adult spermatogenesis

Adult testes show a high degree of cellular heterogeneity due to the continuous production of male gametes within seminiferous tubules (Fig. 1a). Based on the combination of cell-types, the seminiferous epithelium is classified into 12 distinct stages in mouse, and tubules of all stages exist in adult testes (Fig. 1a, b, Supplementary Fig. 1a).

To characterise the transcriptional programme underlying mouse spermatogenesis, we used multiple functional genomics approaches in combination with matched histology to profile specifically staged juvenile (postnatal days (P) 5–35) and adult (8–9 weeks) C57BL/6J (B6) mice. We generated unbiased droplet-based scRNA-Seq data for eight developmental time-points using single-cell suspensions from whole testis. Two timepoints (P5 and adult) were profiled in replicates, confirming consistent sampling of cell-types and the absence of batch effects (Supplementary Fig. 2a, b). Additionally, we generated whole-tissue bulk RNA sequencing for 15 time-points across juvenile development in duplicates, and profiled the chromatin state of purified cell populations using CUT&RUN (Cleavage Under Targets & Release Using Nuclease) in juvenile animals at P24, P26 and P28 in duplicates (Fig. 1c, Methods)⁸. After quality control and filtering, we retained a total of 53,510 single cells, 30 bulk RNA-Seq libraries and 32 CUT&RUN libraries (Methods, Supplementary Data 1).

To analyse our single-cell data, we first integrated all scRNA-Seq libraries using a mutual nearest neighbour (mnn) mapping approach⁹ before performing graph-based clustering (Methods, Supplementary Fig. 2c, d). To allow consistent visualisation throughout this study, we performed dimensionality reduction using t-distributed Stochastic Neighbour Embedding (tSNE) of all integrated single-cell transcriptomes. Focusing on cells isolated from adult B6 testis, we identified the major cell populations based on known marker genes, including spermatogonia (Dmrt1 expression¹⁰), spermatocytes (Piwil1¹¹), round and elongating spermatids (Tex21 and Tnp1, respectively¹²), as well as the main somatic cell-types, Sertoli (Cldn11¹³) and Leydig cells (Fabp3¹⁴) (Fig. 1d). Clustering across all cells identified 8 sub-stages in spermatocyte and 11 sub-stages within the spermatid population (Fig. 1e, Supplementary Fig. 2e), for which we identified marker genes that highlight the dynamic gene expression changes occurring throughout adult spermatogenesis (Supplementary Data 2).

Developmental mapping of the first wave of spermatogenesis

Historically, sub-staging of cell-types within the testis was based on changes in nuclear or cellular morphology (Supplementary Fig. 1b)^4,5. Previous attempts to complement morphology with molecular signatures have focused on FACS-based and sedimentation assays, where the resolution was unable to differentiate between sub-cell-types^15,16,17.

To link our computationally-defined cell clusters with morphologically-defined sub-cell-types, we exploited the first wave of spermatogenesis where cells have only progressed up to a defined stage. Starting around P4, spermatogonia begin to differentiate, forming the first generation of spermatocytes as early as P10, RS by P20, and completing the first wave with the production of mature spermatozoa between P30 and P35 (Fig. 2a)^18,19.

To capture these key developmental transitions, we sampled seven time-points between P5 and P35 to generate additional scRNA-Seq libraries (Fig. 2a). The population of developing germ cells was strongly enriched at the expected developmental stage, as quantified by the percentage of cells in each cluster (Fig. 2b, c, Supplementary Fig. 3a, Methods).

The earliest time-points, P5 and P10, consist almost entirely of somatic cell-types and spermatogonia, whereas P15 is enriched for early and mid-pachytene spermatocytes. By P20, we detect an enrichment across all spermatocyte stages, as well as a group of early RS, which was validated with matched histology showing a large number of tubules in late stages IX–XII (Supplementary Fig. 3b) and the first occurrence of early RS¹⁸. At P25, we observed cells matched to our first nine clusters of spermatids, which we labelled according to morphologically-defined spermatid sub-stages S1–S9 (Fig. 2c, Supplementary Fig. 3a) as spermatids reach the elongating state between P24 and P26¹⁹. The two late time-points (P30 and P35) showed a relatively even distribution of cells across all groups, closely resembling the adult.

To further validate the identity of the cell clusters, we used bulk RNA-Seq from testis samples collected every 2 days during the first wave of spermatogenesis between P6 and P34 (Fig. 2a). We matched the bulk and scRNA-Seq data by using a probabilistic classification of the bulk samples based on cluster-specific marker genes obtained from the adult scRNA-seq data (Supplementary Fig. 3c, Methods). This confirmed that between P6 and P14, spermatogonia and somatic cells show the highest contribution to the transcriptomic profile. Between P16 and P20, bulk RNA-Seq samples display a spermatocyte-specific gene expression signature, after which spermatids become the transcriptionally dominant cell-type. By P26, spermatids reach the elongating state, where transcription ceases due to the beginning of the histone-to-protamine transition²⁰. Following this transition, changes in RNA content are mostly due to degradation; thus, bulk transcriptional profiles can only be classified up to S7/S8.

Finally, we validated our approach by using publicly available scRNA-Seq data of highly purified germ cell populations²¹. After synchronising the first wave of spermatogenesis by manipulating retinoic acid (RA) synthesis, Chen et al. isolated cells from 20 developmental stages, including mitotic, meiotic, and post-meiotic cells. Mapping these cells onto our adult trajectory using the mnn approach (Methods) confirmed our previously-assigned cluster identities, with meiotic and post-meiotic cells mapping at the expected positions, ordered along our developmental trajectory (Fig. 2d).

In sum, by using newly generated and publicly available data of the first wave of spermatogenesis in combination with histological analyses, we assigned transcriptional profiles to specific, morphologically-defined germ cell-types in the adult.

Somatic cell differentiation in postnatal testes

We further exploited the first spermatogenic wave to analyse somatic support cells, which make up a large proportion of juvenile testes. At P5 and P10 the majority of captured cells were of somatic origin (Fig. 2c) and contained a substantial number of cells with expression profiles similar to adult Leydig and Sertoli cells, as well as a large population of Fetal Leydig cells (FLCs) based on Dlk1 expression²². In addition, we detected cells forming the basal lamina, such as peritubular myoid cells (PTM, Acta2²³) and vascular endothelial cells (Tm4sf1²⁴), as well as testicular macrophages (Cd14²⁵), for all of which we identified specific marker genes (Supplementary Fig. 4a–c, Supplementary Data 3).

We then performed differential expression (DE) analysis between P5 and P10 to capture gene expression signatures associated with differentiation of somatic cell-types (Supplementary Fig. 4d, Supplementary Data 4, Methods). Immature Sertoli cells proliferate for a short period during postnatal development before reaching a terminally differentiated state around P15 in the mouse²⁶. We captured this transition between P5 and P10, reflected by the increased expression of Cldn11 in P10 Sertoli cells (Supplementary Data 4). CLDN11 is critical for the formation of tight junctions between neighbouring Sertoli cells, to separate the seminiferous tubule into basal and apical regions and establish the blood-testis barrier¹³. We also detect high proportions of spermatocyte-specific genes in P10 Sertoli cells (Supplementary Fig. 4d, Supplementary Data 4), which is consistent with their role in the phagocytic removal of apoptotic germ cells²⁷. This process can result in the acquisition of germ cell-specific transcripts as recently demonstrated for the spermatid-specific gene Prm2²⁸. In contrast, at P5 we detect highly specific expression of Ptgds (Supplementary Fig. 4d; Supplementary Data 4), which is upstream of the prostaglandin D2 (PGD2) signalling pathway that stimulates both transcription as well as nuclear translocation of SOX9 to induce Sertoli cell differentiation²⁹.

FLCs are involved in androgen production and regulation of Sertoli cell differentiation, and are gradually replaced by adult Leydig cells (ALCs) during postnatal development²². They broadly cluster in two populations, one of which is only detected at P5 (Supplementary Fig. 4a, b) and characterised by high levels of Stmn1 and Lgals1 (Supplementary Fig. 4c, Supplementary Data 3). DE of the second cluster of FLCs between P5 and P10 shows a higher expression of Dlk1 and Inhba at P5 (Supplementary Fig. 4d, Supplementary Data 4), the latter is necessary for stimulating Sertoli cell differentiation³⁰. In contrast, adult-like Leydig cells at P10 display specific expression of Hsd3b6 (Supplementary Fig. 4d, Supplementary Data 4), which is a marker for ALCs and involved in steroid synthesis²².

In sum, our single-cell expression analysis of early postnatal testes provided a molecular characterisation of developing somatic support cells and captured the transcriptional heterogeneity associated with differentiation.

Cellular heterogeneity during spermatogonial differentiation

Compared to adult, spermatogonia are relatively enriched during juvenile development, which facilitates the identification of sub-populations. SSCs originate during the first wave of spermatogenesis from pro-spermatogonia (or gonocytes) that migrate towards the periphery of the seminiferous tubules shortly after birth and undergo differentiation³¹. In addition to generating SSCs, pro-spermatogonia can also differentiate directly into type A spermatogonia and initiate the first wave of spermatogenesis, a feature that appears to be specific to the first wave in mice and can result in subtle differences between the first and subsequent waves^26,32. To capture this fate transition, we clustered germ cells from P5 and broadly identified pro-spermatogonia and spermatogonia (Supplementary Fig. 5a). Marker genes for pro-spermatogonia included Eif2s2, which has been linked to testicular germ cell tumours, that originate from pro-spermatogonia^33,34 (Supplementary Data 5, Supplementary Fig. 5a).

The spermatogonia could be further split into three different sub-populations, two of which closely resembled the expression profile of undifferentiated spermatogonia (Etv5, Zbtb16) including cells expressing Gfra1, associated with stem cell function³⁵. The third population expressed markers for spermatogonial differentiation such as Stra8 (Stimulated by retinoic acid 8), in accord with pro-spermatogonia directly transitioning into differentiating spermatogonia³² (Supplementary Fig. 5a).

To capture the full spectrum of spermatogonial differentiation, we combined cells annotated as spermatogonia from P10 and P15 to obtain 1165 transcriptional profiles (Fig. 3a, b) before ordering them along their differentiation time-course (Fig. 3a, c, Supplementary Fig. 5b, Methods). We detect two clusters (Undiff 1 and Undiff 2) corresponding to undifferentiated spermatogonia (A_undiff) based on expression of Zbtb16 and Sdc4 (Fig. 3b, c; Supplementary Data 6)³⁶, including a small number of cells that express SSC markers, Gfra1 and Id4³⁵ (Supplementary Fig. 5c, Methods).

Based on the expression of Stra8, we can map the point at which spermatogonial differentiation is induced (A_aligned-to-A₁ transition), thus marking the transition to differentiating spermatogonia (A_diff) (Fig. 3b, c)³⁷. A_diff are highly proliferative, including A_1–4, Intermediate and B spermatogonia and express Dmrtb1 at late stages, which mediates the mitosis-to-meiosis transition and quickly disappears in pre-leptotene spermatocytes (pL)¹⁰. This latter population shows a second increase in Stra8 expression, necessary for meiosis initiation (Fig. 3b, c)^37,38.

To confirm our labelling, we mapped the RA-synchronised cells²¹ onto our spermatogonial sub-populations (Methods). Indeed, RA-synchronised A₁ spermatogonia mapped to our A_Undiff-to-A_Diff population and RA-synchronised Intermediate spermatogonia (In) matched the transition between our A_Diff and In_B populations. Following the trajectory, RA-synchronised B spermatogonia and G1 pre-leptotene spermatocytes matched our In_B population, followed by RA-synchronised early-to-late pre-leptotene spermatocytes (epL-lpL) that matched our pL population (Fig. 3b). This confirmed our assigned cell-type identities for spermatogonial subpopulations, and revealed an under-representation of late differentiating spermatogonia and preleptotene spermatocytes in our analysis.

Identification of leptotene and zygotene spermatocytes

The transition between differentiating spermatogonia and spermatocytes is a gradual process that occurs in stage VI tubules when B spermatogonia divide into pre-leptotene spermatocytes³⁸. Despite the enrichment for early germ cells in our juvenile samples, we observed few cells representing late differentiating spermatogonia and pre-leptotene spermatocytes (Fig. 3b), as well as a lack of leptotene and zygotene spermatocytes bridging the spermatogonia and spermatocyte populations (Fig. 2d).

RNA synthesis gradually declines in differentiating spermatogonia, reaching very low levels in leptotene and zygotene spermatocytes^39,40. This presents a challenge in droplet-based scRNA-Seq approaches, because cells with low transcriptional complexity are likely classified as empty droplets (Methods). To capture these transcriptionally quiescent cells, we applied a computational method (EmptyDrops) that distinguishes between droplets capturing genuine cells containing small amounts of mRNA versus empty droplets containing only ambient mRNA⁶ (Methods). Applying this approach to P15, a time-point naturally enriched for early meiotic spermatocytes, recovered 9792 additional cells, compared to the CellRanger pipeline (Supplementary Data 1). This identified an otherwise-inconspicuous population of cells connecting spermatogonia and spermatocytes at the predicted position in the trajectory (Fig. 4a, b). Examining the total number of genes expressed in these cells confirmed a low complexity transcriptome, in accord with a transcriptionally quiescent state (Fig. 4b).

Unsupervised clustering identified two main sub-populations within the recovered cells that, when compared with the RA-synchronised cells, resembled (pre-)leptotene spermatocytes (L) and zygotene spermatocytes (Z) (Fig. 4b). Marker genes include components of the meiotic cohesin complex (Smc3, Smc1b and Rec8 - L), the synaptonemal complex (Syce2 - L; Sycp1/2/3 and Syce3 - Z), as well as genes involved in DNA DSB formation and repair (Prdm9 and Brca2 - L; H2afx - Z) (Fig. 4c, Supplementary Data 7). This recapitulates the known biological processes of early meiotic prophase^41,42,43, thus confirming that, despite low transcriptional complexity, we obtain high-quality transcriptomes for early spermatocytes.

Repeating the bulk RNA-Seq classification for time-points P6–P20 using marker genes from these newly identified cell-types confirmed the presence of leptotene spermatocytes as early as P10 and P12 (Fig. 4d). Among the top markers for (pre-)leptotene spermatocytes and expressed throughout zygonema, was the testis-specific protease Prss50 (Fig. 4c)⁴⁴. We confirmed the specific expression of this gene in early meiotic cells by performing single-molecule RNA in situ hybridisation (ISH) using the RNAScope technology (Methods), which also confirmed an enrichment for these cell-types at P10 and P15 (Fig. 4e, Supplementary Fig. 6a–d).

Applying EmptyDrops to the different time-points consistently increased the number of early spermatocytes from P15 to Adult and also recovered FLCs at P15 and P20, consistent with these cells undergoing gradual atrophy (Supplementary Data 1, Supplementary Fig. 6e-g). Additionally, we detect an increase in transcriptionally quiescent late spermatids, particularly at P35 (Supplementary Fig. 6g).

High-resolution characterisation of male meiosis

The mitotic expansion of spermatogonia produces large numbers of spermatocytes that undergo meiotic cell division. To characterise transcriptional changes throughout the prolonged prophase of meiosis I, we first ordered spermatocytes along their differentiation trajectory, which revealed a strong increase in the number of genes expressed (Fig. 5a, Methods). The highest number of genes was observed in diplotene spermatocytes (D), the latest cell-type in prophase I with active RNA synthesis⁴⁰.

To profile increasing or decreasing transcription throughout meiotic prophase, we correlated each gene’s normalised expression level to the number of genes expressed (Supplementary Data 8, Methods). As expected, known marker genes for early meiotic processes, such as Hormad1, decreased in expression during prophase I. In contrast, we found increasing expression for Pou5f2, which we identified as a marker gene for diplotene spermatocytes (Fig. 5b, Supplementary Data 2). RNA ISH for Pou5f2 in adult testes confirmed the specific expression during late prophase, with signal intensity being highest in Stage IX–XI tubules (Fig. 5c, Supplementary Fig. 7a, b).

Visualising the top ten marker genes, we observed distinct temporal expression patterns, with the majority of early pachytene 1 (eP1) markers associated to fertility phenotypes (Fig. 5d). Overall, we observed an enrichment for fertility genes in our full list of marker genes for all germ sub-cell-types (Fisher’s Exact test, p < 2.2 × 10⁻¹⁶) (Supplementary Data 2, Methods).

Meiosis culminates in metaphase, where the spindle checkpoint typically eliminates aneuploid cells. However, whether initiation of the spindle checkpoint results in gene expression perturbation is currently unknown. We therefore used an aneuploid mouse line that carries one copy of human chromosome 21 (Tc1 mice), causing frequent congression defects and an arrest at metaphase I during male meiosis⁴⁵.

We profiled spermatogenesis in adult Tc1 mice and matched litter-mate controls (Tc0 mice) and processed the scRNA-Seq data together with all single cells from adult and juvenile B6 animals (Methods). As expected, Tc1 mice showed an enrichment across spermatocytes, whereas post-meiotic cell-types were reduced (Fig. 5e)⁴⁵. However, the presence of the human chromosome resulted in gene expression differences in fewer than ten mouse genes in any given cell-type (Supplementary Fig. 7c, Supplementary Data 9). This implies that the sub-fertile phenotype of Tc1 males is not driven by transcriptional differences, but rather caused by activation of the spindle checkpoint independent of transcription. Together this demonstrates that the meiotic gene expression programme is robust to both aneuploidy as well as variation in the cell-type composition within tubules.

Transcriptional dynamics during spermiogenesis

During spermiogenesis, chromatin condensation packages the haploid genome into the confined space of the sperm nucleus (Fig. 6a)¹¹. To dissect the gradual chromatin remodelling during spermatid differentiation, we examined the expression of histone variants, transition proteins, and protamines (Fig. 6a, b).

Histone variants are highly expressed in early RS including H3.3, which consists of two genomic copies (H3f3a and H3f3b) with distinct expression patterns across spermatogenesis (Fig. 6b, Supplementary Fig. 8a). Both variants cause male fertility phenotypes upon perturbation, however the phenotypes associated with the more dynamically regulated paralog H3f3b are more severe⁴⁶.

We also detected up-regulation of canonical histones, including Hist1h2bp and Hist1h4a specifically during early and mid-spermiogenesis (Fig. 6b, Supplementary Fig. 8b). Canonical histones are typically transcribed only during S phase in a replication-dependent manner⁴⁷, thus the atypical expression during spermiogenesis could suggest important roles as replacement histones during chromatin remodelling.

Testis-specific histone variants showed highest expression in elongating spermatids, particularly from S6 onwards. Hils1 and H1fnt decreased towards the late stages, similarly to the transition proteins Tnp1 and Tnp2⁴⁸. In contrast, Hypm, H2afb1 and H2bl1 (1700024p04rik) were highly enriched until the end of differentiation similar to protamines, suggesting a role for these histone variants during the final genome condensation (Fig. 6b).

Chromatin condensation results in transcriptional shutdown and is reflected by the gradual decline of expressed genes after S7 in our data (Fig. 6c). To fuel the drastic morphological changes associated with elongation in the absence of transcription, spermatids store large amounts of mRNAs in the chromatoid body, a perinuclear RNA granule⁴⁹. Difficulties in purifying late spermatids have hindered the characterisation of stored mRNAs, which likely have vital roles during late stages of spermiogenesis.

By correlating normalised gene expression against the number of genes expressed, we can identify genes that decrease (Gene Set 1–4) or increase (Gene Set 5–9) in relative expression after transcriptional shutdown (Fig. 6d). Gradually decreasing expression could be indicative of RNA degradation rates, whereas transcripts that increase in relative expression after transcriptional shutdown are likely protected from degradation. The latter include well-known spermiogenesis-specific genes involved in chromatin condensation and sperm mobility, and thus present a resource for identifying spermiogenesis-related proteins with potential roles in fertility (Fig. 6d, Supplementary Data 10).

Meiotic silencing dynamics of sex chromosomes

Asynapsis results in the transcriptional silencing of sex chromosomes during male meiosis, a process termed meiotic sex chromosome inactivation (MSCI), and is followed by partial transcriptional reactivation in spermatids⁷ (Fig. 7a). We evaluated the inactivation and re-activation status of the sex chromosomes by plotting the ratio of gene expression from the sex chromosomes compared to all autosomes (Fig. 7b; Methods). As described by Sangrithi et al.,⁵⁰ the X chromosome is partially upregulated in spermatogonia (X:A ratio < 1), followed by transcriptional silencing in spermatocytes.

Using the more refined transcriptional profile for early meiotic prophase obtained using EmptyDrops (Fig. 4), we observed a sharp drop in X expression at the zygotene to pachytene transition, consistent with the onset of MSCI (Supplementary Fig. 9a). Throughout spermiogenesis, expression from the sex chromosomes gradually increases, reaching X:A ratios comparable to spermatogonia, therefore suggesting a substantial post-meiotic and temporally controlled reactivation (Fig. 7b, Supplementary Fig. 9b). Several genes are known to be re- or de novo activated in spermatids⁵¹, often dependent on RNF8 (Ring finger protein 8) and/or SCML2 (Sex comb on midleg-like 2)⁵²; however, the precise timing and order of transcriptional reactivation during spermiogenesis has not yet been explored.

Exploiting the time-course of whole-testis bulk RNA-Seq during the first spermatogenic wave allowed the sensitive detection of spermatid-specific escape genes by performing DE analysis between early (<P20) and late (>P20) time-points (Fig. 2a, Supplementary Fig. 3c). These X-linked de novo activated escape genes (n = 128) are exclusively expressed in spermatids, include previously annotated escape genes such as Cypt1 and Akap4, and are enriched for RNF8- or SCML2-dependent genes (Fisher’s Exact test: RNF8-targets, p-value < 5 × 10⁻¹²; SCML2-targets, p-value < 2 × 10⁻⁹) (Fig. 7c; Supplementary Data 11)⁵².

De novo activated escape genes showed a broad range of temporal expression patterns across our scRNA-Seq dataset (Fig. 7d). The earliest expression, directly following meiosis lasting until stage S4, was observed for three members of the Ssxb multi-copy gene family (Ssxb1, Ssxb2, and Ssxb3). We confirmed this expression pattern via RNA ISH for Ssxb1 and quantified the expression across tubules, with the highest signal in epithelial stages I–IV (Fig. 7e, f), closely resembling our scRNA-Seq data. Other multi-copy genes that showed a spermatid-specific expression pattern included Rhox11, Mageb5 and Slxl1 (Supplementary Fig. 9c, d). However, no other gene family showed early reactivation similar to Ssxb, which suggests that this gene family may have distinct functions in post-meiotic X reactivation.

Epigenetic changes underlying de novo escape gene activation

To reveal the epigenetic changes underlying de novo activation of spermatid-specific escape genes, we used CUT&RUN optimised for low cell numbers⁸ to profile the chromatin landscape of spermatocytes and spermatids from P24, P26 and P28 animals (Fig. 8a; Supplementary Fig. 10a; Methods). We assayed trimethylation of histone H3 on lysine 4 (H3K4me3) as a proxy for promoter activity, acetylation of lysine 27 (H3K27ac) which has been linked to RNF8-mediated reactivation of escape genes⁵², as well as repressive trimethylation of lysine 9 (H3K9me3), which is associated with the sex body in early and late spermatocytes and enriched in post-meiotic sex chromatin (PMSC)^53,54. By profiling the enrichment of H3K9me3 across all chromosomes, we confirmed high levels of H3K9me3 on the X chromosome in spermatids⁵⁵. In addition, we reveal that H3K9me3 accumulation begins earlier in meiosis, showing an enrichment of this repressive mark on the X chromosome already in spermatocytes (Fig. 8b, Supplementary Fig. 10c).

On autosomes, H3K9me3 is enriched in pericentromeric regions of constitutive heterochromatin and across tissue-specific gene clusters (Supplementary Fig. 10b). In contrast, H3K9me3 is more evenly distributed across the X chromosome in spermatocytes and spermatids (Supplementary Fig. 10c, d). Nevertheless, we detected broad regions with particularly high H3K9me3 scattered across the X chromosome (Supplementary Fig. 11a) and profiled the enrichment of repetitive elements within these regions compared to the rest of the X chromosome (Methods). Of the top enriched repeat elements, the majority were LTR elements from numerous families (Supplementary Fig. 11b), including RLTR10B which is aberrantly activated upon loss of the H3K9me3-deposition machinery⁵⁶.

Among the regions with highest enrichment for H3K9me3 was the promoter of Akap4, a well-known X-linked escape gene, which prompted us to compare the chromatin dynamics at promoters of de novo escape genes (spermatid-specific genes) versus the promoters of all other expressed X-chromosome genes (non-spermatid specific genes) (Fig. 7c; Supplementary Data 11).

Spermatid-specific genes showed lower levels for H3K4me3 in spermatocytes (Wilcoxon-Mann-Whitney: p-value < 2.2 × 10⁻¹⁶), followed by elevated H3K4me3 signal in spermatids, reflecting their increased expression post-meiosis (Fig. 8c, Supplementary Fig. 12a). Globally, a similar pattern was observed for H3K27ac. However, a subset of spermatid-specific genes, including Akap4 (see below), showed elevated levels of H3K27ac already in spermatocytes, consistent with its role in escape gene activation⁵² (Fig. 8d, Supplementary Fig. 12b, d). We confirmed these patterns for H3K4me3 and H3K27ac with publicly available ChIP-Seq data from Hammoud et al.⁵⁷ and revealed significantly lower levels of H3K4me1 at spermatid-specific genes in both spermatids and spermatocytes (Supplementary Fig. 12e). In contrast, the promoters of spermatid-specific escape genes showed a strong enrichment for H3K9me3 in spermatocytes, suggesting a targeted repression for a subset of genes during meiosis (Wilcoxon-Mann-Whitney: p-value < 3.7 × 10⁻¹¹) (Fig. 8e, Supplementary Fig. 12c).

The chromatin remodelling associated with escape gene activation is exemplified by the epigenetic changes occurring around Akap4 and Cypt1 (Fig. 8f, Supplementary Fig. 12d). In spermatocytes, the promoters of these two spermatid-specific genes have high levels of H3K9me3, which decreases in spermatids, while H3K4me3 levels are strongly increased. For Akap4, a direct target of RNF8, we observe a bivalent chromatin state in spermatocytes enriched for both H3K9me3 and H3K27ac, supporting the RNF8-mediated accumulation of H3K27ac in spermatocytes⁵².

The particular enrichment of H3K9me3 at de novo escape genes supports recent findings by Hirota et al.⁵⁶ identifying SETDB1 as the histone methyltransferase responsible for H3K9me3 enrichment across the sex chromosomes during meiosis. Loss of SETDB1 results in MSCI failure and germ cell apoptosis accompanied by the aberrant expression of spermatid-specific genes in spermatocytes, suggesting that our observed high levels of H3K9me3 are necessary to prevent premature transcription of spermatid-specific genes.

Discussion

The testes are among the most proliferative tissues in the adult body and ensure fertility via the continuous production of millions of sperm per day. In contrast to most developmental differentiation processes which require the profiling of cellular populations at several time-points^58,59, spermatogenesis occurs continuously, with all intermediate cell-types present in adult. This provides a powerful opportunity to capture and profile an entire differentiation process by profiling the transcriptomes of thousands of single-cells at a single time-point; similar approaches have been used by complementary studies dissecting spermatogenesis in both human and mouse^{21,28,60,61,62,63,64,65}.

We identified key developmental transitions within the differentiation trajectory by profiling the first wave of spermatogenesis, where development is naturally truncated, facilitating the identification of the most mature cell-types. Profiling juvenile animals also naturally enriched for cell-types under-represented in adults, including spermatogonia. We obtained more than 1100 spermatogonial transcriptomes, allowing the identification of specific cell clusters within this heterogeneous cell population thus greatly improving the resolution over previous studies that only studied adult testes⁶³. Furthermore, our approach enriched for and captured the differentiation of somatic cell-types, thus providing a valuable resource for understanding tissue homoeostasis.

Droplet-based scRNA-Seq can profile large numbers of cells simultaneously^66,67, capturing a wide range of transcriptional complexity. This presents a major computational challenge in distinguishing between (i) droplets containing transcriptionally inactive cells versus (ii) empty droplets containing (background) ambient RNA. By using a stringent default threshold, we identified the majority of somatic and germ cell-types in testes, similar to recent scRNA-Seq studies in mouse and human^28,63,65. In addition, we applied a statistical method to identify cells with diverse transcriptional complexity⁶, and were able to detect transcriptionally quiescent leptotene/zygotene spermatocytes. This allowed us to bridge the developmental transition between spermatogonia and spermatocytes, providing a more complete view of the continuum of germ cell differentiation.

The transcriptional silencing of the sex chromosomes during meiosis, and their subsequent partial re-activation post-meiosis, is essential for male fertility. Failure of meiotic sex chromosome inactivation results in expression of spermatocyte-lethal genes, as demonstrated for two Y chromosome-encoded genes, zinc finger protein Y-linked (Zfy) 1 and 2⁶⁸. Our discovery that H3K9me3 is enriched during meiosis at spermatid-specific genes on the X chromosome suggests a stronger, targeted repression in spermatocytes. The deposition of H3K9me3 is specific to MSCI in males, and is not observed during meiotic silencing of unpaired chromosomes (MSUC)⁶⁹. Furthermore, males display stronger meiotic silencing of the X chromosome compared with the unpaired X chromosome in XO oocytes⁶⁹. The more robust silencing in males is linked to SETDB1-mediated deposition of H3K9me3 and is essential for meiotic silencing, causing premature expression of spermatid genes when perturbed⁵⁶. Thus, our finding that spermatid-specific genes are particularly enriched for H3K9me3 in spermatocytes suggests that their targeted repression is necessary for male fertility.

Such a requirement could arise from the opposing evolutionary forces acting on the X chromosome⁷⁰. Due to its hemizygosity in males, the X chromosome is expected to be enriched for male-specific genes. In contrast, meiotic silencing allows pachytene-lethal genes to survive on the X chromosome, since their deleterious effect will be masked by MSCI, similarly to Zfy1/2 on the Y chromosome⁶⁸. Our study thus raises interesting questions about how H3K9me3 is targeted to specific genes on the X chromosome in spermatocytes, and how transcription is reactivated in post-meiotic spermatids.

Methods

Mouse material

All animals were housed in the Biological Resources Unit (BRU) in the Cancer Research UK – Cambridge Institute under Home Office Licences PPL 70/7535 until February 2018 and PPL P9855D13B from March 2018. C57BL/6J animals were purchased from Charles River UK Ltd (Margate, United Kingdom) and the Tc1 mouse line was obtained from Fisher and Tybulewizc⁷¹ and maintained by breeding female Tc1 mice to male (129S8 x C57BL/6J) F1 mice. Littermates that did not inherit human chromosome 21 in these crosses were used as control animals (Tc0).

Fluorescence-activated cell sorting of spermatogenic cells

Spermatogenic cell populations were isolated from adult mouse testes as described in Ernst et al.⁴⁵. In brief, the albuginea was removed and tissue was incubated in dissociation buffer containing 25 mg/ml Collagenase A, 25 mg/ml Dispase II and 2.5 mg/ml DNase I for 30 min at 37 °C. Enzymatic digestion was quenched with Dulbecco’s Modified Eagle Medium (DMEM, Gibco) supplemented with 10% fetal calf serum (FCS, 10270106, Gibco). Cells were resuspended at a concentration of 1 million cells per ml and stained with Hoechst 33342 (H3570, ThermoFisher Scientific) at a final concentration of 5 µg/ml for 45 min at 37 °C. Cells were resuspended in PBS containing 1% FCS and 2 mM EDTA and propidium iodide was added to a final concentration of 1 µg/ml prior to sorting.

Cells were sorted on an Aria IIu cell sorter (Becton Dickinson) using a 100 µm nozzle. Hoechst was excited with a UV laser at 355 nm and fluorescence was recorded with a 450/50 filter (Hoechst blue) and 635LP filter (Hoechst red). Primary spermatocytes (4N) and round spermatids (1N) were sorted and collected in PBS containing 1% FCS and 2 mM EDTA.

Total RNA-Seq from bulk samples

Testes from prepubertal mice ranging between postnatal day 6 and 35 were flash frozen or directly used for RNA extraction using Trizol (Thermo Fisher, 15596026) following manufacturer’s instructions. Purified RNA was DNase-treated using the TURBO DNA-free Kit according to manufacturer’s instructions (Thermo Fisher, AM1907) and RNA quality was assessed using the Agilent Tapestation RNA Screentape. Eight hundred nanograms of DNA-depleted RNA were used for RNA-Seq library preparation using the TruSeq Stranded Total RNA Library Kit with Ribo-Zero Gold for cytoplasmic and mitochondrial ribosomal RNA removal according to manufacturer’s instructions (Illumina, RS-122-2303). Libraries were then sequenced on Illumina HiSeq2500 using a paired-end 125 bp run.

10X Genomics single-cell RNA-Seq

Mouse testes were enzymatically dissociated as described above and 34 µl of single-cell suspension at a concentration of ~297,000 cells/ml was loaded into one channel of the Chromium^TM Single Cell A Chip (10X Genomics^®), aiming for a recovery of 4000–5000 cells. The Chromium Single Cell 3′ Library & Gel Bead Kit v2 (10X Genomics^®, 120237) was used for single-cell barcoding, cDNA synthesis and library preparation, following manufacturer’s instructions according to the Single Cell 3′ Reagent Kits User Guide Version 2, Revision D. Libraries were sequenced on Illumina HiSeq2500 using a paired-end run sequencing 26 bp on read 1 and 98 bp on read 2. Information about libraries in which individual samples were sequenced is available in Supplementary Data 1.

Histology

Testes were fixed in neutral buffered formalin (NBF) for 24 h, transferred to 70% ethanol, machine processed and paraffin embedded. Formalin-fixed paraffin-embedded (FFPE) sections of 3 µm thickness were used for all histological stains and immunohistochemistry (IHC).

For Periodic Acid Schiff (PAS) stainings slides were dewaxed, washed in water and placed in 0.5% Periodic Acid (Sigma P0430) for 5 min. After three washes in ultra-pure water, slides were placed in Schiff reagent (Thermo Fisher Scientific, J/7300/PB08) for 15–30 min in a closed container and washed again three times in ultra-pure water. Counterstain was performed using Mayers Haematoxylin (Thermo Fisher Scientific, LAMB/170-D) for 40 s followed by rinsing in tap water, dehydration and mounting.

IHC was performed on FFPE sections using the Bond™ Polymer Refine Kit (DS9800, Leica Microsystems) on the automated Bond Platform. Anti-phospho-Histone H3 (Ser10) (pH3) antibody (Upstate, 06-570, 1:200 dilution) was used with DAB Enhancer (Leica Microsystems, AR9432) and heat-induced epitope retrieval was performed for 10 min at 100 °C on the Bond platform with sodium citrate. All slides were scanned using Aperio XT (Leica Biosystems) and PH3 intensities were quantified using the Aperio eSlide Manager (Leica Biosystems).

RNA in situ hybridisation using RNAScope^®

Detection of transcripts for mouse genes Prss50, Pou5f2 and Ssxb1 was performed in single-plex assays on FFPE sections using Advanced Cell Diagnostics (ACD) RNAscope^® 2.5 LS Reagent Kit-RED (Cat No. 322150), RNAscope^® 2.5 LS Probe Mm-Prss50 (Cat No. 557338), RNAscope^® 2.5 LS Probe Mm-Pou5f2 (Cat No. 557328), and RNAscope^® 2.5 LS Probe Mm-Ssxb1 (Cat No. 557348) (ACD, Hayward, CA, USA).

Briefly, sections were cut at 3 µm thickness, baked for 1 h at 60 °C before loading onto a Bond RX instrument (Leica Biosystems). Slides were deparaffinised and rehydrated on board before pre-treatments using Epitope Retrieval Solution 2 (Cat No. AR9640, Leica Biosystems) at 88 °C for 10 min, and ACD Enzyme from the LS Reagent kit at 40 °C for 15 min. Probe hybridisation and signal amplification was performed according to manufacturer’s instructions. Fast red detection of mouse Prss50/Pou5f2/Ssxb1 was performed on the Bond Rx using the Bond Polymer Refine Red Detection Kit (Leica Biosystems, Cat No. DS9390) according to ACD protocol. Slides were then removed from the Bond Rx and were heated at 60 °C for 1 h, dipped in Xylene and mounted using EcoMount Mounting Medium (Biocare Medical, CA, USA, Cat No. EM897L).

The slides were imaged on the Aperio AT2 (Leica Biosystems) to create whole slide images and were captured at ×40 magnification with a resolution of 0.25 microns per pixel. Quantitative image analysis was performed on the HALO Image Analysis Platform Version 2.3.2089.18 (Indica Labs). Image registration was used to synchronise serial sections and PAS stainings were used to stage seminiferous tubules according to their epithelial stage across tissue sections. Signal intensity of RNAScope^® stainings was quantified across annotation layers containing tubules of the same epithelial stage using the RNA ISH v. 1.5 Module (Indica Labs). Average signal intensity across all tubules of the same epithelial stage is reported in the form of dots per µm².

Low cell number chromatin profiling using CUT&RUN

In situ chromatin profiling of FACS-purified spermatogenic cell populations using Cleavage Under Targets and Release Using Nuclease, CUT&RUN, was performed according to Skene et al.⁸ with minor modifications. In brief, spermatocytes and spermatids from P24, P26 and P28 animals were sorted as described above and collected in PBS. Cells were spun down at 600 × g for 3 min in swinging-bucket rotor and washed twice with 1.5 ml Wash buffer (20 mM HEPES-KOH (pH 7.5), 150 mM NaCl, 0.5 mM Spermidine and 1X cOmplete™ EDTA-free protease inhibitor cocktail (04693159001, Roche)). During the cell washes, concanavalin A-coated magnetic beads (Bangs Laboratories, cat. No. BP531) (10 µl per condition) were washed twice in 1.5 mL binding buffer (20 mM HEPES-KOH (pH 7.5), 10 mM KCl, 1 mM CaCl, 1 mM MnCl₂) and resuspended in 10 µl binding buffer per condition. Cells were then mixed with beads and rotated for 10 min at room temperature (RT) and samples were split into aliquots according to number of antibodies profiled per cell-type. We used 20,000–30,000 spermatocytes and 40,000–60,000 spermatids per chromatin mark.

Cells were then collected on magnetic beads and resuspended in 50 µl antibody buffer (Wash buffer with 0.05% Digitonin and 2 mM EDTA) containing one of the following antibodies in 1:100 dilution: H3K4me3 (Millipore 05-1339 CMA304, Lot2780484), H3K27ac (Abcam ab4729, GR3211741-1) and H3K9me3 (Abcam, ab8898, Lot GR306402-1). Cells were incubated with antibodies for 10 min at RT and then washed once with 1 ml Digitonin buffer (Wash buffer with 0.05% Digitonin). For the mouse anti-H3K4me3 antibody, samples were incubated with a 1:100 dilution in Digitonin buffer of secondary rabbit anti-mouse antibody (Invitrogen, A27033, Lot RG240909) for 10 min at RT and then washed once with 1 mL Digitonin buffer. Samples were then incubated with 700 ng/ml ProteinA-MNase fusion protein (kindly provided by Steven Henikoff) for 10 min at room temperature followed by two washes with 1 ml Digitonin buffer. Cells were then resuspended in 100 µl Digitonin buffer and cooled down to 4 °C before addition of CaCl₂ to a final concentration of 2 mM. Targeted digestion was performed for 30 min on ice until 100 µl of 2X STOP buffer (340 mM NaCl, 20 mM EDTA, 4 mM EGTA, 0.02% Digitonin, 250 mg RNase A, 250 µg Glycogen, 15 pg/ml yeast spike-in DNA (kindly provided by Steven Henikoff)) were added. Cells were then incubated at 37 °C for 10 min to release cleaved chromatin fragments, spun down for 5 min at 16,000 × g at 4 °C and collected on magnet. Supernatant containing the cleaved chromatin fragments was then transferred and cleaned up using the Zymo Clean & Concentrator Kit.

Library preparation was performed using the ThruPLEX® DNA-Seq Library Preparation Kit (R400407, Rubicon Genomics) with a modified Library Amplification programme: Extension and cleavage for 3 min at 72 °C followed by 2 min at 85 °C, denaturation for 2 min at 98 °C followed by four cycles of 20 s at 98 °C, 20 s at 67 °C and 40 s at 72 °C for the addition of indexes. Amplification was then performed for 12–14 cycles of 20 s at 98 °C and 15 s at 72 °C. Double-size selection of libraries was performed using Agencourt AMPure XP Beads (Beckman Coulter, A63880) according to manufacturer’s instructions. Average library size was tested on Agilent 4200 Tapestation using a DNA1000 High Sensitivity Screentape and quantification was performed using the KAPA Library Quantification Kit (Kapa Biosystems). CUT&RUN libraries were sequenced on a HiSeq2500 using a paired-end 125 bp run.

Read alignment and counting of 10X genomics scRNA-Seq data

To generate a genomic reference for sequence alignment, the full Mus musculus genome (GRCm38) was concatenated with the sequence of the human chromosome 21 (taken from GRCh38). Similarly, the genomic annotation for Mus musculus (GRCm38.88) was merged with the annotation for human chromosome 21 (taken from GRCh38.88). The Cell Ranger v1.3.1 mkref function with default settings was used to process the genomic sequence and the annotation file for read alignment. To obtain gene-specific transcript counts, the Cell Ranger v1.3.1 count function with default settings was used to align and count unique molecular identifiers (UMIs) per sample.

Quality control of Cell Ranger filtered cells

The Cell Ranger v1.3.1 software retains cells with similar UMI distributions⁷². We use this default threshold to obtain high-quality cells with large numbers of UMIs. After merging all samples, we filtered out cells that express <1000 genes. Furthermore, we exclude cells with more than 10% of reads mapping to the mitochondrial genome. These filtered data were used for all analyses except that presented in Fig. 4, Supplementary Figure 9a and Supplementary Data 7 where the EmptyDrops filtered cells (below) were utilised.

Quality control of EmptyDrops filtered cells

Using the Cell Ranger default threshold leads to the exclusion of cells with lower transcriptional complexity. We therefore used the EmptyDrops function provided in the DropletUtils Bioconductor package⁶ to statistically distinguish empty droplets from genuine cells (controlling the FDR to 1%). After merging true cells across all samples, we filtered out cells with <500 genes expressed. Furthermore, we excluded cells with more than 10% or mitochondrial genes expressed.

Normalisation of scRNA-Seq data

The transcriptomes of quality filtered cells were normalised using the scran package⁷³. Cells with similar transcriptomic complexity were pre-clustered using a graph-based approach (as implemented in the quickCluster function with the maximum cluster size set to 2000 cells). Size factors were calculated within each cluster before being scaled between clusters using the computeSumFactors function. Throughout this paper, the log₂-transformed, normalised counts (after adding one pseudocount) are displayed. For down-stream analysis, we removed genes that were not detected in any cell.

Detection of highly variable genes

To detect the top 1000 most variable genes across all tested cells, we first fitted a smooth loess regression trend between the variance of the log₂-transformed normalised counts and the abundance of each endogenous gene using the trendVar function in scran without fitting a parametric curve prior to smooth trend fitting. Next, we used the output of the trendVar function together with the log₂-normalised counts to compute the biological variation for each gene using the decomposeVar function in scran with default settings⁷⁴. Genes are ordered based on their biological variation and the top 1000 most variable genes are selected.

Computational mapping of single cells across samples

We first confirmed that the processing of samples across independent batches did not introduce technical batch effects by visualising replicates of P5 and adult B6 (Supplementary Fig. 2a, b). However, when visualising all samples (across sampled time-points and genetic backgrounds, Supplementary Data 1), we observe a biological sample effect (Supplementary Fig. 2c). To remove these sample-specific effects (from here on also named batch-effects), we used the mnnCorrect function implemented in the scran package⁹ (Supplementary Fig. 2d). To identify the set of input genes for mnnCorrect, we computed the top 1000 genes with highest biological variation across all cells within each sample. Subsequently, we used the combineVar function (using default settings) implemented in scran to combine the results of variance decompositions (results of the decomposeVar function) across all samples. The top 1000 genes with highest biological variation after merging were used as informative genes for batch-correction. The ordering of datasets as input into the mnnCorrect function is relevant as the first dataset is used as a reference and should ideally contain the majority of cell-types. Batch correction was performed across (i) all samples (using the CellRanger threshold or the EmptyDrops approach; using adult B6 as reference), (ii) P10 and P15 spermatogonia (using P10 spermatogonia as reference) or (iii) P5 and P10 somatic cell-types (using P10 somatic cell-types as reference) using mnnCorrect with the following parameters: cos.norm.in = TRUE, cos.norm.out = TRUE, sigma = 0.1.

Clustering of batch-corrected single-cell transcriptomes

The full set of CellRanger selected batch-corrected transcriptomes (explained above) were clustered using an iterative graph-based approach.

First, to define broad clusters, we constructed a shared nearest-neighbour (SNN) graph⁷⁵ considering five shared nearest neighbours using the buildSNNGraph function in scran with following parameters: d = 50, type = “rank”, transposed = FALSE, pc.approx = TRUE, rand.seed = NA. In the next step, a multi-level modularity optimisation algorithm was used to find community structure in the graph⁷⁶ as implemented in the cluster_louvain function of the igraph R package (no edge weights were provided). Broad clusters were annotated based on the expression of known marker genes and grouped into somatic and germ cells.

Having grouped cells into somatic or germ cell categories, we re-processed the non-batch-corrected count matrix (separately for somatic and germ cells) by performing (i) batch correction across all samples as described above and (ii) graph-based clustering as described above. When clustering the somatic cells, we constructed the graph using 10 SNN while 5 SNN were used when clustering the germ cells. We annotated clusters based on known marker genes, the mapping of juvenile samples across the germ cell trajectory and the mapping of RA-synchronised cells as described below. Cells in small clusters that show unclear identities (co-expression of otherwise cell-type specific marker genes indicating possible doublets) were excluded from down-stream analysis.

Clustering of the P15 sample after EmptyDrops filtering was performed on the log₂-transformed, normalised counts using 10 shared nearest neighbours and the same strategy as explained above. Clustering of the batch-corrected counts of P10 and P15 spermatogonia was performed using 15 shared nearest neighbours. Clustering of the log₂-normalised counts of P5 spermatogonia was performed using 15 shared nearest neighbours.

Dimensionality reduction and hierarchical clustering

For visualisation, tSNE was computed on the batch-corrected counts of all samples using the R package Rtsne. For this, an initial principal component analysis (PCA) was calculated using the prcomp_irlba function as implemented in the irlba R package. The first 50 PCs were used as input to compute the tSNE (with the following parameters: pca = FALSE, perplexity = 350). Throughout this study, we visualise subsets of this tSNE except in Fig. 2d, and in Supplementary Figure 2a, b, where the plot was generated using log₂-transformed normalised counts of both adult B6 or both P5 samples.

PCA was computed either on the log₂-transformed, normalised counts of the top 1000 most highly variable genes or batch-corrected counts of the scRNA-Seq data using the base R prcomp function or the prcomp_irlba function as described above.

DE testing and marker gene extraction

DE testing across multiple pairwise comparisons was used to identify cluster-specific marker genes in the adult B6 samples, somatic cells of juvenile P5 and P10 samples, spermatogonia of P10 and P15 animals and cells detected in P15 sample after EmptyDrops filtering. To detect cluster-specific marker genes, the findMarkers function implemented in scran was applied to the log₂-transformed normalised counts while providing the cluster labels. In cases where DE testing was performed across cells from multiple samples, we supplied the findMarkers function with sample labels as blocking factors to account for sample-specific effects. Group-specific marker genes are defined as genes with a log₂-fold change in expression between the group of interest and all other groups as well as a false discovery rate < 0.1. We also used the findMarkers function to detect genes differentially expressed between all spermatocytes and spermatids from adult B6 (Supplementary Data 8).

To detect differentially expressed genes between Tc1 and Tc0 animals and for somatic cell-types between P5 and P10 samples, we summed counts within each cell cluster and each batch to form pseudo-bulk samples. We used the Bioconductor package edgeR⁷⁷ to perform DE analysis. For this, we first calculated normalisation factors using the calcNormFactors function. Next, we estimated dispersion across all pseudo-bulk samples using the estimateDisp function while providing a design matrix containing the factors to be tested. We then fitted a quasi-likelihood negative binomial generalised log-linear model to the count data while providing the design matrix using the glmQLFit function with the following extra parameter: robust = TRUE. DE testing was performed between the conditions using the glmTreat function with following parameters: coef = 2, lfc = 0.5 (testing an absolute log-fold change in mean expression >0.5). The false discovery rate was controlled to 10%. This approach avoids confounding batch effects between the two genotypes⁷⁸. Results are presented by plotting the log₂-fold change in expression between Tc1 and Tc0 animals versus the log₂-transformed counts per million, averaged across both conditions (Supplementary Fig. 7c).

Differential cell-proportion testing between samples

To robustly test for differences in cell-type proportions between Tc0 (n = 3) and Tc1 (n = 4) animals, we counted the number of cells allocated to each cell-type within each batch. EdgeR was used to perform differential proportion testing using a similar principle to the approach described in the previous section. We first constructed a DGEList object using the number of cells in each germ cell group per sample and providing the total number of cells per sample as a lib.size argument. We next ran the estimateDisp function while providing a design matrix containing the factors to be tested. As described above, the glmQLFit function was used with the following parameter: robust = TRUE, and the glmQLFTest function was called to test differential cell proportions between Tc1 and Tc0 samples for each cell-type. The false discovery rate was controlled to 10%.

Ordering cells along their developmental trajectory

To order cells along their developmental trajectory, we fitted a principal curve⁷⁹ to a set of principal components (computed on the top 1000 highly variable genes) using the principal.curve function implemented in the princurve R package. The principal curve was fitted to the first 3 PCs after performing PCA on (i) the batch-corrected data of P10 and P15 spermatogonia (ii) the log₂-normalised counts of spermatocytes or spermatids of adult B6 samples; to the first 10 PCs after performing PCA on the log₂-normalised counts of EmptyDrops filtered germ cells at P15. This approach allows us to order cells along the principal curve. The directionality of the curve was inferred using prior information based on the cluster annotation.

We compared the robustness of the principal curve ordering of cells to cell ordering after computing the pseudotime of cells using monocle⁸⁰. For this, we first constructed a CellDataSet (as implemented in monocle) using the batch-corrected counts of P10 and P15 spermatogonia. To avoid additional normalisation, we set the size factors to 1. We next computed a low dimensional representation of the cells using the reduceDimension function with default settings. Finally, we ordered cells based on the pseudotime computed using the orderCells function with default settings. The ordering of cells obtained by fitting a principal curve is highly correlated with the ordering obtained using monocle (Supplementary Fig. 5b).

Correlation analysis

To correlate log₂-transformed normalised gene expression to the number of genes expressed, we used the correlatedPairs function implemented in scran⁷⁴. We first constructed an empirical null distribution (n = 100,000) using the correlateNull function implemented in scran supplying the number of cells in the dataset. Next, we tested the observed Spearman’s rho for each gene (excluding lowly expressed genes; averaged log₂-transformed normalised counts > 0.1) against this null distribution. We consider genes with rho < −0.3 and a Benjamini-Hochberg corrected empirical p-value < 0.1 as negatively correlated and genes with rho > 0.3 and a Benjamini-Hochberg corrected empirical p-value < 0.1 as positively correlated.

Calculating the stem cell and progenitor score

To separate SSCs from progenitor cells among the group of spermatogonia at P10 and P15, we examined the expression of known SSC marker genes: Id4, Gfra1, Lhx1, Egr2, Etv5, Nanos2, Ret, Eomes as well as progenitor markers: Neurog3, Rarg, Nanos3, Lin28a, Upp1³⁵. For each cell, we calculated the fraction of SSC markers and progenitor markers expressed (>0 counts). The colour scale in Supplementary Fig. 5C indicates the fraction of SSC marker genes versus the fraction of progenitor marker genes expressed.

Computing the sex chromosome to autosome ratio

To compute the ratio in expression between chromosome 9, chromosome X or chromosome Y and all autosomes, we selected genes that were expressed in more than 30% of spermatogonia or 30% of spermatids, the cell-types with detectable X chromosome expression. For each cell, the mean expression across these genes per chromosome was calculated. Mean expression of the chromosomes of interest (9, X and Y) was divided by mean expression of the autosomes.

Analysis of RA-synchronised scRNA-Seq data

This section describes the computational analysis of retinoic acid (RA)-synchronised germ cells that were captured and sequenced by Chen et al.²¹. The raw count data can be obtained from Gene Expression Omnibus under the accession number GSE107644.

After downloading the raw data, we merged all samples into one dataset and removed 2 cells that had extreme numbers of detected genes (<1,250 or >12,000). Cells with similar transcriptomic complexity were pre-clustered using a graph-based approach (as implemented in the quickCluster function in scran while restricting the maximum cluster size to 1000 cells). Size factors were calculated within each cluster before being scaled between clusters using the computeSumFactors function.

As described in the “Computational mapping of single cells across samples” section, the mnnCorrect function implemented in scran was used to combine data from RA-synchronised cells and germ cell data generated in this study. The data generated in our study were used as mapping reference. Groups of RA-synchronised cells were labelled as follows: A1: A₁ spermatogonia; ln: Intermediate spermatogonia; TypeBS: Type B spermatogonia in S-phase; TypeBG2M: Type B spermatogonia in G2/M phase of cell cycle; G1: Spermatocytes (SC) in G1 phase of cell cycle; L: Leptotene SC; Z: Zygotene SC; ePL: early Pre-Leptotene SC; mPL: mid Pre-Leptotene SC; lPL: late Pre-Leptotene SC; eP: early Pachytene SC; mP: mid Pachytene SC; lP: late Pachytene SC; D: diplotene SC; MI: Metaphase I; MII: Metaphase II; RS1o2: RS 1–2; RS3o4: RS 3–4; RS5o6: RS 5–6; RS7o8: RS 7–8.

These cells were mapped as follows: Fig. 2: all RA-synchronised cells to all germ cells from adult B6; Fig. 3: A1, ln, TypeBS, G1, TypeBG2M, ePL, mPL, lPL RA-synchronised cells to spermatogonia from P10 and P15 time-points; Fig. 4: A1, ln, TypeBS, G1, TypeBG2M, ePL, mPL, lPL, L, Z, eP, mP, lP RA-synchronised cells to EmptyDrops filtered germ cells from the P15 time-point.

Read alignment and counting of bulk RNA-Seq data

Sequencing reads were aligned against the Mus musculus genome (GRCm38) using the STAR aligner v2.5.3⁸¹ with default settings. Gene-level transcript counts were obtained using HTSeq version 0.9.1⁸² with the –s option set to “reverse” and using the GRCm38.88 genomic annotation file.

Quality control and normalisation of bulk RNA-Seq data

We visualised several features of the aligned and counted data (number of intronic/exonic reads, number of multi-mapping reads, low-quality reads and total library size) and did not detect any low-quality RNA-Seq libraries. Next, we used the size factor normalisation approach implemented in DESeq2⁸³ for data normalisation. For down-stream analysis and visualisation, lowly expressed genes (averaged counts < 10) were excluded.

Probabilistic classification of bulk samples

We used a regression approach to link the bulk samples to the transcriptomic profiles of single cells. Using the top 50 cluster-specific marker genes for spermatogonia, all spermatocyte groups, all spermatid groups, sertoli and leydig cells, we trained a random forest classifier (implemented in the randomForest R package⁸⁴) on 2000 cells isolated from adult B6 testes. Model testing was performed on the remaining 1215 cells isolated from adult B6 testes. Prior to training and testing, log₂-transformed, normalised counts were scaled by computing the Z score for each gene. Probabilistic prediction was performed using the Z score of log₂-transformed, normalised bulk RNA-Seq reads of the input genes.

A similar approach was taken when classifying bulk RNA-Seq data of early juvenile time-points (P6-P20) based on marker gene expression across cell-types identified from the EmptyDrops filtered cells at P15. Here, we trained the random forest on 4000 cells from P15 and followed the same approach as described above.

DE analysis between time-points

DE analysis between cells present in bulk samples before post-natal day 20 and after day 20 was performed using edgeR. The glmTreat function was used for testing with a minimum absolute log₂-fold change threshold > 2. Spermatid-specific genes are identified with a log₂-fold change > 5 in samples after day 20 compared to samples before day 20 (controlling the FDR to 10%).

Read alignment of CUT&RUN data

Paired-end reads were aligned to the Mus musculus genome (GRCm38) using Bowtie2 with the following settings:–local–very-sensitive-local–no-unal -q–phred33. Due to a multiplexing error, one library of a H3K27ac sample was sequenced ~10 times deeper than the rest of the samples. We therefore sub-sampled the reads of this library to 10%. This sample is marked in Supplementary Data 1.

CUT&RUN read counting in specified regions

Paired end reads were counted in specified regions using the regionCounts function implemented in the csaw Bioconductor package⁸⁵. For this, duplicated reads, reads with a minimum Phred quality score of 10, reads mapped more than 1000 bp apart and reads mapping to blacklisted regions (available at: http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/mm10-mouse/mm10.blacklist.bed.gz) were removed. Regions of interests were: promoters (obtained using the promoters function of the GenomicFeatures package), 1000 bp windows across the chromosome (using the windowCounts function of csaw) and whole chromosomes.

Scale normalisation of counted reads

Counts per region were normalised based on library size (counts per million, CPM) for promoter regions and 1000 bp windows; additionally, when considering entire chromosomes, the length of the chromosome was accounted for by computing the Fragments per Kilobase per Million mapped reads (FPKMs). For visualisation purposes, CPM and FPKMs were log-transformed after adding a pseudo-count of 1.

Regions with high H3K9me3 counts

To visualise regions with the highest H3K9me3 signal, we merged the top 1000 windows (1000 bp width) using the mergeWindows of the csaw package with a tolerance of 1500 bp. For visualisation purposes, we performed this analysis for one replicate of P26 spermatocytes and one replicate of P26 spermatids.

Enrichment of repeats within the H3K9me3 high regions

To find repeats that are enriched in regions that showed high H3K9me3 signal (see above) relative to the rest of the X chromosome, we computed the fraction of H3K9me3 high regions (in bases) that were annotated as belonging to a family of repetitive elements. This analysis was performed using one replicate of P26 spermatocytes using the countOverlaps function implemented in the GenomicRanges R package. Repeat locations were obtained from RepeatMasker (mm10-4.0.5–Repeat Library 20140131⁸⁶) and simple, telomeric and centromeric repeats were removed. Enrichment for each repeat family inside the bins compared to the whole X chromosome was performed using a Fisher’s Exact test as implemented in the fisher.test function in R.

Processing of Hammoud et al. ChIP-Seq data

This section describes the analysis of ChIP-Seq data generated by Hammoud et al.⁵⁷. We obtained the raw fastq files of mouse ChIP-Seq data directly from Gene Expression Omnibus under the accession key: GSE49624.

Similar to the CUT&RUN data, single-end reads were aligned to the Mus musculus genome (GRCm38) using Bowtie2 with the following settings: –local–very-sensitive-local–no-unal -q–phred33.

Single end reads in promoter regions (obtained using the promoters function of the GenomicFeatures package) were counted using the regionCounts function implemented in the csaw Bioconductor package⁸⁵. For this, duplicated reads, reads with a minimum Phred quality score of 10, and reads mapping to blacklisted regions were removed.

Counts per promoter were normalised based on library size (counts per million, CPM). For visualisation purposes, CPM per promoter were log-transformed after adding a pseudo-count of 1.

Statistical analysis for CUT&RUN and ChIP-Seq data

To test for differences in histone mark deposition, we performed two-sample Wilcoxon Mann-Whitney tests between the CPMs measured in promoters of spermatid-specific genes and CPMs measured in promoters of non-spermatid-specific genes. Only promoters of genes that were detected as expressed (averaged counts > = 10) across all bulk samples were selected for testing and visualisation.

Gene annotation

We obtained genes with known fertility phenotype collected by Matzuk and Lamb⁸⁷ (Original manuscript: Supplementary Tables 1). We tested for enrichment of these fertility-associated genes among all cell-type specific marker genes (Supplementary Table 2) using Fisher’s Exact test. To visualise histone variants and canonical histones, we used the annotation found in El Kennani et al.⁸⁸. Targets of Rnf8 and Scml2 were taken from Adams et al.⁵².

Multi-copy gene analysis

To analyse multi-copy gene families, we used the annotation from Mueller at al.⁵¹, Supplementary Table 1. The cDNA sequence of these genes was obtained from Ensembl (www.ensembl.org) and the BLAST tool was used to identify sequences of genes with high similarity (>90%). For each multi-copy gene family, normalised counts for all X-chromosomal genes with high sequence similarity were summed.

Visualisation of gene- and promoter-level information

To visualise gene-level transcript counts we either plot the log₂-transformed normalised counts or the Z score of the log₂-transformed, normalised transcript counts. The Z score is computed as: \(\frac{{x_{{\mathrm{ij}}} - \mu _{\mathrm{i}}}}{{\sigma _{\mathrm{i}}}}\) where x is the log₂-transformed, normalised count for gene i in cell j, μ_i is the mean of gene i across all cells and σ_i is the standard deviation for gene i across all cells.

Distributions of log₂-transformed, normalised expression counts as well as CUT&RUN log-transformed CPM in promoters are displayed in the form of boxplots. For this, we plot the median as centre line, the lower and upper hinges correspond to the 25^th and 75^th percentile and the whiskers extend to the largest and smallest value of 1.5 times the interquartile range. Values outside these measures are plotted as dots.

Ethics statement

This investigation was approved by the Animal Welfare and Ethics Review Board and followed the Cambridge Institute guidelines for the use of animals in experimental studies under Home Office licences PPL 70/7535 until February 2018 and PPL P9855D13B from March 2018. All animal experimentation was carried out in accordance with the Animals (Scientific Procedures) Act 1986 (United Kingdom) and conformed to the Animal Research: Reporting of In Vivo Experiments (ARRIVE) guidelines developed by the National Centre for the Replacement, Refinement and Reduction of Animals in research (NC3Rs).

Shiny server

To visualize the different samples named in this study, we set up a shiny app which can be accessed via: https://marionilab.cruk.cam.ac.uk/SpermatoShiny.

Reporting Summary

Further information on experimental design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The authors declare that all data supporting the findings of this study are available within the article and its supplementary information files or from the corresponding author upon reasonable request. Data have been deposited in the ArrayExpress database under accession code E-MTAB-6946 for scRNA-Seq data, E-MTAB-6934 for bulk RNA-Seq data and E-MTAB-6932 for CUT&RUN data. The R code to reproduce the full analysis and all figures can be obtained from: https://github.com/MarioniLab/Spermatogenesis2018. The source data underlying Fig. 7f and Supplementary Figs 6a, c and 7a are provided as a Source data file. A reporting summary for this Article is available as a Supplementary Information file.

References

Oakberg, E. F. Spermatogonial stem-cell renewal in the mouse. Anat. Rec. 169, 515–531 (1971).
Article CAS PubMed Google Scholar
de Rooij, D. G. & Russell, L. D. All you wanted to know about spermatogonia but were afraid to ask. J. Androl. 21, 776–798 (2000).
PubMed Google Scholar
Soh, Y. Q. S. et al. Meioc maintains an extended meiotic prophase I in mice. PLoS. Genet. 13, e1006704 (2017).
Article PubMed PubMed Central CAS Google Scholar
Oakberg, E. F. A description of spermiogenesis in the mouse and its use in analysis of the cycle of the seminiferous epithelium and germ cell renewal. Am. J. Anat. 99, 391–413 (1956).
Article CAS PubMed Google Scholar
Oakberg, E. F. Duration of spermatogenesis in the mouse and timing of stages of the cycle of the seminiferous epithelium. Am. J. Anat. 99, 507–516 (1956).
Article CAS PubMed Google Scholar
Lun, A. et al. Distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Preprint at https://www.biorxiv.org/content/10.1101/234872v2 (2018).
Turner, J. M. A. Meiotic sex chromosome inactivation. Development 134, 1823–1831 (2007).
Article CAS PubMed Google Scholar
Skene, P. J., Henikoff, J. G. & Henikoff, S. Targeted in situ genome-wide profiling with high efficiency for low cell numbers. Nat. Protoc. 13, 1006–1019 (2018).
Article CAS PubMed Google Scholar
Haghverdi, L., Lun, A. T. L., Morgan, M. D. & Marioni, J. C. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36, 421–427 (2018).
Article CAS PubMed PubMed Central Google Scholar
Zhang, T., Murphy, M. W., Gearhart, M. D., Bardwell, V. J. & Zarkower, D. The mammalian Doublesex homolog DMRT6 coordinates the transition between mitotic and meiotic developmental programs during spermatogenesis. Development 141, 3662–3671 (2014).
Article CAS PubMed PubMed Central Google Scholar
Ernst, C., Odom, D. T. & Kutter, C. The emergence of piRNAs against transposon invasion to preserve mammalian genome integrity. Nat. Commun. 8, 1411 (2017).
Article ADS PubMed PubMed Central CAS Google Scholar
Fujii, T. et al. Use of stepwise subtraction to comprehensively isolate mouse genes whose transcription is up‐regulated during spermiogenesis. EMBO Rep. 3, 367–372 (2002).
Article CAS PubMed PubMed Central Google Scholar
Neill, J. D. & Knobil, E. Knobil and Neill’s Physiology of Reproduction. (Elsevier, Amsterdam, Netherlands, 2015).
Oresti, G. M., García-López, J., Aveldaño, M. I. & Mazo, J. del. Cell-type-specific regulation of genes involved in testicular lipid metabolism: fatty acid-binding proteins, diacylglycerol acyltransferases, and perilipin 2. Reproduction 146, 471–480 (2013).
Article CAS PubMed Google Scholar
Bastos, H. et al. Flow cytometric characterization of viable meiotic and postmeiotic cells by Hoechst 33342 in mouse spermatogenesis. Cytom. A. 65, 40–49 (2005).
Article Google Scholar
Romrell, L. J., Bellvé, A. R. & Fawcett, D. W. Separation of mouse spermatogenic cells by sedimentation velocity. A morphological characterization. Dev. Biol. 49, 119–131 (1976).
Article CAS PubMed Google Scholar
Soumillon, M. et al. Cellular source and mechanisms of high transcriptome complexity in the mammalian testis. Cell Rep. 3, 2179–2190 (2013).
Article CAS PubMed Google Scholar
Bellvé, A. R. et al. Spermatogenic cells of the prepuberal mouse. Isolation and morphological characterization. J. Cell Biol. 74, 68–85 (1977).
Article PubMed PubMed Central Google Scholar
Janca, F. C., Jost, L. K. & Evenson, D. P. Mouse testicular and sperm cell development characterized from birth to adulthood by dual parameter flow cytometry. Biol. Reprod. 34, 613–623 (1986).
Article CAS PubMed Google Scholar
Steger, K. Transcriptional and translational regulation of gene expression in haploid spermatids. Anat. Embryol. 199, 471–487 (1999).
Article CAS Google Scholar
Chen, Y. et al. Single-cell RNA-seq uncovers dynamic processes and critical regulators in mouse spermatogenesis. Cell Res. 28, 879–896 (2018).
Article CAS PubMed PubMed Central Google Scholar
Kaftanovskaya, E. M., Lopez, C., Ferguson, L., Myhr, C. & Agoulnik, A. I. Genetic ablation of androgen receptor signaling in fetal Leydig cell lineage affects Leydig cell functions in adult testis. FASEB J. 29, 2327–2337 (2015).
Article CAS PubMed PubMed Central Google Scholar
Cool, J., Carmona, F. D., Szucsik, J. C. & Capel, B. Peritubular myoid cells are not the migrating population required for testis cord formation in the XY gonad. Sex Dev. Genet. Mol. Biol. Evol. Endocrinol. Embryol. Pathol. Sex Determ. Differ. 2, 128–133 (2008).
CAS Google Scholar
Shih, S.-C. et al. The L6 protein TM4SF1 is critical for endothelial cell function and tumor angiogenesis. Cancer Res. 69, 3272–3277 (2009).
Article CAS PubMed PubMed Central Google Scholar
Kitchens, R. L. Role of CD14 in cellular recognition of bacterial lipopolysaccharides. Chem. Immunol. 74, 61–82 (2000).
Article CAS PubMed Google Scholar
Oatley, J. & Griswold, M. The Biology of Mammalian Spermatogonia, https://doi.org/10.1007/978-1-4939-7505-1 (Springer-Verlag, Berlin, Germany, 2017).
Chemes, H. The phagocytic function of Sertoli cells: a morphological, biochemical, and endocrinological study of lysosomes and acid phosphatase localization in the rat testis. Endocrinology 119, 1673–1681 (1986).
Article CAS PubMed Google Scholar
Green, C. D. et al. A Comprehensive Roadmap of Murine Spermatogenesis Defined by Single-Cell RNA-Seq. Dev. Cell 46, 651–667 (2018).
Article CAS PubMed PubMed Central Google Scholar
Moniot, B. et al. The PGD2 pathway, independently of FGF9, amplifies SOX9 activity in Sertoli cells during male sexual differentiation. Development 136, 1813–1821 (2009).
Article CAS PubMed PubMed Central Google Scholar
Archambeault, D. R., Yao, H. H.-C. & Activin, A. a product of fetal Leydig cells, is a unique paracrine regulator of Sertoli cell proliferation and fetal testis cord expansion. Proc. Natl Acad. Sci. USA 107, 10526–10531 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Culty, M. Gonocytes, the forgotten cells of the germ cell lineage. Birth Defects Res. Part C. Embryo Today Rev. 87, 1–26 (2009).
Article CAS Google Scholar
Yoshida, S. et al. The first round of mouse spermatogenesis is a distinctive program that lacks the self-renewing spermatogonia stage. Development 133, 1495–1505 (2006).
Article CAS PubMed Google Scholar
Heaney, J. D., Michelson, M. V., Youngren, K. K., Lam, M.-Y. J. & Nadeau, J. H. Deletion of eIF2beta suppresses testicular cancer incidence and causes recessive lethality in agouti-yellow mice. Hum. Mol. Genet. 18, 1395–1404 (2009).
Article CAS PubMed PubMed Central Google Scholar
Skakkebaek, N. E., Berthelsen, J. G., Giwercman, A. & Müller, J. Carcinoma-in-situ of the testis: possible origin from gonocytes and precursor of all types of germ cell tumours except spermatocytoma. Int. J. Androl. 10, 19–28 (1987).
Article CAS PubMed Google Scholar
La, H. M. et al. Identification of dynamic undifferentiated cell states within the male germline. Nat. Commun. 9, 2819 (2018).
Article ADS PubMed PubMed Central CAS Google Scholar
Kent Hamra, F. et al. Defining the spermatogonial stem cell. Dev. Biol. 269, 393–410 (2004).
Article PubMed CAS Google Scholar
Endo, T. et al. Periodic retinoic acid-STRA8 signaling intersects with periodic germ-cell competencies to regulate spermatogenesis. Proc. Natl Acad. Sci. USA 112, E2347–E2356 (2015).
Article CAS PubMed PubMed Central Google Scholar
Anderson, E. L. et al. Stra8 and its inducer, retinoic acid, regulate meiotic initiation in both spermatogenesis and oogenesis in mice. Proc. Natl Acad. Sci. USA 105, 14976–14980 (2008).
Article ADS CAS PubMed PubMed Central Google Scholar
Kierszenbaum, A. L. & Tres, L. L. Nucleolar and perichromosomal RNA synthesis during meiotic prophase in the mouse testis. J. Cell Biol. 60, 39–53 (1974).
Article CAS PubMed PubMed Central Google Scholar
Monesi, V. Ribonucleic acid synthesis during mitosis and meiosis in the mouse testis. J. Cell. Biol. 22, 521–532 (1964).
Article CAS PubMed PubMed Central Google Scholar
Daniel, K. et al. Meiotic homologue alignment and its quality surveillance are controlled by mouse HORMAD1. Nat. Cell Biol. 13, 599–610 (2011).
Article CAS PubMed PubMed Central Google Scholar
Mahadevaiah, S. K. et al. Recombinational DNA double-strand breaks in mice precede synapsis. Nat. Genet. 27, 271–276 (2001).
Article CAS PubMed Google Scholar
Vries, F. A. Tde et al. Mouse Sycp1 functions in synaptonemal complex assembly, meiotic recombination, and XY body formation. Genes Dev. 19, 1376–1389 (2005).
Article PubMed PubMed Central CAS Google Scholar
Sleutels, F. et al. The male germ cell gene regulator CTCFL is functionally different from CTCF and binds CTCF-like consensus sites in a nucleosome composition-dependent manner. Epigenetics Chromatin 5, 8 (2012).
Article PubMed PubMed Central Google Scholar
Ernst, C. et al. Successful transmission and transcriptional deployment of a human chromosome via mouse male meiosis. eLife 5, e20235 (2016).
Article PubMed PubMed Central CAS Google Scholar
Tang, M. C. W. et al. Contribution of the Two Genes Encoding Histone Variant H3.3 to Viability and Fertility in Mice. PLoS Genet. 11, e1004964 (2015).
Article PubMed PubMed Central CAS Google Scholar
Marzluff, W. F., Gongidi, P., Woods, K. R., Jin, J. & Maltais, L. J. The Human and Mouse Replication-Dependent Histone Genes. Genomics 80, 487–498 (2002).
Article CAS PubMed Google Scholar
Zhao, M., Shirley, C. R., Mounsey, S. & Meistrich, M. L. Nucleoprotein transitions during spermiogenesis in mice with transition nuclear protein Tnp1 and Tnp2 mutations. Biol. Reprod. 71, 1016–1025 (2004).
Article CAS PubMed Google Scholar
Kotaja, N. & Sassone-Corsi, P. Opinion: The chromatoid body: a germ-cell-specific RNA-processing centre. Nat. Rev. Mol. Cell Biol. 8, 85–90 (2007).
Article CAS PubMed Google Scholar
Sangrithi, M. N. et al. Non-Canonical and Sexually Dimorphic X Dosage Compensation States in the Mouse and Human Germline. Dev. Cell 40, 289–301.e3 (2017).
Article CAS PubMed PubMed Central Google Scholar
Mueller, J. L. et al. The mouse X chromosome is enriched for multicopy testis genes showing postmeiotic expression. Nat. Genet. 40, 794–799 (2008).
Article CAS PubMed PubMed Central Google Scholar
Adams, S. R. et al. RNF8 and SCML2 cooperate to regulate ubiquitination and H3K27 acetylation for escape gene activation on the sex chromosomes. PLoS Genet. 14, e1007233 (2018).
Article PubMed PubMed Central CAS Google Scholar
Greaves, I. K., Rangasamy, D., Devoy, M., Graves, J. A. M. & Tremethick, D. J. The X and Y Chromosomes Assemble into H2A.Z, Containing Facultative Heterochromatin, following Meiosis. Mol. Cell. Biol. 26, 5394–5405 (2006).
Article CAS PubMed PubMed Central Google Scholar
Tachibana, M., Nozaki, M., Takeda, N. & Shinkai, Y. Functional dynamics of H3K9 methylation during meiotic prophase progression. EMBO J. 26, 3346–3359 (2007).
Article CAS PubMed PubMed Central Google Scholar
Moretti, C., Vaiman, D., Tores, F. & Cocquet, J. Expression and epigenomic landscape of the sex chromosomes in mouse post-meiotic male germ cells. Epigenetics Chromatin 9, 47 (2016).
Article PubMed PubMed Central CAS Google Scholar
Hirota, T. et al. SETDB1 links the meiotic DNA damage response to sex chromosome silencing in mice. Dev. Cell 47, 645–659.e6 (2018).
Article CAS PubMed PubMed Central Google Scholar
Hammoud, S. S. et al. Chromatin and transcription transitions of mammalian adult germline stem cells and spermatogenesis. Cell Stem Cell 15, 239–253 (2014).
Article CAS PubMed Google Scholar
Kernfeld, E. M. et al. A Single-Cell Transcriptomic Atlas of Thymus Organogenesis Resolves Cell Types and Developmental Maturation. Immunity 48, 1258–1270 (2018).
Article CAS PubMed PubMed Central Google Scholar
Scialdone, A. et al. Resolving early mesoderm diversification through single cell expression profiling. Nature 535, 289–293 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Guo, J. et al. The adult human testis transcriptional cell atlas. Cell Res. 28, 1141 (2018).
Article CAS PubMed PubMed Central Google Scholar
Hermann, B. P. et al. The mammalian spermatogenesis single-cell transcriptome, from spermatogonial stem cells to spermatids. Cell Rep. 25, 1650–1667.e8 (2018).
Article CAS PubMed PubMed Central Google Scholar
Jung, M. et al. Unified single-cell analysis of testis gene regulation and pathology in 5 mouse strains. Preprint at, https://www.biorxiv.org/content/10.1101/393769v1 (2018).
Lukassen, S., Bosch, E., Ekici, A. B. & Winterpacht, A. Characterization of germ cell differentiation in the male mouse through single-cell RNA sequencing. Sci. Rep. 8, 6521 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Wang, M. et al. Single-cell RNA sequencing analysis reveals sequential cell fate transition during human spermatogenesis. Cell Stem Cell 23, 599–614.e4 (2018).
Article CAS PubMed Google Scholar
Xia, B. et al. Widespread transcriptional scanning in testes modulates gene evolution rates. Preprint at, https://www.biorxiv.org/content/10.1101/282129v2 (2018).
Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).
Article CAS PubMed PubMed Central Google Scholar
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Article CAS PubMed PubMed Central Google Scholar
Royo, H. et al. Evidence that meiotic sex chromosome inactivation is essential for male fertility. Curr. Biol. 20, 2117–2123 (2010).
Article CAS PubMed Google Scholar
Cloutier, J. M., Mahadevaiah, S. K., ElInati, E., Tóth, A. & Turner, J. Mammalian meiotic silencing exhibits sexually dimorphic features. Chromosoma 125, 215–226 (2016).
Article CAS PubMed Google Scholar
Rice, W. R. Sexually antagonistic genes: experimental evidence. Science 256, 1436–1439 (1992).
Article ADS CAS PubMed Google Scholar
O’Doherty, A. et al. An aneuploid mouse strain carrying human chromosome 21 with down syndrome phenotypes. Science 309, 2033–2037 (2005).
Article ADS PubMed PubMed Central CAS Google Scholar
Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Lun, A. T. L., Bach, K. & Marioni, J. C. Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 17, 75 (2016).
Article PubMed CAS Google Scholar
Lun, A. T. L., McCarthy, D. J. & Marioni, J. C. A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor. F1000Res. 5, 2122 (2016).
PubMed PubMed Central Google Scholar
Xu, C. & Su, Z. Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinforma. Oxf. Engl. 31, 1974–1980 (2015).
Article CAS Google Scholar
Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, P10008 (2008).
Article Google Scholar
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Article CAS PubMed Google Scholar
Lun, A. T. L. & Marioni, J. C. Overcoming confounding plate effects in differential expression analyses of single-cell RNA-seq data. Biostatistics 18, 451–464 (2017).
Article MathSciNet PubMed PubMed Central Google Scholar
Hastie, T. & Stuetzle, W. Principal Curves. J. Am. Stat. Assoc. 84, 502–516 (1989).
Article MathSciNet MATH Google Scholar
Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014).
Article CAS PubMed PubMed Central Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
Anders, S., Pyl, P. T. & Huber, W. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinforma. Oxf. Engl. 31, 166–169 (2015).
Article CAS Google Scholar
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Article PubMed PubMed Central CAS Google Scholar
Liaw, A. & Wiener, M. Classification and regression by randomForest. R. News 2, 18–22 (2002).
Google Scholar
Lun, A. T. L. & Smyth, G. K. csaw: a Bioconductor package for differential binding analysis of ChIP-seq data using sliding windows. Nucleic Acids Res. 44, e45 (2016).
Article PubMed CAS Google Scholar
Smit, A. F. A., Hubley, R. & Green, P. RepeatMasker Open-4.0. 2013–2015. (2015).
Matzuk, M. M. & Lamb, D. J. Genetic dissection of mammalian fertility pathways. Nat. Cell Biol. 4, S33–S40 (2002).
Article Google Scholar
El Kennani, S. et al. MS_HistoneDB, a manually curated resource for proteomic analysis of human and mouse histones. Epigenetics Chromatin 10, 2 (2017).
Article PubMed PubMed Central CAS Google Scholar

Download references

Acknowledgements

We thank the CRUK-CI core facilities, including Genomics (Paul Coupland and Katarzyna Kania), Flow Cytometry (Jelena Markovic-Djuric and Richard Grenfell) and Histopathology (Julia Jones and Beverley Wilson) cores, and the Biological Resources Unit for technical assistance. We thank Michael Morgan for technical help and Aaron Lun for providing help on the CUT&RUN analysis. We thank Steven Henikoff for kindly providing purified proteinA-MNase protein as well as yeast-spike DNA for CUT&RUN experiments. This research was supported by European Molecular Biology Laboratory (N.E., J.C.M.), Cancer Research UK (C.E., C.P.M.J., D.T.O., J.C.M.), the Wellcome Sanger Institute (C.P.M.J., J.C.M.), the Wellcome Trust (C.E., J.C.M.-grant 105031/Z/14/Z) and the European Research Council (D.T.O.-grant 615584).

Author information

These authors contributed equally: Christina Ernst, Nils Eling.

Authors and Affiliations

European Molecular Biology Laboratory, European Bioinformatics Institute, (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
Christina Ernst, Nils Eling & John C. Marioni
University of Cambridge, Cancer Research UK Cambridge Institute, Robinson Way, Cambridge, CB2 0RE, UK
Christina Ernst, Nils Eling, Celia P. Martinez-Jimenez, John C. Marioni & Duncan T. Odom
Wellcome Sanger Institute, Welcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
Celia P. Martinez-Jimenez & John C. Marioni
German Cancer Research Center (DKFZ), Division Signaling and Functional Genomics, 69120, Heidelberg, Germany
Duncan T. Odom

Authors

Christina Ernst
View author publications
You can also search for this author in PubMed Google Scholar
Nils Eling
View author publications
You can also search for this author in PubMed Google Scholar
Celia P. Martinez-Jimenez
View author publications
You can also search for this author in PubMed Google Scholar
John C. Marioni
View author publications
You can also search for this author in PubMed Google Scholar
Duncan T. Odom
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

C.E., N.E., J.C.M., D.T.O. designed experiments; C.E. performed all experiments presented in the manuscript, performed imaging analysis and interpreted the data; N.E. performed computational analysis and interpreted the data; C.P.M.J. performed preliminary experiments and provided technical assistance; C.E., N.E., J.C.M., D.T.O wrote the manuscript. All authors commented on and approved the manuscript.

Corresponding authors

Correspondence to John C. Marioni or Duncan T. Odom.

Ethics declarations

Competing interests

The Authors declare no Competing Interests.

Additional information

Journal peer review information: Nature Communications thanks Dirk deRooij and the other anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Supplementary Data 8

Supplementary Data 9

Supplementary Data 10

Supplementary Data 11

Description of Additional Supplementary Files

Reporting Summary

Peer Review File

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ernst, C., Eling, N., Martinez-Jimenez, C.P. et al. Staged developmental mapping and X chromosome transcriptional dynamics during mouse spermatogenesis. Nat Commun 10, 1251 (2019). https://doi.org/10.1038/s41467-019-09182-1

Download citation

Received: 07 December 2018
Accepted: 15 February 2019
Published: 19 March 2019
DOI: https://doi.org/10.1038/s41467-019-09182-1

This article is cited by

The dynamic genetic determinants of increased transcriptional divergence in spermatids
- Jasper Panten
- Tobias Heinen
- Duncan T. Odom
Nature Communications (2024)
RNA polymerase II pausing is essential during spermatogenesis for appropriate gene expression and completion of meiosis
- Emily G. Kaye
- Kavyashree Basavaraju
- Prabhakara P. Reddi
Nature Communications (2024)
Gene-knockout by iSTOP enables rapid reproductive disease modeling and phenotyping in germ cells of the founder generation
- Yaling Wang
- Jingwen Chen
- Lingbo Wang
Science China Life Sciences (2024)
An organism-wide atlas of hormonal signaling based on the mouse lemur single-cell transcriptome
- Shixuan Liu
- Camille Ezran
- James E. Ferrell
Nature Communications (2024)
FAAP100 is required for the resolution of transcription-replication conflicts in primordial germ cells
- Weiwei Xu
- Yajuan Yang
- Zi-Jiang Chen
BMC Biology (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Single-cell RNA-Seq of adult spermatogenesis

Developmental mapping of the first wave of spermatogenesis

Somatic cell differentiation in postnatal testes

Cellular heterogeneity during spermatogonial differentiation

Identification of leptotene and zygotene spermatocytes

High-resolution characterisation of male meiosis

Transcriptional dynamics during spermiogenesis

Meiotic silencing dynamics of sex chromosomes

Epigenetic changes underlying de novo escape gene activation

Discussion

Methods

Mouse material

Fluorescence-activated cell sorting of spermatogenic cells

Total RNA-Seq from bulk samples

10X Genomics single-cell RNA-Seq

Histology

RNA in situ hybridisation using RNAScope®

Low cell number chromatin profiling using CUT&RUN

Read alignment and counting of 10X genomics scRNA-Seq data

Quality control of Cell Ranger filtered cells

Quality control of EmptyDrops filtered cells

Normalisation of scRNA-Seq data

Detection of highly variable genes

Computational mapping of single cells across samples

Clustering of batch-corrected single-cell transcriptomes

Dimensionality reduction and hierarchical clustering

DE testing and marker gene extraction

Differential cell-proportion testing between samples

Ordering cells along their developmental trajectory

Correlation analysis

Calculating the stem cell and progenitor score

Computing the sex chromosome to autosome ratio

Analysis of RA-synchronised scRNA-Seq data

Read alignment and counting of bulk RNA-Seq data

Quality control and normalisation of bulk RNA-Seq data

Probabilistic classification of bulk samples

DE analysis between time-points

Read alignment of CUT&RUN data

CUT&RUN read counting in specified regions

Scale normalisation of counted reads

Regions with high H3K9me3 counts

Enrichment of repeats within the H3K9me3 high regions

Processing of Hammoud et al. ChIP-Seq data

Statistical analysis for CUT&RUN and ChIP-Seq data

Gene annotation

Multi-copy gene analysis

Visualisation of gene- and promoter-level information

Ethics statement

Shiny server

Reporting Summary

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links

RNA in situ hybridisation using RNAScope^®