TBPL2/TFIIA complex establishes the maternal transcriptome through oocyte-specific promoter usage

During oocyte growth, transcription is required to create RNA and protein reserves to achieve maternal competence. During this period, the general transcription factor TATA binding protein (TBP) is replaced by its paralogue, TBPL2 (TBP2 or TRF3), which is essential for RNA polymerase II transcription. We show that in oocytes TBPL2 does not assemble into a canonical TFIID complex. Our transcript analyses demonstrate that TBPL2 mediates transcription of oocyte-expressed genes, including mRNA survey genes, as well as specific endogenous retroviral elements. Transcription start site (TSS) mapping indicates that TBPL2 has a strong preference for TATA-like motif in core promoters driving sharp TSS selection, in contrast with canonical TBP/TFIID-driven TATA-less promoters that have broader TSS architecture. Thus, we show a role for the TBPL2/TFIIA complex in the establishment of the oocyte transcriptome by using a specific TSS recognition code.

R egulation of transcription initiation by RNA polymerase II (Pol II) is central to all developmental processes. Pol II transcription requires the stepwise assembly of multiprotein complexes called general transcription factors (GTFs) and Pol II 1 . The evolutionary conserved TFIID complex plays a major role in transcription initiation as it is the first GTF to initiate the assembly of the pre-initiation complex (PIC) by recognising the core promoter 2 . TFIID is a large multiprotein complex composed of the TATA box-binding protein (TBP) and 13 TBP-associated factors (TAFs) in metazoa 3 . The model suggesting that transcription is always regulated by the same transcription complexes has been challenged in metazoans by the discovery of cell-type-specific complexes containing specialised GTF-, TBP-or TAF-paralogs 4 . Two TBP paralogues have been described in vertebrates: TBPL1 (TBP-like factor; TLF, also known as TRF2) has been identified in all metazoan species 5-10 , while TBPL2 (also known as TRF3 or TBP2) has only been described in vertebrates 11,12 . Remarkably, while Tbpl1 and Tbpl2 mutants display embryonic phenotypes in non-mammalian species [7][8][9][10]12,13 , Tbpl1 and Tbpl2 loss of function in mouse results in male and female sterility, respectively [14][15][16] , suggesting that in mammals, these two TBP-like proteins are involved in cell-specific transcription. While TBPL2 shares a high degree of identity (92%) within the conserved saddle-shaped C-terminal DNA-binding core domain of TBP 17 , the C-terminus of TBPL1 is more distant with only 42% identity 12 . As a consequence TBPL2, but not TBPL1, is able to bind canonical TATA box sequences in vitro 5,12,18 . The N-terminal domains of the three vertebrate TBP-related factors do not show any conservation. All three vertebrate TBP-related factors can interact with the GTFs TFIIA and TFIIB, and can mediate Pol II transcription initiation in vitro 12,13,[18][19][20] . However, how alternative initiation complexes form, how they regulate cell-type-specific transcription and how they recognise promoter sequences remain unknown.
Mapping of the transcription start sites (TSSs), at single nucleotide by Cap Analysis of Gene Expression (CAGE) revealed two main modes for transcription start site (TSS) usage 21 . Transcription initiation within a narrow region, called "sharp" (or focused) TSS-type, is common in highly active, tissue-specific gene promoters containing TATA boxes. While transcription initiation with multiple initiation positions within an about 100 bp region, called "broad" TSS promoter architecture 21 , is more characteristic to ubiquitously expressed and developmentally regulated genes (reviewed in ref. 22 ). During zebrafish maternal to zygotic transition, it was described that two TSS-defining grammars coexist, in core promoters of constitutively expressed genes to enable their expression in the two regulatory environments 23 . Maternally active promoters in zebrafish tend to be sharp, with TATA-like, AT-rich (W-box) upstream elements guiding TSS selection, while embryonically active broad promoter architectures of the same genes appear to be regulated by nucleosome positioning. Although a number of germ cell-specific, as well as somatic transcriptional regulators, have been well characterised during folliculogenesis (reviewed in ref. 24 ), the exact actors and mechanisms required for setting up the oocyte-specific transcriptome have not yet been identified in vertebrates.
Female germ cells develop during oogenesis leading to the formation of a highly differentiated and specialised cell, the oocyte. In females, oocytes enter meiosis during embryonic life. Quiescent primordial follicles composed of meiotically arrested oocytes at the late diplotene stage and surrounded by granulosa cells are formed perinatally in mice (reviewed in ref. 24 ). Shortly after birth, some primordial follicles enter folliculogenesis and undertake a growth phase during which a specific oocyte-specific transcriptome is set up, and oocytes increase their size until the pre-antral follicular stage 25 . A remarkable feature of oocytes is the very high expression of retrotransposons driven by Pol II transcription. These elements are interspersed with repetitive elements that can be mobile in the genome. One of the three major classes of retrotransposons in mammals is the long terminal repeat (LTR) retrotransposons derived from retroviruses, also known as endogenous retroviruses (ERVs) that is subdivided in three sub-classes: ERV1, ERVK and endogenous retrovirus-like ERVL-MaLR (mammalian apparent LTR retrotransposons) (reviewed in ref. 26 ). Transcription of mobile elements in specific cell types depends on the presence of a competent promoter recognition transcription machinery and/or the epigenetic status of the loci where these elements have been incorporated. Remarkably, MaLRs encode no known proteins, but MaLRdependent transcription is key in initiating synchronous developmentally regulated transcription to reprogramme the oocyte genome during growth 27 .
Remarkably, during oocyte growth, TBP protein is absent and replaced by TBPL2 28 . Indeed, TBP is only expressed up to the primordial follicular oocytes and becomes undetectable at all subsequent stages during oocyte growth. In contrast, TBPL2 is highly expressed in the growing oocytes, suggesting that TBPL2 is replacing TBP for its transcription initiating functions during folliculogenesis 28 . In agreement with its oocyte-specific expression, a crucial role of TBPL2 for oogenesis was demonstrated in Tbpl2 −/− females, which show sterility due to defect in secondary follicle production 16,29 . In the absence of TBPL2, immunofluorescent staining experiments showed that elongating Pol II and histone H3K4me3 methylation signals were abolished between the primary and secondary follicle stage oocytes, suggesting that Pol II transcription was impaired 16 . Initially, TBPL2/ TRF3 was suggested to be expressed during muscle differentiation 30 , but this observation was later invalidated 16,29 . Altogether, the available data suggested that TBPL2 is playing a specialised role during mouse oocyte development. However, how does TBPL2 regulate oocyte-specific transcription and what is the composition of the associated transcription machinery, remained unknown.
Here, we demonstrate that in oocytes TBPL2 does not assemble into a canonical TFIID complex, while it stably associates with TFIIA. The observation that the oocyte-specific deletion of Taf7, a TFIID-specific TAF, does not influence oocyte growth and maturation, corroborates the lack of TFIID in growing oocytes. Our transcriptomics analyses in wild-type and Tbpl2 −/− oocytes show that TBPL2 mediates transcription of oocyte-expressed genes, including mRNA destabilisation factor genes, as well as MaLR ERVs. Our transcription start site (TSS) mapping from wild-type and Tbpl2 −/− growing oocytes demonstrates that TBPL2 has a strong preference for TATA-like motif in gene core promoters driving specific sharp TSS selection. This is in marked contrast with TBP/TFIID-driven TATA-less gene promoters in preceding stages that have broad TSS architecture. Our results show a role for the TBPL2-TFIIA transcription machinery in a major transition of the oocyte transcriptome mirroring the maternal to zygotic transition that occurs after fertilisation, completing a full germline cycle.
To further analyse the requirement of TFIID during oocyte growth, we carried out a conditional depletion of the TFIIDspecific Taf7 gene during oocyte growth using the Zp3-Cre transgenic line 33 (Supplementary Fig. 1c-g). Remarkably, TAF7 is only detected in the cytoplasm of growing oocytes (Supplementary Fig. 1c). The oocyte-specific deletion of Taf7 did not affect the presence of secondary and antral follicles and the numbers of collected mature oocytes after superovulation (Fig. 1g, h and Supplementary Fig. 1f). The lack of phenotype is not due to an inefficient deletion of Taf7, as TAF7 immunolocalization is impaired ( Supplementary Fig. 1d, e), and as oocyte-specific Taf7 mutant females are severely hypofertile ( Supplementary Fig. 1g). The observations that TBP is not expressed in growing oocytes, and that the oocyte-specific deletion of Taf7 abolishes the cytoplasmic localisation of TAF7, but does not influence oocyte growth, show that canonical TFIID does not assemble in the nuclei of growing oocytes. Thus, our results together demonstrate that during oocyte growth a stable TBPL2-TFIIA complex forms, and may function differently from TBP/TFIID.
In order to further characterise the composition of the TBPL2-TFIIA complex, we took advantage of NIH3T3 cells artificially overexpressing TBPL2 (NIH3T3-II10 cells 28 ). In this context where TBP and TAFs are present, TFIID is efficiently pulled down by an anti-TBP IP, but no interaction with TFIIA could be detected (Fig. 2b). Interestingly, the anti-TBPL2 IP showed that the artificially expressed TBPL2 can incorporate in TFIID-like complexes as TAFs were co-IP-ed (Fig. 2a); however, with much lower stoichiometry (NSAF values) than that of TBP (Fig. 2b). In contrast, strong interaction with TFIIA-αβ and TFIIFA-γ were detected, suggesting that the TBPL2-TFIIA complex can be formed in the NIH3T3-II10 cells and that TBPL2, to the contrary to TBP has the intrinsic ability to interact with TFIIA. Remarkably, in spite of the high similarity between the core domains of TBP and TBPL2, no interaction with Pol Iassociated SL1 (TAF1A-D) and Pol III-associated TFIIIB (BRF1) complexes 34 could be detected in the anti-TBPL2 IPs either in NIH3T3-II10 cells or in ovary WCEs ( Fig. 2a and Supplementary Data 1). In contrary, in the same extracts TBP associates with these Pol I and Pol III complexes ( Fig. 2b and Supplementary Data 2), suggesting that TBPL2 is not involved in Pol I and Pol III transcription initiation in the growing oocytes.
To analyse whether TBPL2 associates with TFIID TAFs and TFIIA in the same complex, we performed a gel filtration analysis of NIH3T3-II10 WCE. The profile indicated that most of the TBPL2 and TFIIA could be found in the same fractions (22)(23)(24)(25)(26) eluting around 150-200 kDa, while TBPL2 protein was below the detection threshold of the western blot assay in the TAF6-containing fractions 9-15 (Fig. 2c). To verify that TBPL2 and TFIIA are part of the same complex in fractions 22-26, we IP-ed TBPL2 from these pooled fractions and subjected them to mass spectrometric analysis. Our data confirmed that in these fractions eluting around 170 kDa, TBPL2 and TFIIA form a stable complex that does not contain any TAFs ( Fig. 2d and Supplementary Data 4). Thus, all these experiments together demonstrate that TBPL2/TFIIA form a stable complex in oocytes, where TBP is not expressed and TBPL2/TFIIA is the only promoter recognising transcription complex that could direct Pol II transcription initiation (see the summary of all the IPs in Fig. 2e).
TBPL2-dependent oocyte transcriptome. To characterise the growing oocyte-specific transcriptome and its dependence on TBPL2, we have performed a transcriptomic analysis of wild-type (WT) and Tbpl2 −/− oocytes isolated from primary (P7) and secondary (P14) follicles (Figs. 3, 4 and Supplementary Fig. 2, Supplementary Data 5). We observed the downregulation of a high number of oocyte-specific genes, out of which Bmp15 and Gdf9 served as internal controls 35,36 , as they were already described to be regulated by TBPL2 16 (Fig. 3a, b and Supplementary Fig. 2a). The principal component analysis showed that the four distinct RNA samples clustered in individual groups and that the main explanation for the variance is the genotype, and then the stage ( Supplementary Fig. 2b). Comparison of the RNAlevel fold changes between mutant and WT oocytes showed that in Tbpl2 −/− , there is a massive downregulation of the most highly expressed transcripts, both at P7 and P14 ( Supplementary  Fig. 2c). The Pearson correlation between the P7 and P14 fold change datasets for transcripts expressed above 100 normalised reads was close to 0.8 ( Supplementary Fig. 2c), indicating that Tbpl2 loss of function similarly altered RNA levels at P7 and P14 stages. We, therefore, focused on the P14 stage for the rest of the study.
In WT P14 oocytes transcripts corresponding to 10791 genes were detected. Importantly, many of these detected transcripts have been transcribed at earlier stages and are stored in growing oocytes 37 . As there is no Pol II transcription in Tbpl2 −/− growing oocytes 16 , RNAs detected in the Tbpl2 −/− mutant oocytes represent mRNAs transcribed by a TBP/TFIID-dependent mechanism and deposited into the growing oocytes independently of TBPL2 activity at earlier stages, i.e., at the primordial follicular stage, where TBP is still expressed. The proportion of genes (1396) upregulated following Tbpl2 deletion ( Fig. 3c) can be explained by two mutually not exclusive ways: (i) the consequence of the normalisation to the library size resulting in a slight overestimation of upregulated transcripts, and underestimation of downregulated transcripts and/or (ii) by transcript buffering mechanisms due to mRNA stabilisation 38 . Validation of the upregulation of some candidate transcripts levels ( Supplementary  Fig. 2d, e) strongly supports the latter hypothesis (but see also the next paragraph).
Nevertheless, we detected 1802 significantly downregulated transcripts in the Tbpl2 −/− oocytes (Fig. 3c). The downregulation of key genes known to be expressed during oocyte growth, such as Bmp15, Eloc, Fgf8, Gdf9 and Zar1 35,36,39 , were confirmed by RT-qPCR ( Supplementary Fig. 2f, g). These results suggest that TBPL2 has an important role in gene expression in the growing oocytes. Gene Ontology (GO) analyses of the biological process of the identified downregulated categories of genes (Supplementary Data 6) indicated that many genes, involved in meiosis II and distinct cell cycle processes, were significantly downregulated ( Supplementary Fig. 2h). The most enriched molecular function GO category was "poly(A)-specific ribonuclease activity" containing many genes coding for factors or subunits of complexes contributing to deadenylation/decapping/decay activity in eukaryotes ( Fig. 3d) (i.e., CCR4-NOT, PAN2/PAN3 40 ; DCP1A/DCP2 41 or BTG4 39 ). In good agreement with the transcriptome analyses, transcripts coding for these "poly(A)-specific ribonuclease activity" factors were significantly downregulated in Tbpl2 −/− mutant P14 oocytes when tested by RT-qPCR ( Fig. 3e and Supplementary Fig. 2i). Thus, in P14 oocytes TBPL2 is regulating the transcription of many genes coding for factors, which are in turn crucial in regulating the stability and translation of the mRNA stock deposited during early oogenesis, as well as transcription of meiosis II-and cell cycle-related genes to prepare the growing oocytes for the upcoming meiotic cell division.
A remarkable feature of oocytes is the very high expression of retrotransposons driven by Pol II transcription (see "Introduction"). As expected, in WT P7 and P14 oocytes, the expression of ERVs was found to be the most abundant 27,42 (Supplementary Fig. 3a-c). Importantly, the transcription of the vast majority of MaLR elements was the most affected in Tbpl2 −/− mutant oocytes at P7 and P14 (Fig. 4). Among them, three highly expressed members, MT-int, MTA_Mm and MTA_Mm-int, were dramatically downregulated in P7 and P14 Tbpl2 −/− mutant oocytes ( Supplementary Fig. 3d, e). As in P14 oocytes, TBPL2 depletion is reducing transcription more than fourfold from MaLR ERVs, which often serve as promoters for neighbouring genes 27,42 , TBPL2 could seriously deregulate oocyte-specific transcription and consequent genome activation.
This demonstrates that TBPL2 is orchestrating the de novo restructuration of the maternal transcriptome and that TBPL2 is crucial for indirectly silencing the translation of the earlier deposited TBP-dependent transcripts.
In contrast, only about 1/3rd of the TBPL2-independent TSS clusters contained WW-enriched motifs at a similar position (Fig. 5b, red arrowhead), as would be expected from promoters that lack maternal promoter code determinants 23,44 P14 WT oocytes P14 Tbpl2 -/oocytes TATA boxes are often associated with tissue-specific gene promoters, we investigated whether the above observed WW motif densities correspond to TATA boxes using the TBP position weight matrix (PWM) from the JASPAR database as a reference. To this end, the presence of TATA boxes was analysed in the TSS clusters of the two datasets and revealed that TBPL2dependent TSS clusters were enriched in high-quality TATA boxes, including a clear increase in the proportion of canonical TATA boxes, when compared to TBPL2-independent TSS clusters (Fig. 5c). Genome browser view snapshots indicate that TSS clusters in P14 WT oocytes tend to be sharp and are associated with TATA-like motifs ( Supplementary Fig. 4a, b). Analysis of the global distribution of the number of TSSs and of the width of the TSS clusters in the above-defined two categories confirmed that TBPL2-dependent TSS are sharper compared to the TBPL2-independent TSS clusters ( Supplementary Fig. 4c, d).
In order to test whether TBPL2 controls transcription initiation from maternal promoter code determinants, we grouped the expression profiles corresponding to each consensus TSS clusters, to characterise promoter activity profiles among datasets by performing self-organising maps (SOMs) 45 ( Supplementary  Fig. 4e). We then focussed on the two most distinct SOM groups: the downregulated promoters (blue group, containing 9442 consensus TSS clusters) (Fig. 5d) and the upregulated promoters (red group, with 6900 consensus TSS clusters) (Fig. 5e). Motif analyses of these two categories of promoters in their −35/ +5 regions relative to the different dominant TSSs indicated that only the core promoters associated with TBPL2-dependent dominant TSSs belonging to the downregulated gene promoters contain a well-defined 7 bp long TATA box-like motif (W-box) in their −31 to −24 regions (Fig. 5f, g and Supplementary Fig. 4f-i). Importantly, W-box-associated TSSs architecture usage distribution for these TBPL2-dependent dominant TSSs was sharp ( Supplementary Fig. 4j, l), as expected for motif-dependent transcriptional initiation 23,44 . In contrast, TBPL2-independent TSSs belonging to the upregulated promoters exert a much broader TSS pattern ( Supplementary Fig. 4k, m). Interestingly, GO analyses of the genes associated with the downregulated promoters revealed a strong association with deadenylation/ decapping/decay activity ( Supplementary Fig. 4n-p, Supplementary Data 7), further confirming our initial RNA-seq analysis observations (Fig. 3).
Importantly, TSS architecture analyses of the TBPL2dependent MaLR ERV TSSs indicated that the majority of MaLR core promoters contain high-quality TATA box motif (median of the TATA box PWM match is 85%, Fig. 5h-j). These observations together demonstrate that the TBPL2/ TFIIA complex drives transcription initiation primarily from core promoters that contain a TATA box-like motif in their core promoter and directs sharp transcription initiation from the corresponding promoter regions to overhaul the growing oocyte transcriptome.
In addition, we observed that TSS usage can shift within the promoter of individual genes depending on the genetic background ( Supplementary Fig. 4b). To get more insights into these promoter architecture differences, we identified genome-wide 6429 shifting promoters by comparing either TBPL2-dependent to TBPL2-independent TSS data. These results are consistent with TSS shifts between TBP/TFIID-dependent somatic-like and maternal promoter codes occurring either in 5′ or 3′ directions ( Fig. 6a and Supplementary Fig. 4q) 44 . WW motif analysis indicated that on each shifting promoter, TBPL2-dependent dominant TSSs are associated with WW motifs, while TBPL2independent dominant TSSs are not (Fig. 6b). In addition, the TATA box PWM match analyses indicated that these WW motifs are enriched in TATA box-like elements compared to the corresponding TBPL2-independent shifting TSSs (Fig. 6c). Thus, our experiments provide a direct demonstration that TBP/TFIID and TBPL2/TFIIA machineries recognise two distinct sequences co-existing in promoters of the same genes with TBPL2 directing a stronger WW/TATA box-dependent sharp TSS selection in them.

Discussion
In this study, we show that a unique basal transcription machinery composed of TBPL2 associated with TFIIA is controlling transcription initiation during oocyte growth, orchestrating a transcriptome change prior to fertilisation using an oocyte-specific TTS usage.
TBPL2 expression in mice is limited to the oocytes and in its absence, oocytes fail to grow and Tbpl2 −/− mouse females are sterile 16,28 . In a mirroring situation, TBPL1 (TRF2) expression is enriched during spermatogenesis, and male germ cells lacking TBPL1 are blocked between the transition from late-round spermatids to early elongating spermatids 14 parallel between TBPL2 and TBPL1 is that both TBP-type factors form endogenous stable complexes with TFIIA. The beginning of TBPL2 accumulation in the oocyte nuclei or TBPL1 accumulation in male germ cell nuclei coincides with the phase of meiosis I 15,28,46 . It is thus conceivable that TBPL2-TFIIA in oocytes or TBPL1-TFIIA during spermatogenesis are involved in the control of gene expression in a meiotic context to set up the corresponding transcriptome. Interestingly, both transcription complexes seem to function in a compacted chromatin environment in which TBP/TFIID probably cannot. However, while TBPL2 and TBP show contrasting expression patterns in the oocytes 28 , TBPL1 and TBP are co-expressed in spermatids 46,47 and it has been suggested that TBPL1 is a testis-specific subunit of TFIIA that is recruited to PIC containing TFIID and might not primarily act independently of TFIID/TBP to control gene expression in round spermatids 48 . While TBPL1 forms a complex also with the TFIIA-αβ paralogue, ALF, in testis 48-50 , TBPL2 does not stably associate with ALF, in spite of the fact that it is expressed in oocytes 50 . TBP-like factors are bipartite proteins with variable N-terminal domains and relatively well-conserved shared C-terminal domains (core domains) forming a saddle-like structure with a concave surface that is known to bind to DNA 17 . Interestingly, TBPL1 has a very short N-terminal domain 5,18 , suggesting that it lost some abilities to interact with partners. Our data suggest that despite their very high similarity (92% identity between the core domains of TBP and TBPL2; reviewed in ref. 51 ), TBP and TBPL2 display different properties as they seem to recognise different DNA sequences to regulate gene promoters with different promoter architectures. Our IP-MS analyses from ovary WCE indicate that contrary to TBP, TBPL2 does not interact with TAFs in growing oocytes. Our analyses in the NIH3T3-II10 cells that overexpress TBPL2 showed that TBPL2 can interact with TAFs in this artificial situation, albeit with less affinity compared to TFIIA, or TBP-TAFs interactions. Our transcriptomic data indicate that all Taf mRNAs, except Taf7l, are detected in growing oocytes (Supplementary Data 5). However, whether they are also expressed in oocytes at the protein level is not yet known, except for TAF4B that has been detected in female neonate oocytes 52 .
Nevertheless, our data suggest that TAF7 is expressed, but localised to the cytoplasm. It is conceivable that, similarly to Tbp mRNA that is transcribed, but not translated in oocytes 53 , Taf mRNA translations (other than Taf7) are also inhibited and thus, the canonical TFIID, or its building blocks, cannot be assembled, and as a result, the canonical TFIID is not present in the nuclei of growing oocytes. Another reason why TBPL2 does not interact with TAFs or ALF, but rather interacts with TFIIA could be its Nterminal domain that is very different from that of TBP (only 23% identity 51 ).
TBPL2 proteins from different vertebrates show a high degree of similarity in their C-terminal core domains amongst themselves, but display very little conservation in their N-terminal domains 12 . It is interesting to note that TBPL2 deficiency leads to embryonic phenotypes in Xenopus 13 and zebrafish 12 , because, contrary to the mouse, TBPL2 is still present in the embryo after fertilisation and thus may act in parallel with TBP in the transcription of specific embryonic genes 10,54 . The molecular mechanism by which TBPL2 controls the transcription of these specific sets of genes in frogs and in fish has not been studied. On the contrary, TBPL2 in mammals is only expressed in growing oocytes and the only phenotype that can be observed in mammals is female sterility 16,29 .
LTR retrotransposons, also known as ERVs, constitute~10% of the mouse genome (reviewed in ref. 55 ). While their expression is generally suppressed by DNA methylation and/or repressive histone modifications, a subset of ERV subfamilies retains transcriptional activity in specific cell types 56 . ERVs are especially active in germ cells and early embryos (reviewed in ref. 26 ). Indeed, many genome-wide transcripts are initiated in LTRs, such as for example of MaLRs in mouse oocytes, which constitute~5% of the genome 57 . Members of the MT subfamily of MaLRs are particularly active in oocytes and hundreds of MT LTRs have been co-opted as oocyte-specific gene promoters 27,58 . As LTRinitiated transcription units shape also the oocyte methylome, it will be important to analyse also how TBPL2 influences DNA methylation in oocytes.
Oocytes display remarkable post-transcriptional regulatory mechanisms that control mRNA stability and translation. During Interestingly, TBPL2 is regulating the activity of several deadenylation/decapping/decay complexes and in the absence of TBPL2, we observed apparent stabilisation of a significant number of transcripts, suggesting that in wild-type oocytes TBPL2 is indirectly inhibiting the translation of mRNAs, and/or inducing the degradation of the mRNAs, previously transcribed by TFIID/TBP-driven Pol II and deposited in the primordial follicular oocytes (Fig. 7). To put in place the growing oocyte-specific maternal transcriptome TBPL2 is controlling the production of new mRNAs using a maternal-specific TSS grammar, as most of these transcripts will remain in the oocyte after transcriptional quiescence. Remarkably, as TBPL2 does not interact with Pol I and Pol III transcription machineries in the growing oocytes, this strongly suggest that rRNA and tRNA are deposited very early during oogenesis in amounts sufficient for the initiation of development. Therefore, it seems that TBPL2 contributes to establish a novel TBPL2-dependent growing oocyte transcriptome and consequent proteome required for further development and oocyte competence for fertilisation (Fig. 7). The indirect regulation of previously deposited mRNAs by a global transcription regulator resembles the well-characterised maternal to zygotic transition (MZT), during which clearance of inherited transcriptome is mediated by de novo gene products generated by newly activated transcription machinery (reviewed in ref. 59 ). At hundreds of gene promoters, two distinct TSS-defining "grammars" coexist in close proximity genome-wide and are differentially utilised either by TBPL2/TFIIA in primary/secondary follicular oocytes, or by TBP/TFIID in primordial follicular oocytes or in the fertilised embryo. This again shows a striking parallel to MZT 23 , where multiple layers of information are embedded in the same promoter sequence, each representing a different type of regulatory grammar interpreted by dedicated transcription machinery depending on the cellular environment.

Methods
Cell lines and cell culture. The NIH3T3-II10 line overexpressing TBPL2 and the control NIH3T3-K2 have already been described 28 and were maintained in high glucose DMEM supplemented with 10% of new-born calf serum at 37°C in 5% CO 2 .
Whole-cell extracts. NIH3T3-II10 and NIH3T3-K2 cells cultured in 15-cm dish were washed twice with 1× PBS, subsequently harvested by scrapping on ice. Harvested cells were centrifuged 1000 × g at 4°C for 5 min and then resuspended in one packed cell volume of whole-cell extraction buffer (20 mM Tris-HCl pH 7.5, 2 mM DTT, 20% glycerol, 400 mM KCl, 1× protease inhibitor cocktail (PIC, Roche)). Cell lysates were frozen in liquid nitrogen and thawed on ice three times, followed by centrifugation at 20,817 × g, at 4°C for 15 min. The supernatant was collected, and protein concentration was measured by Bradford protein assay (Bio-Rad). The cell extracts were used directly for immunoprecipitation and western blot, or stored at −80°C.
Ovaries collected from postnatal day 14 (P14) CD1 and C57BL/6N female mice were homogenised in whole-cell extraction buffer [20 mM Tris-HCl pH 7.5, 2 mM DTT, 20% glycerol, 400 mM KCl, 5× PIC (Roche)]. Cell lysates were frozen in liquid nitrogen and thawed on ice for three times, followed by centrifugation at 20,817 × g, at 4°C for 15 min. The supernatant extracts were used directly for immunoprecipitation.
Antibodies and antibody purification. The antibodies are listed in Supplementary  Table 1. The IGBMC antibody facility raised the anti-TBPL2 polyclonal 3024 serum against the CPDEHGSELNLNSNSSPDPQ peptide (amino acids 111-129) coupled to ovalbumin and injected into one 2-month-old female New-Zeland rabbit. The resulting serum was affinity purified by using the Sulfolink Coupling Gel (Pierce) following the manufacturer's recommendations. proteins were eluted with 0.1 M glycine pH 2.8 and neutralised with 1.5 M Tris-HCl pH 8.8. Immunoprecipitation performed from whole-cell extracts of NIH3T3-II10 and NIH3T3-K2 cells were following the same procedures with protein G Sepharose beads (GE Healthcare): 18 µg of rabbit anti-TBPL2 (3024) and 15 µg anti-TBP per IP.
Mass spectrometry analyzes and NSAF calculations. Samples were TCA precipitated, reduced, alkylated, and digested with LysC and Trypsin at 37°C overnight. After C18 desalting, samples were analysed using an Ultimate 3000 nano-RSLC (Thermo Scientific, San Jose, CA) coupled in line with a linear trap Quadrupole (LTQ)-Orbitrap ELITE mass spectrometer via a nano-electrospray ionisation source (Thermo Scientific). Peptide mixtures were loaded on a C18 Acclaim PepMap100 trap column (75-μm inner diameter × 2 cm, 3 μm, 100 Å; Thermo Fisher Scientific) for 3.5 min at 5 μL/min with 2% acetonitrile (ACN), 0.1% formic acid in H 2 O and then separated on a C18 Accucore nano-column (75-μm inner diameter × 50 cm, 2.6 μm, 150 Å; Thermo Fisher Scientific) with a 240-min linear gradient from 5% to 50% buffer B (A: 0.1% FA in H 2 O/B: 80% ACN, 0.08% FA in H 2 O) followed with 10 min at 99% B. The total duration was set to 280 min at a flow rate of 200 nL/min. Proteins were identified by database searching using SequestHT with Proteome Discoverer 1.4 software (Thermo Fisher Scientific) a combined Mus musculus database generated using Uniprot [https://www.uniprot.org/uniprot/? query=proteome:UP000000589&sort=score] (Swissprot, release 2015_11, 16730 entries) where five interesting proteins sequences (TrEMBL entries: TAF4, ATXN7L2, TADA2B, BTAF1 and SUPT3) were added. Precursor and fragment mass tolerances were set at 7 ppm and 0.5 Da, respectively, and up to two missed cleavages were allowed. Oxidation (M) was set as variable modification and Carbamidomethylation © as fixed modification. Peptides were filtered with a false discovery rate (FDR) at 5%, rank 1 and proteins were identified with one unique peptide. Normalised spectral abundance factor (NSAF) 31 were calculated using custom R scripts (R software version 3.5.3). Only proteins detected in at least two out of three of the technical or biological replicates were considered for further analyses.
Gel filtration. A Superose 6 (10/300) column was equilibrated with buffer consisting of 25 mM Tris-HCl pH 7.9, 5 mM MgCl 2 , 150 mM KCl, 5% Glycerol, 1 mM DTT and 1× PIC (Roche). Five hundred μL of whole-cell extracts containing ∼5 mg of protein were injected in an ÄKTA avant chromatography system (Cytiva) and run at 0.4 mL/min. Protein detection was performed by absorbance at 280 nm and 260 nm. Five hundred μL fractions were collected and analysed by western blot and IP-MS.
Animal experimentation. Animal experimentations were carried out according to animal welfare regulations and guidelines of the French Ministry of Agriculture, and procedures were approved by the French Ministry for Higher Education and Research ethical committee C2EA-17 (project n°2018031209153651). The Tg(Zp3-Cre), Taf7 flox and Tbpl2mouse lines have already been described 16,33,60 .
Superovulation. Five units of pregnant mare serum (PMS) was injected intraperitoneally in 4-week-old female mice between 2 and 4 pm. After 44-46 h, GV oocytes were collected from the ovaries by puncturing with needles.
Oocytes collection. After dissection, ovaries were freed from adhering tissues in 1× PBS. Series of six ovaries were digested in 500 µL of 2 mg/mL Collagenase (SIGMA), 0.025% trypsin (SIGMA) and 0.5 mg/mL type IV-S hyaluronidase (SIGMA), on a ThermoMixer (Eppendorf) with gentle agitation for 20 min. The digestion was then stopped by the addition of 1 mL of 37°C pre-warmed αMEM −5% FBS. The oocytes were then size-selected under a binocular.
RNA preparation. Pool of 100-200 oocytes collected were washed through several M2 drops, and total RNA was isolated using NucleoSpin RNAXS kit (Macherey-Nagel) according to the user manual. RNA quality and quantity were evaluated using a Bioanalyzer. Between 5 and 10 ng of RNA was obtained from each pool of oocytes.
Reads were preprocessed in order to remove the adapter, poly(A) and lowquality sequences (Phred quality score below 20). After this preprocessing, reads shorter than 40 bases were discarded for further analysis. These preprocessing steps were performed using cutadapt version 1.10 61 . Reads were mapped to spike sequences using bowtie version 2.2.8 62 , and reads mapping to spike sequences were removed for further analysis. Reads were then mapped onto the mm10 assembly of Mus musculus genome using STAR version 2.7.0f 63 . Gene expression quantification was performed from uniquely aligned reads using htseq-count version 0.9.1 64 , with annotations from Ensembl version 96 and "union" mode. Read counts were normalised across samples with the median-of-ratios method to make these counts comparable between samples, and differential gene analysis was performed using the DESeq2 version 1.22.2 65 . All the figures were generated using R software version 3.5.3.

RT-qPCR.
Complementary DNA was prepared using random hexamer oligonucleotides and SuperScript IV Reverse Transcriptase (Invitrogen) and amplified using LightCycler ® 480 SYBR Green I Master (Roche) on a LightCycler ® 480 II (Roche). Primers used for qPCR analysis are listed in Supplementary Table 2.
Repeat element analyses. Data were processed as already described 66  SLIC-CAGE analyses. Twenty-eight and 13 ng of total RNA isolated from P14 oocytes (biological replicate 1 and replicate 2,~500-1000 oocytes pooled for each replicate) and 15 ng of the total RNA isolated from P14 Tbpl2 −/− mutant oocytes (approximately 550 pooled oocytes) were used for SLIC-CAGE TSS mapping 43 . Briefly, 5 µg of the carrier RNA mix were added to each sample prior to reverse transcription, followed by the cap-trapping steps designed to isolate capped RNA polymerase II transcripts. The carrier was degraded from the final library prior to sequencing using homing endonucleases. The target library derived from the oocyte RNA polymerase II transcripts was PCR-amplified (15 cycles for P14 WT, 16 cycles for P14 Tbpl2 −/− mutant) and purified using AMPure beads (Beckman Coulter) to remove short PCR artifacts (<200 bp, size selection using 0.8× AMPure beads to sample ratio). The libraries were sequenced using HiSeq2500 Illumina platform in single-end, 50 bp mode (Genomics Facility, MRC, LMS).
Sequenced SLIC-CAGE reads were mapped to the reference M. musculus genome (mm10 assembly) using Bowtie2 62 with parameters that allow zero mismatches per seed sequence (22 nucleotides). Uniquely mapped reads were kept for downstream analyses using CAGEr Bioconductor package (version 1.20.0) 68 and custom R/Bioconductor scripts. Bam files were imported into R using the CAGEr package, where the mismatching additional G, if added through the template-free activity of the reverse transcriptase, was removed. Same samples sequenced on different lanes and biological replicates were merged prior to final analyses.
TSSs corresponding to the MaLR ERVS were identified after annotation using HOMER (version 4.10) 69 .
Sequence analyses were performed using Bioconductor R seqPattern (version 1.14) and R custom scripts. WW dinucleotides enrichment was computed using plotPatternDensityMap on −250/+250 regions centred on the dominant TSSs. TATA box position weight matrix (PWM) matches analyses were performed using the MotifScanScores function applied on the −35/−20 sequences centred on the dominant TSSs, using the TBP PWM provided in the SeqPattern package (derived from the JASPAR database). The distribution of the best match for each sequence was then plotted. Sequence Logo was created using Bioconductor R package SeqLogo.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The datasets generated during the current study are available in different repositories: proteomic data; ProteomeXchange PRIDE database with accession PXD0316347, RNAseq data; Gene Expression Omnibus database GSE140090 and SLIC-CAGE data; ArrayExpress E-MTAB-8866. Source data are provided with this paper.

Code availability
RNA-seq data were analysed using Bioconductor package DESeq2, SLIC-CAGE data were analysed using Bioconductor package CAGEr. All custom codes are available upon request.