A class I odorant receptor enhancer shares a functional motif with class II enhancers

In the mouse, 129 functional class I odorant receptor (OR) genes reside in a ~ 3 megabase huge gene cluster on chromosome 7. The J element, a long-range cis-regulatory element governs the singular expression of class I OR genes by exerting its effect over the whole cluster. To elucidate the molecular mechanisms underlying class I-specific enhancer activity of the J element, we analyzed the J element sequence to determine the functional region and essential motif. The 430-bp core J element, that is highly conserved in mammalian species from the platypus to humans, contains a class I-specific conserved motif of AAACTTTTC, multiple homeodomain sites, and a neighboring O/E-like site, as in class II OR-enhancers. A series of transgenic reporter assays demonstrated that the class I-specific motif is not essential, but the 330-bp core J-H/O containing the homeodomain and O/E-like sites is necessary and sufficient for class I-specific enhancer activity. Further motif analysis revealed that one of homeodomain sequence is the Greek Islands composite motif of the adjacent homeodomain and O/E-like sequences, and mutations in the composite motif abolished or severely reduced class I-enhancer activity. Our results demonstrate that class I and class II enhancers share a functional motif for their enhancer activity.

In the main olfactory epithelium (MOE), olfactory sensory neurons (OSNs) detect chemical stimuli in the external environment by expressing odorant receptor (OR) genes 1 . ORs, G protein-coupled receptors with a putative seven-transmembrane structure, evolved to adapt to species-specific chemical environments, resulting in the establishment and diversification of the largest gene family in vertebrate genomes 2 . Mammalian OR genes are classified into two classes, class I and class II, based on the homology of their deduced amino acid sequences 3 . Class I ORs resemble the OR family first identified in fish and frogs 4,5 , whereas class II ORs are specific to terrestrial animals 6 . It has been presumed that class I ORs detect hydrophilic odorants and class II ORs detect hydrophobic odorants 7 .
Class I and class II genes have different genomic organization 8,9 . There are 129 functional class I genes in the mouse genome, all of which are embedded in a single huge cluster within a ~ 3 Mb genomic region on chromosome 7, forming one of the largest gene clusters in the genome. In contrast, the ~ 950 class II genes are distributed throughout almost all chromosomes. Each olfactory sensory neuron expresses a single functional allele of a single OR gene from the repertoire of class I or class II ORs [10][11][12][13][14] . This one neuron-one receptor rule is established during OSN differentiation by the two sequential steps of OR class choice and the expression of a single OR gene from the corresponding class of OR repertoires.
The OR class choice is regulated by the zinc finger transcription factor Bcl11b (also known as Ctip2), which functions as a binary switch to select OR class 14 . In the absence of Bcl11b, the class I gene is selected by default, whereas class I gene expression is suppressed in the presence of Bcl11b, leading to selection of the class II gene. The singular OR gene expression can be further divided into two processes, the selection of a single OR allele and the subsequent maintenance of transcription of that allele. For the selection of a single OR allele choice, Magklara et al. demonstrated that all OR genes are epigenetically silenced prior to OR selection, and that a single OR allele escapes stochastically from heterochromatic silencing 15 . Subsequently, the expressed OR protein elicits a negative-feedback signal to prevent the activation of additional OR genes 16,17 .
In addition to epigenetic regulation, it has been demonstrated that cis-regulatory elements/enhancers are involved in the transcriptional activation of a single OR allele 13,16,[18][19][20]  www.nature.com/scientificreports/ within the linked cluster. A few cis-regulatory elements have been identified experimentally in mouse class II genes from ~ 60 candidate elements. The H, P, and Lipsi elements control 7-10 class II genes of the linked clusters within a ~ 200 kb genomic range 16,[18][19][20][21] . Together with the cis-regulatory effect, trans-interaction among the class II enhancers (Greek islands) has been demonstrated to play an important role in the formation of OR compartment and interchromosomal enhancer hub to express a single OR gene, in which intergenic class II enhancers (Greek islands) form a super-enhancer associating with a single active OR gene 22 . Recently, the J element, a cisregulatory element of the mouse class I genes was identified, which exhibits unique features with respect to its extraordinary long-range regulation and is evolutionarily conserved in mammalian species 13 . The deletion of the J element was shown to result in a significant decrease in the mRNA levels of 75 class I genes over the whole 3 Mb cluster. Intriguingly, the J element regulates class I gene expression of a much larger number of genes and over a greater genomic distance than not only class II enhancers (comparing the cis-effect) but also any other known enhancer elements, for example, the largest number of genes has been found to regulate is ~ 30, in the cluster control region of the protocadherin-β and γ-genes 23 , and the longest genomic distance across which has been found to act is ~ 1.3 Mb, for the 3′ enhancer of the protooncogene Myc 24 .
In the class II enhancers, the H and P elements contain conserved sequence motifs of multiple homeodomain sites and neighboring O/E-like sites 18 . Mutation analysis of the core H element demonstrated that these conserved motifs are essential for enhancer activity. This motif organization of homeodomain and O/E-like sites was also found in the 430-bp core J element, which is conserved in mammalian species from the platypus to the humans 13 . Recently, deletion of the 912-bp region containing the core J element by genome editing was shown to result in a massive decrease in class I gene expression, suggesting that the core J element plays an important role in class I gene expression 25 . Interestingly, the core J element contains a class I-specific novel conserved motif (5′-AAA CTT TTC-3′). In this study, to elucidate the molecular mechanisms underlying the class I-specific enhancer activity of the J element, we analyzed the core J sequence to identify the essential region and motifs for class I OSN-specific enhancer activity.

Results
The class I-specific motif is not essential for the enhancer activity. We previously demonstrated that the J-gVenus transgene, in which the 3.8-kb NcoI fragment including the J element was placed upstream of the 0.9-kb Olfr544 promoter region and the gapVenus reporter gene, exhibited reproducible and robust expression of gapVenus specifically in class I OSNs (Fig. 1A) 13 . Because the Olfr544 promoter region itself could not activate reporter gene expression 13,26 , and was replaceable by the SV40 minimal promoter for the class I-specific gene expression of the J element ( Fig. 1), we concluded that the J element is responsible for expression of the transgene. Within the J element, the 430-bp region (the core J element) that corresponds to the highest homology region between the mouse and human J elements contains the novel conserved motif of AAA CTT TTC in addition to the cluster of multiple homeodomain and neighboring O/E-like sites, as in the class II enhancers 13 (Fig. 1A). To identify the minimum requirement for the enhancer activity of the J element and function of the conserved motifs, we constructed a deletion series of transgenes based on the J-gVenus construct, and generated transgenic mice (Fig. 1B).
First, we constructed the J-ΔCore transgene by deleting the core J sequence from the J-gVenus transgene and generated transgenic mice using the Tol2 system. None of the nine founders carrying the J-ΔCore transgene showed gapVenus expression, whereas five out of six founders/lines of the J-gVenus Tg mice demonstrated the class I OSN-specific gene expression pattern, indicating that the core J is essential for the enhancer activity of the J element ( Fig. 1B,C). This result is comparable to the result of deletion of the 912-bp region including the core J element in vivo in a previous study 25 .
As the AAA CTT TTC motif is conserved through mammalian genomes from the platypus to humans and is specific to the class I enhancer 13 , it is possible that this motif is responsible for class I OSN-specific enhancer activity. To test the function of the class I-specific conserved motif, we deleted a region containing the class I-specific motif to retain the homeodomain and O/E-like sites to generate the J-ΔMotif transgene. Four out of five founders exhibited gapVenus fluorescence in both the MOE and olfactory bulb (OB) similar to the J-gVenus Tg mice, indicating that the novel conserved motif of AAA CTT TTC specific to the class I enhancer is not essential for the enhancer activity of the J element ( Fig. 1B,C).

The 330-bp core J-H/O is necessary and sufficient for class I-specific enhancer activity.
The results of the J-ΔMotif transgene reporter assay together with that of the J-ΔCore transgene suggested that the remaining 330-bp region containing the homeodomain and O/E-like sites designated as the core J-H/O is important for the enhancer activity of the J element. To test this, we deleted this region and generated transgenic mice carrying the J-ΔCore-H/O transgene. None of the seven founders of the J-ΔCore-H/O Tg mouse line exhibited gapVenus fluorescence, indicating that the 330-bp core J-H/O is necessary for the class I-specific enhancer activity (Fig. 1B,C).
We further examined if the core J-H/O is sufficient for enhancer activity by constructing a CoreJ-H/O transgene, in which the 330-bp core J-H/O element was placed upstream of the Olfr544 promoter region and the gapVenus reporter gene ( Fig. 2A). All six founders of the CoreJ-H/O Tg line showed a similar class I OSN-specific gene expression pattern to that of the J-gVenus Tg mice in both the dorsal MOE and the dorsal OB (Fig. 2B). To confirm the class I OSN-specific enhancer activity, we analyzed the co-expression of gapVenus with class I or class II OR genes in CoreJ-H/O Tg mice by two-color ISH (Fig. 2C). Quantification analysis of 2115 OSNs from three independent mice showed that gapVenus-expressing OSNs predominantly co-labeled with class I genes but not with class II genes (class I, 54/1058, co-expression rate = 5.1%; class II, 5/1057, co-expression rate = 0.47%;   (Fig. 3A). Recently, genome-wide searches for intergenic OR enhancers uncovered ~ 60 candidate enhancer elements (Greek Islands) 20,30 . Sequence analysis of the Greek Islands revealed a novel motif, the composite of adjacent homeodomain and O/E-like sequences, which play an important role in class II enhancer functions by recruiting Lhx2 and O/Es 30 . Interestingly, one of the four homeodomain sites in the core J-H/O region, HD2, was found to be the composite motif (Fig. 3B). www.nature.com/scientificreports/ To examine whether this composite motif is important for the class I enhancer activity, we introduced point mutations into the motif by replacing AA with GG in the homeodomain sequence and CCC with AGG in the O/E-like sequence in the core J-H/O transgene to generate a mutated core J-H/O (mCoreJ-H/O) transgene (Fig. 3C). We obtained six founders of the mCoreJ-H/O Tg mouse line. Fluorescent signals of gapVenus were not detected in four founders and hardly detected in two founders. (Fig. 3D). These results indicate that the composite motif found in the core J-H/O plays a central role in the enhancer activity, and suggested that the remaining homeodomain and O/E-like sites may also contribute to the class I-specific enhancer activity.

Discussion
In this study, we characterized the functional J element, and demonstrated that the 330-bp core J-H/O element is necessary and sufficient to drive class I OSN-specific gene expression. The length of the core J-H/O element is similar to that of the core H (187 bp) and P (317 bp) elements 12,18 . In addition, the motif organization of homeodomain and neighboring O/E-like sites is shared among them, suggesting that these features are common in class I and class II, and are important for their function as OR enhancers. Lhx2 and O/Es bind to the homeodomain and O/E-like sites, respectively, in the core J-H/O element as well as class II enhancer/promoters 27,29,30 , suggesting that Lhx2 and O/Es play critical roles in class I gene expression as well as in class II. However, while a knockout mutation in Lhx2 precluded expression of class II genes, most class I genes were still expressed, though the mRNA levels and the number of expressing OSNs decreased, suggesting the existence of both Lhx2-dependent and Lhx2-independent mechanisms for class I gene expression. Indeed, it has been reported that not only Lhx2 but also Emx2 binds to the olfactory homeodomain sites, and targeted deletion of the Emx2 gene altered the frequency of expression of OR genes, including class I 29,31 .
Motif analysis identified that one of the four homeodomain sites in the core J-H/O element (HD2 in Fig. 3) is the Greek Islands composite motif, a critical motif for some class II enhancer activities 30 . Our mutagenesis study demonstrated that the composite motif in the core J-H/O element is also important for enhancing class I OSN-specific gene expression as in the class II enhancers. However, the mutations in the composite motif did not abolish the enhancer activity completely; two out of the six founders exhibited very weak but specific enhancer activity in class I-OSNs, suggesting that the remaining three homeodomain sites (HD1, 3, and 4 in Fig. 3) and an O/E-like site may contribute to the enhancer activity of the J element. Because multiple homeodomain sites are frequently found in class II enhancers/promoters 20,30,32-34 , they may cooperate to regulate enhancer function. For example, HD1 (AAA CTT TTA ATG A) in the core J-H/O element is similar to the extended homeodomain sequence of AAC TTT TTA ATG A found in the H and P elements, and in the P3 promoter 35 . Although the extended homeodomain sequence is distinct from the composite motif, tandem repeats of the extended homeodomain sequence markedly increase transgene expression 35,36 . It is possible that the extended homeodomain sequence of HD1 cooperates to regulate the enhancer activity with the composite motif in HD2.
Because the class I-specific conserved motif of AAA CTT TTC in the J element was perfectly conserved in mammalian species from the platypus to humans 13 , we expected a critical role in the class I-specific enhancer activity of the J element. Contrary to our expectations, however, a series of transgenic reporter assays showed that the class I-specific conserved motif is not required for class I OSN-specific transcriptional activation. What is the function of the class I-specific motif? The major difference between the J-element and class II enhancers is the scale of action. Class II enhancers regulate the expression of 7 to 10 genes within approximately 200 kb of genomic distance. In contrast, the J element regulates the expression of 75 genes over a genomic region of approximately 3 Mb. The 780-bp ZRS, a cis-regulatory element responsible for the spatiotemporal control of sonic hedgehog (Shh) at a distance of ~ 800 kb in the limb bud, is composed of two distinct domains 37 . Deletion of the 302-bp on the 3′ side of the ZRS in vivo abolished the expression of Shh. However, in Tg mice carrying the reporter transgene of the other part of the ZRS, that is, the 3′ side deletion of the ZRS, the endogenous expression pattern of the reporter gene is replicated. As in the ZRS, it is possible that the class I-specific motif is responsible for the ultra-long-range action of the J element across the entire cluster, and the composite motif, together with other homeodomain sites, plays an important role as an OR enhancer. This possibility will be examined by deleting the AAA CTT TTC motifs from the J element by genome editing, which should reveal its function in vivo.
Overall, we identified that the core J-H/O element is necessary and sufficient for class I OSN-specific enhancer activity and the composite motif plays a central role in the enhancer activity as in the class II enhancers. Thus, the activation mechanism of each enhancer uses a common sequence motif, and the functional motif sequences of class I and class II enhancers do not by themselves define class specificity. What mechanisms determine class specificity? Recently, we demonstrated that Bcl11b, a zinc finger transcription factor determines the OR class to be expressed in mouse OSNs, and demonstrated that the OR class choice is established at the level of OR enhancer activation 14 . In the absence of Bcl11b, the class I enhancer is activated throughout the MOE, whereas the class II enhancer is suppressed. The class I enhancer activity is suppressed in the presence of Bcl11b, which in turn permits the activation of class II enhancers, resulting in the expression of class II genes. Because the depletion of Bcl11b, even after the terminal differentiation into neurons, can switch the enhancer activation from class II to class I in class II-characteristic OSNs, i.e., Acsm4 (also known as O-MACS) and NQO1-negative ventral OSNs, class specificity is determined by the absence (class I) or the presence (class II) of Bcl11b. One new question has been raised. How Bcl11b suppresses the J element? Because the class I-specific conserved motif of AAA CTT TTC is inconsistent with GC-rich consensus sequence of Bcl11b binding 38,39 and the alignment of the core J sequences in eleven mammalian species did not reveal any conserved motifs other than the lass I-specific conserved motif, the composite motif, and homeodomain and O/E-like motifs 13 , Bcl11b may suppress the J element indirectly. Because Bcl11b recruits the nucleosome remodeling deacetylase (NuRD) complex 40 , the suppressive effects of Bcll1b on the J element may involve epigenetic modifications and chromatin remodeling.  CCTAACGAGGCCCCTGAGAT  CTTAATGAAGCCCAGGAGAC  CCTAATTAAGTTCATGGGAA  CTTAATGAAGCCCAGGAGAC  TTTAATTAGGCCATCAGGAG  GCTAATTAACCTCTCAAGTT  GTTAATTAGCATCAAGAAAA  TTTAACGAGGCTCTGGGGAG  GCAAATTAATGCCTTGGAGG  GTTAATGGGTCACATGGGGA  GTTAATAAGTCACCCGGGAA  GCTAATAGGCCCCCAAGGGG  TTAAATTAGTTTCTCGGTGG  GTTAACAAGCCTCCCAGGAA  TAAAAATAGTCTCATAGGGA  TTTAATTGGTCCCCTGATGA  TTTAATTGGCCCCTGAAAGG  CCTAATGAATCCCTAGGAAT  GCTAACGAGCCCCAGCGGAG  TTTAATGGAGCCCCAGGGAA  TCAAATAAGCCTCACAAGGC  TTTAATTAATTCCCTGAGGT  CCTAATTAGCCTTTGGGGAA  GGAAATGAGGGCCATGAGAA  TTTAATTAGTGTCTAAGGGA  GAAAACTAGCTCCTTGGAGA  TCTAATTAGTTCCCAGATGA  TATAATGAACCACTAGAGGC  GGTAATGAATCTCAAGGGAA  TTTAATGAGCCCCATAGTGA  CCTAATGAGCTCCCAAGGGA  TTTAATTAGCACACAGGGGA  CTTAATGAGCTCCCCTGGGA  TCTAATTAGTTCCCAAGTGA  www.nature.com/scientificreports/ In summary, in this study we identified the functional J element, and through examination of the effects of deletion and point mutations, demonstrated that class I and class II enhancers share the functional motif for their enhancer activities.

Methods
Animals. All mice were housed under standard conditions with a 12 h light/dark cycle, and access to food and water ad libitum. Mutant and wild-type mice of both sexes at 4-6 weeks of age were used for the experiments. All mouse studies were approved by the Institutional Animal Experiment Committee of the Tokyo Institute of Technology and were performed in accordance with the institutional, governmental ARRIVE guidelines. J-gVenus transgenic (Tg) mice were generated as described previously 13 .
A deletion series of J-gVenus Tg mice, CoreJ-H/O, and mutated-CoreJ-H/O (mCoreJ-H/O) Tg mice were generated using the Tol2 cytoplasmic microinjection method, which is suitable for the founder assays of Tg mice because of its high efficiency of transgene integrations into multiple integration sites (usually 6-8 integration sites/founder) and a single copy of transgene for each integration site 26,41 . DNA solution containing 20 ng per μL circular Tol2-transgene plasmid and 25 ng per μL transposase mRNA was injected into the cytoplasm of B6C3F1 mouse zygotes (Japan SLC, Inc.). To obtain B6C3F1 zygotes, B6C3F1 female mice over four weeks old were treated with superovulation, and then mated with adult B6C3F1 male mice. An Olympus IX-71 microscope equipped with a micromanipulator transgenic system (Narishige) and FemtoJet system (Eppendorf) was used for microinjection. Injected eggs were transferred to the oviducts of pseudopregnant female ICRs (over six weeks old, Japan SLC, Inc.). The founders were screened by PCR with the Venus primer set of 5′-GCA AGC TGA CCC TGA AGC TG-3′ and 5′-TTG CTC AGG GCG GAC TGG TA-3′ 26 Analysis of whole-mount specimens. Fluorescent images of gapVenus signals in whole-mount specimens were taken with an Olympus SZX10 fluorescent stereomicroscope with a DP71 digital CCD camera and a Leica SPE confocal microscope. Confocal images were collected as z-stacks and projected onto a single image for display. Images were adjusted and merged using Adobe Photoshop CC2018. Two-color in situ hybridization. Probes for Olfr78, Olfr544, Olfr552, Olfr578, Olfr672, Olfr692, Olfr19, Olfr54, Olfr73, Olfr151, Olfr521, Olfr878, and Venus (EGFP) were prepared as previously described 13 . Briefly, all riboprobes were synthesized by in vitro transcription using T3, T7, or Sp6 RNA polymerase (Roche, 11031163001; 10881767001; 10810274001) with hapten-labeled UTP of digoxigenin (DIG) (Roche, 11277073910) or fluorescein (FLU) (Roche, 11685619910) in the presence of an RNase inhibitor, RNasin (Promega, N2111). The following mixed probes were used for two-color in situ hybridization (ISH); class I mix: Olfr78, Olfr544, Olfr552, Olfr578, Olfr672, and Olfr692; and class II mix: Olfr19, Olfr54, Olfr73, Olfr151, Olfr521, and Olfr878.
Mice were transcardially perfused with 4% paraformaldehyde in PBS. Dissected MOE tissues were post-fixed overnight at 4 °C. The tissues were then decalcified in 0.45 M EDTA in PBS for at least 2 days. After cryoprotection with 15% and 30% sucrose in PBS, tissue samples were embedded in FSC 22 Frozen Section Media (Leica Biosystems, 3801481), and sectioned coronally at 12 μm thickness using a cryostat (Microm HM505E). Sections were collected on MAS-coated glass slides (Mastunami, S9441).

Motif analysis. Chromatin immunoprecipitation-sequencing (ChIP) data for Lhx2 and O/E proteins (Ebfs)
were retrieved from GSE93570 30 . The representative composite motif sequences described in Fig. 3 Statistical analysis. Statistical analysis and graphical representation were performed using Microsoft Excel. No randomization method was used, and no statistical methods were used to predetermine the sample size. The sample sizes in this study were generally similar to those used by other studies in the field. Quantification of the number of OSNs co-labeled with OR mix probes and EGFP probe was done blinded to exclude experimenter bias.