In female (XX) mammals, one of the two X chromosomes is inactivated to ensure an equal dose of X-linked genes with males (XY)1. X-chromosome inactivation in eutherian mammals is mediated by the non-coding RNA Xist2. Xist is not found in metatherians3 (marsupials), and how X-chromosome inactivation is initiated in these mammals has been the subject of speculation for decades4. Using the marsupial Monodelphis domestica, here we identify Rsx (RNA-on-the-silent X), an RNA that has properties consistent with a role in X-chromosome inactivation. Rsx is a large, repeat-rich RNA that is expressed only in females and is transcribed from, and coats, the inactive X chromosome. In female germ cells, in which both X chromosomes are active, Rsx is silenced, linking Rsx expression to X-chromosome inactivation and reactivation. Integration of an Rsx transgene on an autosome in mouse embryonic stem cells leads to gene silencing in cis. Our findings permit comparative studies of X-chromosome inactivation in mammals and pose questions about the mechanisms by which X-chromosome inactivation is achieved in eutherians.
X-chromosome dosage-compensation mechanisms vary between metazoans5. In metatherians, X-chromosome inactivation (XCI) is imprinted, affecting the paternal X chromosome6, but the factors that drive XCI in these mammals are unknown4. The metatherian and eutherian female inactive X chromosomes share common epigenetic features7,8,9, suggesting that XCI in these mammals proceeds by a similar mechanism. Notably, the metatherian inactive X chromosome is enriched for histone H3 Lys 27 trimethylation (H3K27me3)7,8,9,10. In eutherians, this H3K27me3 enrichment is Xist-dependent11. We therefore proposed that an unidentified X-linked RNA initiates XCI in metatherians7. Xist RNA is expressed in female but not male somatic tissues, coats the inactive X chromosome, and is expressed from the inactive X chromosome12,13,14,15. We posited that a regulator of XCI in metatherians would also exhibit these unusual properties.
We analysed XCI in the female brain of the short-tailed opossum M. domestica. Using RNA fluorescence in situ hybridization (FISH), we studied the expression of the X-linked gene Hprt1 with a bacterial artificial chromosome (BAC), VM18-839J22, containing Hprt1 plus 49 kilobases (kb) of upstream and 82 kb of downstream sequence, and in which no other known genes mapped (Fig. 1a). RNA FISH signals usually appear as pinpoint dots. However, the RNA signal detected resembled a cloud (Fig. 1a and Supplementary Fig. 1) that was reminiscent of the Xist RNA cloud seen in female mouse (Fig. 1a) and human cells15. We observed the same RNA cloud using a modified form of the BAC carrying an Hprt1 deletion (Fig. 1a). The RNA therefore originated from another, uncharacterized gene located within the genomic region defined by VM18-839J22. RNA FISH using other BACs narrowed down this region to 82 kb downstream of Hprt1 (Fig. 1a). We identified the RNA using reverse transcription PCR (RT–PCR) on female brain complementary DNA with primers located along this critical region (Fig. 1b and Supplementary Table 1), revealing a transcription unit spanning 47 kb (Fig. 1b).
We then investigated whether the RNA exhibited other Xist-like features. First, we looked for evidence of sexually dimorphic expression. No RNA clouds were detected in male opossum brain by VM18-839J22 RNA FISH (Fig. 1b), demonstrating that in this tissue expression of the RNA was female-specific. Consistent with this, RT–PCR on male brain cDNA revealed no expression of the 47-kb transcript previously identified in females (Fig. 1b). RT–PCR on a broad array of tissues, representing derivatives of endoderm, mesoderm and ectoderm, from both males and females revealed expression of the RNA in all female but no male tissues examined (Fig. 1b).
Next, we established whether the RNA coats the inactive X chromosome. We combined VM18-839J22 RNA FISH on female brain cells with immunostaining for the inactive X chromosome marker H3K27me3. We observed colocalization of RNA clouds and H3K27me3 signals (Fig. 1c), demonstrating inactive X chromosome coating.
To determine whether the RNA was expressed from the inactive X chromosome, we performed dual RNA–DNA FISH using the VM18-839J22 BAC. No RNA signal was seen colocalizing with the DNA signal on the active X chromosome (Fig. 1d). By contrast, an RNA signal was observed colocalizing with the DNA signal on the inactive X chromosome (Fig. 1d). This RNA signal was brighter than others in the surrounding cloud, a feature characteristic of a site of nascent RNA synthesis. Thus, the RNA is expressed only from the inactive X chromosome. This must be the paternal X chromosome, as this is always chosen for inactivation6. In summary, like Xist, the RNA that we identified is female-specific, coats the inactive X and is transcribed only from the inactive X chromosome. We call the RNA Rsx (RNA-on-the-silent X).
To characterize Rsx further, we performed RNA-sequencing (RNA-seq) on female opossum brain (Fig. 2a). This confirmed that the Rsx gene generates a precursor RNA of 47 kb (University of California Santa Cruz (UCSC) monDom5 coordinates: chrX 35,605,415–35,651,609) transcribed antisense relative to Hprt1. Split RNA reads indicated that Rsx encodes a spliced RNA consisting of four exons: this was confirmed by RT–PCR (Fig. 2a and Supplementary Table 1). The RNA-seq data predicted that the mature Rsx RNA is large, approximating 27 kb, with 25 kb of sequence deriving from a single exon. Northern blots confirmed that Rsx RNA was large, exceeding the 17 kb mouse Xist RNA in size, and validated the strandedness, female-specificity and broadness of Rsx expression (Fig. 2b). The level of Rsx expression varied between female tissues, an observation also noted for Xist (Supplementary Fig. 2). 3′ rapid amplification of complementary DNA ends (RACE) demonstrated that Rsx transcripts are polyadenylated.
Sequence comparisons showed that Rsx and Xist are not homologous. Nevertheless, Rsx exhibited features reminiscent of Xist. Notably, it was highly enriched in tandem repeats biased towards the 5′ end of the RNA (Fig. 2c) and exhibiting high GC content. The Rsx repeats included two highly conserved and similar motifs with the potential to form stem–loop structures (Fig. 2c and Supplementary Fig. 3). RNA FISH using an oligonucleotide probe recognizing one of these repeats gave a cloud signal indistinguishable from that seen using the VM18-839J22 BAC, confirming that the repeats are included in the RNA that coats the inactive X chromosome (Fig. 2c). The longest open reading frame (ORF) found for Rsx constituted less than 5% of the total RNA length, and was located in the repeat region, suggesting that Rsx functions as a non-coding RNA. We conclude that the Rsx and Xist RNAs display similar features.
RNA-seq has been used previously to identify new transcripts16. We speculated that analysis of RNA-seq data alone would identify Rsx as a candidate XCI RNA. An RNA with a role in XCI would be X-linked and expressed only in females, and should therefore be evident in a comparison of female and male transcriptomes. To identify X-linked genes with sexually dimorphic expression levels, we compared the numbers of reads mapping to each region of the X in the female with that in the male brain and expressed this as a female:male ratio (Supplementary Table 2 and Methods). When all transcribed regions on the X chromosome were examined, Rsx was an outlier, with a female:male ratio exceeding the second-ranked RNA by threefold (Fig. 2d). We repeated this RNA-seq approach on liver, in which the level of Rsx expression is low (Fig. 2b). In this analysis, Rsx appeared second (Supplementary Table 2). Thus, RNA-seq can be used as a preliminary discovery tool to identify RNAs involved in dosage compensation.
To investigate a link between Rsx RNA and XCI, we examined Rsx expression in the female germ line. In mice, Xist is expressed in somatic tissues but is silenced during oocyte development. This is accompanied by a loss of H3K27me3 from the inactive X chromosome and by X-chromosome reactivation17,18,19. Similar to other somatic cells, supporting cells in the ovary displayed Rsx clouds (Fig. 3a) and XCI, as shown by X chromosome H3K27me3 enrichment (Fig. 3b) and monoallelic expression of the X chromosome gene Msn (Fig. 3c). However, in germ cells, identified by HORMAD1 immunostaining20, Rsx clouds were absent (Fig. 3a). Consistent with a relationship between Rsx expression and XCI, most meiotic cells had two active X chromosomes, with no X chromosome H3K27me3 accumulation (Fig. 3b), and biallelic Msn expression (Fig. 3c). Rsx expression is therefore linked to X-chromosome inactivation and reactivation.
We next carried out experiments to address whether Rsx induces gene silencing. Xist transgenes function as ectopic X-inactivation centres in mouse embryonic stem (ES) cells, with Xist RNA coating the transgenic chromosome and inducing gene silencing in cis21,22. We generated an XX ES cell line, 303.2, carrying a single-copy chromosome 18-integrated transgene expressing full-length Rsx RNA (Fig. 4a).
We performed RNA FISH for Rsx and three chromosome 18 genes, Ndfip1, Prrc1 and Synpo, mapping near the transgene integration site, in differentiated 303.2 ES cells. We observed coating of the transgenic chromosome by Rsx RNA (Fig. 4b). Although Ndfip1, Prrc1 and Synpo were biallelically expressed in control ES cells (Fig. 4b, c), all three genes were silenced in more than half of the 303.2 ES cells (Fig. 4b, c). Silencing also occurred in undifferentiated 303.2 cells, albeit in a lower proportion than seen after differentiation (Fig. 4c). We conclude that Rsx expression can induce gene silencing in cis.
Finally, we looked for evidence of Rsx conservation among metatherians. Metatherians are divided into the South American and Australasian groups, which diverged 75–80 Myr ago. We identified expressed sequence tags (ESTs) with homology to opossum Rsx in two Australasian marsupials, the brushtail possum and tammar wallaby (Supplementary Table 3). In support of a role in XCI, RT–PCR demonstrated that in both organisms these ESTs were expressed only in females (Supplementary Fig. 4 and Supplementary Table 1). Rsx therefore originated before the major American–Australasian metatherian evolutionary split, indicating a common mechanism of XCI in all metatherians.
Here we identify Rsx, an RNA with properties suggestive of a role in metatherian XCI. Our findings indicate that RNA-mediated dosage-compensation mechanisms are widespread in the mammals. In eutherians, Xist is one of many non-coding RNAs expressed at the onset of XCI and colocated in the X-inactivation centre23. Our work identifies a candidate X-inactivation centre on the metatherian X chromosome and provides a point of focus for the identification of further RNAs that regulate XCI and ensure that it is imprinted. These studies will deepen our understanding not only of XCI, but also of the evolution and mechanisms controlling genomic imprinting in mammals.
Our results raise questions about the epigenetics of eutherian XCI. Xist RNA has been proposed to spread along the X chromosome by long interspersed elements (LINEs), which are abundant on the X chromosome24. Genomic analyses have concluded that the opossum X chromosome is not enriched in LINEs25, however Rsx RNA can nevertheless coat the inactive X chromosome. Thus, other factors, such as nuclear matrix and scaffold proteins26, may be more important primary determining factors for Rsx and potentially Xist spreading than LINEs. In addition, a study has found that in mice, imprinted inactivation of some X-linked genes proceeds in the absence of Xist27 (but see also ref. 28). It is therefore essential to determine whether an Rsx orthologue is present in eutherians and, if so, whether it is expressed and contributes to imprinted XCI in these mammals (Supplementary Discussion).
Previous work has shown that epigenetic features of XCI are conserved between metatherians and eutherians7,8,9. Although Rsx and Xist are not homologous, arising independently during evolution, they exhibit similarities in secondary sequence features. Thus, many aspects of the XCI pathway seem to have evolved convergently in metatherians and eutherians. With this in mind, it will be interesting to establish whether Rsx and Xist can replace one another in the XCI process. These experiments, as well as those that directly test the role of Rsx in metatherian XCI, would benefit from a genetically manipulable in vitro method for studying XCI in metatherians, akin to the ES cell system used in eutherians. The application of somatic cell reprogramming techniques29 to marsupials should now make this achievable.
RNA FISH, DNA FISH and immunostaining were performed as described elsewhere30. RNA-seq was performed on an Illumina HiSeq 2000 sequencer. Repeats were predicted using CisFinder.
Material for this study was acquired from opossums maintained at the Southwest Foundation for Biomedical Research in San Antonio (germ-cell studies) and from opossums maintained at the MRC NIMR (all other studies) according to UK Home Office regulations. Tammar wallabies of Kangaroo Island, South Australia origin were maintained in a breeding colony in open grassy enclosures. Husbandry, handling and experiments were in accordance with the National Health and Medical Research Council of Australia/Commonwealth Scientific and Industrial Research Organization/Australian Research Council (2004) guidelines, and all sampling techniques and collection of tissues were approved by the University of Melbourne Animal Experimentation Ethics Committees. Material from the brushtail possums was obtained from adult male and female possums housed at the Landcare Research Captive Animal Facility, Lincoln, New Zealand, following the Landcare Research code of ethical conduct and in accordance with part 6 of the New Zealand Animal Welfare Act 1999.
RNA FISH, DNA FISH and immunostaining
Combined RNA FISH, DNA FISH and immunostaining was carried out exactly as previously described30. All RNA FISH was carried out on primary cells collected immediately post-mortem. Germ cells were collected at both 14 days post-partum (d.p.p.) and 17 d.p.p.
VM18-839J22 BAC was electroporated into modified DH10B strain SW102 cells followed by selection using chloramphenicol. Competent BAC-containing SW102 cells were grown to competency and then electroporated with a recombineering construct that had been generated from a kanamycin-resistance template using primers listed in Supplementary Table 1. Kanamycin-resistant colonies were subsequently picked and the Hprt1 deletion was verified using primers flanking the deletion site.
RNA was sequenced using a strand-specific protocol31 with the exception that total RNA was used instead of the polyA+ fraction. In brief, 3.5 μg of total RNA was reverse-transcribed using Superscript II (Invitrogen) in the presence of 10 pmol of T18VN primer, 250 ng random hexamers (Promega), 120 ng actinomycin D (Sigma), 40 U RNAzin (Promega) and 0.5 mM dNTP in a total volume of 20 µl 1× reverse transcription buffer (Invitrogen). Reaction mixture was purified using QIAquick PCR purification kit (Qiagen) and second-strand synthesis was performed using SuperScript double-stranded cDNA synthesis kit (Invitrogen) as recommended by the manufacturer with the exception that 200 μM dTTP was replaced with 400 μM dUTP. Double-stranded cDNA was fragmented using Bioruptor (Diagenode) (15 min, low power, 30 s on, 30 s off) in a 50 μl volume. Sequencing libraries were prepared from fragmented cDNA using NEBNext DNA sample preparation kit (New England Biolabs) and Illumina PE adapters (PE-102-1003). Before the final PCR amplification samples were treated with uracil-N-glycosylase (Applied Biosystems). Sequencing was performed on an Illumina HiSeq 2000 sequencer in the NIDDK Genomics core in 2× 50-base-pair (bp) paired-end mode. After initial processing with Illumina pipeline, quality-filtered sequencing reads were aligned to monDom5 version of the opossum genome assembly using BWA32. We generated in total 96 Mb reads for the female brain sample and 56 Mb reads for the male brain sample. Data were further processed using Samtools33 and visualized with IGV34.
Sequencing of Rsx RNA
RNA-seq gives an overall predicted size of the mature Rsx RNA as 26,800 bp. Note that the predicted transcription start site differs when using RNA-seq or 5′ RACE (coordinates in Supplementary Table 1). Split reads spanning the two sequence gaps on the right, indicating that these gaps contain only intronic sequence (Fig. 2a). We sequenced 8 kb of the gaps located within this intron, and no RNA-seq reads mapped to this sequence, suggesting that further exons have not been missed. A third gap resides in the middle of the third exon of Rsx. PCR shows that this gap is 5 kb, rather than 8 kb according to the monDom5 version of genome assembly. We sequenced 2.2 kb of this gap, and encountered a short and highly repetitive unit that precludes sequencing of the remaining 2.8 kb.
Female and male transcriptome analysis
RNA-seq data was converted into a total of 12,582 transcription units (top 500 listed in Supplementary Table 2). Closely mapping and overlapping reads were amalgamated into ‘blocks’, in which adjacent reads were amalgamated if there was no more than a ‘coalescence distance’ between them, measured between their nearest inside edges. Initially this was done with a 250-nucleotide coalescence distance, and the male and female samples were analysed separately. A count was kept of the number of reads amalgamated into each block. To compare identical loci, the sex-specific blocks were then amalgamated with each other using a longer coalescence distance of 2 kb. These second level amalgamated blocks were then analysed for the ratio of female:male reads in each block. To prevent divide-by-zero errors, and suppress probable false positives in which only small numbers of reads are involved, we added an arbitrary amount of noise (in this case 10 reads) to both female and male read counts for each block, before calculating the ratio. Blocks are then ranked by this noise-adjusted ratio. Note that the ratio of 150 for Rsx cited in the text is the average of three different regions of the RNA; individual ratios were 218, 205 and 28 (see also Supplementary Table 2).
For repeat prediction, the 5′ region of the Rsx transcript was searched for over-represented motifs using CisFinder35. Motifs were then mapped back to the primary transcript using pairwise BLAST.
RNA extraction, RT–PCR, cloning, northern blotting, RACE and quantitative PCR
RNA was extracted using Trizol (GIBCO BRL) according to the manufacturer’s instructions. For RT–PCR, 2 μg of total RNA was treated with DNase (Promega) for 1 h at 37 °C, before random hexameric reverse transcription using Superscript II (Gibco BRL) for 1.5 h at 42 °C. PCR primers (Supplementary Table 1) were designed using Primer3 and all PCRs were carried out using the parameters: 1 cycle: 94 °C 3 min; 35 cycles: 96 °C 10 s, 56 °C 30 s, 72 °C 30 s; 1 cycle: 72 °C 10 min. PCR products spanning exon–exon boundaries (Supplementary Table 1) were cloned into TOPO TA cloning vector (Clontech) for subsequent sequencing.
For northern blot analysis, 10 μg of total RNA, extracted as described earlier, was electrophoresed, together with RNA size markers, on a 0.8% agarose gel containing 1.9 M formaldehyde. RNA was transferred to Hybond-N membranes (Amersham) using 20× saline-sodium citrate (SSC), and the membrane was hybridized overnight at 60 °C for α32P-labelled oligonucleotide probes and 42 °C for γ32P-labelled oligonucleotide probes. For α32P-labelled probe experiments, filters were then washed at 60 °C for 20 min in 2× SSC, 0.1% SDS then for 30 min in 0.5× SSC, 0.1% SDS and finally for 10 min in 0.2× SSC, 0.1% SDS. For γ32P-labelled probe experiments, filters were washed at 42 °C for 10 min in 6× saline sodium phosphate EDTA (SSPE) buffer, 0.1% SDS, then for 5 min in 2× SSC, 0.1% SDS. Note, the reason that Rsx is expressed at varying levels in different tissues is not clear, but may reflect differing requirements for this RNA in X-gene silencing in different cell types. The same phenomenon of variable expression is also observed for Xist (Supplementary Fig. 2) so is not peculiar to this RNA.
5′ RACE was performed using the 5′ RACE system (Invitrogen) and 3′ RACE with the SMARTer RACE cDNA amplification kit (Clontech), in both cases using 1 μg of total RNA from the female opossum brain.
Quantitative PCR for copy number was carried out as previously described36.
ES cell derivation and differentiation
ES cell lines were established from a mouse transgenic for VM18-303M7 BAC, according to published protocols37. In brief, blastocysts were flushed from the uterus at 3.5 d.p.c. and cultured in 2-inhibitor/leukemia inhibitory factor (2i/LIF) medium for 10 days without feeders, during which time inner cell mass outgrowth occurred. This was dissociated in tryspin and passaged to generate ES cells. ES cells were differentiated using retinoic acid for three days exactly as described38.
Image acquisition was performed with an Olympus IX70 inverted microscope with a 100 W mercury arc lamp and a 1003/1.35 UPLAN APO oil immersion objective. Each fluorochrome image was captured separately as a 12-bit source image with a computer-assisted (SoftWoRx) liquid cooled charge-coupled device (Photometrics CH350L; Kodak KAF1400 sensor, 1317 3 1035 pixels).
Gene Expression Omnibus
RNA-seq data is available from the Gene Expression Omnibus under accession number GSE36861; Rsx accession number JQ937282.
We thank D. Bell and R. Lovell-Badge for advice on the characterisation of Rsx, J. Cloutier and G. Polikiewicz for help with germ-cell preparations and quantitative PCR, the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Genomics core (National Institutes of Health, NIH) for RNA sequencing, A. Toth for the HORMAD1 antibody, the Biological and Procedural Services units at the National Institute for Medical Research (NIMR) for animal husbandry and Rsx transgenesis, and J. Cocquet, L. Reynard, H. Byers and members of the Turner and P. Burgoyne laboratories for reading of the manuscript. This work was supported by the Medical Research Council (MRC) (U117588498, U117597141, U117581331, U117597137), the NIH (HD60858), the Robert J. Kleberg Jr and Helen C. Kleberg Foundation, the New Zealand Foundation for Research, Science and Technology, Possum Biocontrol (C10X0501), the Australian National Health and Medical Research Council (1010453) and the NIDDK (NIH) Intramural Research Program.