INTRODUCTION

In diploid eukaryotic organisms, the maternally and paternally derived copies of each gene are usually assumed to be simultaneously expressed at similar levels. However, in some cases only one allele is transcribed, while the other allele is transcriptionally silent. These monoallelically expressed genes belong to three separate classes. In the first class, parent-of-origin imprinting, monoallelic expression is determined by marks placed during gametogenesis, which lead to imprinting either in specific tissues or throughout the entire organism (Reik and Walter, 2001). All cells in which a given gene is imprinted have the same active allele, which is determined solely by the parent of origin of the allele. The remaining two classes of genes both fall into the category of random monoallelic expression and include X-inactivated genes, for which there is coordination across the X chromosome (Lyon, 1986), and autosomal random monoallelic expression (Gimelbrant et al, 2007). In both types of random monoallelic expression of genes, the initial random choice between alleles is followed by a stable mitotic transmission of monoallelic expression. This review will consider the various different types of monoallelic expression, all of which have the potential to impact the functioning of the nervous system and other parts of the body (Figure 1).

Figure 1
figure 1

Types of epigenetically driven monoallelic expression. For the autosomal genes subject to random monoallelic expression, landmark gene classes are shown along with the respective dates when their monoallelic expression was discovered.

PowerPoint slide

IMPRINTING

Whether it is called parent-of-origin imprinting, genomic imprinting, or imprinting, this type of monoallelic expression is determined by different epigenetic marks placed during gametogenesis in the male and female germline. The fertilized egg thus has different marks on the copies of imprinted genes that came from the paternal and maternal contributions. The differential marking leads to monoallelic expression, sometimes just in specific tissues, but more commonly throughout the entire organism; for some imprinted genes are expressed only from the maternal allele, while other imprinted genes are expressed only from the paternal allele. All cells in which a given gene is imprinted have the same active allele, which is determined solely by the parent of origin of the allele. Imprinted genes encode proteins whose functions are widely spread among cellular functions. It is notable that a variety of epigenetic marks including DNA methylation and also modifications of histones have been associated with imprinted loci (Ferguson-Smith, 2011).

How many genes are subject to imprinting? Estimates of the number of imprinted genes are in the range of 100 in mice and around 50 in humans. Most imprinted human genes are also imprinted in the mouse, but as there are more imprinted genes in the mouse, of course, not all of them are imprinted in humans. Note that there were widely discussed reports in 2010 that used RNA-Seq and claimed that there are over 1000 imprinted genes in the mouse brain (Gregg et al, 2010a, 2010b). However, technical and analytic mistakes (which led to the over-exuberant tallying of imprinted genes) were explicitly pointed out in the work that re-analyzed the primary data from the manuscripts by Gregg et al and also analyzed independently obtained data sets (Deveale et al, 2012). An earlier manuscript also reported data inconsistent with the work by Gregg et al (Wang et al, 2008).

History of Searches for Imprinted Genes

Early use of the term imprinting to refer to loss of genetic information notwithstanding, imprinting is now commonly accepted as describing parent-of-origin-dependent monoallelic expression. The first described case was a plant example, wherein it was shown that the R locus involved in maize kernel coloration has allele-specific expression determined by parent of origin of the allele rather than the DNA sequence (Kermicle, 1970).

Mouse experiments published in the 1970s also revealed parent-of-origin effects. One study analyzed a deletion at the Tme locus, which showed early lethality when maternally inherited, but not when paternally inherited (Johnson, 1974). Other studies analyzed Robertsonian translocations (reciprocal translocations) that allowed manipulation of the parent of origin of distinct chromosomal regions. Areas of uniparental disomy were created and it was shown that for certain chromosomal regions, or entire chromosomes, even though the dosage of all areas of the genome was diploid, there were phenotypic effects of the uniparental disomy (Cattanach and Kirk, 1985). In all, 13 sub-chromosomal regions were eventually defined by these experiments and not surprisingly these regions harbor most imprinted genes of the mouse. Cloning experiments performed in the 1980s noted that both the maternal and the paternal genomes were needed to construct a viable embryo (McGrath and Solter, 1984; Surani et al, 1984). Attempts at the construction of gynogenetic and androgenetic mouse embryos failed; although both types of embryos were diploid, neither could successfully develop a viable embryo.

The appreciation that the maternal and paternal DNA contributions to the developing embryo contained information specific to their parental origins was immediately fascinating to biologists and led to efforts in numerous groups around the world to identify imprinted genes and to study the mechanisms underlying the phenomenon.

The first imprinted genes, found in 1991, were Igf2 and the Igf2r, a receptor for Igf2. Soon after, a non-coding RNA, H19 was found within 100 kb of the Igf2 gene. H19 is expressed from the maternal allele, while Igf2 is expressed from the paternal allele. As more genes were found, mechanistic studies began to reveal that differentially methylated regions characterized imprinted loci. The differential methylation occurs because different patterns of methylation are placed on genes in the male vs the female germline during gametogenesis. There are also differentially methylated regions that are not based directly on marks placed during gametogenesis, but rather are placed during early embryonic development in response to other marks that were from the respective germlines. Most, but not all, orthologs that are imprinted in humans are also imprinted in mice (Moore and Oakey, 2011). Efforts to understand the role of histone tail modifications in control of imprinted genes noted that the discrepancy between mouse and human imprinted gene sets is perfectly mirrored by a discrepancy in allele-specific histone tail marks.

Neurological disease-relevant examples of imprinting include Prader–Willi syndrome and Angelman syndrome. Both of these syndromes are complex and have neurological and behavioral features. They are due to genetic perturbations at the same genomic location (15q11–q13) (Buiting et al, 1995; Nicholls et al, 1989). Which syndrome is present depends on parent-of-origin effects. There is an imprinting control region that impacts both maternally expressed and paternally expressed genes in the 15q11–q13 region.

Genomics approaches will continue to look for more imprinted genes as well as explore mechanisms involved in imprinting. A recent paper examined the entire mouse genome for allele-specific DNA methylation associated with an imprinting-type pattern as well as patterns dictated by sequence differences (Xie et al, 2012).

Imprinting mechanisms are a potential target for therapies to ameliorate pathophysiology. An interesting recent example that shows the potential for an epigenetic-based therapy is the discovery, via chemical screening, that topoisomerase inhibitors can improve neural function by unsilencing the dormant allele of the Ube3a gene, a gene within the Prader–Willi/Angelman locus (Huang et al, 2012). Molecular mechanisms underlying imprinting, involving DNA methylation and other epigenetic marks, were reviewed recently (Abramowitz and Bartolomei, 2011).

CIS-REGULATORY POLYMORPHISM

Cis-regulatory elements are DNA sequences that regulate the level, timing, and location of gene expression. They can be found within and near the promoter as well as at large distances from the start site of transcription. Polymorphisms in cis-regulatory elements (Pastinen, 2010) can lead to differences in levels of expression between the two alleles that can be extreme (>10-fold difference) or can be more subtle. Even subtle differences are potentially important as a mechanism having an impact on genotype–phenotype correlation. Cis-regulatory polymorphisms are often more difficult to identify clearly than coding region polymorphisms. Sometimes interspecies conservation can aid in identifying important cis-regulatory elements, but often it is found that elements in one species may not be conserved (at least at the same relative position), even in closely related species.

COPY NUMBER VARIATION

Aneuploidy is also a potential mechanism for yielding monoallelic expression. Although it is not as interesting from an epigenetic point of view, it can certainly be effective in causing monoallelic expression. For example, if one of the two copies of a chromosome or portion of a chromosome is deleted, then the expression will be monoallelic for genes within the impacted genomic region because all expression emanates from the one remaining allele. It is important to note that aneuploidy can be found in mosaic form, such that the aneuploidy is only observed in certain tissues, or in a fraction of cells within a tissue or a fraction of cells distributed among a number of tissues. Thus, it is possible for a given individual to have a normal genotype, as ascertained by a peripheral tissue (ie, blood cells or cheek cells), but to have aneuploidy in the brain or certain brain regions, or in some cells within a given region.

Copy number variation can occur for smaller regions that would not be large enough to be considered examples of aneuploidy. If a single gene undergoes a deletion in its entirety or a deletion that causes loss of expression from one of the two alleles, then this is an obvious cause of monoallelic expression. Like the aneuploidy discussed in the prior paragraph, deletion mechanisms are less interesting from an epigenetic point of view, but certainly can be a mechanism that leads to monoallelic expression.

X-LINKED RANDOM MONOALLELIC EXPRESSION

For X-inactivated genes, there is coordination across the X chromosome of monoallelic expression. Early events in the setting up of this monoallelic expression include interactions between a number of cis-regulatory elements and non-coding RNAs in a region called the X-inactivation center. The Xist non-coding RNA ultimately becomes fixed in its transcription at only one of the two X chromosomes and indeed coats most of the chromosome (Clemson et al, 1996). Among other epigenetic marks associated with the inactive X are, late replication, DNA methylation at CpG islands, hypoacetylation of histones, as well as unusual histone subunit deposition (Jeon et al, 2012). Recently, it was discovered that there is allele-specific DNA methylation on the active X, in areas outside of CpG islands and somewhat focused on gene bodies (Hellman and Chess, 2007). The initial random choice between the two X chromosomes is followed by a stable mitotic transmission of monoallelic expression. In the case of X inactivation, a random choice is made in individual cells early in female development (Lyon, 1986). There are genes that escape X inactivation in humans, and the extent of escape can vary among individuals (Carrel and Willard, 2005).

While X inactivation is random, there is sometimes skewing of X inactivation, either due to sequence polymorphism leading to primary skewing or mutation on one or another gene on the X leading to secondary skewing due to differential growth/survival of cells (Jeon et al, 2012). The extent to which skewing of X inactivation plays a role in brain phenotypes is difficult to assess, but undoubtedly the potential for such a role is there. Rett syndrome, a neurodevelopmental disorder, is caused by mutations in the X-linked MeCP2 gene and is a disease where skewed X inactivation has been considered (Agarwal et al, 2011).

AUTOSOMAL RANDOM MONOALLELIC EXPRESSION

This section deals with an emerging class of autosomal genes with properties reminiscent of X inactivation. In the 1990s it started to emerge that a class of autosomal mammalian genes might exist, which shares similarities with the genes subject to X-chromosome inactivation (Bix and Locksley, 1998; Chess et al, 1994; Gimelbrant et al, 2005; Held et al, 1995; Hollander et al, 1998; Rhoades et al, 2000; Riviere et al, 1998). A defining feature of these autosomal genes is that they (like X-inactivated genes) are monoallelically expressed in a random manner. For some of these genes, half the cells express the maternal allele and half the cells express the paternal allele. Other genes also falling into the randomly monoallelically expressed class have some cells with biallelic expression in addition to the cells with monoallelic expression. Here for the most part, we are talking about an all or none pattern such that the non-expressed alleles appear to be completely, or almost completely, silent in those cells where they are not expressed. Notably, autosomal random monoallelic expression can impact biological function as it leads to three distinct expression states for each gene: expression of both alleles in some cells and expression in either the maternal allele or the paternal allele in other cells. Thus, random monoallelic expression can lead to unique cell identity for individual cells.

The first autosomal genes discovered to be subject to random monoallelic expression were the genes encoding antigen receptors on B and T lymphocytes: immunoglobulins and T-cell receptors, respectively (Pernis et al, 1965). Allelic exclusion was discovered for immunoglobulins in the 1960s by B Pernis and co-workers (Pernis et al, 1965), before the isolation of the immunoglobulin genes. Allelic exclusion was subsequently shown to be due to DNA rearrangement, first for immunoglobulin genes (Hozumi and Tonegawa, 1976) (for which S Tonegawa won the Nobel Prize) and later for T-cell receptor genes. The immunoglobulin and T-cell receptor genes remain the only known mammalian genes subject to DNA rearrangement.

Until the mid-1990s, it was thought that the immunoglobulin and T-cell receptor genes were special cases and allelic exclusion of other genes was not entertained as a mechanism. Then, in 1994, a report of random monoallelic expression of odorant receptor genes, a family with over 1000 members (in the mouse genome), led to the idea that autosomal random monoallelic expression may affect additional autosomal genes (Chess et al, 1994). Over the ensuing decade or so, the class of autosomal randomly monoallelically expressed genes was expanded with the addition of a handful of examples that generally are involved in chemosensory or immune system functions. Genes found to be subject to random monoallelic expression included pheromone receptor genes (which are similar to odorant receptor genes), interleukin genes, and genes encoding receptors on natural killer cells (Bix and Locksley, 1998; Held et al, 1995; Hollander et al, 1998; Rhoades et al, 2000; Riviere et al, 1998; Rodriguez et al, 1999).

Recent improvements in technology have led to an appreciation of the widespread nature of random monoallelic expression across the genome—beyond chemosensory and immune system genes. Around the turn of the century, the initial sequencing of the human genome was complete. The ensuing extensive characterization of common human polymorphisms, along with the development of arrays capable of determining the genotype at hundreds of thousands of loci in a single experiment, set the stage for genome-scale analyses of random monoallelic expression. In 2007, a genome-wide survey that analyzed clonal human cell lines (as well as freshly isolated tissues) revealed a surprisingly large extent of random autosomal monoallelic expression: upwards of 5% beyond the few percent of the genome encoding odorant receptors (Gimelbrant et al, 2007). See Box 1 for outline of the genome-wide survey.

While the earlier-known autosomal genes subject to random monoallelic expression were for the most part involved in the immune system or chemosensory systems, the genes uncovered in the genomic survey are widely distributed across functionalities. Also, most gene ontology categories are represented, indicating that these genes have wide-ranging functions. In addition to variation in the types of proteins they encode, they also vary both in their respective expression patterns. Unsurprisingly, tumor suppressor genes are not randomly monoallelically expressed. Some randomly monoallelically expressed genes are abundantly expressed, while others are expressed at low levels.

Even though the functions of genes subject to random monoallelic expression are widely varied, there is however an excess of genes encoding cell surface markers. This excess suggests a role for random monoallelic expression in the specification of unique cellular identity. Examining the Gene Ontology (GO) category ‘transmembrane receptor’ using GOstat, one would have expected the number of monoallelically expressed genes in the gene list from the manuscript of Gimelbrant et al (2007) would have been ∼3%; however, the observed value was much higher: 8.8%. The over-representation of cell surface molecules suggests the possibility that random monoallelic expression is involved in generating unique cellular identity for individual cells. Such cells might otherwise have been indistinguishable based on their sharing respective developmental histories, locations, and gene expression programs.

Mechanisms

The exploration of potential mechanisms regulating autosomal randomly monoallelically expressed genes has been restricted to characterizing expression from individual loci; such explorations are beginning to include genome-scale studies. Current questions include the extent to which various known mechanisms of regulating gene expression, including epigenetic marks and non-coding RNAs impact on allele-specific expression.

Autosomal randomly monoallelically expressed genes share with X-linked genes and imprinted genes the unusual property of asynchronous replication. Whereas most genes have both alleles replicated relatively synchronously at a defined portion of the S phase, the monoallelically expressed genes have one allele replicating before the other (Chess et al, 1994; Kitsberg et al, 1993; Morishima et al, 1962). An important consideration is that the asynchronous replication reflects differential epigenetic marking of the two alleles, and that the differential marking can regulate gene expression even when the cells become postmitotic, as is the case for neurons.

For the randomly monoallelically expressed genes, the asynchronous replication is coordinated along the entire length of the chromosome, rendering the alleles of all the randomly monoallelically expressed genes, scattered across a chosen chromosome, earlier replicating than the alleles on the homologous chromosome (Ensminger and Chess, 2004; Singh et al, 2003). This coordination between autosome pairs is present even though the randomly monoallelically expressed genes are interspersed among synchronously replicating biallelically expressed genes. Chromosome-scale coordination of replication asynchrony is also observed for the X chromosome. As is the case with X inactivation, the choice occurs early in development (Simon et al, 1999) and then is inherited in a stable manner by daughter cells. Indeed, the asynchronous replication is established before the activation of transcription of either allele (for most genes) and is present in all cell types irrespective of whether or not the gene has been or will ever be expressed from a particular cell lineage. Another important point to note about the asynchronous replication is that the early replicating allele is not necessarily the expressed allele (Gimelbrant et al, 2007). Rather, the asynchrony of replication is a characteristic of genes that are subject to random monoallelic expression and it remains to be determined what epigenetic marks are present on early- vs late-replicating alleles. This is being approached presently by genome-scale epigenetic analyses. A noteworthy exception where asynchronous replication appears to preferentially mark the early-replicating allele for activation is immunoglobulin gene rearrangement (Mostoslavsky et al, 2001).

The extraordinary diversity of neuronal function has spurred much interest over the years in the idea that DNA rearrangement (which generates an inordinately large repertoire of immunoglobulins and T-cell receptors) could play a role in neural diversity. Much of this interest has focused on genes that are randomly monoallelically expressed. An early paper following on the discovery of RAG1 and RAG2, the genes encoding the recombinase for immunoglobulin and T-cell receptor genes, suggested brain expression of RAG1 (Chun et al, 1991). Thereafter, a widely discussed paper claimed to have found evidence of brain-specific DNA rearrangement through the use of transgenes where the lacZ reporter gene was under the control of a promoter whose orientation would have to be flipped by VDJ recombination to be active (Matsuoka et al, 1991). A number of years later, this idea was shown to be an artifact (Abeliovich et al, 1992) (from the laboratory of S Tonegawa, who had been the initial discoverer of DNA rearrangement in lymphocytes).

The possibility that odorant receptor genes use a DNA rearrangement mechanism was appealing because of the fact that each neuron expresses only one allele of 1 of >1000 receptor genes. This idea was definitively ruled out by experiments that involved creating a cloned mouse from single olfactory neurons and demonstrating that the cloned mice each had a full repertoire of odorant receptor gene expression (Eggan et al, 2004; Li et al, 2004).

The possibility of DNA rearrangement also was suggested by initial analyses of the protocadherin locus, where it was shown that different protocadherins had a shared 3′ end but very different 5′ ends (similar to what is seen in immunoglobulin genes and T-cell receptor genes) (Wu and Maniatis, 1999). However, despite furious efforts to detect rearrangement in a number of top molecular biology laboratories, none were found. The patterns of expression of the mouse gene also argued against rearrangement (Tang et al, 1998). Subsequently, it was established that the mechanism involves alternative promoter usage followed by splicing to the common 3′ end (Tasic et al, 2002). The protocadherin genes are subject to random monoallelic expression in that different 5′ variable regions are expressed from the maternal and paternal alleles in a given cell (Chess, 2005; Esumi et al, 2005).

Feedback Loops

If expression from one allele leads to repression of the other allele, this represents an appealing possible mechanism for maintenance of random monoallelic expression. In the case of allelic exclusion of the immunoglobulin and T-cell receptor genes, once a DNA rearrangement has occurred and is functional, feedback prevents rearrangements on the other allele. Feedback is thus a potential mechanism for any randomly monoallelically expressed gene.

There are a number of studies supporting the idea that negative feedback (mediated through the coding region sequences of the odorant receptor genes) appears to play a role in odorant receptor gene regulation (Nguyen et al, 2007) (Fleischmann et al, 2008; Nguyen et al, 2010). The functional consequences for the organization of the olfactory neurons of early mis-expression of an odorant receptor have also been explored in studies that also further supported the negative feedback model (Fleischmann et al, 2008; Nguyen et al, 2010). It is also interesting to note that the signal-transduction pathway, which mediates odorant responsiveness, does not appear to be required for the odorant receptor negative feedback as odorant receptor gene regulation appears unperturbed in a mouse in which the key cyclic-nucleotide-gated channel involved in signal transduction has been knocked out (Brunet et al, 1996). A number of other transgenic experiments in the olfactory system can also be interpreted in the context of a model that includes negative feedback (Ebrahimi and Chess, 2000; Ebrahimi et al, 2000; Fleischmann et al, 2008; Lewcock and Reed, 2004; Nguyen et al, 2007; Qasba and Reed, 1998; Serizawa et al, 2000, 2003; Shykind et al, 2004).

Stability of Monoallelic Expression

Both for X inactivation and for autosomal genes subject to random monoallelic expression, if switching was a prominent component, then clonal cell lines would not have revealed monoallelic expression (Gimelbrant et al, 2007). In some cases, clonal cell lines have been carried in continuous cell culture for over a year and still maintained allele-specific expression (Gimelbrant et al, 2005). An in vivo demonstration of mitotic stability comes from the experiments performed on small patches of tissue from the human placenta (Gimelbrant et al, 2007). Generalizing the observed stability of the allele-specific expression in a given clone (Gimelbrant et al, 2005, 2007), together with in vivo clonal cell expansion, can lead to growth of patches of tissue with distinct allele-specific expression patterns; the size of patches would be expected to depend on the stage in development at which the allelic choice is made for each developing tissue.

Functional Consequences

With the large number of genes involved, autosomal random monoallelic expression can contribute to phenotypic differences among individual cells, and in turn among individuals within a species. One clear example of the functional importance of random monoallelic expression is the DNA rearrangement mechanism of generating antigen receptor diversity. Proper functioning of the immune system generally requires that there be a single antigen receptor per cell to avoid individual lymphocytes having dual specificities.

Another clear example where random monoallelic expression is useful is in the olfactory system. The odorant receptor dictates both the olfactory sensitivity of the neuron choosing it and also is involved in axon guidance. If an individual neuron were to express both alleles of a functionally heterozygous odorant receptor gene, this could lead to confusion in the neuronal wiring.

For other genes with random monoallelic expression is still a mechanism for generating cellular diversity, even if such diversity is not as extensive as that mediated by DNA rearrangement. When there is functional heterozygosity, the chance to generate diversity by having independent expression of the two alleles is readily apparent. Moreover, even in the absence of heterozygosity, the ability of cells to express either one or two alleles can lead to differences in levels of expression that can also contribute to cellular diversity. Studies of the expression patterns of randomly monoallelically expressed genes in relevant tissues (ie, neuronal genes in the brain) will shed light on the role of random monoallelic expression in a variety of tissue contexts.

FUTURE RESEARCH DIRECTIONS

All different types of monoallelic expression have the potential to impact genotype–phenotype correlation. Therefore, these mechanisms are important to consider when studying diseases, including diseases of the nervous system.

The expansion in the number of known randomly monoallelically expressed autosomal genes is remarkable and has the potential to impact our understanding of normal nervous system function as well as neuropathology. Instead of being mostly restricted to the immune system and chemosensory systems, it is now apparent that random monoallelic expression impacts a wide variety of different genes. The known mechanisms involved in establishing and maintaining random monoallelic expression are diverse, and it will be interesting to see the extent to which different mechanisms, as they are uncovered, will provide unifying concepts, beyond the asynchronous replication that appears to be associated with diverse genes that have very different overall gene-regulatory mechanisms. Studies of the potential impact of random monoallelic expression on genotype–phenotype correlation will also be an important area to watch. Finally, as mechanisms and the impact on phenotype are understood, it will be of interest to consider the potential affect of random monoallelic expression on the establishment and continued evolution of gene families.