Introduction

Among psychiatric disorders, schizophrenia is one of the most common and debilitating illnesses. While epidemiological studies have established its inheritance as an etiological factor, its molecular etiology remains enigmatic, despite extensive research (Baron 2001; Owen et al. 2000). Conflicting results follow the studies that suggested linkage or association of particular loci to or with schizophrenia (Brzustowicz et al. 2000; Levinson et al. 2002; Sklar et al. 2001; Wei and Hemmings 2000). This somewhat frustrating observation led us to an alternative explanation that the genes so far to be implied may merely be effector genes whose activity is incidentally or stochastically modulated by other factors. Those factors could be both environmental and genetic: for instance, viral infection, stress, so-called modifier genes, and so on. DNA methylation could be one of the best candidates that affect the expression of the effector-genes through various ways (Morgan et al. 1999; Petronis 2001; Razin 1998; Wolffe 2000).

Mammalian genomes show dynamic changes of DNA methylation patterns during early embryogenesis, including genome-wide demethylation followed by global de novo methylation (Hsieh 2000; Razin and Shemer 1995). Although DNA methylation has not been considered to be transmitted from parents to offspring, several lines of evidence suggest the inheritance of methylation (Morgan et al. 1999; Rakyan et al. 2002; Roemer et al. 1997; Sutherland et al. 2000). Once established, the pattern of genomic methylation is stably maintained through cell generations. Nevertheless, various factors can affect DNA methylation. Methylation levels decline with age (Drinkwater et al. 1989; Wilson et al. 1987). Various hormones have an impact on gene expression and the local pattern of gene methylation (Thomassin et al. 2001; Yokomori et al. 1995). Hypercortisolism has been reported in schizophrenia and is potentially related to stress (Elman et al. 1998; Lammers et al. 1995).

One of the major functions of DNA methylation is silencing of transposable elements (Miura et al. 2001; O’Neill et al. 1998; Yoder et al. 1997). Major constituents of transposable elements in mammalian genomes are retrotransposons, including endogenous retroviruses (Smit 1999). Normally, transcriptionally silent mouse endogenous retrovirus IAP was remarkably activated in the DNA methyltransferase 1 (Dnmt1)-knockout mouse (Walsh et al. 1998). Human genomes also contain numerous and various HERVs. HERVs have long terminal repeat (LTR) sequences at both ends that can activate the transcription of not only own genes but also unrelated genes at a distance (Kowalski et al. 1999). Most HERVs are defective, and their activity is usually strictly suppressed, possibly by methylation-mediated epigenetic regulation systems (Walsh et al. 1998; Yoder et al. 1997). However, transcriptional activation of HERVs has often been observed in various tumors, and retroviral sequences have been isolated from patients with type-1 diabetes, multiple sclerosis, and schizophrenia (Conrad et al. 1997; Karlsson et al. 2001; Perron et al. 1997). In addition, HERV expression can be induced by environmental stimuli (Stauffer et al. 2001; Sutkowski et al. 2001). If individuals whose genomes are suboptimally methylated due to polymorphisms of multiple genes encoding for methylation-constructing/maintaining proteins are exposed to such stimuli, dormant HERVs could be activated to interfere with normal regulation of the genes essential to brain functions and neurodevelopment. Highly controlled brain functions might be disturbed by dysregulation of those genes.

In the present study, we intended to identify such dormant HERVs that retain the enhancer/promoter activity in the brain and that may manifest hazardous effects when the susceptible individuals were exposed to intolerable stimuli.

Materials and methods

RNA

RNA was prepared from Tera 1 and NTera 2 teratocarcinoma cell lines using an ISOGEN RNA extraction kit (NIPPON GENE, Tokyo, Japan) or from a fetal brain using the standard guanidine/CsCl method.

EST database screening

Making reference to Tristem’s work (Tristem 2000), we retrieved several HERV sequences (HERV-H, E, F, W, and HML6 families) from the GenBank database and used as a query for BLAST searches of the EST database. Among ESTs detected and displayed on a screen, we collected 40–267 ESTs from each family. Those ESTs were automatically (Higgins, default conditions) and manually aligned using a DNASIS software (Takara Bio Inc., Shiga, Japan) and were classified into groups with identical sequences. Due to inaccuracy of EST sequences, it was often hard to obtain 100% homology between ESTs expected to derive from the same locus. We in principle considered the ESTs as the same locus origin when they showed more than 98% homology with each other or when the most likely locus of them was regarded to be consistent in BLAST searches of the ‘nr’ and/or ‘htgs’ databases. Thus, a group indicates to contain ESTs that are expressed from the same particular HERV locus. The number of ESTs included in a group was expected to indicate the expression level of the locus. Then, we selected HERV loci with multiple ESTs derived from cDNA libraries of brain-related tumors or normal neuronal tissues in order to examine their transcriptional states in a fetal brain by RT-PCR and sequencing analyses. In this study, we used EST data collected between November 2000 and May 2002 for the compilation.

Reverse transcription (RT) and RT-PCR

Prior to reverse transcription reaction, total RNA (50 µg) was treated with 60 U RNase-free DNase I (Roche Diagnostics, Mannheim, Germany) for 40 min at 37°C in order to remove a trace of genomic DNA contaminated in the RNA samples. Complementary DNA (cDNA) was synthesized in 20-µl-reaction volume from 3 µg (Tera 1 and NTera 2) or 20 µg (fetal brain) of total RNA, as previously described (Jinno et al. 1995). A parallel reaction was carried out in the absence of reverse transcriptase to monitor the amplification from the residual DNA. One microliter of the undiluted or 4- to 8-fold diluted RT products was subjected to PCR in 10 µl of the standard reaction buffer. Following denaturation for 3 min at 94°C, 18–38 cycles of PCR were carried out consisting of denaturation for 25 s at 94°C, annealing for 25 s at 50°C (HML6), 53°C (HERV-F), or 55°C (HERV-H and HERV-K), and extension for 25 s at 72°C. PCR of genomic DNA was carried out with the same thermal cycler conditions as applied to RT-PCR. We designed primers within a range of EST sequences of each group, taking care to evenly amplify homologous HERV loci as many as possible. Primer sequences used are as follows:

(HERV-K)

P1, 5’-CTCTAGGGTGAAGKTACGCTCGA

 

P2, 5’-CCCCAGAGAGTATGCTGCTTGC

 

P3, 5’-AGAGCAYGGGRTTGGRGGTAA

(HERV-H)

LTR-H1S, 5’-GACCCCACTGRAAATYGGACTG

 

LTR-H1A, 5’-TCCACTGTGAGAGTTACYYGAAGCT

(HERV-H)

H-gagS, 5’-TCTCTGKGCTTGCCTCCTTYACTA

 

H-gagA, 5’-TGGGGTTGGKACTGAGGGGAC

(HERV-F)

LTR-F1S, 5’-AACATCAGCCTAATGGCTMATGTCAG

 

LTR-F1A, 5’-GAATAAACCTGAGAGGGKCTTCTKG

(HML6)

HML6–1S, 5’-TGTGTGACCATGGAAYRGGAGAC

 

HML6–1A, 5’-CCCATGTTATGGGGKTKGATGTC

 

HML6–2S, 5’-CCCTGGTCTCCTGCAGTACCCT

 

HML6–2A, 5’-AGCAAAGATSACATGCTTCTGARGAA

Identification of potentially active HERV loci

Rationale of this method was described elsewhere (Sugimoto et al. 2001). RT-PCR products were cloned into pGEM-T vector (Promega, Madison, WI, USA), and 30 or 20 clones in each were sequenced and classified into groups with identical sequences. We considered the putative locus to be potentially active when the same sequence was represented in more than three clones among 30 or 20 clones analyzed for each HERV family. The corresponding loci were subsequently ascertained by BLAST search using the sequence of an RT-PCR clone as a query. To avoid biased amplification, we confirmed random amplification of genomic loci by analyzing the sequence of 9–12 genomic PCR clones for every primer set except HERV-K primers before performing RT-PCR. As HERV-K primer sets span the long introns, genomic PCR was saved. Sequence was determined in both strands using ABI Prism 310 (Applied Biosystems Japan, Tokyo, Japan).

Results

dbEST screening

We previously developed a simple method to identify potentially transcriptionally active HERV-K loci, even though their expression levels are far below the detection limit of Northern blot hybridization (Sugimoto et al. 2001). Preceding the application of this method, we performed EST database screening as described above in order to efficiently identify HERV loci having transcriptional potential in the brain. A total of 600 HERV-ESTs were collected from the screening for five HERV families. Among them, 111 were derived from neuronal tissues or cell lines. As partly shown in Table 1, however, tumor- or tumor-cell-line-derived HERV-ESTs predominated, and only 18 were HERV-ESTs prepared from normal neuronal tissues when 16 primitive neuroectoderm-derived ESTs were excluded. Therefore, we examined, taking advantage of teratocarcinoma cell lines, whether information obtained from tumors is useful for this purpose—the identification of HERVs retaining transcriptional activity in normal brains.

Table 1 Identification of human endogenous retrovirus (HERV) loci with transcriptional potential in the brain. Acc. no. GenBank accession number, EST expressed sequence tag, NA not applicable

Activated HERV-K loci in teratocarcinoma cell lines

In the previous study, we examined HERV-K expression in a teratocarcinoma cell line PA-1 established from a female germline cell tumor (Sugimoto et al. 2001). We extended the previous study using teratocarcinoma cell lines Tera 1 and NTera 2 established from male germline cell tumors. In teratocarcinoma cell lines, several splice variants were abundantly expressed, and the major transcripts were those with 1.5 kb and 1.8 kb in size (Löwer et al. 1993). HERV-K genomes can be classified as type 1 or type 2, depending on whether 292 bp at the pol-env boundary were deleted or present (Barbulescu et al. 1999; Löwer et al. 1993; Ono et al. 1986). The size difference between the 1.5-kb and 1.8-kb mRNAs results from the 292-bp sequence. We first tried to simultaneously amplify both the transcripts using a primer set designed from the immediate downstream of 5’ LTR and the 3’ LTR region (P1 and P3 in Fig. 1). The RT-PCR yielded the expected 325-bp product from the 1.5-kb mRNA, while multiple bands appeared around the expected 601-bp product from the 1.8-kb mRNA. Instead of the 601 bp, a 289-bp product from the 1.8-kb mRNA was obtained in a separate PCR using a type-2-specific primer (P2 in Fig. 1) and the same 3’ LTR primer. Sequencing analyses of RT-PCR products revealed that HERV-K101 and HERV-K102 were exclusively represented in clones with the 325-bp insert from Tera 1 and NTera 2 cell lines. HERV-K108 was a sole major locus to be grouped in the type-2 clones from both cell lines (Table 2). Thus, we confirmed the previous finding that only a few particular loci were activated in teratocarcinoma cell lines among numerous highly homologous HERV-K loci.

Fig. 1
figure 1

Structures of the typical human endogenous retrovirus (HERV)-K genome, cDNAs from GenBank database, and RT-PCR products. The HERV-K provirus is drawn below the size-indicating line. Open boxes at the both ends denote the 5’ and 3’ long terminal repeats (LTRs). A 292-bp sequence indicated by a bracket is deleted in the type-1 HERV-K genome. Next, three open boxes are cDNAs from the GenBank database and indicate the location of exons. X82271 and BC011367 are drawn to indicate the splicing patterns of the 1.8-kb and 1.5-kb mRNAs, respectively. RT-PCR exons are shown with filled boxes. Arrows labeled with P1, P2, and P3 indicate the location of the primers used for the RT-PCR. Size of RT-PCR products is shown in parentheses

Table 2 Activated human endogenous retroviruses (HERV)-K loci in teratocarcinoma cell lines. Acc. no GenBank accession number, NA not applicable

HERV-K101 was also activated in PA-1 cells in the previous study. On the other hand, HERV-K102 and HERV-K108 were activated only in the teratocarcinoma cell lines with male germline cell origin. HERV-K108 was also identified as a potentially active locus in the testis in the previous study (Sugimoto et al. 2001). Combined together, activated HERV-K loci in teratocarcinoma cell lines seemed to fall in two categories: one is common to all teratocarcinoma cell lines and the other is dependent on their male or female origin. We interpreted the above results that some of the activated loci in tumors are expected to potentially be transcriptionally active in the original normal tissues, and that therefore information obtained from tumors is partly suggestive.

Identification of HERV loci retaining the transcriptional potential in the brain

Based on the above interpretation, we utilized HERV-ESTs obtained from brain tumors or tumor cell lines, as well as those from normal brain for the grouping analysis. By the dbEST screening, we extracted five HERV loci that are expected to retain the transcriptional activity in the brain: two distinct loci from the HERV-H family, two from the HERV-HML6 family, and one from the HERV-F family. In addition, we chose two HERV-K loci because of their activation in teratocarcinoma cell lines that have the capability to differentiate to neuroepithelial cells by retinoic acid induction (Table 1).

Their transcriptional potential was examined by RT-PCR using fetal brain RNA and by subsequent sequencing analysis of amplified and cloned products. Degenerate primers were designed within the overlapping ESTs. Their no propensity to biased amplification was confirmed by analyzing sequences of genomic PCR clones. No amplification from residual DNA in RNA specimen was confirmed by monitoring PCR products of reverse transcriptase–minus RT reaction (Fig. 2). Among these seven loci, we identified three as transcriptionally active loci in the brain: HERV-K102, LTR-F1 (HERV-F), and HML6–1 (HERV-HML6). These were represented in 45% (or 90% when included clones with one base mismatch), 20%, and 70% of clones analyzed, respectively (Table 1). On the other hand, we could not confirm the transcriptional potential of the other four loci in a normal brain. Interestingly, an unexpected locus, HERV-H [HSN28H9], was repeatedly represented in three of 20 clones analyzed in the two independent PCRs that examined the two distinct expected loci of the HERV-H family—LTR-H1 and H-gag1 (Table 1). Although the frequency (ratio of the number of clones with an identical sequence to that of total clones analyzed) was low, coincident detection of the same locus may indicate that the locus keeps a transcriptional activity among numerous defective proviruses of the largest HERV family (Tristem 2000). Close inspections of these four loci regarded as active revealed that LTR-F1 is a solitary LTR in which the transcription of a gene ends. Also, the transcription of HML6–1 starts upstream of the 5’ LTR. Therefore, these two loci may not be considered as transcriptionally active HERVs.

Fig. 2
figure 2

Profiles of RT-PCR products. RT+ and RT- indicate the reverse transcription reaction with and without reverse transcriptase, respectively. Numbers on each lane indicate PCR cycles, and M shows a size marker. Sizes of the markers are indicated at the left side of each panel. RT-PCR products were electrophoresed in 1.5% agarose gels or 4% acrylamide gels

Finally, the present study provided one or two HERV loci, HERV-K102 located on chromosome 1q21-q22 and possibly the HERV-H locus on 22q12, as potentially transcriptionally active HERVs in the brain. These HERV loci could be candidates for further analyses discussed below, as well as for methylation and expression analyses in normal and schizophrenic brains.

Discussion

Epidemiological studies have firmly established the inheritance of schizophrenia as an etiological factor. Whereas powerful molecular technologies and completeness of the human genome sequencing project have resulted in successful identification of genes for monogenic diseases and even for polygenic diseases such as type-1 diabetes, the molecular etiology of schizophrenia remains enigmatic despite extensive research (Baron 2001; Owen et al. 2000). Most molecular studies of inherited diseases are based on DNA sequence, while research on DNA methylation demonstrated involvement of epigenetic mechanisms in a wide range of normal and pathological cellular phenomena. The characteristics of DNA methylation exhibiting mitotic and meiotic partial stability might better explain the features of schizophrenia and the situation in schizophrenia research. Genomic methylation levels decline with age, and the age-dependent demethylation of repetitive sequences such as satellite DNAs and endogenous retroviruses was frequently observed (Hornsby et al. 1992; Howlett et al. 1989; Ono et al. 1989; Suzuki et al. 2002). HERVs might play a role in the pathogenesis of schizophrenia as a mediator linking methylation and the effector genes.

The HERV is one of major retrotransposons and consists of 8% of the human genome (International Human Genome Sequencing Consortium 2001). Most HERVs are defective remnants. It is a substantial task to find HERVs retaining the transcriptional activity. We began with EST database searches for efficient identification of such HERVs. Most brain-related HERV-ESTs were derived from tumor tissues or tumor cell lines. According to the finding from our preliminary experiments using teratocarcinoma cell lines, we utilized those HERV-ESTs and extracted five HERV loci to be subjected to experimental confirmation. We finally identified two HERV loci, HERV-K102 on 1q21-q22 and HERV-H [HSN28H9] on 22q12, that are expected to retain the enhancer/promoter activity in the brain. These two HERV loci were not extracted from the dbEST screening. Although it may be premature to evaluate the present strategy to search for the HERVs with transcriptional potential, it is an urgent necessity to develop more efficient and comprehensive search strategies.

The two HERV loci, HERV-K102 and HERV-H [HSN28H9], were overlapped with or included in the regions of schizophrenia-susceptible loci, SCZD9 (1q21-q22; OMIM 604906) and SCZD4 (22q11-q13; OMIM 600850). Inspections of the surrounding sequences provided interesting suggestions. Numerous characterized and uncharacterized genes are present near HERV-K102. The genes encoding for rho/rac guanine nucleotide exchange factor 2 (ARHGEF2), RAS oncogene-related protein (RAB25), ephrin-A1 (EFNA1), ephrin-A3 (EFNA3), ephrin-A4 (EFNA4), and the uncharacterized G-protein coupled receptor-like gene (LOC128227) may be relevant to neuronal functions, neurodevelopment, and neuropsychiatric disorders (Skaper et al. 2001). Activation of HERV-K102 might affect the expression of these genes as an enhancer. There is also the adenosine deaminase gene (ADAR) that is involved in the RNA editing of glutamate receptors (GluR-B, GluR-C, and GluR-D) and the serotonin receptor 5-HT2C by targeting double-stranded regions of their mRNA (Reenan 2001).

Of particular interest is the presence of the genes encoding for chromatin-interacting proteins: the ASH1 gene and the SET domain-bifurcated 1 gene (SETDB1). Drosophila Ash1 is an epigenetic activator with the methylase activity of histone lysine residues (Beisel et al. 2002). The SET domain is a highly conserved amino acid motif found in chromatin remodeling proteins (Jenuwein 2001). HERV-H [HSN28H9] on chromosome 22q12 is located 4 kb downstream of the Synapsin III gene (SYN3) in the reverse orientation. Synapsin III may be implicated in synaptogenesis and the regulation of neurotransmitter release. Reduction of synapsins has been observed in the hippocampal tissue of patients with bipolar disorder and schizophrenia (Vawter et al. 2002). HERV-H [HSN28H9] could affect the mRNA level of SYN3 through its readthrough antisense RNA (Hannon 2002). It is possible to test whether this mechanism does work, at least in vitro.

Conflicting results may be attributed to various factors, such as experimental designs and the involvement of multiple genes with modest effects. If DNA methylation is involved in the pathogenesis of schizophrenia, different approaches other than sequence-based methods may be required. Furthermore, it is more important to select genes to be analyzed because methylation analyses are at present more laborious than DNA typing. Our study presented such two candidate-HERV loci, which also suggests possible mechanisms that can be tested by in vitro experiments. HERVs are classified into at least 22 families in a recent study (Tristem 2000). The largest HERV family contains several hundred copies of the full-length proviral sequence. In addition, there are ten to tens of thousand solitary LTRs in the human genome (Löwer et al. 1996). Not a few HERVs should exist that exert hazardous effects on highly controlled brain functions when the epigenetic gene silencing systems are compromised by various factors.