Original Article | Open Access | Published:

Genome-wide study of association and interaction with maternal cytomegalovirus infection suggests new schizophrenia loci

Molecular Psychiatry volume 19, pages 325333 (2014) | Download Citation


Genetic and environmental components as well as their interaction contribute to the risk of schizophrenia, making it highly relevant to include environmental factors in genetic studies of schizophrenia. This study comprises genome-wide association (GWA) and follow-up analyses of all individuals born in Denmark since 1981 and diagnosed with schizophrenia as well as controls from the same birth cohort. Furthermore, we present the first genome-wide interaction survey of single nucleotide polymorphisms (SNPs) and maternal cytomegalovirus (CMV) infection. The GWA analysis included 888 cases and 882 controls, and the follow-up investigation of the top GWA results was performed in independent Danish (1396 cases and 1803 controls) and German-Dutch (1169 cases, 3714 controls) samples. The SNPs most strongly associated in the single-marker analysis of the combined Danish samples were rs4757144 in ARNTL (P=3.78 × 10−6) and rs8057927 in CDH13 (P=1.39 × 10−5). Both genes have previously been linked to schizophrenia or other psychiatric disorders. The strongest associated SNP in the combined analysis, including Danish and German-Dutch samples, was rs12922317 in RUNDC2A (P=9.04 × 10−7). A region-based analysis summarizing independent signals in segments of 100 kb identified a new region-based genome-wide significant locus overlapping the gene ZEB1 (P=7.0 × 10−7). This signal was replicated in the follow-up analysis (P=2.3 × 10−2). Significant interaction with maternal CMV infection was found for rs7902091 (PSNP × CMV=7.3 × 10−7) in CTNNA3, a gene not previously implicated in schizophrenia, stressing the importance of including environmental factors in genetic studies.


Schizophrenia is a severe life-long mental disorder, which affects approximately 1% of the population worldwide. Several studies have documented a strong genetic component in the etiology and the heritability is estimated to be around 80%.1 Besides the genetic component, environmental factors as well as gene–environment interactions are believed to contribute to the disease risk.2 Numerous linkage, candidate gene studies and genome-wide association (GWA) studies have been performed in order to elucidate the genetic architecture of the disease.3, 4, 5, 6, 7, 8 These studies have implicated several genes in disease risk but seldom unambiguously across different studies and populations. In the GWA studies, only a few loci have passed the generally accepted level of P<5 × 10−8 for genome-wide significance.4, 5, 9, 10, 11 From the GWA studies, it can be concluded that only moderate levels of association of common variants with schizophrenia can be expected, and recent results suggest that a high number of common susceptibility variants of small effect are involved, collectively capturing around 30% of the genetic risk.12, 13 The remaining genetic risk could involve de novo mutations, rare variants and gene–environment interactions.14, 15, 16, 17, 18 Due to the low-effect sizes of the common risk variants, the heritability that can be accounted for by those identified so far has been estimated to be between 1% and 2%.19, 20

It is well documented that the environment has an important role in the development of schizophrenia.21 Especially early in life the susceptibility to environmental risk factors may be increased, supported by several studies demonstrating an association of maternal infection with increased risk of the child developing schizophrenia later in life.22, 23, 24, 25 It has also been reported that interaction between genetic variation in the offspring and markers of maternal infection (maternal antibodies) may influence the risk of schizophrenia,26 stressing the importance of taking environmental factors into account in genetic studies.

The Danish population is, in general, considered ethnically homogenous with only recent immigration of non-Caucasian individuals, which makes it suitable for genetic studies. In Denmark, all newborn babies (around 65,000 each year) are screened for metabolic diseases and since 1981 the surplus of the analyzed blood spot samples have been stored in the Danish Newborn Screening Biobank (DNSB).27 Coupling with information from the Danish Psychiatric Central Research Register28 allows for the unique opportunity to obtain DNA from all individuals who have been diagnosed with schizophrenia since 1981. Furthermore, as the DNSB samples are obtained from babies before they are able to produce their own antibodies and hence the antibodies in the blood reflect the mother’s antibodies, it is possible to investigate how maternal infections and their interactions with genetic variations in the offspring influence the risk of schizophrenia. Infection with cytomegalovirus (CMV), a neurotrophic virus of the herpesvirus family, has been associated with schizophrenia in several studies, and interactions between selected genes and CMV have been reported,29, 30

Here we report the results of a GWA study and follow-up investigation of all Danish individuals born since 1981 and diagnosed with schizophrenia, including single variant as well as regional analyses. In addition, as the first genome-wide gene–environment interaction study in schizophrenia, we examine interaction between single nucleotide polymorphisms (SNPs) and maternal CMV infection (maternal anti-CMV immunoglobulin G (IgG) antibody titer).

Materials and methods

Study design and power calculation

A two-stage design was applied in this study. In stage 1, a GWA analysis of 888 cases and 882 controls was performed and, in stage 2, a follow-up analysis of the strongest associated SNPs was performed on an independent sample consisting of 1396 cases and 1803 controls. Combined association analysis of both samples achieves a power of 80% to detect a disease allele with a frequency of 0.36 and odds ratio (OR) of 1.35 assuming prevalence of 0.01 at a significance level of 5 × 10−8.31 The SNPs strongest associated with schizophrenia were analyzed further by combining stage 1 and stage 2 individuals with a German-Dutch sample, a sample genetically closely related to the Danish.32

Study subjects and phenotype definition

Stage 1: It was possible to identify the samples of interest based on the unique personal identification number (CPR-number), which is assigned to all live-born babies in Denmark. This number is stored in the Danish Civil Registration System (DCRS)33 and is used in all the contacts with the public sector. In this study, information from the DCRS was linked with the information stored in the nationwide Danish Psychiatric Central Register28 in order to identify all individuals born in 1981 and onwards that in 2006 had been diagnosed with schizophrenia according to ICD-10-DCR (The ICD-10 Classification of Mental and Behavioural Disorders Diagnostic Criteria for Research; F20). For each schizophrenia case, one matched control individual was randomly selected with the same gender, date of birth and age and with no history of schizophrenia on the date of first diagnosis of schizophrenia of the case. Using this procedure, 915 cases and 915 controls were identified and subsequently dried blood spots from the individuals were obtained from the DNSB.

Stage 2: All individuals born since 1981 and onwards diagnosed with schizophrenia according to ICD-10-DCR, F20 between 2006 and 2010 and matched controls were identified as described above. In all, 1149 cases and 1303 controls were identified and subsequently blood spots were obtained from the DNSB. Furthermore, a sample of 247 schizophrenia cases fulfilling ICD-10 criteria and 500 controls were included as described previously.7, 26 The cases and controls were Danish Caucasians.

German-Dutch replication sample: A total of 1169 schizophrenia cases (464 German and 705 Dutch) and 3714 ethnically matched controls (1272 German and 2442 Dutch) were used in the replication analysis (details on criteria for inclusion can be found in Rietschel et al.5). Descriptive data for the three samples can be found in Supplementary Table S1. This study has been approved by the Danish Data Protection Agency and the local ethics committees in Denmark and abroad.

Genotyping and quality control (QC)

Stage 1: Sufficient biological material was available for 909 cases and 899 controls. DNA was extracted from the dried blood spots using Extract-N-Amp Blood PCR kit (Sigma Aldrich, Seelze, Germany) and subsequently whole genome-amplified in triplicates using the RepliG kit (Qiagen, Venlo, The Netherlands).34 The three separate reactions were pooled before genotyping, which was done using the Illumina Human 610-quad beadchip (San Diego, CA, USA). In all, 1774 individuals (892 cases, 882 controls) with gender in concordance with the register information were successfully genotyped with a call rate >0.97. Stringent QC was applied to data from samples with a call rate >0.97. The QC excluded SNPs with a call rate <0.99, SNPs with a deviation from Hardy–Weinberg equilibrium (P<0.0001 in controls) and a minor allele frequency (MAF) <0.0015. Furthermore, test for relatedness, estimation of individual heterozygosity and test for non-random missingness of SNPs between cases and controls were conducted (Supplementary Table S2). After QC, 1770 individuals (882 controls, 888 cases) and 541,148 SNPs were left for further analysis.

Stage 2: DNA from the 1149 cases and 1303 controls obtained from the DNSB was extracted and whole genome-amplified using the kits described above. DNA was isolated from blood samples from the additional 247 cases and 500 controls following standard procedures. In all, 193 follow-up SNPs (Supplementary Table S3) were genotyped as well as five SNPs on the sex-chromosomes, using the Sequenom MassARRAY genotyping platform (Sequenom, San Diego, CA, USA) following the protocol described in Nyegaard et al.7 It was checked that the estimated gender, based on the genotypic information, was in concordance with the gender given in the DCRS. In order to exclude the presence of identical samples, an identity by state analysis was performed using the software Graphical Relationship Representation.35 After QC, 3142 individuals (1370 cases, 1772 controls) with a call rate >0.8 were genotyped for 168 SNPs (Supplementary Table S3). The SNPs had a call rate >0.9, no significant deviation from Hardy–Weinberg equilibrium (P>0.0001 in controls) and a MAF>0.0015.

German-Dutch replication sample: The individuals were genotyped using the Illumina HumanHap550v3 BeadArray (Illumina). After QC genotypes for 475,427 SNPs were available in 1169 cases and 3714 controls (for details on genotyping and QC, see Rietschel et al.5).

Antibody measurements

Measurements of type-specific IgG antibodies to CMV were obtained by enzyme immunoassay36 for a subsample of stage 1 individuals (488 cases (216 females and 272 males) and equally many controls. The blood spots stored in the DNSB were taken when the neonates were 2–7 days old. At that age a child has not yet produced any significant amount of IgG antibodies, but while in utero maternal IgG antibodies are transferred across the placenta to the fetus. Hence, the antibodies measured can be assumed to be mainly maternal.37 The measurements were dichotomized at 0.2 optical density units, yielding a prevalence consistent with those measured in European populations.38

Statistical analysis

All analyses were performed using the software PLINK (http://pngu.mgh.harvard.edu/~purcell/plink/)39 unless otherwise stated.

Population stratification

In order to minimize the effect of spurious association originating from population, stratification association analysis was performed using logistic regression with principal component one as covariate derived from principal component analysis40 (Supplementary Figure S2). This lowered the genomic inflation factor from 1.047 to 1.013. A deviation of λ from 1 can be expected under polygenic inheritance even when there is no population structure,41 so in order to avoid unnecessary correction only principal component one was used to correct for population stratification (more information is provided in the Supplementary Information).

Single-marker association analysis

GWA analysis was performed using logistic regression with principal component one as a covariate applying an additive genetic model. The SNPs demonstrating the strongest association with schizophrenia in the GWA analysis (2843 SNPs with a P-value <0.005), were further evaluated in a meta-analysis by including data from a German-Dutch replication sample.5 The meta-analysis implemented in PLINK was used, and a fixed effect model was considered.

Two types of analysis were applied in order to identify follow-up SNPs for genotyping in the stage 2 sample: (1) based on the GWA analysis, a set of 100 SNPs were identified using a top-down approach, and (2) based on the meta-analysis, a set of 100 SNPs were identified using a top-down approach. Association with schizophrenia was analyzed in the stage 2 sample by logistic regression using an additive genetic model. A binomial sign test was performed in order to test for evidence of directionally consistent replication.

A meta-analysis (as described above) was used to test for association of follow-up SNPs with schizophrenia in two combined data sets: (1) the combined stage 1 and the stage 2 samples, and (2) the combined stage 1, stage 2 and German-Dutch replication sample (referred to as extended meta-analysis from now), which in total included 3453 cases and 6399 controls.

Region-wise association analysis

All chromosomes were divided into overlapping regions of 100 kb, each overlapping its neighboring regions by 50 kb. For each region, a combined P-value was calculated by Fisher’s method:42 X=−2∑ki=1loge(pi), where k is the number of SNPs in the region and pi is the P-value for each SNP calculated by a standard χ2 test, using an additive model. The P-value for each region was calculated by permutation test shuffling the case–control status. (see supplementary Material for details).

SNP × maternal CMV infection interaction analysis

The two-step method of Murcray et al.43 was applied in order to test whether genetic variation interacts with maternal CMV infection influencing the risk of schizophrenia in the offspring. First, the full set of SNPs was screened for association with maternal CMV infection in the combined sample of cases and controls at a significance level of 0.05. Second conditional logistic regression with inclusion of an interaction term in the regression on the m SNPs selected in step 1, was performed using Stata 10.0, College Station, TX, USA: StataCorp LP. Because the tests performed in steps 1 and 2 are (asymptotically) independent,43 Bonferroni correction for the m tests in step 2 preserves the family-wise error rate.


Single-marker association analysis

In the GWA analysis 26,863 SNPs demonstrated an association with schizophrenia with a P-value <0.05 (Figure 1a). In all, 54 SNPs showed P-values <1 × 10−4, and the SNP demonstrating the strongest association with schizophrenia was rs2836518 (P=1.32 × 10−5), located on chromosome 21q22 in the intron of ERG (list of all SNPs with P-values <1 × 10−4 can be found in Supplementary Table S5).

Figure 1
Figure 1

(a) Manhattan plot of genome-wide association (GWA) analysis. The blue line indicates P=1 × 10−4. (b) Regional association plot of rs4757144 located in ARNTL. (c) Regional association plot of rs8057927 located in CDH13. (d) Regional association plot of rs12922317 located in RUNDC2A. The P-values in green are from the GWA analysis, P-values marked in blue are from the combined analysis of Danish individuals, P-values marked in purple are from the extended meta-analysis. The linkage disequilibrium (LD; r2) between the SNP in focus and its flanking markers genotyped in the GWA study are demonstrated in red (high LD) to white (low LD). The recombination rate is plotted in blue according to HapMap (CEU).

In the stage 2 sample, 165 SNPs were successfully genotyped, of which nine demonstrated a directionally consistent association at nominal significance (Table 1). At the experiment level, the associations of the SNPs genotyped in stage 2 demonstrated significant directional consistent evidence of replication (binomial sign test P<0.0096). The SNP showing the strongest association in the stage 2 sample was rs4757144 (P=0.0059) located on chromosome 11 in the intron of ARNTL on chromosome 11p15.

Table 1: P-values from association analysis of stage 1, stage 2 and German-Dutch samples and combined analysis of stage 1+stage 2 and stage 1+stage 2+German Dutch samples. The odds ratio (OR) is given and minor allele frequency (MAF) in cases and controls is also given

In the combined analysis of stage 1 and stage 2, four SNPs showed P-values <1 × 10−4. The SNP demonstrating the strongest association in this analysis was also the ARNTL SNP rs4757144 (P=3.78 × 10−6; Figure 1b). The other three markers were rs8057927 located in the intron of CDH13 on chromosome 16q23 (P=1.39 × 10−5; Figure 1c), rs2121783 located in the intron of FOXP1 on 3p13 (P=8.86 × 10−5) and rs3123688 on 10p11 located between ZEB1 and ZNF438 upstream transcription start for both genes (P=9.05 × 10−5; Table 1).

In the extended meta-analysis, including both the Danish and German-Dutch individuals, four SNPs demonstrated P-values <1 × 10−5: rs12922317 located in the intron of RUNDC2 on 16p13 (P=9.04 × 10−7; Figure 1d), rs8057927 located in the intron of CDH13 on 16q23 (P=1.20 × 10−6), rs6485671 located upstream CREB3L1 on 11p11 (P=5.08 × 10−6) and rs4757144 in the intron of ARNTL (P=5.35 × 10−6; Table 1). Additional results of the combined analyses can be found in Supplementary Table S6.

Region-wise association analysis

In total, 55,561 overlapping regions were tested for association with schizophrenia (Figure 2), requiring a region-based genome-wide significant level of P=9.6 × 10−7 if Bonferroni correction is applied. One region at chromosome 10p11:31,566,070–31,666,070 (hg18) overlapping ZEB1 was genome-wide significant (P=7.0 × 10−7; Figure 2, Supplementary Table S7). Five genotyped SNPs were located in this region (rs1314004, rs7083727, rs1314013, rs12242798 and rs3123688). Two of the SNPs are in high linkage disequilibrium (rs3123688 and rs12242798, r2=0.642374). The five SNPs were genotyped in the stage 2 sample. However, rs12242798 failed genotyping and was therefore imputed using the software MaCH 1.0 (http://www.sph.umich.edu/csg/abecasis/MACH/)44 with HapMap phase III, Release #2, CEU, as reference population. The resulting genotypes were imputed with good quality (quality=0.99, Rsq=0.34). Significant region-wise association was found in the stage 2 sample (P=0.023), thereby establishing formal replication of this locus in the Danish population.

Figure 2
Figure 2

(a) Manhattan plot of region-wise association analysis. The blue line indicates genome-wide significance (P=9.0 × 10−7). (b) Regional association plot of rs3123688 located upstream of ZEB1. The P-values in green are from the genome-wide association (GWA) analysis and P-values marked in blue are from the combined analysis of Danish individuals. The vertical lines represent the region-wise P-values. The linkage disequilibrium (LD; r2) between the single nucleotide polymorphism in focus and its flanking markers genotyped in the GWA study are demonstrated in red (high LD) to white (low LD). The recombination rate is plotted in blue according to HapMap (CEU).

SNP × maternal CMV infection interaction analysis

Of the case mothers, 73.2% were CMV positive while that was the case for 70.7% of the control mothers, corresponding to an OR of 1.13 (0.85−1.50), P=0.39, for CMV with respect to schizophrenia in the offspring. A total of 29,082 SNPs passed step 1 inducing a Bonferroni significance level of P=1.72 × 10−6 at step 2. A single SNP, rs7902091 (MAF 0.16 in cases and 0.15 in controls) located in an intron of CTNNA3 on chromosome 10q21, demonstrated experiment-wide significant interaction with maternal CMV infection, with an interaction P-value of 7.3 × 10−7 and interaction OR of 5.3 under an additive genetic model (Figure 3, Supplementary Table S8). On its own, rs7902091 showed no association with schizophrenia (OR=1.04, P=0.67). For non-carriers of the minor allele, the risk of schizophrenia from maternal CMV was not observed (OR=0.72, P=0.11) whereas for carriers the risk increased to OR=5.0 (P=3.8 × 10−2). Furthermore, a neighboring SNP, rs7919083 located 2206 bp from rs7902091, demonstrated a relatively low interaction P-value (P=5 × 10−4) practically independent of rs7902091 (r2=0.08).

Figure 3
Figure 3

Regional association plot of rs7902091. P-values of interaction between single nucleotide polymorphisms (SNPs) and maternal cytomegalovirus infection. The linkage disequilibrium (LD; r2) between the SNP in focus and its flanking markers genotyped in the genome-wide association study are demonstrated in red (high LD) to white (low LD). The recombination rate is plotted in blue according to HapMap (CEU).


Here we report the results of a GWA study of schizophrenia using cases from a complete Danish birth cohort and follow-up investigations in additional samples, applying single variant and regional analyses, the latter identifying a novel locus at ZEB1. Moreover, conducting the first genome-wide gene-environment interaction survey in psychiatric disorders, we report significant interaction between CTNNA3 and maternal CMV infection.

In the single variant analysis, none of the analyzed SNPs passed the widely accepted genome-wide significance threshold of P=5 × 10−8. However, the results highlighted a number of loci with the strongest signals located on 10p11, 11p15, 16q23 and 16p13 (Table 1). The SNP rs4757144, located in an intron of the circadian rhythm-associated gene ARNTL on 11p15, demonstrated the strongest association in the combined analysis of Danish stage 1 and stage 2 individuals and was the fourth most associated SNP in the extended meta-analysis. ARNTL is expressed in several regions of the human brain.45 Circadian-rhythm abnormalities in schizophrenia patients have been reported.46, 47, 48, 49 Several candidate gene studies have investigated the involvement of circadian genes in schizophrenia and other psychiatric disorders, with some suggesting the involvement of ARNTL in disease risk50, 51, 52 and others not.53 The SNP rs4757144 reported in this study is in high linkage disequilibrium (r2>0.8, HapMap release 22), with two SNPs previously reported to be associated with schizophrenia (rs198235051) and bipolar disorder (rs4757142,51 rs198235052). ARNTL was also one of the top four candidate genes associated with bipolar disorder identified by convergent functional genomics data mining of existing GWAS data sets.54

The second most associated SNP in both the combined analysis of the Danish samples and in the extended meta-analysis was rs8057927 in the intron of CDH13 on 16q23. CDH13 encodes cadherin-13, a member of the cadherin super family of molecules that mediates Ca2+-dependent cell–cell adhesion in solid tissue.55, 56, 57 CDH13 is expressed in several parts of the adult human brain58 and appears to have a negative role in neural cell proliferation in the developing nervous system.58, 59, 60 The implication of CDH13 in other psychiatric disorders has been suggested. GWA studies of attention deficit/hyperactivity disorder identified CDH13 as one of the most associated genes,61, 62, 63, 64 and a meta-analysis of attention deficit/hyperactivity disorder linkage scans identified the region with CDH13 as the only genome-wide significant.65 GWA studies have also indicated the involvement of CDH13 in depression66 and autism,67 and a recent study implicated CNVs encompassing CDH13 in autism susceptibility.68 The important role of cadherin-13 during brain development and in maintaining neural circuitry together with the reports of involvement of CDH13 in other psychiatric disorders therefore support our result, which, for the first time, suggests the involvement of CDH13 in schizophrenia (discussion of other top hits from the extended meta-analysis can be found in the Supplementary Material).

One SNP (rs10828623) out of the 10 most associated SNPs in the combined analysis of Danish stage 1 and stage 2 individuals was nominal significantly associated with schizophrenia in the data from the Psychiatric Genomics Consortium (PGC; P=0.002),11 resulting in a P-value=4.54 × 10−6 in the combined analysis of Danish stage 1 and 2 individuals and the PGC samples. The German-Dutch sample was included in the PGC data. Thus, there was no overlap in the discovery and follow-up samples. The limited replication could be due to genetic heterogeneity between the Danish and PGC samples, reducing the power to detect variants with small effects. The single marker loci demonstrating the strongest association in this study are therefore only valid for Danish, German and Dutch populations.

Applying a regional analysis, summarizing independent signals in relatively small segments of overlapping regions of 100 kb, we found region-based genome-wide significant association at a region on 10p11 containing ZEB1. The applied significance level was based on Bonferroni correction of the total number of analyzed regions, which is analogous to how the conventional GWAS threshold for single-SNP association of 5 × 10−8 is deduced but which in this case is conservative due to the regions being 50% overlapping and therefore far from independent. This approach was able to identify a novel risk locus even though it was performed using a small sample compared with recent GWA studies, indicating that aggregating P-values in this fashion can be a powerful approach. This is supported by the accumulating observations of independent association signals from closely positioned SNPs (see for example, Steinberg et al.4 and Ripke et al.11). Moreover, the region showed significant association in the Danish stage 2 sample, providing independent replication of this locus in the Danish population. No replication was attempted in the German-Dutch GWA data set or the PGC data as not all SNPs (or proxies) in this region were present in these data sets. ZEB1 encodes an E-box binding zinc finger transcription factor, which is widely expressed in the central nervous system and has an important role in development of the brain69 and neuronal differentiation.70 The associated region includes the promoter of ZEB1 and could therefore be involved in or linked to variants involved in regulation of expression. This is intriguing as it has been demonstrated that the expression of ZEB1 is regulated by another transcription factor protein, TCF4,71 which is one of the best validated schizophrenia susceptibility genes. Two independent SNPs in this gene have passed the threshold for genome-wide significant association,4, 5, 9 and several studies have found TCF4 SNPs demonstrating close to genome-wide significance.5, 11, 12 Notably, studies have also found strong evidence for the involvement of both TCF4 and ZEB1 genetic variants in another disorder, namely Fuch’s corneal dystrophy.72, 73 Interestingly, ZEB1 is involved in regulation of cadherin-13 expression by physically binding to the E2-box in the promotor region of CDH13 decreasing the expression of the gene;74 however, this finding still needs to be confirmed in nerve cells. Our results together with previous findings could therefore indicate that ZEB1, TCF4 and cadherin-13 are elements of a common pathway involved in schizophrenia.

The interaction analysis of SNPs with maternal CMV infection found a significant interaction at CTNNA3, using an efficient two-step method where only SNPs passing step 1 were tested for interaction.43 Thus, only SNPs showing nominal association with CMV infection in the pooled sample of cases and controls were tested for interaction. This amounted to around 29,000 SNPs distributed across the genome. The interacting SNP was rs7902091 located in an intron of CTNNA3, just upstream the gene LRRTM3, which is nested within CTNNA3. CTNNA3 encodes catenin alpha-3, which is predominantly expressed in heart and testis but expression of the gene in the brain has also been demonstrated.75 Catenin alpha-3 mediates cell–cell adhesion by functioning as a link between cadherin-based cell–cell adhesion complexes and the cytoskeleton.76, 77 Biologically the interaction of rs7902091 in CTNNA3 with maternal CMV makes sense, because CMV during infection may disrupt cell-to-cell connections by disconnecting the cadherin–catenin–actin complex within endothelial cells,78 and in a study of human CMV in transgenic Drosophila, expression of the regulatory virus genes caused abnormal embryonic development by interfering with cell-to-cell adherens junctions through an effect on catenins.79 The interaction observed suggests that the region around rs7902091 in concert with maternal CMV infection may have a role in the etiology of schizophrenia. However, this should be replicated in additional studies. It is noteworthy, though, that neighboring SNPs (in particular rs7919083) showed a low interaction P-value independently of rs7902091, supporting the involvement of this locus. CTNNA3 and its nested gene LRRTM3 (encoding the Leucine-rich repeat transmembrane neuronal protein 3) have both previously been found associated to Alzheimer’s disease80, 81, 82, 83, 84 and with autism spectrum disorder.67, 85 In relation to Alzheimer’s disease, CTNNA3 have been observed to have stronger effect in females than in males.82 We therefore performed a secondary analysis of gender differences in the interaction of CTNNA3 and CMV. The results are shown in the Supplementary Material (Supplementary Table S3).

As in any other observational study involving environmental factors, confounding cannot be ruled out. With the apparent risk from CMV being turned on and off by the presence or absence of the variant, any confounder of CMV–schizophrenia association would confound the interaction result. For instance, the prevalence of CMV infection has been reported to correlate with the prevalence of other infections, including other members of the Herpes family,86 and social-economic status87 that could be potential confounders. However, regardless of whether the CMV–schizophrenia association can be explained by confounding in part or completely, there is still interaction at the CTNNA3 locus identifying sub-populations of different risk profiles for schizophrenia.

We have reported the first GWA study and follow-up analysis of all the Danish individuals born since 1981 and diagnosed with schizophrenia up to 2010 and controls from the same birth cohort. Furthermore, we have followed up in an additional sample from a genetically related population. The results support the findings from other GWA studies, suggesting the involvement of many common variants each contributing only slightly to disease risk. Applying a region-wise analysis, a new risk locus (at ZEB1) was identified and replicated. Several other plausible susceptibility loci were also suggested. This is also the first genome-wide study analyzing how maternal CMV infection interacts with the genotype of the progeny affecting the risk of schizophrenia, identifying a significant interaction at CTNNA3, a gene not previously implicated with schizophrenia. The result stresses the importance of including environmental factors in the evaluation of disease risk. Moreover, this is, to our knowledge, the first significant gene–environment interaction identified in a genome-wide survey of a psychiatric disorder. Future studies should confirm the associations of the genomic regions with schizophrenia, demonstrating the strongest signals in this study, as well as enlarge, the inclusion of environmental factors when identifying genetic risk variants. The unique samples from the DNSB together with information from Danish register systems makes it possible to perform genetic studies with inclusion of a wealth of potential environmental risk factors. Future studies of the Danish population could therefore provide valuable insight into how gene–environment interactions influence the risk of schizophrenia.


  1. 1.

    , . Twin studies of schizophrenia: from bow-and-arrow concordances to star wars Mx and functional genomics. Am J Med Genet 2000; 97: 12–17.

  2. 2.

    , , . Gene-environment interactions in schizophrenia: review of epidemiological findings and future directions. Schizophr Bull 2008; 34: 1066–1082.

  3. 3.

    , , , , , et al. Gene variants associated with schizophrenia in a Norwegian genome-wide study are replicated in a large European cohort. J Psychiatr Res 2010; 44: 748–753.

  4. 4.

    , , , , , et al. Common variants at VRK2 and TCF4 conferring risk of schizophrenia. Hum Mol Genet 2011; 20: 4076–4081.

  5. 5.

    , , , , , et al. Association between genetic variation in a region on chromosome 11 and schizophrenia in large samples from Europe. Mol Psychiatry 2012; 17: 906–917.

  6. 6.

    , , , , , et al. A genome-wide investigation of SNPs and CNVs in schizophrenia. PLoS Genet 2009; 5: e1000373.

  7. 7.

    , , , , , et al. Support of association between BRD1 and both schizophrenia and bipolar affective disorder. Am J Med Genet B Neuropsychiatr Genet 2010; 153B: 582–591.

  8. 8.

    , , , , , et al. Genomewide linkage scan of schizophrenia in a large multicenter pedigree sample using single nucleotide polymorphisms. Mol Psychiatry 2009; 14: 786–795.

  9. 9.

    , , , , , et al. Common variants conferring risk of schizophrenia. Nature 2009; 460: 744–747.

  10. 10.

    , , , , , et al. Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature 2009; 460: 753–757.

  11. 11.

    , , , , , et al. Genome-wide association study identifies five new schizophrenia loci. Nat Genet 2011; 43: 969–976.

  12. 12.

    , , , , , et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 2009; 460: 748–752.

  13. 13.

    , , , , , et al. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nat Genet 2012; 44: 247–250.

  14. 14.

    , , , , , et al. Large recurrent microdeletions associated with schizophrenia. Nature 2008; 455: 232–236.

  15. 15.

    , , , , , et al. Genome-wide analysis shows increased frequency of copy number variation deletions in dutch schizophrenia patients. Biol Psychiatry 2011; 70: 655–662.

  16. 16.

    , , , , , et al. Deletion 17q12 is a recurrent copy number variant that confers high risk of autism and schizophrenia. Am J Hum Genet 2010; 87: 618–630.

  17. 17.

    , , , , , et al. Sequencing of DISC1 pathway genes reveals increased burden of rare missense variants in schizophrenia patients from a Northern Swedish population. PLoS One 2011; 6: e23450.

  18. 18.

    , , , , , et al. Exome sequencing supports a de novo mutational paradigm for schizophrenia. Nat Genet 2011; 43: 864–868.

  19. 19.

    , , , . Evaluating the heritability explained by known susceptibility variants: a survey of ten complex diseases. Genet Epidemiol 2011; 35: 310–317.

  20. 20.

    . ‘The missing genes: what happened to the heritability of psychiatric disorders?’. Mol Psychiatry 2011; 16: 362–364.

  21. 21.

    , , . Schizophrenia as a complex trait: evidence from a meta-analysis of twin studies. Arch Gen Psychiatry 2003; 60: 1187–1192.

  22. 22.

    , , , . Maternal exposure to herpes simplex virus and risk of psychosis among adult offspring. Biol Psychiatry 2008; 63: 809–815.

  23. 23.

    , , , , , et al. A Danish National Birth Cohort study of maternal HSV-2 antibodies as a risk factor for schizophrenia in their offspring. Schizophr Res 2010; 122: 257–263.

  24. 24.

    , , , , , et al. Serologic evidence of prenatal influenza in the etiology of schizophrenia. Arch Gen Psychiatry 2004; 61: 774–780.

  25. 25.

    , , , , , . Maternal exposure to toxoplasmosis and risk of schizophrenia in adult offspring. Am J Psychiatry 2005; 162: 767–773.

  26. 26.

    , , , , , et al. Association of GRIN1 and GRIN2A-D with schizophrenia and genetic interaction with maternal herpes simplex virus-2 infection affecting disease risk. Am J Med Genet B Neuropsychiatr Genet 2011; 156B: 913–922.

  27. 27.

    , . Storage policies and use of the Danish Newborn Screening Biobank. J Inherit Metab Dis 2007; 30: 530–536.

  28. 28.

    , , . The Danish Psychiatric Central Research Register. Scand J Public Health 2011; 39: 54–57.

  29. 29.

    , , , , , et al. Are exposure to cytomegalovirus and genetic variation on chromosome 6p joint risk factors for schizophrenia? Ann Med 2007; 39: 145–153.

  30. 30.

    , , , , , et al. Polymorphisms in MICB are associated with human herpes virus seropositivity and schizophrenia risk. Schizophr Res 2007; 94: 342–353.

  31. 31.

    , , , . Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet 2006; 38: 209–213.

  32. 32.

    , , , , , et al. Genes mirror geography within Europe. Nature 2008; 456: 98–101.

  33. 33.

    , , , . The Danish Civil Registration System. A cohort of eight million persons. Dan Med Bull 2006; 53: 441–449.

  34. 34.

    , , , , , et al. Genome-wide scans using archived neonatal dried blood spot samples. BMC Genomics 2009; 10: 297.

  35. 35.

    , , , . GRR: graphical representation of relationship errors. Bioinformatics 2001; 17: 742–743.

  36. 36.

    , , , , group TCS. Maternal exposure to herpes simplex virus and risk of psychosis among adult offspring. Biol Psychiatry 2008; 63: 809–815.

  37. 37.

    , . Neonatal immunology. Semin Perinatol 1998; 22: 2–14.

  38. 38.

    , . Epidemiological impact and disease burden of congenital cytomegalovirus infection in Europe. Euro Surveill 2009; 14: 26–32.

  39. 39.

    , , , , , et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007; 81: 559–575.

  40. 40.

    , , . Population structure and eigenanalysis. PLoS Genet 2006; 2: e190.

  41. 41.

    , , , , , et al. Genomic inflation factors under polygenic inheritance. Eur J Hum Genet 2011; 19: 807–812.

  42. 42.

    . Statistical Methods for Research Workers 13th edn, Oliver and Boyd: London, 1925.

  43. 43.

    , , . Gene-environment interaction in genome-wide association studies. Am J Epidemiol 2009; 169: 219–226.

  44. 44.

    , . Mach 1.0: rapid haplotype reconstruction and missing genotype inference. Am J Hum Genet 2006; S79: 2290.

  45. 45.

    , , , . Circadian clock gene expression in brain regions of Alzheimer 's disease patients and control subjects. J Biol Rhythms 2011; 26: 160–170.

  46. 46.

    , , , , . The suitability of actigraphy, diary data, and urinary melatonin profiles for quantitative assessment of sleep disturbances in schizophrenia: a case report. Chronobiol Int 2006; 23: 485–495.

  47. 47.

    , , . Electroencephalographic sleep in schizophrenia: a critical review. Compr Psychiatry 1990; 31: 34–47.

  48. 48.

    . Influence of sleep-wake and circadian rhythm disturbances in psychiatric disorders. J Psychiatry Neurosci 2000; 25: 446–458.

  49. 49.

    , , , , , et al. Circadian rhythm of tryptophan, serotonin, melatonin, and pituitary hormones in schizophrenia. Biol Psychiatry 1994; 35: 151–163.

  50. 50.

    , , , , , et al. Suggestive evidence for association of the circadian genes PERIOD3 and ARNTL with bipolar disorder. Am J Med Genet B Neuropsychiatr Genet 2006; 141B: 234–241.

  51. 51.

    , , , , , et al. Association study of eight circadian genes with bipolar I disorder, schizoaffective disorder and schizophrenia. Genes Brain Behav 2006; 5: 150–157.

  52. 52.

    , , , , , et al. Differential association of circadian genes with mood disorders: CRY1 and NPAS2 are associated with unipolar major depression and CLOCK and VIP with bipolar disorder. Neuropsychopharmacology 2010; 35: 1279–1289.

  53. 53.

    , , , , , et al. Association study of 21 circadian genes with bipolar I disorder, schizoaffective disorder, and schizophrenia. Bipolar Disord 2009; 11: 701–710.

  54. 54.

    , , , , , et al. Convergent functional genomics of genome-wide association data for bipolar disorder: comprehensive identification of candidate genes, pathways and mechanisms. Am J Med Genet B Neuropsychiatr Genet 2009; 150B: 155–181.

  55. 55.

    . Regulation of cadherin-mediated adhesion in morphogenesis. Nat Rev Mol Cell Biol 2005; 6: 622–634.

  56. 56.

    . Morphogenetic roles of classic cadherins. Curr Opin Cell Biol 1995; 7: 619–627.

  57. 57.

    . The cadherin superfamily in neuronal connections and interactions. Nat Rev Neurosci 2007; 8: 11–20.

  58. 58.

    , , , , , et al. Expression of T-cadherin (CDH13, H-Cadherin) in human brain and its characteristics as a negative growth regulator of epidermal growth factor in neuroblastoma cells. J Neurochem 2000; 74: 1489–1497.

  59. 59.

    , , . Inhibition of motor axon growth by T-cadherin substrata. Development 1996; 122: 3163–3171.

  60. 60.

    , , , , , et al. T-cadherin structures reveal a novel adhesive binding mechanism. Nat Struct Mol Biol 2010; 17: 339–347.

  61. 61.

    , , , , , et al. Genome-wide association scan of quantitative traits for attention deficit hyperactivity disorder identifies novel associations and confirms candidate gene associations. Am J Med Genet B Neuropsychiatr Genet 2008; 147B: 1345–1354.

  62. 62.

    , , , , , et al. Molecular genetics of adult ADHD: converging evidence from genome-wide association and extended pedigree linkage studies. J Neural Transm 2008; 115: 1573–1585.

  63. 63.

    , , , , , et al. Genome-wide association scan of attention deficit hyperactivity disorder. Am J Med Genet B Neuropsychiatr Genet 2008; 147B: 1337–1344.

  64. 64.

    , , , , , et al. Case-control genome-wide association study of attention-deficit/hyperactivity disorder. J Am Acad Child Adolesc Psychiatry 2010; 49: 906–920.

  65. 65.

    , , , , , et al. Meta-analysis of genome-wide linkage scans of attention deficit hyperactivity disorder. Am J Med Genet B Neuropsychiatr Genet 2008; 147B: 1392–1398.

  66. 66.

    , , , , , et al. Genome-wide association scan of trait depression. Biol Psychiatry 2010; 68: 811–817.

  67. 67.

    , , , , , et al. Common genetic variants on 5p14.1 associate with autism spectrum disorders. Nature 2009; 459: 528–533.

  68. 68.

    , , , , , et al. Multiple recurrent de novo CNVs, including duplications of the 7q11.23 Williams syndrome region, are strongly associated with autism. Neuron 2011; 70: 863–885.

  69. 69.

    , , , , , . Complementary expression pattern of Zfhx1 genes Sip1 and deltaEF1 in the mouse embryo and their genetic interaction revealed by compound mutants. Dev Dyn 2006; 235: 1941–1952.

  70. 70.

    , , . Transcriptional inhibition of REST by NeuroD2 during neuronal differentiation. Mol Cell Neurosci 2010; 44: 178–189.

  71. 71.

    , , , , , et al. The class I bHLH factors E2-2A and E2-2B regulate EMT. J Cell Sci 2009; 122: 1014–1024.

  72. 72.

    , , , , , et al. E2-2 protein and Fuchs's corneal dystrophy. N Engl J Med 2010; 363: 1016–1024.

  73. 73.

    , , , , , et al. Missense mutations in TCF8 cause late-onset Fuchs corneal dystrophy and interact with FCD4 on chromosome 9p. Am J Hum Genet 2010; 86: 45–53.

  74. 74.

    , , , , . Zeb1-mediated T-cadherin repression increases the invasive potential of gallbladder cancer. FEBS Lett 2009; 583: 430–436.

  75. 75.

    , , , , , et al. Alpha-T-catenin is expressed in human brain and interacts with the Wnt signaling pathway but is not responsible for linkage to chromosome 10 in Alzheimer's disease. Neuromolecular Med 2004; 5: 133–146.

  76. 76.

    , , . Molecular and functional analysis of cadherin-based adherens junctions. Annu Rev Cell Dev Biol 1997; 13: 119–146.

  77. 77.

    , , , , , et al. alphaT-catenin: a novel tissue-specific beta-catenin-binding protein mediating strong cell-cell adhesion. J Cell Sci 2001; 114: 3177–3188.

  78. 78.

    , , , , . Cytomegalovirus-induced transendothelial cell migration. a closer look at intercellular communication mechanisms. Intervirology 1999; 42: 350–356.

  79. 79.

    , , , . Human cytomegalovirus immediate-early-gene expression disrupts embryogenesis in transgenic Drosophila. Transgenic Res 2008; 17: 105–119.

  80. 80.

    , , , , , et al. LRRTM3 promotes processing of amyloid-precursor protein by BACE1 and is a positional candidate gene for late-onset Alzheimer's disease. Proc Natl Acad Sci USA 2006; 103: 17967–17972.

  81. 81.

    , , , , , et al. Effect of heterogeneity on the chromosome 10 risk in late-onset Alzheimer disease. Hum Mutat 2007; 28: 1065–1073.

  82. 82.

    , , , , , et al. Genetic association of CTNNA3 with late-onset Alzheimer’s disease in females. Hum Mol Genet 2007; 16: 2854–2869.

  83. 83.

    , , , , , et al. Association analysis of 528 intra-genic SNPs in a region of chromosome 10 linked to late onset Alzheimer's disease. Am J Med Genet B Neuropsychiatr Genet 2008; 147B: 727–731.

  84. 84.

    , , , , , . An association analysis of Alzheimer disease candidate genes detects an ancestral risk haplotype clade in ACE and putative multilocus association between ACE, A2M, and LRRTM3. Am J Med Genet B Neuropsychiatr Genet 2009; 150B: 721–735.

  85. 85.

    , , , , , et al. Polymorphisms in leucine-rich repeat genes are associated with autism spectrum disorder susceptibility in populations of European ancestry. Mol Autism 2010; 1: 7.

  86. 86.

    , , , , . Association of cytomegalovirus and herpes simplex virus infections of the cervix in four clinic populations. Sex Transm Dis 1985; 12: 224–228.

  87. 87.

    , , , , , . Seroprevalence of cytomegalovirus infection in the United States, 1988–1994. Clin Infect Dis 2006; 43: 1143–1151.

Download references


The study was supported by grants from The Danish Strategic Research Council, H. Lundbeck A/S, The Faculty of Health Sciences at Aarhus University, Lundbeck Foundation, The Stanley Research Foundation and The Villum Kann Rasmussen Foundation. The German SCZ GWAS sample was supported by the German Federal Ministry of Education and Research (BMBF), within the context of the National Genome Research Network 2 (NGFN-2), the National Genome Research Network plus (NGFNplus) and the Integrated Genome Research Network (IG) MooDS (Grant 01GS08144 to SC and MMN, Grant 01GS08147 to MR). The Dutch GWAS sample from Utrecht was sponsored by NIMH funding, R01 MH078075.

Author contributions

Direction of study: ADB. Overall idea and strategy: ADB, OM and PBM. Collection and ascertainment of Danish study subjects: OM, MN, PBM, CBP and ADB. Sample preparation, laboratory experiments and genotyping: DD, MH, DMH, MN, AH, TØ, CW, MD and RHY. Statistical analyses: DD, JG, JP and ADB. German-Dutch GWA data: SC, MM, MMN, MR, GROUP, AU and RAO. Writing of manuscript: DD and ADB. Commenting and editing manuscript: All authors.

Author information


  1. Department of Biomedicine and Centre for Integrative Sequencing, iSEQ, Aarhus University, Aarhus, Denmark

    • A D Børglum
    • , D Demontis
    • , J Grove
    • , J Pallesen
    • , A Hedemand
    •  & M Nyegaard
  2. Centre for Psychiatric Research, Aarhus University Hospital, Risskov, Denmark

    • A D Børglum
    •  & O Mors
  3. The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus and Copenhagen, Denmark

    • A D Børglum
    • , D Demontis
    • , J Grove
    • , J Pallesen
    • , C B Pedersen
    • , A Hedemand
    • , M Nyegaard
    • , M Nordentoft
    • , P B Mortensen
    •  & O Mors
  4. Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark

    • J Grove
    •  & C Wiuf
  5. Section of Neonatal Screening and Hormones, Statens Serum Institute, Copenhagen, Denmark

    • M V Hollegaard
    •  & D M Hougaard
  6. National Centre for Register-based Research, Aarhus University, Aarhus, Denmark

    • C B Pedersen
    •  & P B Mortensen
  7. Department of Genomics, Life and Brain Center, University of Bonn, Bonn, Germany

    • M Mattheisen
    • , M M Nöthen
    •  & S Cichon
  8. Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA

    • M Mattheisen
  9. Institute for Genomic Mathematics, University of Bonn, Bonn, Germany

    • M Mattheisen
  10. For a full list of members, see Appendix

  11. Department of Internal Medicine, Erasmus Medical Center Rotterdam, Rotterdam, The Netherlands

    • A Uitterlinden
  12. Department of Molecular Medicine, Aarhus University Hospital, Skejby, Denmark

    • T Ørntoft
  13. Department of Mathematical Science, University of Copenhagen, Copenhagen, Denmark

    • C Wiuf
  14. Synaptic transmission, H. Lundbeck A/S, Valby, Denmark

    • M Didriksen
  15. Psychiatric Centre Copenhagen, Copenhagen University Hospital, Copenhagen, Denmark

    • M Nordentoft
  16. Institute of Human Genetics, University of Bonn, Bonn, Germany

    • M M Nöthen
    •  & S Cichon
  17. German Center for Neurodegenerative Disorders (DZNE), Bonn, Germany

    • M M Nöthen
  18. Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, University of Heidelberg, Manheim, Germany

    • M Rietschel
  19. Department of Medical Genetics and Rudolf Magnus Institute of Neuroscience, University Medical Center Utrecht, Utrecht, The Netherlands

    • R A Ophoff
  20. Institute of Neuroscience and Medicine (INM-1), Research Center Juelich, Juelich, Germany

    • S Cichon
  21. Stanley Division of Developmental Neurovirology, Department of Pediatrics, Johns Hopkins University School of Medicine, Baltimore, MD, USA

    • R H Yolken


  1. GROUP investigators


  1. Search for A D Børglum in:

  2. Search for D Demontis in:

  3. Search for J Grove in:

  4. Search for J Pallesen in:

  5. Search for M V Hollegaard in:

  6. Search for C B Pedersen in:

  7. Search for A Hedemand in:

  8. Search for M Mattheisen in:

  9. Search for A Uitterlinden in:

  10. Search for M Nyegaard in:

  11. Search for T Ørntoft in:

  12. Search for C Wiuf in:

  13. Search for M Didriksen in:

  14. Search for M Nordentoft in:

  15. Search for M M Nöthen in:

  16. Search for M Rietschel in:

  17. Search for R A Ophoff in:

  18. Search for S Cichon in:

  19. Search for R H Yolken in:

  20. Search for D M Hougaard in:

  21. Search for P B Mortensen in:

  22. Search for O Mors in:

Competing interests

The authors declare no conflict of interest.

Corresponding author

Correspondence to A D Børglum.

Supplementary information



Genetic Risk and Outcome in Psychosis (GROUP Investigators): René S Kahn1, Don H Linszen2, Jim van Os3, DurkbWiersma4, Richard Bruggeman4, Wiepke Cahn1, Lieuwe de Haan2, Lydia Krabbendam3, Inez Myin-Germeys3.

1Department of Psychiatry, Rudolf Magnus Institute of Neuroscience, University Medical Center Utrecht, Utrecht, The Netherlands; 2Department of Psychiatry, Academic Medical Center University of Amsterdam, Amsterdam, The Netherlands; 3Maastricht University Medical Center, South Limburg Mental Health Research and Teaching Network, Maastricht, The Netherlands; 4Department of Psychiatry, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.

About this article

Publication history







Supplementary Information accompanies the paper on the Molecular Psychiatry website (http://www.nature.com/mp)

Further reading