Introduction

Building on the gigantic progress made on genome analyses, more and more genome-wide association studies (GWAS) on mental disease were conducted in recent years, some of them including >100 000 individuals.1, 2 Although the information obtained from these huge endeavors is of fundamental interest for population genetics, it has not been possible yet to elucidate genetic roots of neuropsychiatric disease-relevant behavioral phenotypes. Even though the so far largest GWAS on schizophrenia extracted 108 genome-wide significant loci,2 their roles for individuals suffering from mental disease have remained largely obscure. The loci in aggregate are not suitable to predict any disease-relevant phenotypes apart from the schizophrenia diagnosis.3 Although perhaps disappointing, this fact is readily explained by the excessive and multiplex heterogeneity of individuals gathering under the end point diagnoses of highly complex diseases (‘mixed bags’; that is, every schizophrenic subject having another combination of genetic and environmental reasons to be schizophrenic), not to mention the clinical diagnostic uncertainty and total lack of objective measures.

Polygenic risk scores, considering in a stepwise manner the less significant single-nucleotide polymorphisms (SNPs) underneath the Bonferroni threshold or complex pathway analyses based on the identified risk genotypes, raised hopes to make more sense of the GWAS finds but the value of these scores turned out to be rather limited for complex disease and their reproducibility often questionable e.g.4, 5, 6 In contrast, by exploring distinct SNPs derived from the schizophrenia GWAS2 information, general principles of interest for normal and abnormal brain development, contributing to disease pathophysiology, may be extracted.7 Indisputably, GWAS methodology can yield interesting results for specific mechanistic insight upon analysis of narrow and well-defined phenotypes.8, 9, 10

The present paper proposes an entirely different way to use schizophrenia GWAS hits for understanding at least some behavioral aspects of mental illness: an approach beginning with individuals. We named this approach the ‘OTTO project’, where ‘OTTO’ (old Germanic=heritage) stands for an individual, male or female, with a particular load of genetic factors (here: GWAS hits) that—together with his personally experienced environmental input—contributed to the development of his discrete mental illness. Ideally, OTTO’s illness is characterized by a particularly strong behavioral phenotype, which facilitates the here introduced procedure. We hypothesize that OTTO has some ‘similars’, that is, individuals who share with him not only the clinical end point diagnosis (schizophrenia) but, most importantly, his set of genetic (GWAS-derived) risk factors and possibly his discernible phenotype. Each OTTO together with his similars constitutes a subgroup, which we termed for this approach ‘squad’, whose members are characterized by a certain mental disease-relevant genotype–phenotype constellation. Exploring OTTO to uncover this constellation will, according to our hypothesis, open new avenues in the search for biological subgroups of mental diseases.

The OTTO approach detailed here to provide proof of concept has been grounded on two important assumptions: (1) The latest schizophrenia GWAS hits of the PGC (Psychiatric Genomics Consortium)2 are pertinent for mental disease. (2) At least in some cases, these GWAS hits alone will assist in defining or approximating mental disease-relevant genotype–phenotype associations. At the same time, we recognize several constraints of our OTTO approach as currently implemented: (i) The biological significance of most SNPs derived from GWAS is not yet clear. Many SNPs are intergenic or intronic and thus still lack any viable path to mechanistic insight. (ii) The GRAS (Göttingen Research Association for Schizophrenia) data set we use to evaluate OTTO is very large compared with other deeply phenotyped samples, yet it is relatively small for genetic studies. (iii) Replication by other laboratories will pose a challenge owing to the present uniqueness of the GRAS-type deep phenotyping database and may require the use of phenotype proxies as exemplified here. (iv) A homozygous risk assumption is made to facilitate our newly applied procedures (purposely skipping potential heterozygous risk at this point and accepting loss of power).

To sum up, the present study has been designed to raise in the first part the awareness of the enormous heterogeneity of the schizophrenia GWAS subject matrix, which urgently calls for alternative strategies to unravel biological disease subgroups. Building on this ground in the second and third part, the ‘OTTO project’ as an individual-based approach to categorize GWAS finds in mental illness is introduced: Several variations of the OTTO idea are presented and translated into first data sets, providing examples of promising phenotype-to-genotype or genotype-to-phenotype approaches, all starting from individuals. Genotypes (SNPs) extracted in this way from GWAS findings apparently contribute in aggregate to autistic or affective behavioral continua. Most importantly, their accumulation yields encouraging signals of replicability in independent samples, an absolutely essential prerequisite for this kind of approach.

Materials and methods

Subjects

All parts of the present study comply with the Helsinki Declaration and were approved by the Ethics Committees of the Universities of Göttingen and Greifswald, Germany or of collaborating centers. All subjects and/or their authorized legal representatives gave written informed consent.

GRAS schizophrenia patients

The GRAS data collection has been initiated in 2005 and consists of meanwhile 1106 genotyped and deeply phenotyped patients, diagnosed with schizophrenia or schizoaffective disorder according to Diagnostic and Statistical Manual of Mental Disorders Fourth Edition, Text Revision (DSM-IV-TR),11 recruited in 23 collaborating centers across Germany.12, 13

GRAS healthy controls

Healthy controls were voluntary blood donors (total: N=1259) from the Department of Transfusion Medicine, University Medical Center Göttingen, Germany, who widely fulfill health criteria, assessed by predonation screening.12

GRAS disease controls

The extended GRAS data collection of mental disease controls (here: N=65) consists of male and female patients diagnosed with psychiatric disorders other than schizophrenia according to DSM-IV-TR.11

ASD subjects

Starting in 2011, the GRAS data collection has been expanded to include autism spectrum disorder (ASD) subjects (here: N=81 males). Prerequisites for the present study were male gender, Intelligent Quotient 75 and expressive language skills allowing semistructured interviewing.14

General population sample

The replication sample of the general population comprises 2400 subjects from a follow-up (SHIP-LEGEND) of the Study of Health in Pomerania (SHIP), conducted in North-East Germany. SHIP investigates common risk factors, subclinical disorders and manifest diseases in the general population.15

Phenotyping

Phenotyping of all subjects included in the present study has been described in great detail.3, 12, 13, 14, 15, 16 Only subjects with data available on all respective items were included in the composite score-based analyses, explaining the slightly variable N numbers.

Positive and Negative Syndrome Scale Autism Severity Score

The Positive and Negative Syndrome Scale (PANSS) Autism Severity Score (PAUSS) is a cross-validated measure capturing the continuous nature of autistic behaviors.14 It represents the sum of six items of the negative subscale and two items of the general subscale of the PANSS.17

Affective Composite Score

Depression, anxiety (PANSS general psychopathology subscale items 6 and 2; range 1–7),17 suicidality (score 0–3: sum of dichotomous items suicidal ideation, suicide plans and suicide attempts) and Brief Symptom Inventory items paranoid ideation, depressivity, somatic concern and anxiousness18 were integrated into the Affective Composite Score (ACS), calculated as sum of the single item values after Blom transformation of data.19 Current depressive symptoms were assessed in the disease controls and the general population sample (SHIP) using the Beck Depression Inventory (BDI), which is a 21-item self-report questionnaire with high reliability and validity.20 Self-reported chronic stress was measured in the general population with the 12-item Trier Inventory for the Assessment of Chronic Stress (TICS).21

Genotyping

Genotyping has been performed using a semicustom Axiom myDesign genotyping array (Affymetrix, Santa Clara, CA, USA), based on a CEU marker backbone including 518 722 SNPs and a custom marker set including 102 537 SNPs. The array was designed using the Axiom Design Center (www.affymetrix.com), applying diverse selection criteria. Genotyping was performed by Affymetrix on a GeneTitan platform. SNP call rate >97% and several other quality-control steps were applied. These steps were performed using either the Genotyping Console (GTC) software (Affymetrix) or R. In a subsequent step, markers in X, Y and mitochondrial chromosomes and those with Hardy–Weinberg equilibrium P<1 × E-10 were removed, leaving 589 921 SNPs available for analyses.3, 16, 22 Of the 1106 GRAS patients, 46 individuals had to be excluded for further analysis owing to relatedness, genotyping problems and genetic population outlier status (based on 10 principle components) (N=1060; 709 males and 351 females; mean age 39.47±12.64 years; range 17–79). Similarly, of the 1259 healthy controls, 93 individuals had to be excluded for further analysis for the same reasons (N=1166; 731 males and 435 females; mean age 37.49±13.27 years; range 18–69). For the present analyses, we selected those SNPs directly genotyped in our GRAS data set with genome-wide significance (P<5 × 10E-8) in the most recent schizophrenia GWAS.2 Pruning based on linkage disequilibrium r20.05 as implemented in PLINK23 was carried out. This procedure left 97 GWAS-derived SNPs ready for analyses. Genotyping of the disease controls (for the present study: N=65 with BDI available; 38 males and 27 females; mean age 48.10±14.57 years; range=22–80) and the autism sample (for the present study, high-functioning male ASD subjects: N=81; mean age 31.78±10.83 years; range=16–63) was carried out analogously. The general population sample (from SHIP; N=2400; mean age 55.89±14.27 years; range=29–89) was genotyped using the Affymetrix Genome-Wide Human SNP array 6.0. Hybridization of genomic DNA was carried out in accordance with the manufacturer’s standard recommendations. The overall genotyping efficiency was 98.6%. Imputation of genotypes in the SHIP cohort was performed with the software IMPUTE v0.5.0 against the 1000 Genomes (phase 1v3) reference panel using 869 224 genotyped SNPs.15 Of the 4 SNPs in the schizoaffective model, 3 were imputed with a high imputation quality of >0.97.

OTTO I–IV

From the GRAS database, four schizophrenic subjects, referred to as OTTOs I–IV, with widely contrasting risk genotype constellation, that is, with lowest possible homozygous risk SNP overlap among each other, were selected (Figure 1).

Figure 1
figure 1

Enormous heterogeneity of the schizophrenia genome-wide association studies (GWAS) subject matrix and closeness of disease and health regarding GWAS hits. (a, b) Cumulative distributions of schizophrenia risk single-nucleotide polymorphisms (SNPs) in the GRAS (Göttingen Research Association for Schizophrenia) sample illustrate the genetic heterogeneity of GWAS on end point diagnosis. (c) Odds ratios of the 97 schizophrenia GWAS SNPs determined in 1060 schizophrenic/schizoaffective and 1166 healthy subjects of the GRAS data collection. (d) Presentation of 4 OTTOs with contrasting homozygous GWAS risk SNP constellation (compare Table 1) and their schizophrenia and healthy similars sharing 85, 75, 65 and 55% homozygous risk SNPs with the respective OTTO; P-values/odds ratio (95% confidence interval) from Fisher’s exact test comparing schizophrenia with healthy similars of the 65% and 55% overlap squads.

PowerPoint slide

‘OTTO cousins’

From the raw genotype data (1106 schizophrenia patients), the full SNP data set was pruned using minor allele frequency (0.05) and linkage disequilibrium between markers (r20.05) as inclusion criteria, leaving 33 311 markers ready for relatedness ascertainment. This was carried out with PLINK by calculating a genome-wide pairwise identity-by-descent estimation.23 Six subject pairs with a PI-HAT value between 0.26 and 0.54 were identified that are normally excluded owing to relatedness (see above). In the ‘cousin approach’, however, 3 ‘OTTO cousin pairs’ with a PI-HAT value of around 0.25 were further analyzed.

PGAS for SNP pruning and selection

All 97 GWAS risk SNPs are SNPs that ‘survived’ after linkage disequilibrium pruning (r20.05), an important prerequisite to exclude redundant information in the here pursued phenotype-based genetic association study (PGAS) approach. GWAS risk SNPs of OTTOs were individually screened by a PGAS approach using the adequate phenotype measure, here PAUSS or ACS. Single SNPs with a tendency of one homozygous genotype being associated with the phenotype of interest went into the accumulation models (standard operating procedure (SOP) explained in great detail in Stepniak et al.;16 Figure 2).

Figure 2
figure 2

Individual-based approach to genome-wide association studies (GWAS) finds: OTTO workflow diagram to exemplify the applied novel procedures. (a) Selection of OTTOs based on (1) a phenotype-to-genotype approach starting from an individual with an extreme (here: autistic) phenotype and (2) a genotype-to-phenotype approach starting from cousin pairs with a shared severe (here: affective) phenotype. (b) Number of homozygous GWAS risk single-nucleotide polymorphisms (SNPs; of all 97 available after linkage disequilibrium pruning) carried by the autistic OTTO (26 SNPs) and the affective OTTOs (9 SNPs), respectively. (c) These GWAS risk SNPs are then individually screened for a tendency of association with the adequate phenotype measure (bar graph inset for illustration), here PAUSS (PANSS Autism Severity Score) and ACS (Affective Composite Score), respectively, using the GRAS (Göttingen Research Association for Schizophrenia) sample: PGAS (phenotype-based genetic association study) standard operating procedure (SOP). (d) Number of homozygous SNPs showing such tendency. (e) These SNPs go into the respective accumulation model and are evaluated in the discovery sample (GRAS). (f) Indispensable is the replication of any promising accumulation model in independent samples. In case of phenotype unavailability, a proxy phenotype has to be used.

PowerPoint slide

Statistical analysis

Genotypic data were compiled recording the numbers of risk alleles contained in each of the 97 different SNPs. Distribution functions of shared SNPs were obtained by computing the number of shared SNPs for each possible pair of individuals (1060 schizophrenic and 1166 healthy individuals of the GRAS data collection). The counts representing the numbers of shared SNPs were used first to compute histograms of frequencies of shared SNPs and then to compute the reverse of the cumulative distribution function (CDF) of occurrences, where a reverse CDF is defined as 1−CDF; 95% confidence intervals (CIs) computed through the Wald method along with a Kolmogorov–Smirnov test confirm that both distributions are indistinguishable from each other (Figures 1a and b). Mann–Whitney U-test or t-test were used for group comparisons, Jonckheere–Terpstra trend test was applied to test for trends. Spearman rank correlation coefficient was used to assess the strength of association between two non-parametric variables. Cronbach’s α was calculated as a measure of internal consistency. For the ACS, rank-based Blom transformation19 was applied to standardize all individual measures by transforming them into standard normally distributed surrogates prior to sum score computation. This maintained the order of the data but removed skewness from variable distributions. All P-values in discovery sample analyses are two-sided, in replications one-sided. Statistical analyses were performed with SPSS for windows version 17.0 (IBM-Deutschland, Munich, Germany) and STATA MP version 13.1 (StataCorp, College Station, TX, USA). Case–control analyses of SNP genotypes were performed using PLINK 1.07.23 Fisher’s exact test was used to compare schizophrenia and healthy similars regarding the number of homozygous risk SNPs shared with the genetically defined OTTOs.

Results

Raising the awareness of the enormous heterogeneity of the schizophrenia GWAS subject matrix and the closeness of health and disease

Before starting the OTTO approach, we aimed at documenting (i) the tremendous heterogeneity of the GWAS risk SNP distribution among schizophrenic individuals and controls, and (ii) the closeness of health and disease. For this, we explored the distribution of shared homozygous SNPs of the available 97 GWAS-derived risk SNPs2 in individuals of the GRAS data collection, schizophrenic and healthy. As illustrated in Figure 1a, the number of any obtained combination of homozygous schizophrenia risk SNPs (of 97) shared among 50% of subjects is 13.0 for schizophrenic (N=1060) and 12.0 for healthy subjects (N=1166). Less than 10% of all GRAS subjects have any combination of 18 homozygous risk SNPs (of 97) in common. Including heterozygous in addition to homozygous risk SNPs, the numbers derived from any combination of the 97 risk SNPs reach 56.0 versus 54.0 shared genotypes between 50% of subjects, schizophrenic or healthy, respectively (Figure 1b). These data stress the enormous heterogeneity of the GWAS subject matrix used for identifying common schizophrenia risk alleles and the remarkable genetic similarity between health and disease regarding the GWAS hit (risk SNP) distribution.

Underlined are these facts by the overall low odds ratios (ORs) known for all 97 SNPs in the PGC schizophrenia GWAS data set2 and reproduced here in the GRAS sample (Figure 1c), confirming poor class separation between healthy and schizophrenic GRAS subjects. To verify whether these data might be separable upon use of other methods, we applied a projection technique that is linear and incorporates class information, the so-called linear discriminant analysis,24 and a highly non-linear tree-based method, the random forest classifier.25, 26 Results of both approaches again highlight that schizophrenic and healthy GRAS individuals cannot be separated using the genotypic data (all P-values >0.25) (Supplementary-Figures S1 and S2).

Risk SNP overlaps of four contrasting individual OTTOs with their respective schizophrenic and healthy similars in the GRAS database

We now moved on to a first approach based on individual OTTOs. For this, we chose four schizophrenic GRAS subjects with widely contrasting risk genotype constellation. In other words, we selected four OTTOs (I–IV) with lowest possible homozygous risk SNP overlap among each other. As shown in Figure 1d and Table 1, these 4 OTTOs carry between 19 and 30 homozygous risk genotypes each. Searching for schizophrenia similars of all 4 OTTOs by stepwise lowering the risk SNP overlap threshold (starting from 85%), only 1 other individual was found among 1059 GRAS schizophrenic patients with 75% overlap, in this case with OTTO-II; lowering to an overlap of 65% resulted in 2–21 schizophrenic individuals, similar to OTTOs I–IV. Extracting healthy similars, the situation up to an overlap in risk SNPs of 65% is broadly comparable. Only OTTO-I similars are less frequently found in healthy as compared with schizophrenic subjects (P=0.048; OR=3.7; 95% CI: 1.01–13.46), but a tendency is seen for all OTTOs. At an overlap of 55%, distinctly more similars are found, with 3/4 healthy squads (those of OTTO-I, OTTO-II and OTTO-III) being smaller than the schizophrenic squads. This finding may be of potential interest with respect to genetic schizophrenia subgroups but needs replication and further refinement in larger samples. The small numbers of OTTO I–IV similars in the GRAS sample with 65% shared GWAS risk SNPs makes the vast heterogeneity of the GWAS subject matrix, as based on end point diagnoses, again very obvious. A genotype-to-phenotype approach with OTTOs I–IV as originally planned was therefore not deemed reasonable with our sample.

Table 1 Overview of the 97 schizophrenia GWAS risk SNPs and their distribution among OTTOs

Phenotype-to-genotype approach: the autistic OTTO

We thus performed a reverse approach, that is, a direct phenotype-to-genotype approach, starting with an individual exhibiting an extreme phenotype that is not expected to be as prominently affected by antipsychotic medication as, for instance, the severity of positive symptoms. For this, a male individual of the GRAS sample was selected, characterized by an unusually severe autistic phenotype as determined by PAUSS (Figures 2 and 3a and b).14 In fact, previous work underscores the convergence of schizophrenia-negative symptoms and autistic phenotypes. PAUSS evolved as a dimensional measure capturing the continuous nature of autistic behaviors.14 Schizophrenia patients predominantly suffering from negative symptoms obtain high scores also on the Autism Diagnostic Observation Schedule.27 Importantly, the here selected patient had been diagnosed with ASD in childhood (that is, before any antipsychotic medication) and was later switched to a schizophrenia diagnosis. Figure 3b illustrates the distinct phenotype of this ‘autistic OTTO’, contrasting him from the average of GRAS schizophrenic males. The autistic OTTO is homozygous for the risk allele at 26/97 GWAS risk loci. To identify among the 26 those markers that likely co-modulate autistic features, we applied our PGAS SOP, using PAUSS in the male GRAS sample16 (Figure 2). A total of seven homozygous risk SNPs, termed ‘pro-autistic genotypes’, individually showing a tendency of an association with PAUSS were extracted and included in the accumulation model (overview of the seven genes in Figure 3c). Interestingly, 5/7 identified ‘pro-autistic’ loci are within the extended major histocompatibility complex region (Table 1). An accumulation score, representing the sum of homozygous pro-autistic genotypes of the seven selected SNPs, was calculated for each male individual of discovery and replication samples. In the discovery sample, a significant trend of higher numbers of risk SNPs with higher PAUSS values emerged (Figure 3d). Extreme group comparison (0–1 versus 5–7 risk SNPs) was significant for PAUSS (Figure 3d) and revealed trends of higher values with higher-risk SNP numbers for all single PAUSS items (Figure 3e). As both the selection of PAUSS-associated SNPs of the autistic OTTO and the accumulation model were based on the male GRAS (that is, discovery) sample, ‘circularity’ of the procedure has to be admitted and the significant values obtained should thus be regarded as ‘exploratory significance estimates’. Replicability is therefore the most crucial issue. Importantly, the discovery sample result was reproducible in a small independent sample of male individuals diagnosed with ASD (Figure 3f). As an important control, we screened all 71 (97−26) risk SNPs, which were absent (or heterozygous) in the autistic OTTO, for association with PAUSS using our PGAS SOP. We found 27/71 SNPs that showed the association pattern. Expectedly, their accumulation in the GRAS males yielded a highly significant Jonckheere–Terpstra trend (P=0.0000002). However, replication in the ASD sample completely failed (P=0.919), underlining the necessity to start from an individual (OTTO) for pruning of phenotype-irrelevant SNPs. These data encourage the use of a GWAS-derived risk SNP ‘package’ of a subject with extreme autistic traits for predicting the severity of autistic features in the sense of a behavioral continuum, as shown here in a schizophrenic and an independent ASD replication sample. Especially the replication is absolutely essential for this type of analysis as outlined above.

Figure 3
figure 3

Example of a ‘bottom-up’ phenotype-to-genotype approach starting from a schizophrenia patient with an extreme phenotype, the ‘autistic OTTO’. (a) Quantification of autistic symptoms using the PAUSS (PANSS Autism Severity Score). Note the high intercorrelation of PAUSS items and the high internal consistency of the scale (Spearman rank correlation coefficients; Cronbach’s α). (b) Illustration of the extreme phenotype of the autistic OTTO compared with the mean PAUSS (±s.e.m.) of male schizophrenia patients (GRAS (Göttingen Research Association for Schizophrenia)). (c) List of 7 selected ‘pro-autistic’ single-nucleotide polymorphisms (SNPs) of the 26 genome-wide association studies (GWAS) risk SNPs carried in homozygosity by the autistic OTTO. Pruning/selection was carried out according to their phenotypical relevance following the standard operating procedure described in the Materials and methods section. (d) An accumulation of the seven risk SNPs is associated with PAUSS in the discovery sample; two-sided P-values for Jonckheere–Terpstra trend test and Mann–Whitney U-test of extreme group comparison (data corrected by linear regression analysis for antipsychotic medication using chlorpromazine equivalent doses); mean±s.e.m. (e) Separate presentation of extreme group results for all PAUSS items shows a consistent tendency of higher values upon higher genetic risk. (f) Effect of risk SNP accumulation on PAUSS and extreme group contrast are replicated in an independent sample of individuals diagnosed with autism spectrum disorder (ASD); one-sided P-values for Jonckheere–Terpstra trend test and Mann–Whitney U-test of extreme group comparison; mean±s.e.m.

PowerPoint slide

Genotype-to-phenotype approach using cousin pairs: the affective OTTOs

We next reasoned that another way to define OTTOs—via additional risk genotype pruning—could be achieved by using pairs of subjects who share a certain proportion of their genome as well as the end point diagnosis. This should yield better information on true risk genotypes. We thus employed real genetic cousins, all diagnosed with schizophrenia, as OTTO cousin pairs for a genotype-to-phenotype approach (Figures 2 and 4a). Those SNPs for which the schizophrenia risk allele was present in homozygosity in both members of each pair were extracted (9, 11 and 18 SNPs for pairs A1/2, B1/2 and C1/2, respectively) and used to define ‘OTTO cousin similars’. As for the autistic OTTO, schizophrenic and healthy similars were identified according to the amount of shared risk SNPs with each OTTO cousin pair, that is, 85% or 75% homozygous risk SNPs. As shown in Figure 4a, the number of individuals with risk SNP overlap 85% reaches values (34, 46 and 53) that can be used for a first exploratory genotype-to-phenotype approach. For two of the OTTO cousin pairs, even significant differences between numbers of schizophrenic and healthy similars are seen, potentially indicating signals of genetic schizophrenia subgroups. As an example of our genotype-to-phenotype approach, the OTTO-A1/2 pair (9 SNPs, 53 similars with 85% overlap) is further pursued here owing to a remarkably prominent, shared phenotype of these cousins.

Figure 4
figure 4

Example of a ‘bottom-up’ genotype-to-phenotype approach starting from schizoaffective OTTO cousin pairs. (a) Presentation of three OTTO cousin pairs in schizophrenic GRAS (Göttingen Research Association for Schizophrenia) subjects with shared homozygous risk alleles. Schizophrenia and healthy similars of each OTTO cousin pair sharing 85% and 75% homozygous risk single-nucleotide polymorphisms (SNPs) are shown; odds ratio (OR) and P-value from Fisher’s exact test comparing respective squads of schizophrenia and healthy similars. (b) Schizoaffective OTTO-A1/A2 pair and list of the four selected ‘pro-affective’ SNPs out of the nine genome-wide association studies (GWAS) risk SNPs shared in homozygosity. Pruning/selection was performed according to standard operating procedure described in Materials and methods section. (c) Quantification of affective symptoms using the Affective Composite Score (ACS): High intercorrelation of ACS items and high internal scale consistency (Spearman rank correlation; Cronbach’s α). (d) Extreme phenotypes of affective OTTOs compared with ACS of schizoaffective and schizophrenic GRAS patients (mean±s.e.m.). (e) Accumulation of the four risk SNPs associated with ACS in the discovery sample; two-sided P-values for Jonckheere–Terpstra trend test and Mann–Whitney U-test of extreme group comparison (data corrected by linear regression analysis for antipsychotic medication using chlorpromazine equivalent doses); mean±s.e.m. (f) Separate presentation of extreme group results for all ACS items shows a consistent tendency of higher values upon higher genetic risk. (g) Replication I: Effect of risk SNP accumulation on Beck Depression Inventory (BDI) as proxy of ACS and extreme group contrast are replicated in an independent sample of disease controls; one-sided P-values for Jonckheere–Terpstra trend and T-test of extreme group comparison; mean±s.e.m. (h) Replication II: BDI and Trier Inventory for the Assessment of Chronic Stress (TICS) as proxies of ACS for extreme group contrast in healthy controls (Study of Health in Pomerania); one-sided P-values for T-test of extreme group comparison; mean±s.e.m.

PowerPoint slide

To investigate the potential symptomatic relevance of the SNPs overlapping between OTTO-A1 and OTTO-A2, we first evaluated the clinical syndrome of the two female cousins based on the GRAS examination book13 and all medical reports and discharge letters. OTTO-A1 was first diagnosed in 1998 with bipolar disorder, depressive episode with psychotic symptoms, and later switched to the diagnosis schizoaffective disorder according to DSM-IV-TR.11 She had attempted to commit suicide once. Her older cousin, OTTO-A2, was, in 1978, also first diagnosed with bipolar disorder, manic episode without overt psychotic symptoms, and received the diagnosis of schizoaffective disorder in 2006. OTTO-A2 had 7 × attempted to commit suicide. Thus schizoaffective and bipolar symptoms as well as suicidality emerged as overlapping clinical characteristics of the two cousins (Figure 4b). We therefore generated, based on the most prominent symptoms of OTTO-A1/2, an ACS, as described in the Methods section (Figure 4c). Applying the ACS, 4/9 risk SNPs of OTTO-A1/2 were selected using our PGAS SOP (Figure 4b). As illustrated in Figure 4d, the phenotype (ACS) of the two cousins, compared with the mean ACS of all schizoaffective and schizophrenia subjects of GRAS, stuck out, featuring again an extreme phenotype as suitable discovery matrix. We next explored a step-wise accumulation of the four selected ACS-associated risk SNPs in the GRAS discovery sample and obtained ‘exploratory significance estimates’ for overall trend as well as for extreme group comparison (Figure 4e). Again, extreme groups of 0–1 versus 4 homozygous risk SNPs tended to differ in all ACS single items (Figure 4f). GRAS disease controls and individuals of the general population served as replication samples I and II. For both, the BDI20 had to be used as ACS proxy (owing to missing ACS information). In the small replication sample I, both trend test and extreme group comparison were significant (Figure 4g). Even in healthy individuals of SHIP (replication sample II), extreme groups were still borderline significant regarding BDI and the highly inter-correlated (r=0.77) TICS21 (Figure 4h). Again, control screening of the 88 (97−9) risk SNPs, which were absent (or heterozygous) in the affective OTTOs, for association with ACS using our PGAS SOP, led to 36 SNPs with the association pattern. Even though their accumulation in the GRAS sample resulted in a highly significant Jonckheere–Terpstra trend (P=0.00018) owing to the above-mentioned circularity, replication in the disease controls failed completely (P=0.973). Thus both individual-based approaches, the ‘extreme autistic OTTO’ and the ‘affective OTTO cousin pair’, led to small but replicable signals of phenotype–genotype associations.

Discussion

The present work has been designed as a first individual-based approach to categorize GWAS finds in mental illness. Referred to as the ‘OTTO project’, it follows a bottom-up approach to define biological subgroups of complex mental disease. The OTTO project builds on deeply phenotyped schizophrenic individuals who all carry a certain number of GWAS risk genotypes (as do all healthy individuals) and searches for genetic (GWAS risk SNPs) as well as phenotypic similars among subjects with the same end point diagnosis. The first part of this work was intended to exemplify the huge heterogeneity of the schizophrenia GWAS subject matrix. This heterogeneity urgently calls for alternative strategies to unravel biological disease subgroups or mental disease-relevant behavioral phenotypes. The second and third part translated the OTTO idea to first data and showed examples of promising phenotype-to-genotype and genotype-to-phenotype approaches, starting from individuals.

In fact, applying the OTTO procedure, one can now begin to delineate the contribution of normal genotypes (here: GWAS risk SNPs) to (disease) phenotypes.28 Many behavioral traits constitute a continuum as exemplified in the present study for autistic (PAUSS) or affective symptoms (ACS). Health and disease genotypes are remarkably close, illustrated here by the cumulative distribution of shared GWAS risk SNPs. Also ‘disease borders’ according to classification systems commence to vanish: A recent multivariate sibling study of the Swedish population, starting from >3 million individuals, reported on a general genetic factor of psychopathology, indicating that common psychiatric disorders share the same genetic origin.29 Moreover, a proportion of GWAS risk SNPs may not even be disease phenotype related, as suggested by the necessity of pruning individual risk SNP ‘packages’ (to yield 7/26 or 4/9) for overt phenotype relevance. They may mark differences between schizophrenia cases and controls that are not associated with schizophrenia at all (for example, cigarette consumption or social status). Clearly, the mechanistic (molecular–cellular) contribution of OTTO’s identified risk SNPs to the phenotype still remains to be elucidated.

Nevertheless, first hints toward biological (genetic) roots of behavioral features or of mental disease-relevant phenotypes may be obtained using our model approach. Such hints have been demonstrated here for an autistic phenotype deduced from an individual OTTO with extreme autistic features and an affective phenotype, derived from an OTTO cousin pair. Even though the signals are still small, they were reproducible in independent replications, including a general population sample. Ideally, biological subgroups derived from OTTOs may be used for GWAS against healthy controls or against samples with crude end point diagnosis (schizophrenia): The better the biological subgroup definition, the greater the chance to harvest more specific, disease-relevant genetic (GWAS) hits. This will, however, require deep phenotyping of large numbers of individuals to allow PGAS.16, 28 Recently, also the term ‘phenome-wide association studies’ was coined for exploiting readouts of phenotypic data from electronic health records, which may be very helpful but likely too crude for neuropsychiatric diseases.30

Not to forget is the tremendous influence of the environment, mediated by, for example, epigenetic modulation of the inherited substrate.3 This suggests novel scientific endeavors for estimating the environmental impact on individuals (OTTOs) by ‘deep phenotyping’ also of the environment. For influences such as urbanicity, air pollution or constant noise exposure, deep environmental phenotyping may be realizable. First approaches in this direction, considering individual stress load, are already underway.31 Deep phenotyping of individuals and environment to complement and start making sense of genetic (GWAS) information is ultimately inevitable but extremely labor intense and has neither been appreciated nor well funded in recent years. Based on the numbers of deeply phenotyped subjects extending to at least 5–10 times the GRAS collection, the OTTO model will yield many discernible disease-relevant phenotypes and ultimately deliver a superior subject matrix for GWAS.

To conclude, the here suggested model approach, starting with an ‘extreme phenotype OTTO’ and in particular with an ‘OTTO cousin pair’ may aid in revealing phenotypical contributions of a certain GWAS risk SNP constellation. We note, however, that while for OTTO and his squad, the identified risk SNPs are among the relevant genetic underpinnings of their phenotype, other individuals with autistic or affective traits may have totally different genetic reasons for showing this particular phenotype (‘many ways lead to Rome’). Nevertheless, complementary to the top-down approach of GWAS in complex diseases, the OTTO project provides a bottom-up model exploiting GWAS results for the identification of distinct genetic roots of behavioral continua, as relevant for mental disease subcategories.