Functional neuroimaging effects of recently discovered genetic risk loci for schizophrenia and polygenic risk profile in five RDoC subdomains

Recently, 125 loci with genome-wide support for association with schizophrenia were identified. We investigated the impact of these variants and their accumulated genetic risk on brain activation in five neurocognitive domains of the Research Domain Criteria (working memory, reward processing, episodic memory, social cognition and emotion processing). In 578 healthy subjects we tested for association (i) of a polygenic risk profile score (RPS) including all single-nucleotide polymorphisms (SNPs) reaching genome-wide significance in the recent genome-wide association studies (GWAS) meta-analysis and (ii) of all independent genome-wide significant loci separately that showed sufficient distribution of all allelic groups in our sample (105 SNPs). The RPS was nominally associated with perigenual anterior cingulate and posterior cingulate/precuneus activation during episodic memory (PFWE(ROI)=0.047) and social cognition (PFWE(ROI)=0.025), respectively. Single SNP analyses revealed that rs9607782, located near EP300, was significantly associated with amygdala recruitment during emotion processing (PFWE(ROI)=1.63 × 10−4, surpassing Bonferroni correction for the number of SNPs). Importantly, this association was replicable in an independent sample (N=150; PFWE(ROI)<0.025). Other SNP effects previously associated with imaging phenotypes were nominally significant, but did not withstand correction for the number of SNPs tested. To assess whether there was true signal within our data, we repeated single SNP analyses with 105 randomly chosen non-schizophrenia-associated variants, observing fewer significant results and lower association probabilities. Applying stringent methodological procedures, we found preliminary evidence for the notion that genetic risk for schizophrenia conferred by rs9607782 may be mediated by amygdala function. We critically evaluate the potential caveats of the methodological approaches employed and offer suggestions for future studies.


INTRODUCTION
Schizophrenia, a severe and often chronic disease that affects 1% of the population, has one of the highest heritability estimates in psychiatry (80%). 1 Genome-wide association studies (GWAS) have been uncovering an increasing number of common variants underlying disease susceptibility, promising valuable insights into pathogenic biological pathways. The largest GWAS to date including 36 989 cases and 113 075 controls identified 125 genetic loci (of which 108 were independent) associated with schizophrenia. 2 It was suggested that investigating important neurocognitive domains implemented within specific brain circuits could be a promising way for biological psychiatry. The Research Domain Criteria (RDoC) approach 3,4 postulates five major domains (negative valence, positive valence, cognition, social processes and arousal/regulation) containing several subdomains of which many are relevant to schizophrenia. Following the RDoC rationale, in order to uncover biological mechanisms underlying mental illness, these domains warrant investigation at different units of analyses, including genetics and brain circuits. In the last years, imaging genetics studies investigated the impact of genetic risk variants on domain-related brain circuits (for overviews see refs 5-7). In our own previous work we were able to identify potential intermediate phenotypes 8,9 in five RDoC subdomains (working memory (WM), episodic memory, reward processing (RP), social cognition and emotion processing) that were modulated by risk variants within schizophrenia-associated variants (for example, CACNA1C, ZNF804A). [10][11][12][13] Importantly, several of these findings were successfully replicated in independent samples. [14][15][16] However, all of these studies as well as the overwhelming majority of all other published imaging genetics investigations (see Stein et al. 17 for a notable exception, though using structural neuroimaging) have been performed with single genetic variants and usually with only one neurocognitive paradigm.
A crucial next step is to comprehensively investigate the impact of the 108 independent schizophrenia-associated loci uncovered recently. 2 One approach complementing single single-nucleotide polymorphism (SNP) analyses is the use of polygenic risk profiles. As the accumulation of genetic variants is known to form a substantial proportion of genetic susceptibility to psychiatric disease, 18 Purcell et al. 19 proposed the use of a polygenic risk profile score (RPS), a sum across risk alleles of multiple SNPs weighted by their effect size in an independent study. Employing RPS is considered a feasible approach to investigate the combined genetic impact of multiple variants in small samples. 20 However, investigating the linear combination of several hundreds or thousands of variants may come at the expense of losing specificity; that is, RPS may obscure information conveyed by single or subsets of genes. Therefore, we decided to employ two complementary exploratory analyses. We investigated a range of promising intermediate phenotypes evoking activation of dedicated brain circuits relevant for schizophrenia (WM, episodic memory, RP, social cognition and emotion processing). These five neurocognitive subdomains cover four of the five general RDoC domains, that is, negative valence, positive valence, cognition and social cognition. First, we tested for association with an RPS including all SNPs reaching genome-wide significance (Po5 × 10 − 8 ; combined risk of 125 SNPs) in the recent GWAS meta-analysis. Second, we tested the effects of all genome-wide significant single variants that showed sufficient distribution of all allelic groups within our sample in order to identify contributions of specific variants. Results were assessed within task-specific target areas located within widespread brain circuits (Supplementary Figure S2).

MATERIALS AND METHODS Subjects
A total of 578 German volunteers who never suffered from psychiatric disorder (evidenced by SCID-I) 21 were recruited at Mannheim, Berlin and Bonn as part of an ongoing study on neurogenetic mechanisms of unipolar depression, bipolar disorder and schizophrenia (http://www.ngfn.de/en/ schizophrenie.html; http://www.sys-med.de/en/consortia/integrament/). N = 333 participants had no lifetime family history of schizophrenia or an affective disorder, and n = 245 subjects had at least one first-degree relative affected by schizophrenia (n = 72), bipolar disorder (n = 71) or depression (n = 102). Affected index patients did not suffer from any other psychotic or affective disorder, and the investigated relatives had no family history of multiple different psychiatric diagnoses (for example, cases of both, affective and psychotic disorders). All subjects had grandparents of European origin. Following application of exclusion criteria n = 472-509 subjects were included in the analyses of the respective tasks (see Supplementary Material for details). N = 150 controls recruited as part of a study on the neurogenetic mechanisms of alcohol dependence in Berlin and Bonn (http://www.ngfn.de/en/alkoholabh__ngigkeit.html) 22 served as replication sample. All participants never suffered from psychiatric disorder according to the Structured Clinical Interview of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) Axis-I Disorders (SCID-I). 21 Demographic characteristics of the respective subsamples are given in Supplementary Tables S1 and S2. The study was approved by local ethics committees of the universities of Heidelberg, Berlin and Bonn. All participants gave written informed consent to the study according to the Declaration of Helsinki.

DNA extraction and genotyping
Ethylenediaminetetraacetic acid anticoagulated venous blood samples were collected from all individuals. Lymphocyte DNA was isolated using the Chemagic Magnetic Separation Module I (Chemagen, Baesweiler, Germany) according to the manufacturer's recommendations. A genomewide data set was generated at the Department of Genomics, Life & Brain Center, University of Bonn using Illumina's Human610Quad, Human660W-Quad and Infinium PsychArray-24 BeadChips (Illumina, San Diego, CA, USA). Quality control and imputation were performed with standard parameters used by the Psychiatric Genetics Consortium (PGC) Statistical Analyses Group and RPS were calculated using methods described by Purcell et al. 19 (see Supplementary Material for details).

Functional imaging tasks
During functional magnetic resonance imaging (fMRI) subjects completed an associative episodic memory (EM) task requiring encoding, recall and recognition of face-profession pairs. The WM n-back task required continuous updating and retrieval of elements held in short-time memory. The RP monetary incentive delay task allowed the study of anticipation of monetary gains or losses. The Theory of Mind (ToM) task consisted of cartoon stories requiring subjects to take the protagonist's perspective and judge changes in his/her affective states. The face-matching task (FMT) operationalized subjects' implicit processing of negative emotions while viewing faces showing either fearful or angry expressions (for detailed descriptions see refs 10-13 and Supplementary Material). All tasks were previously shown to robustly activate target structures and to possess excellent test-retest reliability in between-group designs. 12,16,23,24 Imaging parameters Blood-oxygen-level dependent fMRI was performed on three Siemens Trio 3 T MR-Tomographs at the Life and Brain Center of the University of Bonn, the Central Institute of Mental Health Mannheim, and the Charité-Universitätsmedizin Berlin. At all sites, identical sequences and scanner protocols were employed (EM: 33 slices, axially tilted (−30°), slice thickness 2.4 mm+0.6 mm gap, field of view (FOV) 192 mm, repetition time (TR) 1.96 s, echo time (TE) 30 ms, flip angle 80°, all other fMRI tasks: 28 slices, slice thickness 4 mm+1 mm gap, FOV 192 mm, TR 2 s, TE 30 ms, flip angle 80°). Quality-control measurements were conducted at all sites on every day of data collection according to a multicenter quality-assurance protocol, revealing stable parameters over time. 25 To account for any variance related to differences across sites, the site was used as a covariate for all statistical analyses.

Functional image processing
Image processing and statistical analyses were conducted using statistical parametric mapping methods as implemented in SPM8 (http://www.fil.ion. ucl.ac.uk/spm/software/spm8/). Scans were subjected to a strict quality assessment before inclusion into further analyses. Following data-quality assessment and application of general exclusion criteria (see above) n = 472-509 subjects were included in the respective analyses (see Supplementary Figure 1 and Supplementary Methods for details). During pre-processing, images were realigned to a mean image (movement parameters confined to o 3 mm translation and o1.7°rotation between volumes), slice-time-corrected, spatially normalized to a standard stereotactic space (a brain template created by the Montreal Neurological Institute) with a voxel size of 3 × 3 × 3 mm and smoothed with a 9 mm full width at half maximum Gaussian filter. A first-level fixed-effects model was computed for each participant and task. Regressors were created from the time course of the experimental conditions of interest per task and convolved with a canonical hemodynamic response function. Movement parameters, and for the ToM task also instructions and button presses, were included in the first-level models as covariates of no interest. For each subject, individual contrast images of the task effect (WM: 2-back40-back; EM: memory4control; RP: anticipation of monetary win/loss4anticipation of neutral outcomes; ToM: mentalizing4control; FMT: faces4shapes) were subsequently entered into group statistics.

Statistical group analyses
RPS: To test for genetic association with the intermediate phenotypes, the respective individual contrast images were analyzed using second-level multiple regression models for each task, including the RPS as regressor of interest, and age, sex, site, subgroup (no familial liability for psychiatric disorders, affected first-degree relative of patients with schizophrenia, bipolar disorder or depression), chip used for genotyping and the first three principal components (derived from a statistically independent set of common SNPs For the WM, EM and FMT tasks, ROIs were defined a priori using anatomical labels provided by the Wake Forest University Pick Atlas (www. fmri.wfubmc.edu/downloads). These were the bilateral dorsolateral prefrontal cortex (BA46 and BA9 with subtraction of medial voxels of BA9, see Esslinger et al. 11 for details) for the WM task, the left and right hippocampus and perigenual anterior cingulate cortex (pgACC) for EM tasks and the bilateral amygdala, as well as the pgACC for the FMT (Supplementary Figure S2). As the AAL atlas does not provide anatomical labels for the ventral striatum, a spherical ROI was created for the RP task using a voxel located in the center of the ventral striatum (x = ± 9, y = 11, z = − 8) surrounded by a sphere of 10 mm. For the ToM task, four functional ROIs were created for those regions previously shown to be aberrantly activated in patients with schizophrenia (for example, Sugranyes et al. 26 ; Walter et al. 27 ) and to be genetically modulated. 13,16 As these regions were not covered by specific AAL ROIs, masks for the medial prefrontal cortex, bilateral temporal parietal junction (TPJ) and posterior cingulate cortex/precuneus (PCC/Pcu) were created based on coordinates reported by a meta-analysis on brain areas involved in ToM 28 using the toolbox TWURoi (see Supplementary Material for details).

Single SNP fMRI analyses
Comparable to RPS analyses, second-level multiple regression models were computed for each SNP and each task with number of minor alleles as regressor of interest, and age, sex, site, subgroup, genotyping chip and the first three principal components of potential population stratification as nuisance covariates. We excluded SNPs with less than 10 subjects in one allelic group, resulting in 105 independent analyses. Similar to RPS analyses, we extracted FWE-corrected P-values of the maximally activated voxel within each ROI for each SNP from an undirected test. These significance values were then Bonferroni-corrected for the number of autosomal SNPs tested, that is, 105 (P o4.76 × 10 − 4 ).
We used an undirected F-test in all analyses, as, for the investigated RPS and the vast majority of SNPs, we had no hypotheses about the directionality of effects. Associations with the intermediate phenotype could be represented by reduced or increased activation, being either indicative of dysfunction, inefficiency or compensatory resilience mechanisms.

Replication analysis
We could replicate the effect of rs9607782 on amygdala activation in an independent control sample using the identical FMT (x = 30, y = − 4, z = − 11, F = 10.94, Z = 3.04, P FWE(ROI) = 0.025; Figure 3). In both samples amygdala activity decreased with increasing number of risk alleles for schizophrenia.
Analyses of LD-independent non-schizophrenia-associated SNPs (n = 105) In order to assess the adequacy of correcting for the number of SNPs tested per ROI, we determined the number of significant effects of 105 common variants not significantly associated with schizophrenia on all neuroimaging phenotypes. Therefore, we generated a random set of 10 000 SNPs across the whole genome. From this set we randomly selected 105 variants with a P-value of 40.2 for association with schizophrenia and a minor allele frequency of 410%/ o 90% in the PGC_SCZ52 data set. We observed a median of three (range 0-7) significant hits applying a threshold of P FWE(ROI) o0.05 across the respective ROIs. None of these associations withstood correction for the number of variants tested per ROI (Supplementary Figure S3). The difference between the number of statistically meaningful associations per ROI using schizophrenia-related and -unrelated SNPs was marginally    However, overall the P-values observed with schizophreniaassociated variants were significantly lower than among associations with unrelated variants (P o 0.005 after 10 000 permutations; one-tailed t-test; see Supplementary Material for details).

DISCUSSION
We investigated the impact of genetic risk for schizophrenia on a range of functional imaging phenotypes reliably activating distributed brain networks covering five RDoC subdomains with well-established relevance to the disease. We analyzed the effects of (i) a polygenic RPS and (ii) of 105 single SNPs genome-wide significantly associated with schizophrenia in face of benefits and drawbacks of both approaches. RPS offer assessment of accumulated genetic risk in a limited number of tests, but prohibit conclusions regarding specific contributions of SNPs and might conceal effects of some risk variants when combined with irrelevant signals. Single SNP analyses on the other hand have the disadvantage that the higher amount of tests (i) heightens the risk of false-positive results if not correctly accounted for (type I error) and (ii) increases the likelihood to find only those effects whose effect size is overestimated (winner's curse). The analysis of the polygenic RPS revealed two significant associations at a standard neuroimaging significance level (P FWE(ROI) o 0.05). The RPS predicted pgACC activity during EM recognition. pgACC recruitment was previously found to be modulated by genetic risk for schizophrenia during active EM retrieval. Furthermore, we found an association between RPS and PCC/Pcu activity during ToM. Activation of the PCC/Pcu, one of the crucial mentalizing areas, has been associated twice with a risk variant within ZNF804A (rs1344706) in high LD with a SNP included in the RPS (rs11693094; R 2 = 0.84). However, it must be emphasized that these results would not withstand multiple comparison correction for the total number of ROI analyses across all tasks (P o0.0025, that is, Po 0.05 FWE across 20 ROI analyses). With such a stringent threshold none of the above-mentioned results would remain significant and, in fact, one false-positive finding would be expected in 20 independent tests (at P = 0.05).
As we investigated 105 independent SNPs, we strictly corrected for this number within each ROI. Only one association withstood correction for multiple testing: rs9607782 was associated with right amygdala activity during implicit emotion processing. To the best of our knowledge, this SNP was not previously investigated in imaging genetics studies. Rs9607782 is located in an LD block with EP300, L3MBTL2, CHADL and RANGAP, and is in high LD (R 2 = 0.77) with a missense mutation in EP300 (rs20551), suggesting functional relevance. Besides association with schizophrenia, mutations in the EP300 gene are responsible for the Rubinstein-Taybi syndrome, a developmental disorder that includes intellectual disability, impulsivity, distractibility and mood instability. 29 EP300 encodes p300, a protein that functions as histone-acetyltransferase and is expressed in the brain in limbic and cortical regions. In mice, inhibition of p300 activity significantly impairs fear memory consolidation, and associated neural plasticity in the lateral amygdala. 30 P300 inhibition further induced enhanced anxiety and mild cognitive impairment in a water maze task. 31 Altered emotional significance detection and maladaptive appraisal have been associated with amygdala dysfunction in schizophrenia before. [32][33][34] Thus, previous evidence supports the potential relevance of the observed association, particularly as we were able to replicate our result in an independent sample ( Figure 3). This finding taps into the RDoC domains social cognition and negative valence. Importantly, the fMRI paradigm used to measure this potential intermediate phenotype is well qualified as an RDoC subdomain as it fulfills required criteria for good psychometric properties, 24 association with psychopathology (association with schizophrenia was repeatedly shown on the behavioral and the brain level) 35,36 and heritability (with heritability estimates for emotion identification ranging between 0.21 and 0.43). 37,38 Other associations, although not surviving correction for multiple comparisons, include SNPs previously found to be related to imaging phenotypes from all RDoC domains assessed (see Table 1), for example, rs2007044 within CACNA1C was associated with hippocampal activity during EM, 10,14,15,39 but also with bilateral ventral striatal activation during RP. Rs11693094 within ZNF804A was associated with activity of ToM areas (medial prefrontal cortex, left temporal parietal junction) as shown before 13,16 and also with striatal activity. In addition, there were associations of variants within TCF4 (previously found to have an impact on hippocampal volume) 40 with hippocampal and pgACC activity during EM as well as with amygdala recruitment during implicit emotion processing. rs1702294 within MIR137 (gene previously found to have an impact on frontalmediotemporal connectivity) 41,42 was associated with hippocampal activity, and rs2905426 within NCAN (previously associated with cortical thickness and folding in schizophrenia) 43,44 was associated with ventral striatal activation during RP.
All of these effects would stand out as significant hits in association studies on imaging phenotypes that typically apply a threshold of P FWE(ROI) o 0.05. However, an increasing number of statistical tests always comes at the expense of an increasing risk of type I error if not corrected for properly. In total, we found a median of 4 and up to 8 significant hits per ROI, which again, given a significance threshold of P o0.05, is compatible to chance findings in 105 tests. In fact, these numbers correspond well with those found by Sullivan who observed similar numbers in simulated genetic data containing no valid associations. 45 In his data, an uncorrected significance level of P o 0.05 yielded a proportion of false-positive findings of 96.8% (at least one positive finding in 968 of 1000 simulations). When Sullivan corrected for the number of SNPs he tested (which is in accordance to what we did) he still found a false-positive proportion of 31.4%. 45 Thus, accounting for the number of SNPs may still be insufficient and additional correction for the number of ROIs could be necessary. Indeed, Sullivan reported that correction for the total number of independent tests resulted in the appropriate false-positive proportion of 5%. Application of such strict correction for our single SNP analyses (in our case P o 2.3 × 10 − 5 ) would result in no finding being interpretable as statistically significant. On the other hand, the combination of FWE correction across ROIs with Bonferroni correction for the number of tests could be too conservative for fMRI data, taking into account that task-related ROI activation may be highly correlated, that is, is not truly independent. By testing associations of 720 SNPs not associated with schizophrenia with functional brain correlates of WM and implicit emotion processing Meyer-Lindenberg et al. 46 showed that the false-positive proportion was no higher than 1.0-4.1% applying FWE correction alone. If this applies to our data, those SNPs surpassing the standard threshold of P FWE(ROI) o0.05 may be true findings. Hence, to evaluate whether our findings contain true signal, we applied a similar strategy by testing the association of 105 SNPs not associated with schizophrenia or any mental function with our imaging phenotypes. Whereas the statistical difference for the total number of statistical meaningful associations among schizophrenia-related and -unrelated variants fell short of the significance threshold, we observed significantly higher probabilities for associations of disease-related than -unrelated variants. These results at least tentatively suggest further true signal in our data. Nevertheless, we believe that in face of the large amount of analyses we carried out in total, independent replication is essential and should be made mandatory in imaging genetics research, just as in psychiatric genetics research.
Our findings point to several issues relevant to the field of imaging genetics. It is foreseeable that larger genetic studies will discover more genome-wide significant genetic variants associated with psychiatric disorders. A brute force approach, that is, testing and correcting for all significantly associated variants known, will require a magnitude of statistical power difficult to achieve in functional imaging. There are three principal ways to respond to this conundrum. First, increase sample sizes (the mantra of genetics), for example, through consortial data accumulation worldwide. This is a reasonable path that has already been taken by the ENIGMA consortium 47 to which we have contributed too. 17,48 However, costs and data availability limit this path and, in the foreseeable future, (relatively) large sample sizes will likely be restricted to structural and resting state data. The aggregation of task-related functional data will be even more difficult because of demands on harmonization of experimental procedures and psychometric task properties. Second, restrict analyses to theorydriven a priori hypotheses. This is formally correct but the possibility remains that 'suitable' hypotheses are established post hoc after extensive data mining. Hence, this is difficult to control without prior registration of hypotheses, in particular as more and more groups are performing genome-wide genotyping. That said, exploratory analyses should still be accepted if labeled appropriately and substantiated by replication studies. Third, and most forward-looking, the use of data reduction, feature selection and multivariate methods. 49 Data reduction methods might include polygenic RPSs, gene set enrichment analyses or pathway analyses on the genetic side, and network analyses, independent component analyses, or graph theory on the imaging side. In addition, multivariate machine-learning methods might be useful to map high-dimensional genetic and neuroimaging data sets-this, however, also requires relatively large data sets.
Apart from these three principal approaches we would like to encourage the field to publish replication studies and in particular negative findings to enable future meta-analytical approaches. In addition, standards should be developed calling on imaging genetic studies to additionally publish whole-brain-effect sizes to facilitate replication and meta-analysis.
Our results suggest that risk scores aggregated across multiple independent loci may be less helpful than hoped for intermediate phenotype characterization. There is some evidence that brain effects are not as linear as associations with a (linear) polygenic RPS would imply. 50 Further, it should be emphasized that comparable polygenic scores do not necessarily include the same combination of risk variants. Each individual will rather have a unique combination of risk variants, and it is thus not reasonable to assume that similar scores will have an impact on the same neural region or circuit. A possible solution would be genotyping a large sample of individuals and deeply phenotyping (using, for example, neuroimaging) a subset at the upper and lower tail of the distribution. This does not speak against the usefulness of RPS in general, but in intermediate phenotype research, more hypothesisdriven approaches to RPS definition may prove more fruitful.
Coming back to our own study presented here, we remind that our data are limited by the combined analyses of subjects with and without familial liability for psychiatric disorders. Still, all subjects had a negative lifetime history of psychiatric disorders as evidenced by a respective diagnostic interview and we accounted for subgroup status in all of our analyses.
Please note also that, in order to limit our variables on the imaging side, we restricted our analyses to regional effects using a ROI approach and did not test for other possible measures that focus on connectivity 11 or network parameters. 51,52 In summary, although it unquestionably is a big challenge to acquire enough functional neuroimaging data for strict correction of both high-dimensional data sets (that is, imaging and genetics) using a brute force approach, we plead for the continuation of acquiring, analyzing and publishing imaging genetics results. These should include theory-driven restricted analyses, as well as stringent significance level adjustment and independent replication. Implementing these methodological requirements, our most robust finding was an association between a variant near EP300 with amygdala function during implicit emotion processing, a finding that is supported by preclinical and clinical findings and points to a disease-relevant mechanism potentially mediating aberrant anxiety processing and/or fear memory.

CONFLICT OF INTEREST
AML received consultancy fees from Astra Zeneca, Elsevier, F Hoffmann-La Roche, Gerson Lehrman Group, Lundbeck foundation, Outcome Europe Sárl, Outcome Sciences, Roche Pharma, Servier International and Thieme Verlag, and lecture feesincluding the travel fees-from Abbott, Astra Zeneca, Aula Médica Congresos, BASF, Groupo Ferrer International, Janssen-Cilag, Lilly Deutschland, LVR Klinikum Düsseldorf, Servier Deutschland and Otsuka Pharmaceuticals. HW received a speaker honorarium from Servier. The remaining authors declare no conflict of interest.