Progress in genome-wide association studies of schizophrenia in Han Chinese populations

Since 2006, genome-wide association studies of schizophrenia have led to the identification of numerous novel risk loci for this disease. However, there remains a geographical imbalance in genome-wide association studies, which to date have primarily focused on Western populations. During the last 6 years, genome-wide association studies in Han Chinese populations have identified both the sharing of susceptible loci across ethnicities and genes unique to Han Chinese populations. Here, we review recent progress in genome-wide association studies of schizophrenia in Han Chinese populations. Researchers have identified and replicated the sharing of susceptible genes, such as within the major histocompatibility complex, microRNA 137 (MIR137), zinc finger protein 804A (ZNF804A), vaccinia related kinase 2 (VRK2), and arsenite methyltransferase (AS3MT), across both European and East Asian populations. Several copy number variations identified in European populations have also been validated in the Han Chinese, including duplications at 16p11.2, 15q11.2-13.1, 7q11.23, and VIPR2 and deletions at 22q11.2, 1q21.1-q21.2, and NRXN1. However, these studies have identified some potential confounding factors, such as genetic heterogeneity and the effects of natural selection on tetraspanin 18 (TSPAN18) or zinc finger protein 323 (ZNF323), which may explain the population differences in genome-wide association studies. In the future, genome-wide association studies in Han Chinese populations should include meta-analyzes or mega-analyses with enlarged sample sizes across populations, deep sequencing, precision medicine treatment, and functional exploration of the risk genes for schizophrenia.


INTRODUCTION
Schizophrenia is a common severe psychiatric disorder that affects 1% of the world population with high heritability in the range of 64-81% and a complex genetic architecture. [1][2][3] Schizophrenia is often characterized as a heterogeneous disorder, and its genetic basis has long been explored using family-based or twin-based studies. 3 In early studies, linkage and candidate gene association studies implicated numerous putative risk chromosome loci for this disease. However, inconsistent findings resulting from limited biological information on single loci and inadequate sample sizes have sharply hampered the progress of these studies.
Since 2005, the strategy of genome-wide association studies (GWASs) has rapidly prompted additional studies examining the biological mechanism of other complex diseases. The success of GWASs has typically been evaluated based on the number of susceptible genes or risk loci discovered, representing just one achievement of the GWAS approach. Thus, it should be much more meaningful to replicate or validate the findings of various GWASs and to explain the clinical implications of specifically associated single nucleotide polymorphisms (SNPs). Accumulative associated genes reported by GWASs may shed light on the pathophysiological mechanisms contributing to special disease phenotypes and potential therapeutic targets. However, until recently, most large-scale GWASs focused primarily on Western populations, although other studies have examined more diverse populations. Here, we will review the progress in GWASs of schizophrenia in Han Chinese populations. Figure 1 indicates the flow chart of this review, which unravels causal genes of schizophrenia from GWAS risk loci across multi-ethnicities.

PROGRESS IN GWASS OF SCHIZOPHRENIA MAINLY IN EUROPEAN POPULATIONS Common polymorphisms associated with schizophrenia in European Populations
In 2006, Mah et al. reported the first GWAS of schizophrenia, including 320 patients of European descent and 325 matched controls. They revealed that the semaphorin receptor plexin A2 (PLXNA2) may be a susceptible locus for this disease in people of European ancestry. 4 To date, myriad risk loci for schizophrenia have been identified using GWAS approaches. 5 16,726 additional subjects, which included individuals from China and Japan, as well as Ashkenazi Jews and outbred European populations. The number of Chinese individuals was 1034 cases and 1034 normal controls. 6 In 2009, three GWAS studies from the ISC, SGENE, and MGS consortia individually implicated the major histocompatibility complex (MHC) region located on chromosome 6p22.1, transcription factor 4 (TCF4) and neurogranin (NRGN) as key susceptible genes for schizophrenia at significant levels of P < 5 × 10 −8 . [7][8][9] These findings also further supported the abnormal immune system and neurodevelopmental hypothesis of this disease. Subsequently, several independent studies have also replicated the MHC findings for schizophrenia across various populations. Moreover, meta-analyzes (combining GWAS results) or mega-analyses (combining GWAS data), which dramatically enlarge the statistical power of a GWAS by pooling subjects, have played important roles in promoting worldwide GWASs of complex diseases into the new so-called "big data era". 22 In 2011, Steinberg et al. conducted a meta-analysis in a large replication sample and added vaccinia-related kinase 2 (VRK2) as well as replicating the MHC region and TCF4 associations. 10 In recent years, the largest and encouraging schizophrenia GWAS came from the Psychiatric Genomics Consortium (PGC). To date, the PGC (http://pgc.unc.edu) has involved more than 900 investigators from 40 countries, with an open-source dataset representing more than 400,000 human participants. 22 11 1 year later, the PGC reported 13 novel schizophrenia loci (CACNA1C, CACNB2, ZNF323, etc.) from a much larger multi-stage GWAS (5001 cases and 6243 controls from Sweden followed by meta-analysis with a previous GWAS of 8832 cases and 12,067 controls and finally by replication of independent samples from 7413 cases, 19,762 controls and 581 parentoffspring trios). 12 Recently, the PGC2 successfully identified 128 linkage disequilibrium (LD)-independent SNPs in 108 distinct loci using GWASs of 17,836 cases and 33,859 controls of European ancestry. 13 The susceptible genes include some potential therapeutic targets (such as DRD2; glutamate metabotropic receptor 3, GRM3) and genes involved in neurodevelopment, glutamatergic neurotransmission (for example, GRIN2A, GRIA1, and SRR), neuronal calcium signaling (for example, CACNA1C, CACNB2, and CAMKK2) and synaptic function and plasticity (for example, KCTD13, CNTN4, and MEF2C). 13 These findings have provided new insights into the pathogenesis of schizophrenia. Table 1 showed the summary of progress in GWASs of schizophrenia mainly in European populations (Shifman et 5,6 The plan for PGC3 will be to further enlarge the GWAS sample size to 100,000 cases and at least 20,000 subjects will be sequenced for 200 genes, and analyzes of the Network or Pathway, the PsychENCODE project, etc. will be performed. 22 Rare copy number variations (CNVs) of schizophrenia in European populations Accumulating evidence suggested rare, large CNVs contribute to vulnerability for schizophrenia. To date, most of the evidence of CNVs has been reported from studies in populations with European ancestry. The 22q11.2 deletion was the first CNVs reported as implicated in schizophrenia, and the prevalence of 22q11.2 deletion in patients with SZ is about 0.3%. 23 25 have been identified to be associated with an increased risk of schizophrenia. 34 Functional exploration on GWAS-identified loci of schizophrenia in European Populations Moreover, one of the striking aspects of efforts based on GWASs is to explore the potential functional implications for risk genes of schizophrenia. Last year, two other studies based on large-scale GWASs of schizophrenia were also convincing. Sekar et al. reported that complex variations of the complement component 4 (C4) gene may increase the genetic risk of schizophrenia and further explore the potential function contributing to the mechanism of schizophrenia. 19 Another study identified a  human-specific arsenite methyltransferase (AS3MT) isoform that may be one of the molecular risk factors in the 10q24.32 schizophrenia-associated locus. 20 However, the vast majority of susceptibility loci were identified in samples of European ancestry. 4-14, 19, 20 The associated variants identified in European populations might not be associated with schizophrenia in other ancestry groups because of underlying genetic heterogeneity. Therefore, large-scale studies in non-European populations are necessary not only to investigate whether the previously identified loci can be generalized to non-European populations but also to identify new schizophrenia susceptibility loci. However, the European population accounts for the predominant findings, reflecting relatively larger sample sizes, improved collaborative mechanisms, etc. For example, the PGC3 groups are working on the meta-analyses across East Asian  Table 2). Most of these studies have focused on the replication or verification of the susceptible genes of schizophrenia in Han Chinese. To identify new common genetic risk factors, two research groups independently conducted GWASs in Han Chinese populations in 2011. 15,16 Yue et al. used 479 cases with schizophrenia and 1599 healthy control subjects, followed by 4027 individuals with schizophrenia and 5603 controls as a replication sample, particularly those of Northern Han Chinese descent. These researchers identified a previously unknown variation in a region of chromosome 11p11.2 as a novel susceptible locus for schizophrenia in Han Chinese populations and they also replicated the chromosome 6p22.1 finding from European populations that included the MHC region. 15 Shi et al. compared the genomes of 3750 schizophrenia patients with those of 6468 controls comprising three subgroups (Northern, Central and, Southern populations). This study revealed two significantly associated regions, 8p12 (Wolf-Hirschhorn syndrome candidate 1like 1, WHSC1L1; LSM1 homolog, mRNA degradation associated, LSM1) and 1q24.2 (mitochondrial pyruvate carrier 2, BRP44). The association of these two regions with schizophrenia was validated in another sample of 4383 schizophrenia cases and 4539 control subjects. 16 Both studies also supported previous findings from European populations revealing that the region of chromosome 6, associated with MHC, 7-9 may be involved in schizophrenia in Chinese populations as well. 35,36 The replication of the MHC locus and genes in this region is of great potential significance, suggesting vital etiological immune system mechanisms for this disease that need further consideration (Cyranoski D, Nature News 2011; http://www.nature.com/news).
However, in 2013, Ma et al. did not replicate the findings of the two above mentioned GWASs in a sample of 976 unrelated schizophrenia cases and 1043 control subjects from Central China regions. 37 Other studies have also reported different results than the above mentioned GWASs in Han Chinese or Europeans. [38][39][40][41][42][43][44][45][46][47][48][49][50][51][52][53] The differences may reflect a relatively small power or sample size or high heterogeneity across subpopulations. Several other studies have attempted to replicate the findings of the two above mentioned GWASs in Han Chinese; however, the results remained inconsistent.
Some other recent studies have revealed surprising differences in GWAS-identified susceptibility genes or loci between European and East Asian populations. Luo et al. (2014) examined 30 GWASidentified significant risk SNPs in Europeans and 10 SNPs in Han Chinese individuals; however, none of these genes was shared between these two geography regional populations. 54 Liu et al. explored the evolutionary history of the tetraspanin 18 (TSPAN18) gene and subsequently observed the potential effects of a recent Darwinian-positive selection on the protective allele of rs11038172 in East Asians. 55 These researchers deduced that natural selection may help to explain the strong genetic heterogeneity in schizophrenia risk and previous inconsistent association results for schizophrenia in both Europeans and East Asians.
Guan et al. assessed the initial GWAS and replicated 26 genetic variants in an independent sample of 1471 cases with schizophrenia and 1528 controls, identifying common variants on 17q25 and gene-gene interactions conferring risk of schizophrenia in Han Chinese. These authors also used the expression dataset to link tubulin-folding cofactor D (TBCD) and zinc finger protein 750 (ZNF750) mutations to disease susceptibility and the transcript levels in human brain tissues. 53 Comparison of GWAS-identified common polymorphisms of schizophrenia between Han Chinese and European populations Notably, there may be other factors reflecting the differences of GWAS-identified significant genes or loci for schizophrenia among subpopulations, such as clinical heterogeneity, relative small sample size, different genotyping methods, etc. In fact, Yu et al. recently completed a meta-analysis based on GWAS data from China in a relatively large sample (GWAS: 4384 cases and 5770 controls; Replication: 4339 cases and 7043 controls). 56 These authors further used the PGC2 GWAS data to evaluate the polygenic risk scores (PRS) to compare the overall patterns of the results from the PGC schizophrenia analysis (discovery sample) with the results from independent Chinese analyzes (target sample). The Genome-wide Complex Trait Analysis (GCTA) was also used to estimate the percentage of phenotypic variance explained by common SNPs in a Chinese population. Both PGC and Yu's studies identified several chromosome loci, including 2p16.1 (vaccinia related kinase 2, VRK2), 6p22.1 (gamma-aminobutyric acid type B receptor subunit 1, GABBR1) and 10q24.32 (arsenite methyltransferase, AS3MT; ADP ribosylation factor-like GTPase 3, ARL3), that might be susceptibility loci for schizophrenia in both European and Han Chinese subpopulations. 56 Functional exploration suggested that VRK2 and ARL3 may play important roles in neurodevelopment. 56 Without reference GWAS dataset in Han Chinese, Yu et al. used the PGC2 dataset to calculate a polygenic risk score (PRS) and found the PGC PRS valid to sample from Chinese (R 2 = 1.7-5.7%). 56 The use of meta-analyses or megaanalyses, including much larger sample sizes with high clinical quality (at least 50,000~100,000 pairs of schizophrenia patients and controls) across European and Asian populations or subpopulations, may explain the difference between populations. Table 3 showed the comparison of single-marker association results of schizophrenia in Chinese population and PGC2 sample. Researchers have identified the sharing of susceptible genes, such as MHC, MIR137, ZNF804A, VRK2, and AS3MT, across European and East Asian populations.
In further analyzes, Liu et al. used the GWAS dataset from three distinct populations (European Americans, Han Chinese, and African Americans) to explore potentially shared pathways. 57 These authors found that five pathways (serotonergic synapse, ubiquitin-mediated proteolysis, hedgehog signaling, adipocytokine     are associated with schizophrenia, and found a significant increase of deletions in cases over controls. 59  Progress on other genetic markers of schizophrenia in Han Chinese miRNA or methylation studies have also been reported in the Han Chinese population. For example, Wei et al. reported that the upregulation of miR-130b and miR-193a-3p may be stateindependent biomarkers for schizophrenia. 60 Zhang et al. showed that the dysregulation of miRNA systems undermines the inhibitory effects of miRNAs, resulting in the abnormal upregulation of genome transcription in the development of schizophrenia. 61 This same group also reported that transcriptional factors and microRNAs (miRNAs) may contribute to the development of schizophrenia and might also be relevant to the clinical treatment of the disease. 62 Based on GWAS datasets, Luo et al. verified the association of calcium/calmodulin (CAM)dependent protein kinase kinase 2 (CAMKK2) with schizophrenia and subsequently observed that the T allele of rs1063843 is associated with a lower expression level of CAMKK2 in this disease. 54 Luo et al. further explored the GWAS and expression data and observed that zinc finger protein 323 (ZNF323) may have brain expression Quantitative Trait Loci effects as a novel risk gene of schizophrenia. Moreover, ZNF323 showed positive selection based on compensatory advantage on pulmonary function. 63 Based on GWAS datasets, researchers deduced that genetic markers of human evolution may be enriched in schizophrenia, which may well explain the differences of GWASs among ethnicities or subpopulations. 64 However, studies on clinical phenotypes or endo-phenotypes could offer additional clues or suggestions for the mechanisms of complex diseases. Ma et al. suggested that the GWAS-identified risk gene, methionine sulfoxide reductase A (MSRA), may be associated with fluid intelligence in schizophrenia. 65 Fluid intelligence is the capacity to reason and solve novel problems, independent of any knowledge from the past which usually be assessed with Cattell's Culture-Free Intelligence Test (CCFIT). Wang et al. reported that reduced gray matter (GM) volume was associated with polymorphisms in thromboxane A synthase 1 (TBXAS1), phosphatidylinositol-4-phosphate 3-kinase catalytic subunit type 2 gamma (PIK3C2G) and heparan sulfate-glucosamine 3sulfotransferase 5 (HS3ST5) in first-episode treatment-naïve patients with schizophrenia. 66 Liu et al. evaluated the effects of polygenic risk on cortical gyrification and provided some implications regarding differences in the genetic risk of individuals for schizophrenia to cortical morphology and brain development in Han Chinese population. 67 Furthermore, some GWASs have been conducted to assess treatment or side effects in Han Chinese populations.
Some GWASs with limited sample sizes have led to a potential trend in precision medicine for antipsychotic therapy 68,69 . Liou et al. suggested that rs28362691 in nuclear factor kappa B subunit 1 (NFKB1) might be involved in the development of treatment refractory schizophrenia in Han Chinese individuals. 68 Yu et al. reported that protein tyrosine phosphatase, receptor type D (PTPRD) and glutamine-fructose-6-phosphate transaminase 2 (GFPT2) polymorphisms were associated with the weight gain effects of atypical antipsychotic medications. 69 Both PTPRD and GFPT2 have been reported as susceptible genes of diabetes, suggesting a novel mechanism for weight gain induced through antipsychotic therapy. Thus, additional large-scale pharmacogenomics studies should be completed.

DISSCUSSION
Sample sizes for GWASs and for pharmacogenomics For international GWASs, the sample size of schizophrenia as the primary phenotype has been enlarged from 320 cases in the first GWAS to 35,476 cases with schizophrenia in the PGC2 4, 11 , and will be up to 100,000 cases in the PGC3 in the future 22 .
In a genome-wide pharmacogenomics study, McClay et al. reported 738 subjects with DSM-IV schizophrenia who took part in the Clinical Antipsychotic Trials of Intervention Effectiveness. 70 To date, few genome-wide pharmacogenomics study in Han Chinese population has been published. Most of the pharmacogenetic studies focus on some candidate genes or pathways. Yu et al. used 534 patients in discovery sample to explore the body antipsychotic-induced weight gain. 69 Yue et al. has just completed a pharmacogenomics study including 3003 patients with schizophrenia treated with 7 antipsychotic drugs for 6 weeks in Han Chinese population (unpublished data).
Genetic architecture of the schizophrenia disease in Chinese populations Population structure can potentially cause inaccurate associations. Therefore, it is critical to understand the genetic structure of a population. The Han Chinese population is the largest ethnic group in the world, composing 20% of the world population, the Han Chinese population constitute more than 90% of China's population. However, the Han Chinese has been largely underrepresented in most of the GWASs. Considering modern human origins, the "Out-of-Africa" hypothesis has been supported by genetic and archeological evidences, however, the scenario of colonization of East Asia need further clarification. Genetic evidence has provided some indication that East Asia humans may migrate or come from through the following two roads: along the coast to south and Southeast Asia (the southern road), or through the Middle East to Central Asia (the northern road). As for Han Chinese substructure, Xu et al. reported that the Han Chinese population was intricately sub-structured, with the main observed clusters corresponding roughly to northern Han, central Han, and southern Han. 71 Moreover, using over 350,000 SNPs in over 6000 Han Chinese samples from ten provinces of China, Chen et al. explored Chinese population structure across the geographical locations, and examined the potential magnitude of Chinese population stratification. They finally revealed a one-dimensional "north-south" population structure and a close correlation between geography and the gene structure of the Han Chinese. 72 Actually, in the 1000 Genomes project, the Chinese population was categorized under the generic terms of "Southern Han" and "Northern Han". Therefore, in the genetic association studies Han of Chinese population, it should be ensured that homogeneous samples were recruited or analyzed. For example, to reduce the effects of population structure, Yu et al. performed a stratified GWAS of schizophrenia in Han Chinese population based on the North-South sub-population structures, and then combined the GWAS results by meta-analysis. 69 Population structure analysis should be done before the analysis using Multidimensional Scaling Analysis or Principal Component Analysis, and should be used as a covariate. Moreover, the inflation factor or linkage disequilibrium-based regression score could also be performed to examine the effects of population structure on association results.  (Lou, 1983). 74 Additionally, we previously estimated the heritability of schizophrenia due to common GWAS SNPs. Assuming a population risk of 0.01, we estimated from GCTA software that 40.2% (s.e. of 0.02) of the total variation was explained by common SNPs across the genome. 56 It suggested that the variance in liability estimated from GCTA accounts for over 51.3% of the observed heritability (0.40/0.78). However, this SNPbased heritability analysis is still inaccurate, and a future twin meta-analysis could better estimate the heritability of schizophrenia in the Chinese population. Nevertheless, this analysis revealed that common SNPs also make substantial contributions to the risk for schizophrenia in the Han Chinese population, and there are additional schizophrenia susceptibility loci yet to be discovered, including both common and rare variants. The "National 686 Project" for the management of the severe psychiatric disorders, launched by the central government of China, has enrolled 5.4 million patients with severe psychiatric disorders, including schizophrenia, mental retardation, and epilepsy with psychiatric syndromes. The first guideline on improving mental health services was released on 19 Jan 2017, according to 22 ministries and, ministerial-level departments led by the National Health and Family Planning Commission (China Daily). All of the above findings suggest that multi-omics strategies will be helpful in the future for the exploration of the genetic background or mechanism for schizophrenia in China.
To coordinate research efforts in psychiatric genetics in China, a group of Chinese and foreign investigators have established an annual "Summit on Chinese Psychiatric Genetics" to present their latest research and to discuss the current state and future directions of this field. The Summit on Chinese Psychiatric Genetics has been held four times across China, and a session conference on the XXIV World Congress of Psychiatric Genetics was conducted in Israel in Oct 2016 (www.wcpg2016.org). The famous non-governmental organizations, such as Beijing Genomics Institute (http://www.genomics.cn/), CapitalBio Corporation (http://www.capitalbio.com/) and other biotech companies have made collaborative research efforts for psychiatric disorders with multiple research institutes in China.
Furthermore, Chinese researchers also have been establishing collaborations with international fellowships or scholars working on meta-analyses or mega-analyses in much larger samples across European and Asian populations. Thus, the GWAS findings will further contribute to the exploration of the mechanisms of schizophrenia and potential treatment targets of this complex disease.

CONCLUSIONS
Accompanied by the rapid development of international GWASs of schizophrenia, an expanded list of common risk loci were identified in Han Chinese populations. However, when the sample size or genetic statistic power increased, much more striking shared susceptible genes or loci were identified among European and East Asian populations. Moreover, there were susceptibility loci unique to Han Chinese. In conclusion, although many confounding factors may influence the results of GWASs of schizophrenia between European and Han Chinese populations or sub-populations of Han, several common or rare variations have also been identified that overlap in multi-ethnicities. Researchers have identified the sharing of susceptible genes, such as MHC, MIR137, ZNF804A, VRK2, and AS3MT, across European and East Asian populations. Several CNVs identified in European population have also been validated in Han Chinese, including duplications at 16p11.2, 15q11.2-13.1, 7q11.23, and VIPR2 and deletions at 22q11.2, 1q21.1-q21.2, and NRXN1.
In the future, it should be helpful to use the meta-analyses or mega-analyses that include much larger sample sizes (at least 50,000~100,000 cases of schizophrenia and a considerable number of controls) across European and Asian populations or subpopulations. Moreover, other directions may include multiomics methods, deep sequencing of the genome, precision medicine, and further functional exploration of risk genes to explain the mechanism of the disease.

AUTHOR CONTRIBUTIONS
W.Y. drafted the first version of the manuscript. X.Y. and D.Z. critically revised the manuscript for important intellectual content. All authors approved the final version for publication. We also thank for Prof. Lynn E DeLisi's kindly help for this manuscript.