Sex differences in DNA methylation of the cord blood are related to sex-bias psychiatric diseases

Sex differences in the prevalence of psychiatric disorders are well documented, with exposure to stress during gestation differentially impacting females and males. We explored sex-specific DNA methylation in the cord blood of 39 females and 32 males born at term and with appropriate weight at birth regarding their potential connection to psychiatric outcomes. Mothers were interviewed to gather information about environmental factors (gestational exposure) that could interfere with the methylation profiles in the newborns. Bisulphite converted DNA was hybridized to Illumina HumanMethylation450 BeadChips. Excluding XYS probes, there were 2,332 differentially methylated CpG sites (DMSs) between sexes, which were enriched within brain modules of co-methylated CpGs during brain development and also differentially methylated in the brains of boys and girls. Genes associated with the DMSs were enriched for neurodevelopmental disorders, particularly for CpG sites found differentially methylated in brain tissue between patients with schizophrenia and controls. Moreover, the DMS had an overlap of 890 (38%) CpG sites with a cohort submitted to toxic exposition during gestation. This study supports the evidences that sex differences in DNA methylation of autosomes act as a primary driver of sex differences that are found in psychiatric outcomes.

bereavement, unwanted pregnancies, military invasion and natural disasters increased the risk for the offspring to develop neurodevelopmental diseases later in life, such as schizophrenia, depression and anxiety [17][18][19][20] .
The relation between psychiatric diseases and adverse experience in early life were included in the Developmental Origins of Health and Disease (DOHaD) hypothesis. Most DOHaD diseases have a sex-biased susceptibility 21,22 , suggesting that the response to stress is different in males and females. Accordingly, when exposed to stress, males suffer from behavioral problems earlier in development -at one year of age 23 , whilst females show the effects at later life periods 24,25 . In addition, there is a clear temporal specificity and sexually dichotomous influence of the effects of prenatal stress [26][27][28] . Males seem to be more vulnerable to develop schizophrenic, autism and attention-deficit/hyperactivity disorder (ADHD) symptoms in response to perinatal stress, all these were associated with intra uterus growth restriction (IUGR) 29,30 .
Animal models also contributed to association between methylation alterations and different response to stress exposition in females and males. For instance, intrauterine exposure to Bisphenol A (BPA), an estrogenic endocrine disruptor, seems to induce DNA methylation changes in male cortex and in female hypothalamus as result of expression changes in ESR1 and DNMT1 31 . In mice, treatment with valproic acid during gestation induced sex-specific morphological alterations at a discreet region of the midbrain, including a higher cell counts in males compared to females 14 . In mice, exposure to stress in the prenatal period caused elevation of DNMT1 expression levels in females' placenta, but not males. Contrariwise, males showed response to maladaptive behavioral stress together with change in methylation and expression of the glucocorticoid receptor and corticotropin-releasing factor in the fetal brain 28 .
A correlation between changes in methylation and gene expression were suggested to be involved with the differential brain development 32 . Thus, differences in the regulation of gene expression between females and males could explain the variances in specific brain structure 8,27,28,32,33 , cognitive abilities and stress-coping strategies 28 Some studies already evaluated sex differences at methylation levels between females and males in the cord blood 34 , prefrontal cortex 32,35 and pancreatic isolates 35 , nevertheless, none considered exposition to stress. Whilst it is known that the developing brain has enhanced sex-biased sensitivity to environmental stimuli at specific periods of gestation that contributes to vulnerability to psychiatric diseases, the possibility that DNA methylation differences between females and males could be involved with psychiatric diseases, independently of exposition to stress, remains to be addressed.
To unveil the contribution of stress exposition and the sexual antagonism effects in DNA methylation as well as their association with psychiatric vulnerabilities, we conducted a genome-wide methylation analysis of cord-blood in a well-characterized cohort of children born at term and of adequate weight/size at birth. Adjustments for cell heterogeneity and stress exposure co-variables were performed followed by the identification of differentially methylated CpG sites (DMSs) between females and males. These methylation differences were explored in other cohorts, including brain development, neurodevelopmental disorders and a cohort where children were exposed to stress during gestation.

Results
General characteristics of the study population. We used a natural cohort of newborns collected at the University Hospital of São Paulo Medical School, where, typically, only mothers with a not at risk pregnancy do have labor. We randomly collected cord blood samples from 96 individuals. Then, we excluded six pre-term and two post-term newborns, as well as 11 small and seven large for gestational age newborns, since those presentations have already been associated with psychiatric outcomes and stress exposition 36 . The final cohort included 71 cases, comprised of 39 females and 32 males, all with adequate weigh/size for gestational age and born at term. Moreover, to maximize differences in methylation probably related to sexes differences, we corrected for other exposure variables that could interfere with the analysis (Table 1, Supplemental Table 1) explored by this study (females versus males). Although smoking were different between females and males (p = 0.034), only psychiatric disease in father's family presented a significant value as given by the sva package analysis (Supplemental Figure 1) and was corrected accordingly. In addition, data were adjusted for batch effects and cell type composition. Only as a proof of concept, the sex of each sample was confirmed by methylation levels of sexual chromosomes. The study design was included as a chart showing the analyses flow (Supplemental Figure 2).

Methylation differences between sexes correlate with gene expression.
To characterize methylation changes in the cord blood of females and males, methylation levels were derived for 464,964 CpG sites, excluding XYS probes. Supervised comparison between sexes identified 2,332 differentially methylated CpG sites (DMSs) with females presenting twice the number of hypermethylated sites compared to males (adjP < 0.05, 1,546 and 786 CpG sites more methylated in females and males, respectively), with 1,499 CpG sites associated to 1,113 genes ( Fig. 1; Supplemental Table S2). Upon clustering of the 2,332 CpG sites (Euclidian distance with average linkage), two clusters formed, completely separating both sexes (Supplemental Figure 3).
To evaluate if DMSs were associated with differentially expressed genes between females and males, we used an expression dataset of 2,500 samples from 13 tissues 37 . Excluding genes located in the sexual chromosomes, the 1,113 genes were enrichment among differentially expressed genes between females and males in five tissues, four being brain regions (anterior cingulate cortex, frontal cortex, cerebellum and hippocampus), and the fifth being heart (Supplemental Table 3).
Considering that the 1,113 genes have a sex-biased expression, we used MiSig/WebGestalt to verify if there was an enrichment of binding sites for estrogen and androgen receptors and binding sites located at 2 Kb up or downstream from transcriptional start site (TSS) of each of gene. This analysis resulted in 105 genes regulated by estrogen receptor alpha or beta (five motifs, adjP < 0.05), 24 genes regulated by androgen receptor (three motifs, adjP < 0.05), and ten genes regulated by both transcriptional factors. Then, to explore more distant motifs, we used Opossum 38 that searched for transcriptional factors target sites located at 5 Kb or 10 Kb up or downstream from TSS of 838 (out of 1,113) genes, revealing no enrichment for ESR1 or AR binding sites (Supplemental Table 4).
Differentially methylated CpG sites are not randomly distributed within the genome. Females have twice the number of methylated CpGs sites than males, with a non-random distribution in the genome. CpG sites more frequently methylated in females were more often located near transcriptional start sites (TSS1500) and within CpG islands, S_shores and S_shelves. Conversely, CpG sites more frequently methylated in males were enriched within gene bodies and 3′ UTR, and were associated with open sea areas (Supplemental Table 5, chi-square test, p < 0.05). Based on the predefined regions, the differentially methylated regions (DMRs) between females and males were distributed as follow: 171 CpG islands (98 and 73 hyper-and hypomethylated in females), 131 promoters (80 and 51 hyper-and hypomethylated in females), 127 genes (64 and 63 hyper-and hypomethylated in females) and 333 genome-wide tiling regions (169 and 164 hyper-and hypomethylated in females) (Supplemental Figure 4, Supplemental Table 6, adjP < 0.05). The top DMRs (adjP < 0.05 and at least 10 CpG sites) are shown in Table 2. Differentiated methylated sites between males and females are related to genes associated with normal brain development and neurodevelopment disorders. To gather a global view of the role of these genes, the 1,113 genes were annotated into gene ontology, revealing over-representation of genes related to system development, neurogenesis and neuron differentiation, striated muscle cell differentiation, and single organism behavioral and synaptic transmission. In addition, we found enrichment of genes belonging to the metabolic pathway (60 genes), the MAPK signaling pathway (23 genes), and focal adhesion (19 genes), among others (adjP < 0.01), including the insulin signaling pathway (12 genes). Finally, we observed an enrichment of genes related to neuropsychiatric disorders, stress, cancer, congenital disorders, musculoskeletal including cardiovascular diseases, endocrinological and female and male genitourinary diseases (Supplemental Table 7). To test if the 1,113 genes were associated with brain development, we searched in the Allen Brain Atlas and found that 930 genes were expressed in at least one of the 26 brain regions (8 th post-conception week to 40 years of age). Next, the comparison between the 2,332 DMS and the 7% of CpGs that changes the methylation during brain development 39 showed an enrichment, represented by 238 CpGs (p < 0.00001, Supplemental Figure 5). Further, we compared to the 521 CpG sites that, in addition to being modulated during brain development, were also differentially methylated between brains of females and males, resulting in 195 common CpG sites (p < 0.00001, Supplemental Figure 5). Finally, extracting data from the only study in schizophrenia that had used the 450 K Beadchip arrays with publicly available data 40 we searched for enrichment between 2,332 DMS between females and males and the 4,641 differentially methylated CpG sites in brain tissue from schizophrenic cases and controls. The analysis reported an overlap of 116 CpG sites between both studies, indicating an enrichment for CpG sites related to schizophrenia among CpG sites that differ between females and males in cord blood (p = 0.0139, Supplemental Figure 5). Differentiated methylated sites between females and males are enriched for psychiatric disorders in a stress exposition model. To further explore the enrichment of DMSs between sexes after correction for stress exposition among genes associated with psychiatric disorders, we compared our DMSs to the differentially methylated CpG sites in the cord blood between females and males exposed to stress (toxic agents) during gestation from an independent cohort. A study from CHAMACOS (a longitudinal birth cohort study of the effects of exposure to pesticides and environmental chemicals on the health and development of Mexican-American children living in the agricultural region of Salinas Valley, CA) assessed the methylation changes related to sex differences in the cord blood of females and males, taking into account batch effects and cell composition. Excluding XYS CpG sites, they found 3,031 DMS (25.7% of the total) located in the autosomes, with most of them (82%) more methylated in females 34 . We compared these to our 2,332 DMS and found 890 (38%) common DMSs, all presenting the same pattern regarding hypo-and hypermethylation.
To assess whether there was an enrichment of genes associated with psychiatric diseases in both data sets, CHAMACOS' and ours, we used only those DMSs related to genes. Considering that one gene may have more than one CpG site, and different CpG sites may have opposing methylation differences (hypo-or hypermethylated), depending on their location, we excluded genes represented by more than one CpG site. This analysis resulted in 472 common genes between CHAMACOS' and ours, 1,309 and 500 genes belonging exclusively to CHAMACOS' or our datasets, respectively. Next, an enrichment analysis for diseases were performed with full datasets followed by the enrichment analysis with common and exclusive genes of the two cohorts that showed that, considering all diseases, most of the psychiatric diseases were enriched in all datasets, although there were differences in the ranking as exemplified by the top ten diseases (Supplemental Table 8, diseases and  schizophrenia).
These gene sets were tested for enrichment in modules of co-expressed genes along human neocortical development 41 (Supplemental Table 9). M8, M13 and M17 were enriched within the full datasets and among common genes. M8 is comprised of genes more expressed in the beginning of gestation; genes from M13 present a decreasing expression until postconception week (PCW) 16, from which point present an increasing pattern; Figure 1. Characterization of the differentially methylated CpG sites between females and males. Volcano plot of the differential methylation analysis with X and Y axes displaying, respectively, the delta-beta values and the log10 of adjusted p-values for each CpG site. CpGs more methylated in females and males are represented in pink and blue, respectively.  and genes from M17 increase after PCW 16 until birth, from which point it seems to reach a plateau. Genes exclusively from this study (n = 500) were enriched for M9 (p = 0.0009), which contains genes highly expressed in the beginning of gestation that decrease expression levels at around 12 th week whereas genes exclusively from CHAMACOS' (n = 1,309) were enriched for M15 (p = 0.0246), that presents an increasing level of expression during the whole neocortical development. Genes found only by CHAMACOS study (total or exclusive genes) were enriched for M1, that present a decreasing expression until week 12 th , from which point present a constant increasing expression levels (Fig. 2).

Discussion
Differences underlying sexes have been associated with distinct incidence and severity of a group of psychiatry diseases, although the mechanisms are not well understood, moreover, differential methylation has been involved with this sex bias prevalence of psychiatry disorders. The immune system response during certain periods of development has sex-specific effects on regulation of various neurotransmitters important for the stress response throughout life, with consequences for physical and psychiatric disorders in later life 36 . Besides, there is a clear sex dimorphism in structures during brain development that were implicated to a sex-bias vulnerability to psychiatric disorders independent of stress exposition 42 . The potential underlying mechanisms driving these sex differences regarding stress responses and their relevance to disease were deeply discussed. The authors concluded that the continued focus and appreciation for how sex differences in stress responses may predict disease risk and resiliency is critical for developing preventive strategies and treatments 43 . In this study, we conducted a genome-wide DNA methylation profile of cord blood of a well-characterized cohort composed of babies born at term and with adequate weight, after correcting for stress exposition factors that could interfere with DNA methylation. We showed that differential methylation between females and males found in the cord blood can represent gene expression differences in tissues with sex dimorphism. The DMS were equally distributed across all 22 autosomes, as previously described 34 , but their locations were not random. Whilst females had more methylated CpGs associated with repressed gene expression, males had more methylated CpGs in the open sea, which is more often positively correlated with expression 44,45 .
Pathways enriched for the 1,113 genes related to the DMS between sexes included metabolic and MAPK pathways. Changes in the metabolism of 1-carbon and methyl donor availability in the fetal liver, a major metabolic organ, were suggested to have a sex influence in a subset of genes 46 . Several pathways related to muscle were over-represented, such as the hypertrophic cardiomyopathy and vascular smooth muscle contraction pathways. The development and establishment of musculature is thought to be a major difference between adult males and females. At molecular level, these differences seem to be present already at birth. Accordingly, cardiovascular and musculoskeletal disorders were enriched in the diseases analyses.
The enrichment analysis revealed a possible role of our gene set to the brain as CpG sites and genes are modulated during brain development on top of being differently methylated between sexes 39 . Furthermore, these genes have been associated to psychiatry diseases as demonstrated by the enrichment of the 2,332 DMSs between females and males among the 4,641 differentially methylated CpG sites in brain tissue between patients with schizophrenia and controls 40 . Among the common genes, SHANK genes participate in synaptic formation, including in SHANK2 and SHANK3, in which mutations were described in a wide spectrum of neurodevelopmental and neuropsychiatric diseases 47 .
When analyzing the near and distant motifs for sex hormones, we suggested that: (1) the differential methylation found in CpG sites could interfere with the binding in motifs close to TSS of transcriptional factors classically related to females and males and; (2) other factors that are not exclusively associated with the regulation performed by sexual hormones are contributing to differentially expressed and methylated genes between females and males. Also, most of the DMS (66%) showed higher DNA methylation in females compared to males. It has been reported that females have higher methylation levels than males, although most studies used the X and Y chromosomes in their analyses, where a substantial portion of the DMSs are located 34 . Importantly some of these studies were performed in tissues other than cord blood, such as brain or pancreas, and with older subjects, that present changes in methylation because of the aging process 35,48,49 .
X dosage and the presence of Y chromosome exerts a role for the methylation patterns on the autosomes. One X was associated to hypomethylation on the autosomes but the presence of Y changes this pattern, suggesting the existence of an interaction between X and Y dosages 50 , meaning that, in addition to the organization effects of gonadal hormones, the sex chromosomes contribute to the differential methylation between females and males and these differences might contribute the sex bias psychiatric vulnerabilities, agreeing with a previous published discussion 43 .
The importance of considering the sex differences relies on the different response from females and males to stress to maintain homeostasis, which may have different consequences for the long-term responses to perturbations from the environment 33 . Prenatal stress can induce long-term neurodevelopmental diseases, particular those related to the hypothalamic-pituitary-adrenal (HPA) response to stress 51 . This exposure to maternal stress may alter the stress-vulnerability of the embryo leading to an increased risk of psychopathology 36 . The cohort from CHAMACOS was subjected to toxic stress (e.g. exposure to pesticides) 34 whilst our study excluded factors already associated with psychiatric vulnerabilities (not born at term and not adequate size/weight for gestational age); and corrected for stress factors covariables that potentially interfere with DNA methylation. Psychiatric disorders were associated to genes from both studies. Nevertheless, comparing both data, we found 890 overlapping CpG sites, with our methylation differences (delta-beta) being smaller than theirs were, suggesting that exposition to stress might increase these differences. Comparing both datasets to the co-expressed gene modules during neocortical development revealed that some modules were similarly enriched. In addition, genes from these modules (M8: SOX1, CDKN2C, TRIP10, SALL1, EFNA4, PARD3B, CDH23, PRDM16, NEK9, C1orf61, HSPB1,  Figure 2. Enrichment analysis for the modules from Parikshak et al. 41 , that identified genes co-expressed during cortical development. Graphs show the distribution of genes related to the studies in enriched modules (p < 0.05). Scatter smooth of module eigengene values was defined by Parikshak et al. 41 at different stages of neocortical development. Different periods of lifetime are scaled from light to dark blue colors: early foetal, early-mid foetal, late midfoetal, late foetal, neonatal/early infancy and late infancy. The red line indicates time of birth. Sets of genes enriched for the modules are displayed below of each graph.
We are aware that this study has limitations. First, we controlled for a limited number of stress factors and cannot guarantee that the methylation analyses were not influenced by unexplored stress exposition factors. Second, we could not measure contamination of maternal cells and thus, did not correct for it. Third, comparisons between data from cord blood and brain were not performed with samples collect from the same individuals. Fourth, we used only exposure to pesticides and environmental chemicals as a model toxic stress. Finally, we found only one study with publicly available data that used the H450K to evaluate DNA methylation in the brain of patients with schizophrenia representing psychiatric diseases found enriched in our analyses.
This study corroborated previous findings from the literature that showed that methylation differences between females and males in the cord blood methylation are associated with brain developing tissues. We describe there that genes related to this differential methylation are enriched for genes previously associated with psychiatric diseases, even after correcting the methylation analysis for stress exposition covariables, in additional to the technical effects. These differences are not solely explained by gonadal hormone effects, but also by sex chromosomes effects on the autosomes. Moreover, they present specific pattern of gene co-expression organized in modules of neocortical development, which implies temporal specific effects.

Material and Methods
Sample collection. Samples came from the Western region of Sao Paulo city, where 6,200 women were enrolled between 26 th and 34 th week gestation between 2012-2014 (Regiao Oeste Cohort, ROC). For this study, a questionnaire was administered to mothers to evaluate levels of stress and toxic exposures during gestation (smoke, use of drugs). These children are being followed up to gather additional information related to neurodevelopmental diseases and behavioral phenotypes as a continuum of this study; this will not be explored in here due to the relatively small time-lapsed. We collected blood from the umbilical cord of 96 women that delivered at the University Hospital of Sao Paulo. All women gave written informed consent to participate in the study. The study itself, as well as the use of samples, were conducted with ethical approval granted by the Hospital's Internal review board -CEPHU under protocol number 1076/10, with all experiments performed in compliance with the Helsinki Declaration.
Sample preparation and quality control. DNA was isolated from umbilical cord blood samples using QIAamp DNA Blood Midi Kit (Qiagen). NanoDrop (Thermo Fisher Scientific) and Qubit (Thermo Fisher Scientific) was used to assess DNA purity and quantity. DNA quality was checked by electrophoresis in 0.8% agarose gels followed by bisulfite conversion whereby 1 ug DNA was treated using the EZ DNA Mehylation kit (Zymo Research Corp), following manufacture recommendations.
High quality bisulfite-converted DNA from 96 samples were hybridized in the Human Methylation 450 BeadChip microarrays (HM450K, Illumina), following the Illumina Infinium HD methylation protocol. Both experiments were processed by Deoxi Biotecnologia (www.deoxi.com) according to manufacturer's instructions. Raw data were extracted by the iScan SQ scanner (Illumina) using GenomeStudio software (v.2011.1), with the methylation module v.1.9.0 (Illumina), into IDAT files, which were used for further analyses. The HM450K platform interrogates DNA methylation levels of 485,577 loci distributed across the genome at single-nucleotide resolution. Probes were annotated according regarding their nearest genes using FDb.InfiniumMethylation.hg19 package with an additional annotation for those probes that were located at more than one gene, Hg19 coordinates from UCSC were selected. This resulted in 20,461 unique entrezID genes. The methylation levels for each CpG probe were provided in beta-values (0 indicating unmethylated CpGs, and 1, fully methylated CpGs). All analyses were performed in R environment using Bioconductor packages (http://www.bioconductor.org). All samples had high quality data passing QC using standard parameters.
Genome-wide methylation analysis. The RnBeads package 56 was applied to the dataset. Unreliable measurements were identified using Greedycut (p > 0.05) that removed 1,478 probes. Probes located in SNPs (n = 4,776) as well those located in specific contexts (n = 5,926 -non-CpG sites as well as additional SNPs) were also excluded. Samples were retained only if > 99% of sites assayed had detection p > 0.05. The background was corrected using noob method, which is based on a normal-exponential convolution using out-of-band probes 57 . Signal intensities values from probes type I and II were normalized using SWAN method that adjusts the intensities based on a quantile approach 58 . Probes located in the sexual chromosomes were filtered out remaining 464,964 probes for analyses.
To quantify methylation differences resultant from cellular composition of the blood, a purified dataset containing six blood cell types was used as ref. 59 to estimate the blood cell type contribution for each cord blood sample. Based on the estimated percentages of each cell type, a projection of the methylation levels for each cord blood sample was used in the limma based analysis of differential DNA methylation, as implemented by RnBeads 56 . Variables from the questionnaire were tested for association with the main group comparison (female versus male) indicating that only psychiatric disease in the father family was acting as co-variables. Moreover, adjustment for technical effects (e.g. batch effects) and cell type composition were performed before the differential methylation analyses.
In silico functionally exploration of data. Functional and disease enrichment analysis were performed in WebGestalt 61 using the whole genome as background. Significant values were considered to be those features comprised by at least 10 genes and adjusted p-value (adjP) < 0.01 adjusted by the multiple test as given by Benjamini-Hochberg procedure.
CpGs were compared to differentially methylated CpG sites found by three studies with publicly available data 34,40,62 using all probes from 450 K BeadChip as background. In addition, genes associated to DMSs were compared to 17 modules comprised of co-expressed genes during human cortical development 41 . For both cases, we applied the Modular Single-set Enrichment Test (MSET) 62 , which assess enrichment for a gene or CpG site (here named as feature) list of interest within a set of features using a randomization testing. First, the probability of our list of features being significantly represented in each module were compared to 10,000 simulated sets of features generated randomly from the 450 K Beadchip array (for DMS) or RefSeq (for genes) as background. Then, a p-value was calculated based on the number of simulated randomized sets that contains equal or higher number of features belonging to our list of features in the list generated by each of the three studies. We considered as significant, those comparisons with a p-value < 0.05. For the comparison with modules of genes co-expressed during neocortex development, the module eigengene values from the original study 41 were plotted in a scatter smooth plot for the significant modules.

Data access. The HumanMethylation450 BeadChip data set from this study is available in NCBI Gene
Expression Omnibus (GEO) under accession number GSE85042.