The impact of methodology on the reproducibility and rigor of DNA methylation data

Epigenetic modifications are crucial for normal development and implicated in disease pathogenesis. While epigenetics continues to be a burgeoning research area in neuroscience, unaddressed issues related to data reproducibility across laboratories remain. Separating meaningful experimental changes from background variability is a challenge in epigenomic studies. Here we show that seemingly minor experimental variations, even under normal baseline conditions, can have a significant impact on epigenome outcome measures and data interpretation. We examined genome-wide DNA methylation and gene expression profiles of hippocampal tissues from wild-type rats housed in three independent laboratories using nearly identical conditions. Reduced-representation bisulfite sequencing and RNA-seq respectively identified 3852 differentially methylated and 1075 differentially expressed genes between laboratories, even in the absence of experimental intervention. Difficult-to-match factors such as animal vendors and a subset of husbandry and tissue extraction procedures produced quantifiable variations between wild-type animals across the three laboratories. Our study demonstrates that seemingly minor experimental variations, even under normal baseline conditions, can have a significant impact on epigenome outcome measures and data interpretation. This is particularly meaningful for neurological studies in animal models, in which baseline parameters between experimental groups are difficult to control. To enhance scientific rigor, we conclude that strict adherence to protocols is necessary for the execution and interpretation of epigenetic studies and that protocol-sensitive epigenetic changes, amongst naive animals, may confound experimental results.

Detailed epigenome analyses applied to translational disease models typically find dozens to thousands of potential epigenetic modifications at gene regions, and the process of identifying causal factors is unavoidably challenging and time-consuming. Adding further to this complexity are observations of major differences in epigenetic signatures among different models of the same disease. For example, a recent study examined genomewide DNA methylation levels without matching experimental protocols in three different animal models of epileptogenesis, each performed in a different laboratory, and found no meaningful common changes in DNA methylation associated across the three models, which led the authors to conclude that there was no mechanistic overlap among models 10 . However, one protocol-related contributing factor that has not been adequately considered in epigenetic studies is the comparison of differences between control and experimental tissue within a laboratory or between laboratories.
To begin to address whether baseline experimental analysis of DNA methylation can be influenced by interlaboratory protocol-related confounds, we sought to compare DNA methylation marks in control wild-type tissue collected from three different laboratories. Hippocampal tissues were harvested and examined for DNA methylation and associated gene expression differences across the three laboratories ( Fig. 1), minimizing protocol differences, and matching variables such as vendor, age, rat strain, and tissue processing method for analysis.

Results
Experimental and environmental factors. SAS-Sprague Dawley male rats were purchased from vendors (Charles River and Envigo) nearest to our three project sites; Legacy Research Institute in Portland, Oregon (Site #1), Trinity College in Hartford, Connecticut (Site #2), and the University of Alabama at Birmingham in Birmingham (UAB), Alabama (Site #3). We identified factors that are typically easy to match, factors that may not always be considered, and factors that are challenging to match (Table 1). A total of 28 factors were collected in four major areas including "before-each-laboratory", "at-each-laboratory", "up-to-sacrifice", and "dissection". Three factors (10.7%; distance from breeding site, caging shape and size, and chow vendor) were all unique at each site, while 25 (89.2%) factors were shared by two or more sites. Based on the number of matched variables, Site #1 and Site #2 were the most similar to each other, sharing 22 (78.6%) factors, while Site #3 shared the least numbers of factors with Site #1 (14; 50.0%) and Site #2 (15; 53.6%).
Genome-wide profiling of DNA methylation and gene expression. Rats were 8.0 to 8.3 weeks of age at the time of arrival from the vendors. Entire hippocampi (from both hemispheres) were harvested from rats (n = 5-6) 18 to 27 days after arrival. The average body weights measured before animals were killed were 328.0 ± 15.9 g (n = 5), 310.8 ± 16.6 g (n = 6), and 337.9 ± 15.0 g (n = 6) for Sites #1, #2, and #3, respectively. Overall,  We obtained approximately 120 million 50-bp single-end reads per sample for DNA methylation profiling using reduced representation bisulfite sequencing (RRBS; Fig. 2A) and 66 million 50-bp paired-end reads per sample for gene expression profiling using RNA-Seq (Fig. 2B). The obtained sequencing reads were of high quality. A total of 8,779,630 CpG sites, corresponding to 38,185 genes, approximately 95% of 40,189 genes in the Rn6 rat genome annotation, were measured in at least one sample of the current RRBS dataset, and each sample included an average of 2,133,011 measured CpG sites (minimum 1,756,129 and maximum 2,646,863; Supplementary Table S1). A Principal Component Analysis (PCA) plot on the top 10% most variant CpGs (Fig. 2C) from the RRBS dataset illustrated that methylation profiles from Sites #1 and #2 were more similar, while samples from Site #3 were more divergent in terms of genome-wide methylation changes. Gene expression profiles (Fig. 2D) showed a higher congruence across all three sites.
Hippocampal markers and cell-type abundance. As differences in DNA methylation and gene expression across sites were observed, we decided to examine the possibility that the differences originated from individual variance in surgical procedures; the result being possible differences in regions of the hippocampus www.nature.com/scientificreports/ being studied. We collected 184 region-specific gene expression markers (Supplementary Table S2), covering CA1, CA2, CA4, dentate gyrus, dorsal, and ventral regions in the hippocampus, from the Hipposeq database 11 to examine the expression profiles in our dataset. Figure 3 illustrates the gene expression patterns of these marker genes, where no outstanding association between hippocampal regions was identified across samples. We also examined the average expression levels of known cell-type-specific neuronal marker genes, which were very comparable across three experimental sites (Supplementary Table S3). Cell-type abundance analysis using CIBERSORT 12 on the expression data revealed that the majority of the cells were neurons, astrocytes, and oligodendrocytes as shown in Supplementary Fig. 1, and their compositions were not significantly different across the three project sites (Kruskal-Wallis P value > 0.05 for each cell type; Supplementary Table S4). These results suggest that there were no systematic variances in surgical procedures between experimental sites and collected cell-type compositions.
Differential methylation and gene expression analysis. Both DNA methylation and gene expression data were analyzed in a pairwise fashion comparing samples from each site to both of the others. Differentially methylated CpGs (DMCs) were those with a methylation difference of > 25% and a false discovery rate adjusted p value < 0.01 by methylKit. Approximately 6.0% of DMCs were located in promoter regions, 34.6% in introns,   Fig. 2). Each DMC was mapped to a gene with the shortest distance from its transcript starting site and differentially methylated genes (DMG) were identified as having at least one mapped DMC. We also examined the correlation between the methylation levels of DMCs and body weight before animals were killed to assess the effect of different body weight on methylation. While the majority of the Pearson correlations were not significant, a large portion of the DMCs between Site #1 and Site #3 showed a statistically significant correlation with body weight (Supplementary Fig. 3).
Differentially expressed genes (DEGs) were identified using DESeq2 and genes with a Benjamini-Hochberg adjusted p value < 0.05. Based on the total number of DMGs and DEGs, the comparison between Site #1 and Site #2 had the smallest numbers of DMGs (n = 1,49) and DEGs (n = 58), suggesting that these two sites had the most similar DNA methylation and gene expression profiles among the three sites, which is also supported by the hierarchical clustering of samples ( Supplementary Fig. 4). The comparison between Site #2 and Site #3 was the   (Fig. 2E,F). We also performed an analysis of differentially methylation regions (DMRs) in addition to DMGs using methylKit, which identified 24-77 DMRs between sites (Supplementary Tables S11-S13).
Overlap between DEGs and DMGs. Genome-wide DNA methylation studies commonly report changes in DNA methylation in the absence of gene expression data. When combined with gene expression data, investigators often focus on the alterations in DNA methylation that can be inversely correlated with gene expression changes. To understand the relationship between epigenetic regulation and transcriptomic changes, we examined the overlap between DMGs and DEGs. Approximately 53 to 59% of differential CpG sites showed an inverse relationship between the direction of methylation and expression changes such as increased methylation with down-regulated gene expression or decreased methylation with up-regulated gene expression. This is reflective of the nuances associated with the position of DNA methylation changes in relation to gene expression and strongly suggests that even in the minority of instances where DMG show a change in gene expression, the directionality of that gene expression change cannot and should not be inferred based on whether DNA methylation is increased or decreased at a particular site.
Functional-level similarity. To infer the significance of DNA methylation changes in the absence of definitive overlap, we next identified and compared overrepresented biological functions; the purpose was to identify pathway-level functional changes that may be related to experimental variables of interest. To assess the functional-level similarity between DMGs and DEGs identified as divergent between study sites, and attempt to determine if these represented true epigenetic changes associated with experimental variables, enrichment analysis was performed using a hypergeometric test with our in-house R analysis package richR (http:// github. com/ hurlab/ richR) in terms of Gene Ontology (GO) terms 13,14 and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways 15 . The complete lists of significant GO terms and KEGG pathways overrepresented in DMGs and DEGs are given in Supplementary Tables S14-S25. Supplementary Table S26 summarized the numbers of significant biological functions identified in each gene set. DMGs have over 400 significant GO terms with adjusted p < 0.05 but very few significant KEGG pathways. Heatmaps were generated with top significant functions, where colored cells indicated significant enrichment within the corresponding dataset (Fig. 4). The top enriched GO terms were very similar across all three DMG sets (Fig. 4A), while very few GO terms were enriched partially due to the small numbers of DEGs. KEGG analysis revealed different sets of pathways were enriched in these gene sets. "Pathways in cancer" was identified in all three DMG sets, but no apparent theme was identified.

Discussion
Each of the four major categories of experimental/environmental factors used in the present study identified noticeable differences in the methylome and transcriptome. In the "before-each-laboratory" category, the animal vendors were different; Charles River Laboratory for Sites #1 and #2 and Harlan (Envigo) for Site #3. Although we purchased Sprague-Dawley rats from both vendors, there could be vendor-specific genetic differences. Studies have shown that many animal models of the same strains could have phenotypic as well as genetic variances according to the sources (vendors) [16][17][18] . In addition to possible genetic differences, these two vendors used different chows (Purina 5L79 at Charles River and Teklad Global 18% Protein Rodent Diet at Envigo). Although we did not examine specifically the possible impact of animal chow on the epigenome, other studies have reported differences in animal phenotypes resulting from changes in nutrition; DNA methylation was labile in response to nutritional influence 19,20 . Sites #1 and #2 used the same animal vendor as well as the same branded chow (LabDiet 5001) at their laboratories, while Site#3 used a different animal vendor and a different selection of chow (NIH open formula rat sterilizable diet). These differences are well aligned with the more outstanding differences in methylome and transcriptome between Site #3 and other sites.
Other noticeable factors include days-from-arrival-to-sacrifice in the "up-to-sacrifice" category and type-ofbuffer and the use of air bubbles in the "dissection" category. Rats were killed and hippocampi dissected 27 days This difference may be associated with different body weights, which also showed significant differences across the project sites and correlated with different methylation levels of DMCs ( Supplementary Fig. 3).
There is a possibility that the delay in sacrificing by 8 to 9 days along with changes in body weight affected the DNA methylome; however, it is not clear the degree to which the changes were attributable to delay. As for the different factors in the "dissection" category, Site #3 used artificial cerebrospinal fluid as their choice of a buffer solution with 95% O 2 and 5% CO 2 bubbled, while the other two sites used 0.9% saline. Although hyperoxia may result in genome-wide changes in DNA-methylation 23 , the effect of relatively short-term exposure during dissection on methylation change would not be expected to be substantial due to the time it generally takes to see methylation changes in culture.
Changes in methylation and gene expression can be either protocol-induced variations or could be considered epigenetic noise. Accordingly, we examined changes in pathways and biological functions using GO and KEGG analyses. No GO term changes related to DNA methylation were significantly enriched. However, one GO term that was changed was histone deacetylation; it was significantly enriched among the DEGs between Site #1 (Legacy) and Site #3 (UAB). Six DEGs, including Per1, Per2, Rbm14, Bcl6, Sfpq, and Prkd2, were included in this set, suggesting a potential difference in another epigenetic marker of histone deacetylation is taking place.
Nearly identical protocols (Sites #1 and #2) resulted in a close match of DNA methylation and RNA-seq profiles, whereas seemingly minor differences induced major changes in the DNA methylome and transcriptome. Although it is impossible to gauge the extent of their individual or combined effects on DNA methylation changes in this study, some of these factors have been implicated in modulating epigenetic changes 20 .
We also examined the relationship between the significant changes in DNA methylation and significant expression change in the nearest gene, based on the assumed inverse correlation between methylation and gene expression. Little overlap was observed up to 5% of DMGs with DEGs, approximately 59% of them showed an inverse relationship (increased methylation with down-regulated gene expression or decreased methylation with up-regulated gene expression), suggesting that the directionality of gene expression change cannot and should not be inferred based solely on whether DNA methylation is increased or decreased at a particular site. Associating DNA methylation to gene expression is very challenging and DNA methylation often has a strong influence through most distal effects as at enhancer elements or CTCF binding sites 24,25 ; therefore, our current overlap analysis has room for improvement. www.nature.com/scientificreports/ Bioinformatics approaches for addressing the batch differences in certain high-throughput data analysis are available by using either tools such as ComBat 26 , SVA 27 , and removeBatchEffect 28 , or including the batch information as a covariate. However, caution is needed when using batch correction methods as they may bias the data in unpredicted ways 29 . Our study demonstrates that seemingly minor experimental variations, even under normal baseline conditions, can have a significant impact on epigenome outcome measures and data interpretation. This is particularly meaningful for neurological studies in animal models, in which baseline parameters between experimental groups are difficult to control 10 . Therefore, strict guidelines for the execution and interpretation of epigenetic studies, possibly including additional controls in experimental design to adjust protocol-sensitive epigenetic, are needed to enhance scientific rigor, and these data identify protocol-sensitive epigenetic changes that may confound experimental results.

Methods
Animals. SAS Sprague Dawley (SD) male rats were purchased from the nearest vendors (Charles River and Envigo) from our three project sites, including Legacy Research (Site #1), Trinity College (Site #2), and the University of Alabama at Birmingham (Site #3). Animal handling was approved by the Institutional Animal Care and Use Committee (IACUC) at each of three sites and summarized as listed in Table 1. The investigation conformed to the National Research Council of the National Academies Guide for the Care and Use of Laboratory Animals 30 and complied with the ARRIVE guidelines.
RRBS and RNA-Seq. Whole hippocampi from each animal were surgically dissected and flash-frozen in liquid N 2 and stored at -80 °C before being shipped to the University of North Dakota (UND), where samples were collected, de-identified, and stored at -80 °C for deep sequencing analysis. Once collected, all samples were processed using Qiagen's AllPrep DNA/RNA Mini Kit (Germantown, MD; Product ID: 80,204) to individually extract RNA and DNA from each flash-frozen sample. All RNA and DNA samples were stored at -80 °C before being sent frozen in dry ice to the University of Michigan Genomics and Epigenomics Core for deep sequencing.
The RRBS procedure was adapted as previously described and performed at the University of Michigan Epigenetics Core facility 31 . Briefly, genomic DNA quantity was measured using a Qubit fluorometer (ThermoFisher Scientific, Waltham, MA) and the quality assessed using TapeStation (Agilent, Santa Clara, CA). Genomic DNA was digested with Msp1 restriction enzyme and purified using phenol:chloroform extraction and ethanol precipitation. Following Msp1 digestion, genomic DNA underwent blunt-end digestion, phosphorylation, and ligation of adapters with methylated cytosines. Ligated fragments, processed for size selection using agarose gel electrophoresis, were bisulfite converted, amplified by PCR, then cleaned using AMPure XP beads (Beckman Coulter Life Sciences, Indianapolis, IN). Libraries were quantified using Qubit fluorometric quantification (ThermoFisher Scientific, Waltham, MA), analyzed using a TapeStation system (Agilent, Santa Clara, CA), then sequenced on an Illumina Hi-Seq 4000 platform (Illumina, San Diego, CA).
Total RNA isolated from individual microglia preparations was evaluated using a TapeStation system (Agilent, Santa Clara, CA). The NEBNext Ultra II RNA Library Prep Kit for Illumina (New England Biolabs, Ipswich, MA) was used to construct the sequencing library. Resultant cDNA was commercially sequenced with a paired-end read length of 50 bases using an Illumina Hi-Seq 4000 platform (Illumina, San Diego, CA).
RRBS and RNA-Seq data processing. Quality control assessment of RRBS data was completed using FastQC v0.11.5 32 . Raw Sequencing reads were cleaned using Trim Galore v0.5.0, to remove reads with adapter contamination and reads with a Phred quality score less than 30 33 . Cleaned reads were mapped to an in silico bisulfite-converted rat reference genome rn6 using Bismark v0.20.0 and Bowtie2 v2.3.4.2 34,35 . CpG sites on X and Y chromosomes were excluded. PCA was performed on the most variant 10% of the measured CpG sites. methylKit v1.8.1 36 was used to count uniquely mapped reads and assess changes in methylation between each site. Differentially methylated CpGs (DMCs) were defined as a 25% or greater difference in cytosine methylation levels between sites with an adjusted p value < 0.01, which were then aggregated into differentially methylated genes (DMGs) based on the unique gene identifiers. DMCs were annotated based on genes and CpG island features using gene bodies and 2, 5, and 10 kb regions upstream from transcription start sites using the genomation R package 37 . Annotation of murine CpG islands was obtained from the University of California, Santa Cruz, CA (UCSC, https:// genome. ucsc. edu/ cgi-bin/ hgGat eway? db= rn6). The gene annotation was obtained from Ensembl and NCBI gene databases 38 .
Quality control assessment of RNA-Seq data was completed using the FastQC v0.11.5 32 . Raw sequencing reads were cleaned using Trimmomatic 39 to remove reads with adapters or poly-N sequences as well as reads with quality scores < 30. Cleaned reads were mapped to the rat reference genome rn6 using HISAT2 40 . Genes with zero expression across samples were omitted from differential expression analysis. featureCounts was used to assign mapped reads to unique genomic features 41 . PCA was performed to gain insights into the association between samples. Differentially expressed genes (DEGs) were identified using the DESeq2 R package with a significance cutoff of < 0.05 adjusted p value 42 .
Hippocampus region-specific expression markers. To assess the possibility of imbalance in the dissected hippocampal regions, which might have resulted in the observed methylation and expression differences, we examined the expression levels of region-specific hippocampal markers. We compiled 187 region-specific markers from the Hipposeq, a comprehensive RNA-Seq database of gene expression in hippocampal principal neurons 11  www.nature.com/scientificreports/ Cell-type composition analysis. To assess the cell-type composition difference of the samples across three sites, the expression levels of selected known neuronal cell-type-specific markers for microglia, astrocytes, neurons, oligodendrocytes, as well as those are known to be expressed in various cell types. These marker genes were compiled in our recent study 43 on alpha-synuclein-associated differential methylation signatures in microglia, which employed two high-throughput expression profiling studies in rodent brains 44,45 . We also employed CIBERSORT 12 to infer the cell type abundance based on gene expression-and marker genes. Brain cell-typespecific signatures of 903 genes were obtained from a study in human brains 46 , which included astrocytes, endothelial, fetal quiescent, fetal replicating, microglia, neurons, oligodendrocytes, oligodendrocyte progenitor cells (OPC). These human gene signatures were mapped to rat genes based on the Ensembl Genes database v104 annotation using the biomaRt Bioconductor package 47 . CIBERSOFT function available in IOBR R package was used to estimate the abundances of the member cell type from the RNA-Seq data. Kruskal-Wallis test was used to examine the significant difference in the cell-type composition across the samples.

Functional enrichment analysis.
To identify significantly over-represented biological functions, a functional enrichment analysis of both DMGs and DEGs was conducted using our in-house enrichment analysis R package richR (http:// github. com/ hurlab/ richR). Gene Ontology (GO) 48 and Kyoto Encyclopedia of Genes and Genomes (KEGG) 49 were used as the target annotation sources of biological functions and pathways. A Benjamini-Hochberg adjusted p value of < 0.05 was used as the significance cutoff. VennDetail Bioconductor package 50 was used to examine the overlap at the biological functions/pathways as well as at the gene level (DEGs and DMGs).

Data availability
The raw sequencing data have been deposited into the NCBI Gene Expression Omnibus database (Accession ID: GSE164833). Analysis scripts used in the current study are freely available at our GitHub site (https:// github. com/ hurlab/ Proto colMa tters). All other data generated or analyzed during this study are included in this article and its supplementary materials. www.nature.com/scientificreports/ Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.