Pubertal development in healthy children is mirrored by DNA methylation patterns in peripheral blood

Puberty marks numerous physiological processes which are initiated by central activation of the hypothalamic–pituitary–gonadal axis, followed by development of secondary sexual characteristics. To a large extent, pubertal timing is heritable, but current knowledge of genetic polymorphisms only explains few months in the large inter-individual variation in the timing of puberty. We have analysed longitudinal genome-wide changes in DNA methylation in peripheral blood samples (n = 102) obtained from 51 healthy children before and after pubertal onset. We show that changes in single methylation sites are tightly associated with physiological pubertal transition and altered reproductive hormone levels. These methylation sites cluster in and around genes enriched for biological functions related to pubertal development. Importantly, we identified that methylation of the genomic region containing the promoter of TRIP6 was co-ordinately regulated as a function of pubertal development. In accordance, immunohistochemistry identified TRIP6 in adult, but not pre-pubertal, testicular Leydig cells and circulating TRIP6 levels doubled during puberty. Using elastic net prediction models, methylation patterns predicted pubertal development more accurately than chronological age. We demonstrate for the first time that pubertal attainment of secondary sexual characteristics is mirrored by changes in DNA methylation patterns in peripheral blood. Thus, modulations of the epigenome seem involved in regulation of the individual pubertal timing.

methylation patterns has previously been investigated 19 but the results were rather inconclusive due to study design and technical aspects.
In this longitudinal study of healthy Danish boys and girls with repeated blood sampling, we investigated the relationship between individual changes in genome-wide DNA methylation patterns and pubertal development. We identified several CpGs and CpG islands that closely followed pubertal transition suggesting important novel regulators of pubertal onset.

Results
From our longitudinal cohort (the COPENHAGEN puberty cohort) of healthy boys and girls, followed clinically during pubertal development, we included 22 girls and 32 boys who all had blood samples drawn for DNA isolation before and after onset of puberty. Pubertal onset was identified by thorough clinical investigation and defined as a testicular volume of ≥ 4 ml, or Tanner breast stage ≥ B2 for boys and girls, respectively. The mean age between two examinations where this threshold was reached was used as their age of pubertal onset. Pre-pubertal blood samples were drawn at a mean age of 9.1 and 8.3 years corresponding to − 2.5 and − 1.8 years prior to pubertal onset for boys and girls, respectively. Post-pubertal samples were drawn at a mean age of 15.1 and 14 years and 3.5 and 4 years after pubertal onset for boys and girls, respectively (Fig. 1a, Table 1). Two girls and one boy were excluded for technical reasons. We observed a slight tendency towards lower DNA methylation levels in open sea and in shores and shelf of CpG islands when pre-pubertal were compared to post-pubertal children and significantly (p = 0.001) higher DNA methylation levels of CpG islands in girls compared to boys (Supplementary Figure 1).
Identification of differentially methylated CpGs associated with pubertal timing. Each time point for DNA collection was assigned a specific pubertal age (i.e. years from pubertal onset). In this way, we identified CpGs that were associated specifically with pubertal development. A modified surrogate variant analysis (SmartSVA) was applied to correct for confounding effects (e.g. possible differences in blood composition) and reduced the genomic inflation to 1.2 (Fig. 1b). When boys and girls were analysed together (corrected for age and sex) 457 CpGs with a false discovery rate (FDR) below 0.05 were identified. They were evenly distributed across all autosomes ( Fig. 1c and Supplementary Table 1). Unsupervised hierarchical clustering of the most significant CpGs (94 CpGs at FDR < 0.001) resulted in nearly complete separation of children according to whether samples were obtained in pre-pubertal vs. post-pubertal states (Fig. 1d). When boys and girls were analysed separately, only the data from the boys resulted in significant CpGs (data not shown).
The core list of the 94 CpGs associated with pubertal age was used to search for significantly enriched gene ontology terms (biological process, p < 0.05) and subsequently plotted in a semantic space that group similar terms (Fig. 1e). This revealed clusters related to e.g. regulation of androgen receptor activity, anatomical morphogenesis, muscle cell development, and pancreatic juice secretion (Supplementary Tables 2 and 3).

Differentially methylated CpGs associated with circulating reproductive hormone levels.
Pubertal transition entails central activation of the HPG axis and hence increases in circulating reproductive hormones; a process which is markedly different between boys and girls. We therefore compared changes in DNA methylation to changes in reproductive hormones independently in boys and girls. Using a similar SmartSVA method as applied for pubertal age (above) to control for genomic inflation, a FDR cut-off of 0.05 identified no significant CpGs among the girls (that start to cycle post-pubertally). Among the boys, we identified 999, 492, 403, 282, and 218 CpGs that were associated with changes in testosterone, follicle-stimulating hormone (FSH), luteinizing hormone (LH), anti-Müllerian hormone (AMH), and Inhibin B, respectively (Supplementary Table 4, Fig. 2a). A binary tree obtained from a sequence of union operations between the CpGs revealed that LH CpGs were more distant than the others and that Inhibin B, AMH, and FSH were more related (Fig. 2b). As for pubertal age, a core subset of 133 CpGs associated with at least 3 of the 5 reproductive hormones (Supplementary Table 5) separated children into pre-and post-pubertal in an unsupervised hierarchical clustering (Supplementary Figure 2). Importantly, the core sets originating from CpGs associated with pubertal age and reproductive hormones, respectively, showed a highly significant overlap (p-value: < 0.0001; Fig. 2c). 52 CpGs showed association to both reproductive hormones and pubertal age (Supplementary Table 6) and included 5 (10%) and 6 (12%) CpGs in the proximity of the genes TRIP6 and KCNAB3, respectively. Genomic regions with differential methylation. We searched for genomic regions (in a window of 1000 bases) where DNA methylation in a co-ordinated manner was associated with pubertal age. This identified genomic regions that were differentially methylated in boys and girls individually (data not shown) and in both genders collectively (Supplementary Table 7). The most significant region for both genders, when analysed individually (Supplementary Figure 3) as well as collectively (Fig. 3a), was a region on chromosome 7 situated between SLC12A9 and TRIP6 (hg19 coordinates chr7:100463206-100464771). This region contains the last exon and 3′ UTR of SLC12A9 and the promoter of TRIP6. It furthermore contains both a CpG island and several potential binding sites for transcription factors (Fig. 3a). Since the region was upstream of TRIP6 and included several potential transcription factor binding sites, we analysed the expression of TRIP6 by immunohistochemistry in a target tissue for pubertal transition, i.e. testicular tissue from healthy pre-and post-pubertal (adult) males. This revealed a prominent and highly specific expression in adult testosterone producing Leydig cells (Fig. 3b), whereas TRIP6 expression in pre-pubertal Leydig cells was absent (Fig. 3c). TRIP6 staining was also found in the less mature Sertoli cells in the pre-pubertal sample but not in the mature adult testis. Ovarian tissue showed a weak staining in oocytes and granulosa cells, whereas the theca cells were negative for TRIP6 staining (Supplementary Figure 4). In addition to expression in target tissues, we also analysed the circulating levels of TRIP6 in a subset of pre-, mid-(around onset) and post-pubertal children (20 boys and 18 girls). This showed a significant increase in the circulating levels as children progressed through puberty (Fig. 3d). TRIP6 levels doubled in boys from 3.4 (95% CI: 2.5-4.2) ng/ml pre-pubertally to 6.9 (4.5-9.3) ng/ml post-pubertally. Even though variation was observed, the increase was highly significant (p-value: 0.005). Similar data was obtained in girls where the mean pre-pubertal levels were 4.4 (3.4-5.4) ng/ml and significantly increased (p-value: 0.004) to 8.3 (6.1-10.5) ng/ml in post-pubertal girls. In general, TRIP6 levels appeared to be higher in girls than in boys, and the TRIP6 promoter also seemed to be more differentially regulated in girls ( Fig. 3d and Supplementary Figure 3).
Moreover, changes in methylation levels of the same genomic region between SLC12A9 and TRIP6 were significantly associated with testosterone (boys), FSH (boys and girls), and LH (boys and girls) levels. This was however not the case for Inhibin B (boys and girls), AMH (boys and girls), or estradiol (girls; data not shown).
In addition, a region in the proximity of KCNAB3 was identified as the third most significant region (Supplementary Table 7) to be differentially methylated during pubertal transition. (a) From the COPENHAGEN puberty study of children followed longitudinally, selected pre-and post-pubertal samples from 31 boys and 20 girls were included in the study. Time relative to onset of puberty as measured by a testicular volume of 4 ml or breast tanner stage B2 or above was calculated and defined as the pubertal age. (b) Genomic inflation of the genome-wide DNA methylation patterns correlating with pubertal age were corrected by applying an improved surrogate variant analysis (SmartSVA) resulting in a genomic inflation factor of 1.2 and a qq-plot leaving 457 significant CpG sites at a FDR of 0.05 (red). (c) Manhattan plot of the significant CpG sites reveal a distribution across all autosomes. (d) Unsupervised hierarchical centroid clustering of CpGs associated with pubertal age. The resulting dendrograms were colour coded according to their height and divided samples into two major groups that nearly uniquely represented pre-and post-pubertal boys and girls. (e) Gene ontology analysis of all significant CpGs revealed several ontologies (p-value < 0.05) that could be related to pubertal development. Biological process gene ontologies were plotted in a sematic space, using REVIGO, that groups related ontologies together.
Scientific RepoRts | 6:28657 | DOI: 10.1038/srep28657 Prediction of pubertal age from DNA methylation patterns. DNA methylation patterns have been shown to predict biological aging, and with our longitudinal data on pubertal age (Fig. 4a), we therefore tried to predict the pubertal age of boys and girls based on our methylation data. Normalised and filtered probes (n = 467,873) were used as predictors in an elastic net regression model (similar to the algorithms earlier used in prediction of biological aging). The prediction error (standard deviation of residual error) was estimated using 10-fold cross validation and corrected for chronological age. Using the minimal lambda penalisation value (0.10 and 0.13), the pubertal age was predicted with a residual error of 0.39 and 0.74 year, equalling five and nine months in boys and girls, respectively (Fig. 4b,c). In comparison, the residual error obtained from our data on predicted biological aging (Fig. 4d) was 0.94 year (11 months; lambda = 0.06). When boys and girls were combined into one analysis, the residual error was 0.6 year (seven months) for prediction of pubertal age.

Discussion
To our knowledge, this is the first human longitudinal study describing individual changes in specific epigenetic profiles associated with pubertal onset. We identified a number of CpGs and CPG islands, suggesting novel key players for pubertal development.
Our study cannot reveal the cause and effect of differentially regulated CPGs. However, a downstream biological consequence of a change in a single CpG is less likely and, as such, these may represent surrogate markers of pubertal transition. It is more likely that coordinated changes of DNA methylation in genomic regions have a downstream biological effect. Consequently, we searched for such regions in our data and identified a potential regulatory region upstream of TRIP6 that was co-ordinately regulated during pubertal transition. Not much is known about human TRIP6 (Thyroid Hormone Receptor Interactor 6, 7q22), which according to ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/; RRID:SCR_006169) does not contain any SNPs that are clearly associated to a clinical phenotype and no TRIP6 variants were identified in recent large association studies on age at menarche 4,20 . It contains three LIM zinc-binding domains and interacts with TR-beta only in the presence of thyroid hormone 21 . As a member of the zyxin family, TRIP6 has also been described to be involved in actin cytoskeleton rearrangements and was recently shown to be required for F-actin organization and dendritic morphology of hippocampal neurons 22 . TRIP6 is also involved in nuclear factor kappa B and JNK signalling by LPA2 receptor and TRAF6 binding 23 and modulates glucocorticoid receptor transcriptional activity 24 . According to FANTOM5 (RRID:SCR_002678) data, TRIP6 is found expressed in many tissues but is, in particular, found in reproductive tissues. Detailed investigation of gene expression profiles in the GEO database (RRID:SCR_004584) provides evidence of differential TRIP6 expression during rodent development of the mammary gland (GDS2721 and GDS2360) 25,26 , gonads (GDS4503) 27 , and hypothalamus (GDS2862) as well as in the hypothalamic hamartomas of patients with central precocious puberty (GDS3110) 28 . Furthermore, we found TRIP6 to be highly and specifically expressed in human testicular Leydig cells (producing testosterone) in healthy adult men but not in pre-pubertal boys. In contrast to mature Sertoli cells, TRIP6 was expressed in immature Sertoli cells in pre-pubertal testes. It is well known that thyroid hormones play an important role in regulation of both brain and testicular development. In the testis, the most potent thyroid hormone, triiodothyronine (T3), induce androgen receptor expression in immature Sertoli cells 29 that together with FSH drives Sertoli cell proliferation and consequently induction of spermatogenesis during onset of puberty. Importantly, T3 also induces both StAR and LHR expression in Leydig cells 30 and T3 therefore seems to be a key player in the pubertal transition of the testis. We speculate that TRIP6 may be involved in the T3 activation of steroidogenesis in Leydig cells during pubertal onset in boys, which is reflected by promoter demethylation of TRIP6 in white blood cells. Similarly in females, T3 increases FSH-induced antral follicle growth in vitro and regulates granulosa cell proliferation 31,32 . Doubling of circulating levels of TRIP6 in pre-to post-pubertal children even further substantiates that TRIP6 is induced during pubertal transition. To our knowledge, only the closely related zyxin-family members zyxin and Lpp, and not Trip6, have been knocked out in mice 33,34 . Drosophila contains only one TRIP6 orthologue, zyxin, and null mutants yields smaller flies, which however could be related to an effect on actin rearrangements 35 . Another interesting, and nearly as significant region as the TRIP6 promoter, was found in front of KCNAB3 (Voltage-gated potassium channel subunit beta-3). However, very little is currently known about KCNAB3, which very interestingly, is thought to function primarily in the brain 36,37 . Mice lacking Kcnab3 weigh less than control littermates probably due to a higher basal metabolic rates but it is unknown whether pubertal development was affected 38 .
Many parameters have been shown to influence DNA methylation in peripheral blood. Some of the identified pubertal CpGs could potentially be changed due to e.g. biological aging or changes in the cellular composition of blood. However, the age at pubertal onset varied substantially and importantly also the age of the children when blood samples were drawn (5.6-11.3 and 12.2-16.4 years of age, pre-and post-pubertally, respectively). Moreover, the regression analysis was adjusted for age at blood sampling by including it as a non-penalized covariate in the prediction model. Similarly, boys and girls entered puberty at different ages and sex. Consequently, sex was also included as a specific covariate. We tried to analyse boys and girls separately, and the overall picture was that only significant CpGs could be identified among the boys. We speculate that this is partly due to the lower number (and hence lower power) of girls (20 girls and 40 samples) included in the analysis compared to boys (31 boys and 62 samples), but also due to reproductive hormone cyclicity in post-pubertal girls. It is evident that e.g. differential methylation of the TRIP6 promoter is regulated also in girls when analysed alone (Supplementary Figure 3), indicating that similar pubertal methylation profiles are found in both sexes. A significant proportion (34 out of 94), but not all, of the CpGs associated with pubertal age in boys and girls were also identified when boys were analysed alone, indicating added statistical power by inclusion of girls in the analysis. Moreover, it should also be considered that the Tanner staging of boys and girls may not resemble the same pubertal readout and the same "absolute stage" (breast tissue vs. testicular volume). Altogether, our results indicate that future studies should attempt to include a larger amount of children, which should provide statistical power to analyse boys and girls separately. Isolation of specific subsets of lymphocytes may in addition provide further power to the analysis. In general we observed higher methylation levels of CpG islands in girls compared to boys, which is in accordance to earlier reports on sex differences of overall DNA methylation levels 39 .
We did not have cell counts on the different lymphocyte populations in the samples, but we believe that this potential bias was efficiently corrected for by applying the SmartSVA method. The SmartSVA method was, in our study, superior to the EWASher method as EWASher, besides reducing the genomic inflation, also reduced the significant signals substantially. As white blood cells cannot be regarded as a target tissue for pubertal development, we speculate that the identified changes in epigenetic profiles associated with pubertal transition may be traces of the regulation that takes place in target tissues. Indeed, the gene ontology analysis identified many terms that would normally be regarded as processes taking place during pubertal transition in target tissues. In particular, terms related to morphogenesis and HPG activation were identified. Recently, Finucane et al. 40 reported gene ontology analysis of GWAS data on "age at menarche" and found enrichments of "Adrenal or pancreas" and "CNS", which fit very well with the terms found in our gene ontology analysis. A link between epigenetic events in blood cells and epigenetic regulation in reproductive target tissue is further substantiated by the above-mentioned changes in TRIP6 promoter methylation and coordinated changes in expression in Leydig cells during pubertal transition as well as circulating TRIP6 levels.
Many studies have investigated the relationship between DNA methylation patterns and biological aging 7-12 , which currently is the best predictor of chronological age. It has been reported that overall DNA methylation levels decrease by age 41 , which is the same tendency observed in our data. Individuals that show a progressed biological aging compared to their chronological age have a greater risk of all-cause mortality 11 . Furthermore, cancer tissues show accelerated biological aging 8 . In our study, we were able to predict pubertal age even better than biological aging. Many factors including the paired design, the age intervals investigated and other technical issues could influence the performance of the prediction accuracy, but our results clearly indicate that it is possible to accurately predict the pubertal development independently of age with a good accuracy. It would be interesting to study children with precocious or delayed puberty to investigate whether these conditions are reflected in DNA methylation patterns. Likewise, it would be intriguing to investigate patients with altered reproductive hormone levels. The described tight link between pubertal development and DNA methylation may, however, also imply that disrupting environmental factors affecting the pubertal methylome could be involved in the declining age at pubertal onset observed in the western world.
In summary, pubertal transition was reflected by changes in the patterns of DNA methylation in peripheral white blood cells. Changes in methylation of single CpGs were associated to both the stage of pubertal development (pubertal age) as well as changes in circulating hormones and could be used to predict the pubertal age. Methylation levels of genomic regions, including the promoter of TRIP6, systematically changed during pubertal transition. Epigenetic activation of TRIP6 was reflected by post-pubertal expression in reproductive target tissue as well as increasing circulating levels. Our human data is the first to indicate that peripheral blood is a valuable surrogate tissue for assessment of sexual maturity during pubertal transition.

Methods
Study population and clinical examination. The COPENHAGEN Puberty Study (ClinicalTrials.gov ID: NCT01411527) is a combined cross sectional and longitudinal population-based cohort study of healthy Danish children and adolescents. The study population has been described in detail previously [42][43][44] and has been approved by the local Danish ethical committee (KF 01 282214; V200.1996/90) and the Danish Data Protection Agency (2010-41-5042). The study was carried out in accordance with the approved guidelines and written informed consent was obtained from all children and parents. In this study, we included 32 boys and 22 girls. The clinical evaluations were performed by trained physicians and included pubertal staging of breast development according to Tanner's classification evaluated by palpation 45 . Testicular volume was determined using orchidometry as described by Marshall and Tanner 45 . A breast Tanner stage of two or above and a testicular volume of 4 mL or more were considered to be markers of pubertal onset in girls and boys, respectively. The age of pubertal onset was approximated using the date exactly between two visits where the girls advanced from B1 to B2 (or more) and the boys' testicular volume increased to 4 ml or above. In this study, we used the estimated date of onset to determine the pubertal age, defined as the difference between the date of examination relative to the age at onset.
Serum levels of FSH and LH were measured by a two-sided time-resolved fluoroimmunoassay (Delfia, Wallac, Inc., Turku, Finland) with detection limits of 0.06 and 0.05 IU/L, respectively. Total serum estradiol was measured by a RIA (Pantex Corp., Immunodiagnostic Systems Limited, Santa Monica, CA, USA) with a detection limit of 18 pmol/l. Serum AMH levels were determined using the Beckman Coulter enzyme immunometric assay generation I (Immunotech, Beckman Coulter Ltd., USA) with a detection limit of 2.0 pmol/L. Serum inhibin B was measured using a double antibody immunometric assay (Serotec, Oxford, UK) with an LOD of 20 pg/ml. Serum testosterone levels were measured using RIA (DPC Coat-A-Count; Diagnostic Products Corp. Los Angeles, CA, USA) with a detection limit of 0.23 nmol/l. Samples with missing values (AMH (boys, n = 1; girls, n = 2), FSH (boys, n = 1; girls, n = 1), Inhibin B (boys, n = 1; girls, n = 6), Testosterone (boys, n = 1), estradiol (girls, n = 1), LH (boys, n = 1; girls, n = 1)) were substituted with the average from the gender and age group. Study design. Blood samples (n = 108) obtained at well-defined time points pre-and post-pubertally were selected from our longitudinal cohort. The cohort was not originally designed to include pre-and post-pubertal sampling of DNA and we were therefor only able to include a subset of the complete cohort. However, any child with DNA purified from blood samples before and after pubertal onset was included. These blood samples therefore consisted of paired samples from 32 boys and 22 girls. Clinical data including reproductive hormone values at the pre-and post-pubertal time points where blood sampling for DNA isolation was performed is outlined in Table 1.
Infinium 450 K array analysis. DNA from the 108 blood samples were purified using standard techniques as described before 6 and quantified by Qubit (Life Technologies Europe BV, Naerum, Denmark) measurements. The DNA was bisulfite treated, hybridized to Infinium HumanMethylation450 BeadChips (Illumina, San Diego, CA, USA), and scanned using standard protocols by NXT-Dx (Gent, Belgium). Probe intensities were extracted using GenomeStudio V2011.1 (Illumina, San Diego, CA, USA; RRID:SCR_010973). In order to minimize systematic bias samples were randomly distributed onto the BeadChips, which holds 12 samples on each BeadChip. Each BeadChip interrogates 485,577 cytosine positions (CpG sites) in the human genome covering 99% of all RefSeq genes. 365,934 sites are located within known gene regions (promoter, gene body, and untranslated regions) and 119,830 are intergenic sites 46 . Data analysis. In GenomeStudio the data was background subtracted and normalized to internal controls.
IDAT-files generated by GenomeStudio were imported into the minfi (RRID:SCR_012830) package 47 in RStudio (RRID:SCR_000432) version 0.99.489. The methylation level at each CpG site was calculated as a beta (β ) value that ranges from zero (no methylation) to one (complete methylation). The quality of the data was checked using the ShinyMethyl package. A principal component analysis (PCA) was performed using the FactoMineR package 48 on the 65 SNPs included on the BeadChip. In the PCA plot the paired samples (pre-and post-pubertal samples) clustered closely together except for three individuals (two girls and one boy) and all six samples were excluded from the subsequent analysis.
Data from the remaining 102 samples from 20 girls and 31 boys were normalized using a Subset quantile Within-Array Normalization (SWAN; RRID:SCR_003455) procedure 49 and probes containing SNPs in the CpG or extension sites were removed.
It is well known that the cell type composition of blood changes with age and can be a significant confounding factor 50 . Several different algorithms that have been developed to adjust for such confounding effects and we therefore tried to apply two recently developed methods. The EWASher algorithm 51 was applied using Anaconda with Python 2.7 (RRID:SCR_008394) and an improved Surrogate Variable Analysis (SmartSVA, https:// github.com/ehsanbehnam/SmartSVA) 52 was applied in RStudio. With correlation to pubertal age we found that the EWASher method reduced the genomic inflation factor from 8.9 to 1.2 (calculated by the GenABEL (RRID:SCR_001842) package 53 ) but only left very few significant CpGs (Supplementary Figure 5). The SmartSVA method on the other hand both reduced the genomic inflation factor to 1.2 and left 457 significantly associated CpGs at a FDR of 0.05 or less. The SmartSVA method was therefore used in subsequent analyses. The R package CpGAssoc (RRID:SCR_000320) 54 was used to analyse differential methylation at single CpG sites where the results from the modified SVA analysis were included as a covariate. The cpg.assoc function calculates the association of each CpG site to the variable in question and reports effect sizes, standard errors and multiple-testing-adjusted P-values (as well as unadjusted). To account for multiple testing a false discovery rate (FDR) procedure using the Benjamini-Hochberg method 55 was applied where values below 0.05 were considered significant. In the analysis of pubertal age, chronological age and sex were also included as covariates in the modified SVA and the analysis was performed as a paired analysis.
Differentially methylated regions were investigated using the DMRcate package 56 . DMRcate identifies and ranks the most differentially methylated regions across the genome based on tunable kernel smoothing method. A bandwidth of 1000 nucleotides (lambda = 1000) and a scaling factor of 2 (C = 2) were used as recommended by the authors of the DMRcate package 56 and results were corrected for multiple testing by using the Benjamini-Hochberg method 55 . Annotation was done according to hg19.
Heat maps were drawn with the Heat plus package in RStudio and venn diagrams and trees were produced with the InteractiVenn web-application 57 .
Scientific RepoRts | 6:28657 | DOI: 10.1038/srep28657 Gene ontology analysis of differentially methylated CpGs was performed with the gometh function within the missMethyl package in R. The gometh function adapt the goseq method of Young et al. 58 and accounts for the number of probes per gene and subsequently performs a modified hypergeometric test. Significant gene ontology terms (p-value < 0.05) were then imported into REVIGO (RRID:SCR_005825) for visualization in a semantic space 59 .
Prediction models. To predict pubertal age and chronological age, the elastic net regression model, which estimated the most predictive of the 467,873 DNA methylation probes, was fitted using the glmnet package 60 in R. The prediction error (standard deviation of residual error from lambda.min) was estimated using 10-fold cross validation. Age at sampling was included as a non-penalized covariate in the regression model to adjust for the age-dependent sampling when pubertal age was analysed. Analysis of biological aging was performed unadjusted.
Immunohistochemistry. Immunohistochemistry was performed with microwave de-masking in citrate buffer and as described before 61 . Tissue was obtained from the bio bank at the Department of Growth and Reproduction and the primary antibody bought from Abnova (Walnut, CA, USA, Cat. nr. PAB29450) and used in a 1:50 dilution. As negative controls the primary antibody was omitted (Supplementary Figure 4). In total, seven normal adult testicular samples, two pre-pubertal testicular samples, and three ovarian samples were investigated by TRIP6 immunohistochemistry. 2) in girls. The protocol from the manufacturer was followed and the samples were measured as single measurements whereas the standards were measured in duplicates. The manufacturer reports that the inter-and intra-assay CV is less than 15%, the sensitivity is 0.1 ng/ml, the detection rage is 0.625-20 ng/ml and that no cross-reactivity were observed. All samples were within or diluted to fit the detection range. The standard curve was used to convert intensities to concentration and the data analysed by a Wilcoxon Signed Rank test.