Introduction

Relationships have been identified between adult stature and geographical location; in particular these concern latitude, with adult height increasing as distance from the equator increases.1, 2, 3 This phenomenon is seen in Europe: people from North Europe are taller than people from South Europe.4, 5 In addition, genetic studies have indicated that the height differences between these closely related populations can be partially explained by multiple minor genetic differences.4 This raises the possibility of a relation between genotype and geography that influences height.

There are few published data in children regarding geographical height gradients.1, 3 A latitudinal gradient has been shown for height in Japanese children from North to South Japan. This may relate to a difference in day length as this is an important predictor of height in early and late adolescence.3

Human growth is a complex process, with variations in growth rate occurring in the short-term over weeks and also seasonally.6, 7 Growth rate is greatest during the summer, suggesting a relationship with day length.8, 9, 10 However, little is known regarding growth rate in children with short stature living at different latitudes, and there are few published data analysing growth rate with respect to number of daylight hours. One report from the National Cooperative Growth Study in children with growth hormone deficiency (GHD) on recombinant human growth hormone (r-hGH) therapy, has shown that there was a ‘seasonal’ variation in growth at all latitudes, with summer annualised height velocity (HV) being greater than winter HV.11 This difference was greatest in the first year of therapy but persisted in the subsequent years. This difference also increased with distance from the equator and correlated with number of daylight hours across different latitudes. Most recently, growth rate in GHD children in response to r-hGH treatment has been considered in terms of genetic background. Carriage of specific genetic markers has been associated with both high and low growth responses to r-hGH in GHD children.12

The aims of the present study were (A) to assess whether living at different latitudes with different numbers of summer daylight hours impacted on annual HV in children with GHD treated with r-hGH, (B) to investigate the possible interaction between summer daylight and growth-related genetic markers on HV12 and (C) to use the difference in gene expression profile associations to identify pathways and hence mechanisms associated with this interaction.

Subjects and methods

Study design

Children were enrolled from the PREDICT long-term follow-up prospective study, which was conducted at 41 sites in 14 countries worldwide (NCT00699855; Merck Serono SA—Geneva). In the current study, we have analysed the data on first-year response to r-hGH from the PREDICT long-term follow-up study, which uses a pharmacogenomic approach to evaluate the association of single-nucleotide polymorphisms (SNPs) in growth and metabolism genes with long-term changes in growth whilst on r-hGH therapy.12

The following data had been collected: location of the study centre, ethnicity, gender, birth weight, parental heights, peak serum GH response to stimulation testing, age, height, weight and r-hGH dose at baseline and height and weight after 1 year of treatment. Absolute latitude was assumed to be that of the study site. For each patient, annualised HV (cm per year) at the first year of r-hGH therapy was assessed. To look at mechanisms underlying the latitude effect, the average number of daylight hours during summer at each centre was obtained using data obtained from the Geographical Information System and mesh climatic data.13 The number of daylight hours depends both on date and latitude. At the sites in the Northern hemisphere, the three months with the longest days, in terms of daylight hours, were classed as summer (May 6 to August 6).11 In the sites of the Southern hemisphere, summer period was considered to extend from November 6 to February 5.

Patients were categorised into three groups based on the distribution of their locations and the corresponding summer daylight exposure (SDE) (as high (>75th percentile of the cohort: 15.5 h), intermediate (between the 25th and 75th percentiles: >15.5 to <14.3 h) and low (<25th percentile: 14.3 h) exposure) (Figure 1a). Basal gene expression was correlated with SDE and HV (cm per year) using rank regression. Network models were constructed from the gene expression overlap between these two regressions, and highly connected regions were assessed to define biological functions.

Figure 1
figure 1

(a) The absolute latitude at each study site extracted from the GIS.13 Patients were divided into three groups according to the summer daylight exposure (SDE) at their site: high (>75th percentile: 15.5 h; n=22), intermediate (between the 25th and 75th percentiles: >15.5 to <14.3 h; n=73) and low (<25th percentile: 14.3 h; n=23) SDE. (b) Distribution of the summer daylight exposure groups according to the latitude of the centres: four centres with lower SDE and nearest to the equator (low latitude), 10 sites had highest SDE and were farthest from the equator (high latitude) and 14 sites were in the intermediate latitude group. Number of patients at each site is shown. The three groups are coloured differently.

PowerPoint slide

The growth response was then analysed both by SDE and also by carriage/non-carriage of SNPs previously associated with a high growth response.12

This study was conducted in compliance with ethical principles based on the Declaration of Helsinki, the International Conference on Harmonisation Tripartite Guideline for Good Clinical Practice, and all applicable regulatory requirements.

Patients

One hundred and eighteen patients (75 boys, mean age±s.d. scores (SDS): 8.9±3.3; 43 girls, 8.5±3.1) with GHD from a per-protocol population at 1 year were studied. Seven patients were excluded from the population: for three patients, no information on geographical location was available; for four patients, the country of origin did not match that of the study site. All patients were naïve to r-hGH therapy and pre-pubertal at the start of treatment. The diagnosis of GHD was based on two different stimulation tests with a peak GH <10 μg l−1.12 The median peak GH value was 4.1 μg l−1. Patients with GHD received r-hGH at an average dose of 0.035 mg kg−1 per day. Compliance was monitored by recall in the last month of the study and was estimated at an average of ~90% of patients. The great majority of the cohort had isolated GHD; however, a small number of children (<7%) had additional hormone deficiencies (thyroid-stimulating hormone, adrenocorticotropic hormone) and were treated with replacement therapies during the study, this group was considered too small to undertake sub-analyses. During the year of r-hGH treatment 31 patients entered puberty (Tanner stage 2).

Country of origin

Data were submitted from 28 study centres in 14 countries (Figure 1b). The population included 77% Europeans and 33% patients originating from countries across the world including Australia, Canada, Argentina, Taiwan and Korea.

Growth parameters

For each patient, annualised height gain and annualised HV (cm per year) at the first-year visit (12 ±3 month of first-year treatment anniversary) were calculated, converted to SDS14 and expressed as delta (Δ) height SDS and HV SDS. Body mass index (BMI) was calculated as the weight in kilograms divided by the square of the height in metres and converted to SDS using published reference values15 and expressed as ΔBMI SDS.

Target height (TH) was calculated as previously shown and converted to SDS.16 In addition, to take into account the patient’s genetic potential for growth, distance to TH SDS, which is the difference between the child’s height SDS and TH SDS,17 was calculated.

Genetic markers associated with height change in children with GHD

Analysis was performed using seven polymorphisms within five different genes previously associated with high growth response.12 These included the gene coding for the major GH-dependent carrier of insulin-like growth factor I in the circulation, IGFBP-3; a signalling molecule, GRB10; the growth factor TGF-α; the tumour suppressor TP53; and CYP19A1, a P450 cytochrome enzyme with aromatase activity. For each polymorphism, the difference in growth between alleles or genotypes was >1 cm over the first year, representing ~20% of first-year increment in growth.12 For all the SNPs, the analysis was conducted using full genotypes based on the presence or absence of the major allele (dominant model), and the presence or absence of the minor allele (recessive model).

Statistical analysis

All auxological data were expressed as median and inter-quartile ranges [median (Q1,Q3)]. Uncorrected P-values <0.05 were considered statistically significant. Statistical analysis was performed using the Statistical Package for Social Science program, version 20.0 software for Windows (SPSS, Chicago, IL, USA). Differences in continuous variables were examined for unpaired samples by the Kruskal–Wallis test, whereas differences in categorical variables were assessed by Fisher’s exact test. Correlations between variables were assessed by Pearson’s correlation coefficient. Partial least squares regression (PLSR) was applied to overcome multi-co-linearity between variables.18 By using PLSR, the ‘variable important for projection’ coefficients were computed and a value of <0.8 was considered to be small and not contributing significantly to the prediction model.19 To examine which variables had a major impact on the prediction of HV, independent variables were used, including latitude and summer daylight, GH peak, r-hGH dose, BW SDS, baseline BMI SDS, distance to TH SDS, age and gender; HV was used as the dependent variable.

To investigate the impact of carriage of the growth-related SNPs and summer daylight on HV, a generalised linear model was used, with an interaction term for carriage/non-carriage of the growth-related SNPs and SDE, which was modelled as a fixed effect. HV (cm per year) was considered as the dependent variable with covariates for multiple other variables: gender, GH peak, r-hGH dose, BW SDS, baseline age, BMI and distance to TH SDS. Significance of each term was tested using an Anova type III.

Transcriptome analysis

Gene expression profiling was performed at baseline on whole-blood RNA extracted centrally by qLAB (Edinburgh, UK) using the PAXgene 96 blood RNA kit (Qiagen, Crawley, UK). Reduction of globin messenger RNA was undertaken using the Ambion GLOBIN Clear Human Kit (Life Technologies, Paisley, UK). Complementary RNA was generated using the Two-Cycle Eukaryotic Target Labelling Kit (Affymetrix, Santa Clara, CA, USA) and a final quality check performed before hybridisation to Affymetrix GeneChip Human Genome U133 Plus 2.0 Arrays. Arrays were then scanned on an Affymetrix GeneChip 7G scanner and assessed for quality against internal and hybridisation controls. All analyses were performed centrally by the Bioinformatics Group at Merck Serono.

Processing and normalisation of gene expression data were performed using a Robust Multi-array Average background correction modified for probe sequence with quantile normalisation and median polish (Partek Genomics Suite, version 6.3, St Louis, MO, USA). Confounding effects due to variations in cell populations and outliers were examined by cross-validation using principal component analysis and iso-map multidimensional scaling (Qlucore Omics Explorer 2.2, Qlucore, Lund, Sweden).

Correlations between basal gene expression with summer daylight and HV were assessed using rank regression with and without all the variables included in the PLSR model as confounding factors (Qlucore Omics Explorer 2.2).

Network analysis of transcriptomic data

Network analysis is directed towards the identification and prioritisation of key functional elements within interactome models. An interactome model of all known protein–protein interactions between the differentially expressed genes as ‘seeds’ and their inferred immediate neighbours was calculated using BioGRID database (31.2.114).20 Network processing was performed using Cytoscape 2.8.3.21

The ModuLand plugin for Cytoscape 2.8.3 was used to determine overlapping modules within the network and to identify hierarchical structure within the model thus enabling the identification of key network elements and prioritise biological function.22, 23 Network modules were prioritised for further investigation by their centrality property and the most central set of 10 genes within each module was used to assess associated biological pathways using the geneontology.org database.24 The network structure observed with community modelling in Moduland was confirmed by cluster analysis using the ClusterOne algorithm.25 Cluster robustness was tested by random sample removal.26

Causal Network Analysis was performed within the overlap of associated gene expression between SDE and HV. Causal Network Analysis identifies upstream molecules up to three steps distant that control the expression of the genes in the data set, and thus provides insight into information flow within the network.27

Results

Clinical characteristics varied over 1 year of r-hGH therapy

Subjects were classified on the basis of SDE groups: as high, intermediate and low SDE (see Methods). At the start of treatment, the age, height and BMI SDS, peak GH and administered dose of r-hGH were not significantly different between these groups (Table 1). TH SDS and distance to TH SDS were greater at high compared with intermediate and low groups (P<0.05).

Table 1 Clinical characteristics of the study population and response to r-hGH defined by SDE groups

After 1 year of r-hGH therapy, GHD patients from locations with higher SDE had a greater 1-year growth response than those from locations with intermediate and lower SDE (P=0.019 for HV (cm per year); P=0.024 for HV SDS; P=0.017 for Δ height SDS) (Table 1).

Relationships between summer daylight and growth rate

When the relationship between HV and summer daylight was evaluated, there was a significant correlation between HV (cm per year) and summer daylight (r=0.256, P=0.006; Figure 2).

Figure 2
figure 2

Correlation between HV (cm per year) and summer daylight. The variables are expressed as natural logarithm (Ln).

PowerPoint slide

To identify which variables (baseline characteristics, latitude and SDE) had the greatest effect on HV, a PLSR analysis was used, which accounts for multi-co-linearity between variables. Four variables had a variable important for projection value >0.8 – in order: GH peak, baseline age, summer daylight and distance to TH (Figure 3a). The effect of the other variables, including latitude, was smaller and not considered significant. Inclusion of SNP data into the PLSR analysis confirmed that genotypes as well as summer daylight had a significant impact on HV (Figure 3b).

Figure 3
figure 3

The importance of each variable to the prediction of HV (cm per year) was assessed by partial least squares regression (PLSR). Variables were plotted according to their importance in the prediction of HV. A cutoff (dashed line) of 0.8 has been used to identify ‘important‘ variables. High values indicate that the variable has high impact in the prediction of HV. Panel (a) shows the effect of latitude, summer daylight and the main clinical variables; and panel (b) includes the effect of genotypes.

PowerPoint slide

Interactions between carriage/non-carriage of growth-related SNPs and SDE: gene–environment interaction

To test the relationship between genetic and environmental factors, a generalised linear model was fitted. The effect of genotype (carriage vs non-carriage of the SNP) and group (high vs intermediate vs low SDE) on 1-year HV (cm per year) was investigated (Table 2). There was no difference in genotype frequency between the three summer daylight groups. HV was affected by a significant interaction between the carriage of a high growth response SNP and summer daylight for SNPs within GRB10, IGFBP-3, TGF-α, CYP19A1 and TP53 (interaction P-value <0.05 for each gene) (Table 2). HV SDS was also tested as a dependent variable in the generalised linear model and results were similar (data not shown).

Table 2 Generalised linear models showing differences in HV (cm per year) by genotype and SDE group

Directional differences in growth response

For each SNP, the direction of the impact on growth response in each SDE group was analysed (Figure 4). The difference in HV (in centimetres) between carriers and non-carriers for IGFBP-3, TGF-α and TP53 SNPs was greatest in those exposed to the highest number of summer daylight hours, which corresponds to the higher latitudes (Figure 4a). In contrast, for GRB10 and CYP19A1, the difference in HV between carriers and non-carriers was higher in those exposed to the lowest number of summer daylight hours, corresponding to lower latitudes (Figure 4b).

Figure 4
figure 4

Delta HV (ΔHV, cm per year) between carriers and non-carriers for each SNP by summer daylight group is shown. This relates to the results from the generalised linear model, evaluating a carriage (carriage vs non-carriage of the SNP) and group (high vs intermediate vs low summer daylight exposure (SDE)) effect on 1-year growth velocity (cm per year). P*=significant (<0.05). In (a) the ΔHV is greatest in the group with the highest number of SDE hours (for IGFBP-3, TGF-α and TP53). The difference in HV varies among SDE groups, ranging from 2.4 to 1.6 cm per year at higher to 0.0–0.7 cm per year at lower SDE. In (b) the ΔHV is greatest in the group with the lowest number of SDE hours (for GRB10 and CYP19A1). The difference in HV ranged from 2.4 to 1.7 cm per year at lower to 0.2 to 1.8 cm per year at higher SDE.

PowerPoint slide

Network analysis of transcriptomic data

The expression of 1868 and 4098 genes was correlated with SDE and HV, respectively, with no covariates; 397 genes were present in both data sets (overlap, P=0.0015) (Figure 5a) [n=60, Gene Expression Omnibus GSE72439].

Figure 5
figure 5

Overlap of gene expression associated with both HV and SDE and subsequent network analysis of the common genes [n=60 GHD patients]. (a) Venn diagram of overlap (397, P=0.0015 hypergeometric test) between genes correlated with SDE (1868) and HV (4098) (no covariates, rank regression P<0.05). (b) Overlapping gene set correlated with SDE and HV using the same covariates as used for the partial least square regression (gender, GH peak, r-hGH dose, BW SDS, baseline age, BMI and distance to TH SDS) (Supplementary Table S1) was used to generate an interactome model. Clusters of related genes were identified within the interactome model using the Moduland algorithm and a network of the cluster modules was generated (shown) where the different coloured octagons represent clusters and the gene name is the most central gene element within that cluster (Supplementary Table S2). (c) Biological pathways associated with the overlap between the clusters were identified using the Geneontology.org database (hypergeometric test with a Benjamini–Hochberg false discovery rate (FDR) modified P-value).

PowerPoint slide

The subset of common genes (143 unique genes with GH peak, r-hGH dose, BW SDS, baseline BMI SDS, distance to TH SDS, age and gender as covariates (Supplementary Table S1)) was used to generate an interactome model on which network analysis was performed (Figure 5b).

Network clusters were identified as markers of biological function23, 28 and robustness was confirmed using random sample removal (Supplementary Figure S1). The biological functions associated with the clusters were centred on chromatin remodelling (PHF21A), signal transduction (TLK2) and transcriptional regulation (MLLT10 and MSX1) (Supplementary Table S2 and Figure 5b). Biological functions represented by the network clusters included ‘gene expression’ (P=2.5x10−12), ‘metabolic process’ (P=5.4 × 10−09), ‘circadian rhythm’ (P=2.6 × 10−03; Supplementary Table S2 and Figure 5c).

The causal network analysis (Figure 6a) of upstream regulation identified a set of transcriptional regulators with similar action in relation to both summer daylight and HV (using hierarchical clustering, Supplementary Table S3). The transcription factor NANOG had positively correlated expression with both SDE and HV and was the primary target of the regulators within the causal networks (Figure 6b).

Figure 6
figure 6

Causal analysis and mechanistic modelling of the subset of genes common to both HV and SDE. (a) Causal network analysis takes the genes with altered expression (examples numbered 1–5, green (low expression) and red (high expression)) and identifies upstream molecules up to three steps distant. This approach provides insight into information flow within the network using the known literature to identify network edges linking to upstream regulators (a–c) and master regulators (A), for which there is statistical evidence (Fisher’s exact test) to support a corresponding causal relationship (within Ingenuity Pathway Analysis software). The most significant causal edges between regulators are then used to construct networks downstream of a ‘master’ regulator to indicate possible mechanisms. (b) Regulators of gene expression with matched action in both HV and SDE were identified by causal network analysis and hierarchical clustering of results (Supplementary Table S3). These data were mapped onto the clusters identified within the network model of the overlap of gene expression and implicated NANOG as a prime target of regulation. Grey=opposing correlated expression with HV and SDE, green=negatively correlated with both HV and SDE, red=positively correlated with both HV and SDE, uncoloured=inferred interaction, orange=predicted activated regulator, blue=predicted inhibited regulator.

PowerPoint slide

The expression of NANOG correlated positively (R=0.27, P=0.03) with the expression of IGFBP-3, a gene where the rs3110697 SNP has the greatest effect on HV at higher levels of SDE (Figures 4a and 7a). The expression of NANOG also correlated negatively (R=–0.24, P=0.05) with the expression of GRB10, a gene where the rs1024531, rs12536500 and rs933360 SNPs have the greatest effect on HV at lower levels of SDE (Figures 4b and 7b).

Figure 7
figure 7

Correlation of NANOG expression with gene expression from genes with single-nucleotide polymorphisms associated with both height velocity and summer daylight exposure. NANOG gene expression probeset 220184_at correlated against: (a) IGFBP-3, a gene where the rs3110697 SNP has a greater effect on height velocity at higher levels of SDE (Figure 5), using gene expression probeset 210095_s_at and (b) GRB10, a gene where the rs1024531, rs12536500 and rs933360 SNPs have a reduced effect on height velocity at higher levels of SDE (Figure 5), using gene expression probeset 215248_at. Analysis performed with the same covariates as used for the PLSR (gender, GH peak, r-hGH dose, BW SDS, baseline age, BMI and distance to TH SDS). Red (high) to green (low) gradation of colour represents level of expression of NANOG.

PowerPoint slide

Distance from light-modulated pathways

To address the relationship of light modulation to gene expression within the overlap data set we considered three pathways (i) circadian rhythm, (ii) melatonin and (iii) vitamin D signalling.

To assess how close these pathways were to the network of genes identified, we calculated the ‘shortest path’ between all primary network regulators identified by causal network analysis and the melatonin, vitamin D and circadian rhythm pathways (Supplementary Table S4). The melatonin pathway was significantly closer to the network model of the overlap of SDE (Supplementary Figure S2A) and HV (Supplementary Figure S2B) gene expression (P<0.01) than either vitamin D or circadian rhythm pathways.

Discussion

In the present study we observed that growth response is influenced by both the genetic background and geographical location. This is the first study, to the authors’ knowledge, which has shown an interaction between growth-related polymorphisms and SDE in influencing growth response in patients treated with r-hGH therapy.

First-year growth velocity was significantly greater at locations with longer SDE, which correspond to higher latitudes (Table 1). It is important to consider whether there are any confounding factors influencing this relationship. First, those patients in locations with longer SDE had a lower baseline and ΔBMI SDS over the year, but neither was significant. If BMI was assumed to be an index of food intake and used as a marker of nutritional status, as previously shown,29 then a lower baseline and ΔBMI SDS should be associated with a poorer growth rate rather than a higher rate.30

Second, GH peak levels in GH stimulation tests were used in this study to define GHD severity. Such tests are recognised to have low-diagnostic specificity for GHD.31 However, all patients underwent two different stimulation tests, and both tests needed to have generated a peak GH level <10 μg l−1 for the child to be included in the study. It is possible that at a later date some of these patients may retest with a normal peak GH level but that at the time of the study they were all ‘biochemically’ GHD. Importantly, across the SDE groups GH peak levels were not significantly different and therefore any inadequacy in the GH testing would apply equally across the groups (Table 1).

Third, another potential bias was the onset of puberty. Thirty-one patients entered puberty (Tanner stage 2) over the year of r-hGH treatment. However, there was no difference in the proportion of patients entering puberty across the three SDE groups (Table 1), indicating that any pubertal influence on growth would occur in all three groups. In addition, the pubertal acceleration in growth does not occur at the start of puberty in boys and does not reach maximal impact until Tanner breast stage 3 in girls. Therefore, puberty will have had a minimal impact on growth rate in the first year of treatment.32 In fact children living at higher latitudes/higher SDE had the lowest percentage of patients entering puberty, yet the highest growth rate (Table 1). Overall this indicates that differences in nutritional intake, severity of GH deficiency and puberty onset in our cohort of GHD children cannot account for the geographical gradient in the growth patterns and that other factors are involved.

In the examination of mechanisms related to the latitude effect, a positive correlation was found between HV and summer daylight and this was further supported by the PLSR analysis showing that summer daylight, but not latitude had a significant linear effect on growth rate. Several potential mechanisms explaining the association between daylight and height in children in Northern vs Southern Japan have been described by Yokoya et al.3 Climatic variables of temperature, solar radiation and day length were analysed. It was shown that day length, but not other climate variables, was the primary predictor for the geographical gradient in body height. Differences in melatonin secretion due to variation in day length were proposed to explain the geographical variation in height,33 inhibiting sexual and skeletal maturation.34 The link between day length and melatonin secretion resides in the eye, a light-sensitive organ whose function is to maintain circadian and seasonal rhythms,35 as facilitated by retinal communication via neural tracts with the pineal gland.35, 36 A circadian clock has been found to be primarily mediated by melanopsin-containing retinal ganglion cells,36, 37, 38 which are intrinsically blue-light sensitive.39 A number of studies have also provided support for a link between seasonal changes in daylight and physiological alterations, including human growth,40, 41 with the existence of a seasonal variability in growth patterns in normal children. Growth appears to speed up during times of greatest daylight exposure and slow down during periods of darkness.7, 8, 9, 11 The same phenomenon has been demonstrated in impaired growth with the exogenous administration of r-hGH. A report from the National Cooperative Growth Study has shown the existence of a seasonal variability in children with GHD on r-hGH therapy. Growth rate was greatest during summer and correlated with different numbers of daylight hours.11

Melatonin is related to the glucocorticoid receptor (GR) pathway and there is a well-established link between the circadian cortisol cycle and growth rate.42, 43 Therefore, the GR pathway modulation is a possible mechanism by which daylight could modulate growth response. We previously found that the expression of several genes involved in the GR pathway were correlated with both 1 month insulin-like growth factor I generation44 and 1 year growth response to r-hGH.12 We also showed that variation in the gene expression of GR pathway members relates to phase of growth in normal children.45 It is likely that a complex interaction between melatonin, GR pathways and growth mechanisms is generating these seasonal differences.

Gene expression (transcriptomic) data were studied on whole blood to explain biological mechanisms related to the interaction between SDE and HV. The transcriptome of whole blood has a substantial overlap with other human tissues.45, 46 A number of studies have characterised mononuclear cells in whole blood as a growth-responsive tissue and an appropriate model to study GH action.44, 47, 48 There is also a significant overlap between the regulation of lymphoid cell function and the classic growth pathways, as mononuclear blood cells have been shown to share important growth-related genes involved in both T/B cell proliferation and the regulation of bone development.49, 50, 51, 52 We are, therefore, confident that analysis of GE from whole blood gives us an opportunity to define potential mechanims for the SDE effect. The analysis of correlated GE profiles between SDE and HV (Figure 5c) has suggested a role of the circadian clock pathway. Mapping gene expression correlated with SDE and HV onto the canonical circadian rhythm pathway showed gene expression clustering around CREB activity (Supplementary Figure S2). In relation to light-sensitive biological pathways further analysis of network properties showed that the melatonin pathway was closer to SDE and HV correlated gene expression than the vitamin D and circadian rhythm pathways, but no distinct network modules were identified to support a direct involvement of these pathways.

Causal network analysis identified NANOG as a primary target for both SDE- and HV-related regulatory pathways. NANOG has been implicated in the development of circadian oscillator action,53 shown to reduce adipogenesis54 and enhance bone growth.55 MSX1 was the most central gene in a transcription factor related network cluster (Figure 6b) and has been recently found to induce light-controlled cell growth and tail development in vivo in vertebrate models.56 Both NANOG and MSX1 are associated with the regulation of growth-related pathways, including the Wnt, MAP/ERK and CREB pathways. Mutations in the MAP/ERK signalling pathways have been implicated in the aetiology of human RASopathies, dysmorphic syndromes presenting with short stature57 and the CREB pathway is involved in somatic growth and bone development,58 along with the hormonal hypothalamic-pituitary regulation. CREB-mutant mice have been found to have reduced postnatal growth consistent with dwarfism caused by GHD, owing to a reduction of GHRH expression.59 This network analysis has established a mechanistic hypothesis for the role of NANOG in differential growth response at different latitudes.

Some limitations of the present study exist, such as the lack of 25-hydroxyvitamin D measurement as a reflection of ultraviolet light exposure and the diversity of children enrolled at any one site. However, to reduce any bias, all the analyses were corrected for distance to TH and BMI and only ethnicity consistent with the country of origin was considered.

It is unlikely that only environmental variables can explain the geographical gradient in growth as its inter-individual variation depends also on the child’s genetic constitution. The relationship between mid-parental height and the final height of offspring has been shown to explain 40% of the sex- and age-adjusted height variance in normal growth.60 We found greater TH at higher latitudes, reflecting the fact that body size and adult stature are associated with latitude.2, 5, 61, 62 To assess whether the environmental or genetic factors had the greatest impact on growth response, a PLSR analysis was performed that accounts for co-linearity between the data and allows distinct contributions to the variance within the data to be identified. SDE and distance to TH were demonstrated to have similar effects in the PLSR analysis (Figure 3a), which included variables that were utilised in previous GHD-specific models of growth.63 These previous models have incorporated surrogate genetic markers, such as parental heights and, although using identified clinical or biochemical factors that influence growth response to GH, explain only 40–61% of GH responsiveness over the first year of therapy.63 These models have not incorporated any primary genetic information. This implies that further parameters, such as specific genetic markers, could be included to improve current prediction models.

Recently, the main genes involved in the determination of human height have been identified in normal individuals.64 Analysis of 183,727 individuals by the Genetic Investigation of ANthropometric Traits (GIANT) consortium has identified 180 loci that influence adult height, which could also be potential candidates for growth and markers of GH responsiveness. The PREDICT long-term follow-up study has also highlighted genes involved in growth response and has revealed that carriage of a range of genetic markers is associated with change in insulin-like growth factor I over the first month of r-hGH treatment44 and HV over the first year of treatment.12 The inclusion of these genetic markers into the PLSR analysis confirmed that they had a significant impact on HV, along with distance to TH, implying that genetic contribution and carriage of specific growth-related SNPs is important in determination of growth response in children treated with r-hGH (Figure 4).

This study also showed an interaction between the carriage of high growth response SNPs and SDE within five genes involved in growth pathways and also known to affect adult height.64 These genes include GRB10, IGFBP-3, TGF-α and TP53 involved in IGF-1 system and cell growth and CYP19A1 in oestrogen synthesis. To assess the latitudinal effect we used different SDE groups, showing a gene–environmental interaction, which leads to differential growth response for children carrying the same SNP at different latitudes. Specifically, for IGFBP-3, TGF-α and TP53 the difference in HV between carriers and non-carriers was increased at longer summer daylight hours. In contrast, for GRB10 and CYP19A1 we showed the converse. Although in several conditions a gene–environment interaction has been shown causing variation in the phenotypic effect,28 in human growth it has only been hypothesised65 and this study suggests the existence of an interaction between environmental and genetic factors.

In conclusion, the present report suggests that growth response in GHD children involves a complex gene–environment interaction. The growth response to r-hGH appears to be related to both daylight exposure and to gene polymorphisms, with the magnitude and direction of the interaction being gene dependent. In addition, the gene expression data suggest that pathways related to the circadian clock and in particular the transcriptional regulator NANOG may contribute to mechanisms that link this gene–environment interaction.