Abstract
Recombinant human growth hormone (r-hGH) is used as a therapeutic agent for disorders of growth including growth hormone deficiency (GHD) and Turner syndrome (TS). Treatment is costly and current methods to model response are inexact. GHD (n = 71) and TS patients (n = 43) were recruited to study response to r-hGH over 5 years. Analysis was performed using 1219 genetic markers and baseline (pre-treatment) blood transcriptome. Random forest was used to determine predictive value of transcriptomic data associated with growth response. No genetic marker passed the stringency criteria for prediction. However, we identified an identical set of genes in both GHD and TS whose expression could be used to classify therapeutic response to r-hGH with a high accuracy (AUC > 0.9). Combining transcriptomic markers with clinical phenotype was shown to significantly reduce predictive error. This work could be translated into a single genomic test linked to a prediction algorithm to improve clinical management. Trial registration numbers: NCT00256126 and NCT00699855.
Similar content being viewed by others
Introduction
Recombinant human growth hormone (r-hGH) is used as a therapeutic agent for a range of disorders of growth impairment including growth hormone deficiency (GHD) and Turner syndrome (TS). Treatment is costly between £6000 and £24000 per centimetre (cm) gained in final height [1]. Therapy is not always successful in patients and there are currently no genomic markers for predicting positive or negative responses. Prediction models up to four years of therapy have been defined using clinical measurements [2] but have been difficult to implement in practise. Whilst an understanding of the pharmacogenetic background has been established [3, 4], such approaches are of limited predictive value due to the influence of covariates related to the child’s developmental stage, disease severity and geographical location [5, 6]. The pre-treatment blood transcriptome has been previously shown to relate to first year response to r-hGH therapy [7]; however, little is known about the predictive value of this association and its relationship to longer term response to therapy. The transcriptome represents a level of ‘omic’ data that reflects genetic information, developmental stage in relation to age [8] along with the impact of the local environment [6] and, therefore, has potential to classify response to r-hGH.
Response to r-hGH in the first year of therapy is considered to be a primary marker of growth response. Prediction of first year growth has been shown to be dependent on GHD severity, age, distance to target height, body weight, dose of r-hGH, birth weight and, as defined by regression models, can account for 61% in GHD [9] and 46% in TS [10, 11] of the variation within the data. Clinical markers such as distance to target height are surrogate genetic variables and this implies that an effective level of genomic prediction is hypothesised to be possible if developmental [8, 12] and environmental covariates [13] of growth response can be taken into account.
Transcriptomic data have been used extensively in cancer tissues both to sub-type the tumour [14,15,16] and to predict response to therapies [17, 18]. In contrast, in this study, we have used peripheral blood gene expression profiling as the source for gene expression profiles, and show that these patterns can be used to predict response to r-hGH in each year of treatment up to 5 years in two different growth disorders that account for ~60% of GH prescriptions in the USA, Europe and the UK [19, 20].
Methods
Patients
The PREDICT Long-term Follow-up study (multicentre, open-label, prospective, phase IV) and the pharmacogenetics of the first year of r-hGH treatment have been described extensively previously [7, 21]. Briefly, pre-pubertal children with GHD and TS were enroled. A diagnosis of GHD was reached following two pharmacological stimulation tests with a peak GH concentration of <10 µg/L. Prior to enrolment in the study, none of the children had received GH therapy. Children with GHD due to central nervous system tumours or radiotherapy were excluded but children born small for gestational age were not (15% of the cohort [22]). The majority of GHD patients had isolated GHD but five were treated with both thyroxine and hydrocortisone replacement, three with just thyroxine and one with just hydrocortisone [22]. The diagnosis of TS was based on karyotype.
This PREDICT study was conducted in compliance with ethical principles based on the Declaration of Helsinki, the International Conference on Harmonization Tripartite Guideline for Good Clinical Practice and all applicable regulatory requirements. The PREDICT (NCT00256126) and PREDICT Long-term Follow-up (NCT00699855) studies were approved by the Scotland Medical Research and Ethics Committee (reference 05/MRE10/61) and the North West Research Ethics Committee (reference 08/H1010/77), respectively. Informed consent was obtained from parents for all study participants.
Genetic analysis
A total of 1219 genetic markers were used in the analysis, 1217 Illumina-genotyped single-nucleotide polymorphisms (SNPs) corresponding to a candidate list of 103 genes and 2 TaqMan-genotyped SNPs in the IGFBP3 promoter. All genes selected are known to be involved in growth regulation and GH action as previously described [5, 7].
A Kruskal–Wallis rank sum test was applied on the following three genetic models (a) genotypic (AA, AB, BB); (b) dominant (AA/AB + BB) and (c) recessive (AA + AB/BB). For non-pseudoautosomal X chromosome markers, GHD boys and TS girls were analysed as having only two homozygote categories (AA/BB). Adjustment for multiple testing was performed using Bonferroni correction with two different parameters as the number of independent tests, the number of linkage disequilibrium (LD) blocks in the gene in which the SNP is contained and the total number of LD blocks present in all genes (768 in GHD and 563 in TS). Filtering criterion for prediction was defined as a false discovery rate (FDR) modified p value < 0.05 unmodified for LD blocks.
Transcriptome analysis
Transcriptomic profiling was carried out on whole blood RNA as described previously [7] using Affymetrix GeneChip Human Genome U133 plus 2.0 Arrays. For background correction, the Robust Multichip Average was applied with quantile normalisation and a mean probe set summarisation using Qlucore Omics Explorer 2.3 ([QOE] Qlucore, Lund, Sweden). The data set generated was subject to quality control to investigate the presence of outliers and further confounding effects.
Baseline gene expression associations with height velocity in each year of growth response were determined using rank regression with microarray batch, age, body mass index (BMI) at baseline as covariates (eliminated factor function in QOE) for both GHD and TS patients along with gender and peak GH test response (average of two provocative tests) for the GHD patients. Over the study, a number of children either entered puberty spontaneously or received exogenous sex steroids for pubertal induction. We therefore introduced a further normalisation for Tanner stage to the analysis to account for the proportion of children entering puberty in each year of the study.
Generation of network models
Network analysis allows the identification and prioritisation of key functional elements within interactome models. To derive an interactome model differentially expressed genes were used as ‘seeds’ and all known protein:protein interactions between the seeds and their inferred immediate neighbours were calculated to generate a biological network using the output of the Biogrid model of the human Interactome (3.3.122) [23]. Network generation and processing were performed using Cytoscape 2.8.3 [24].
Analysis of gene network models
Clustering and ‘community structure’ of modules within biological networks arise from variation in connectivity within the network and are known to be associated with function [25, 26]. To rank these functional components within interactome models, we used the ModuLand plugin for Cytoscape 2.8.3 to determine overlapping modules and to identify hierarchical structure using the centrality property thus enabling the identification of key network elements [27]. The central core unit of each network module (metanode) was defined as the ten most central genes. A list of the unique genes in each metanode was generated and used as a model of the functional core of the associated network for further comparison. Network topology was analysed using the CytoHubba plugin for Cytoscape [28]. The String database was used to assess the integrity and connectivity of gene modules [29].
Analysis of epigenomic data
Previously published methylation profiles of six GHD patients were used to assess the relationship of changes in DNA methylation in relation to response to r-hGH [30]. The data from GSE57107 were re-analysed in Qlucore Omics Explorer 3.3 and a median-based gene level summary of methylation was determined (n = 20,618). The relationship between gene level DNA methylation and response to r-hGH was determined using rank regression.
Classification of growth response
All analysis was performed using the statistical software R 3.3.2 [31]. The relationship of baseline gene expression to potential predictive value (classification of low and high quartiles of response) was performed using Discriminant Analysis of Principal Components (DAPC) [32], Partial Least Squares Discriminant Analysis (PLS-DA) (mixOmics 6.1.1R package [33]) and random forest (RF) with 1000 trees [34]. Class size imbalance was corrected for using Synthetic Minority Oversampling Technique (SMOTE) [35]. Feature selection from RF data was performed using the BORUTA algorithm [36]. The area under the curve of the receiver operating characteristic (AUC) was used to present the probability of a randomly selected sample being classified correctly.
In RFs, about one third of the cases are left out of each iteration and can be used as a test set to perform cross-validation and to get an estimate of the test set error, the out of bag (oob) error estimate. The oob error estimate is recognised as being unbiased [34].
We used RF to investigate whether blood transcriptomic data from GHD and TS patients provided additional value for prediction of response to r-hGH based on baseline patient auxology (age, weight SDS, birthweight SDS and distance to target height SDS in both TS and GHD with the addition of peak GH value for GH provocation test in GHD). These analyses were performed by defining the predictive value of baseline clinical phenotype alone and these data were then compared to baseline clinical phenotype in addition to blood transcriptomic markers.
Statistics
Analyses were performed to determine genetic associations with response to r-hGH using the Kruskal–Wallis rank-sum test with Bonferroni corrections for FDR. This study was powered to detect an effect size of >6 cm/year as previously described [7].
Transcriptomic data were subjected to dimensional scaling using Principal Components Analysis and Iso-map multidimensional scaling [37] and used to demonstrate data homogeneity (Qlucore Omics Explorer 3.3) along with outliers using cross-validation. Unsupervised analysis of transcriptome data was performed using a projection score to select optimal variable subsets by variance filtering [38].
Transcriptomic associations with response to r-hGH were performed using rank regression (p < 0.01) and modified for the listed covariates. This was done by fitting a linear model with the factors to be eliminated as predictors, and retaining only the residuals (i.e. subtracting the part explained by the predictors). When a nominal factor was used as covariate (such as gender), this is equivalent to mean-centring each variable over each subgroup defined by the factor. To use covariates to adjust analysis with the downstream statistical test to relate gene expression and response to r-hGH a General Linear Model was set up, where the null hypothesis is that the data can be modelled by the covariates, and the alternative hypothesis is that the data can be modelled by covariates and the response to r-hGH.
The significance of gene set overlaps derived from the network analysis was determined using the hypergeometric test. Analyses were performed in the stated software or using R [31].
Study approval
The PREDICT (NCT00256126) and PREDICT Long-term Follow-up (NCT00699855) studies were approved by the Scotland Medical Research and Ethics Committee (reference 05/MRE10/61) and the North West Research Ethics Committee (reference 08/H1010/77), respectively. Informed consent was obtained from parents for all study participants.
Results
Growth response of patients over 5 years of r-hGH treatment
The auxology of the PREDICT study has been previously described at baseline and after 1 year [7] and after 3 years [5] of therapy with r-hGH. Height velocities as a measure of response to r-hGH at each year in GHD and TS are shown in Table 1A. As expected, first year growth response is the largest with a decline in subsequent years to a maintenance growth rate [39].
Genetic associations were not robust enough to be used to predict changes in growth rate over the 5 years of the study
The association between SNP carriage and growth response was assessed for 1096 and 792 growth-related candidate genes, in GHD and TS, respectively, which passed the filtering criterion. Whilst 113 SNPs were associated with growth response endpoints with an FDR p value < 0.05 modified by the number of blocks of LD, none of these were deemed to pass the stringency criteria required for predictive value (Table S1A–E).
Unsupervised and supervised analysis demonstrates that GHD and TS blood transcriptome at baseline can be used to classify response to r-hGH therapy over 5 years of treatment
We first demonstrated that a fundamental relationship existed between the baseline blood transcriptome and response to r-hGH over the 5 years of the study (GHD n = 50, TS n = 22) using DAPC on the unsupervised baseline transcriptome (GHD = 8875, TS = 8455 gene probe sets). These analyses showed clear segregation of the low response and the high response quartiles of response to r-hGH thus demonstrating the utility of blood transcriptome to differentiate response groups (Fig. 1). PLS-DA of the unsupervised baseline transcriptome demonstrated similar findings (Fig. 2).
GHD and TS blood transcriptome at baseline can be used to classify response to r-hGH therapy year-on-year over 5 years of treatment
Baseline gene expression associated with height velocity at each year of the 5 years after the start of treatment with r-hGH was defined using rank regression (p < 0.01) (Table S2) with a range of covariates—microarray batch, age, BMI at baseline for both GHD and TS patients along with gender and peak GH test response in GHD. Tanner stage was added as a further covariate to account for the pubertal status of the patients (Fig. S1). There was no difference in auxology at baseline between each group of patients at each year of the study (Tables 1B and S2). First classification of low and high responding quartile groups of patients was assessed by PLS-DA using unmodified class sizes (Table S3A, B): clear separation of the quartiles was observed (example of first year GHD response, Fig. 3). We also examined classification of growth response using RF with oversampling by SMOTE to correct for uneven class size (GHD, Table S3A and TS, Table S3B). These data show clear classification of good and poor responders: at each year of the study all PLS-DA AUCs were between 73 and 98% and all RF AUCs were between 78 and 98% in both conditions.
Interactome network models of response to r-hGH
There was a limited overlap between GHD and TS whole blood transcriptomic markers related to growth response at each year of the study (Table S2). We therefore generated interactome network models including inferred interactions to assess whether GHD and TS growth response-associated gene expression was related by affecting the function of similar network modules, albeit in different ways.
Interactome network models of gene expression associated with height velocity at each year of the study were generated. The hierarchy of overlapping modules of genes was identified in each network using the network topology parameter of ‘centrality’ (Table S4). Network centrality is a measurement that is known to be related to gene function within networks; the more central a gene is, the more capable it is of influencing other genes within the network [40].
The gene level summary of SNP associations with change in height and height velocity measurements with FDR < 0.05 (Table S1C, D) were mapped onto the network models (Table S4). Most of the genetic associations with change in height and height velocity were also present within the network models—15/25 SNPs in GHD and 9/12 in TS (Table S5), implying that these genes have a functional role in network action.
Network models associated with height velocity in each year in both GHD and TS demonstrated significant overlaps (hypergeometric test, p < 0.01) (Fig. 4). These observations imply that whilst associated gene expression may be different between GHD and TS, common network elements are being affected in the two conditions.
The overlap between network models formed a discrete interactome element shared between GHD and TS (Fig. 5). When this network was partitioned into genes related to each year of response to r-hGH (coloured Fig. 5A), it was determined that the genes associated with year 3 formed a less distinct cluster within the network (Fig. 5B). This observation is in alignment with a partition between early (years 1 and 2) and later (years 4 and 5) response to r-hGH as would be expected clinically.
The facts that (i) genetic associations with growth response map to the network models derived from transcriptomic data and that (ii) the network connectivity of the central modules changes over the duration of the study imply that the network models are robust and account for the effect of development on related phenotype (Table S5).
The identification of core sets of genes that can classify response to r-hGH in both GHD and TS
The overlap between network models was used to select a common set of genes at each year of therapy present in both GHD and TS. Genes within this common list were selected for growth response classification if they had previously been identified as significantly associated with height velocity by rank regression in either GHD or TS (p < 0.05) (Fig. 5C and Table S2).
Classification of both high and low r-hGH response quartiles against the remaining patients was shown using PLS-DA (no oversampling) and RF (using SMOTE oversampling). All AUCs for classification were between 74 and 96% (Table S6).
Further confidence in the findings was provided by assessing the predictive quality of the gene probe sets using BORUTA to define the limits of the noise in the analysis using a 100-fold permutation of the data (e.g. first year growth response Fig. 6 and Table S7).
The core sets of genes with expression in whole blood that can classify response to r-hGH in both GHD and TS are associated with differential genomic methylation
Changes in genomic methylation in response to short-term treatment with r-hGH (4 days) have been demonstrated in children with range of conditions that manifest short stature [30]. Using the data provided by this previously published study, we examined the epigenome at baseline (prior to r-hGH treatment) in relation to growth response (measured by knemometry) in GHD patients (n = 6) and found that using a gene level summary of DNA methylation (20,618 genes) 497 had methylation associated with growth response to r-hGH (rank regression, p < 0.01) (Fig. 7). The majority of associated genes (425/497) were hypermethylated at lower rates of growth response.
We took the core sets of genes previously identified as classifiers of response to r-hGH in both GHD and TS and mapped gene level methylation present in the six GHD patients with knemometry measurements. The majority of genes (57/71) were correlated with growth response (|r| > 0.3) these were evenly distributed between positive (n = 27/71) and negative (n = 30/71) correlations (year 1 data shown in Fig. 7B).
Transcriptomic markers combined with phenotype lead to better growth response prediction
It is known that the baseline phenotype of GHD and TS patients can be used to predict response to r-hGH [2, 41,42,43]. We can use clinical parameters alone to achieve good classification of response to r-hGH in our current data sets (GHD range: 0.86–0.94, TS range: 0.84–0.91) (Table 2). However, we also found that including the blood transcriptome markers increased predictive value at each year (GHD range: 0.95–0.97, TS range: 0.92–0.95). These data reveal an average increase of 7% (p = 0.0031) and 4% (p = 0.0365) (prediction of low quartile) along with 4% (p = 0.0179) and 4% (p = 0.0097) (prediction of high quartile) in GHD and TS, respectively (Table 2).
Importantly, we also noted a significant decrease of error rate in the prediction of growth response at each year when blood transcriptome markers were combined with clinical phenotype markers. Error rates decreased by an average of 5% (p = 0.0084) and 5% (p = 0.0400) (prediction of low quartile) along with 5% (p = 0.0252) and 5% (p = 0.0067) (prediction of high quartile) in GHD and TS, respectively (Table 2). The reduction observed amounted to an average halving of the error rate seen when predicting response to r-hGH using clinical phenotype markers alone.
Discussion
This study aimed to identify for the first time the genomic associations that classify response to r-hGH therapy from 1 year up to 5 years of treatment with r-hGH in children with TS and GHD.
Our previous analysis has shown limited utility of genetic associations derived from a candidate set of growth-related genes in the prediction of response to r-hGh in GHD and TS after 1 year of therapy [5, 7, 44]. Hence, genetic data do not appear to be powerful enough on their own to be used in prediction and clinical management. Recently a genome-wide association study (GWAS) on r-hGH response over the first year of treatment has been published [45]. Inevitably this work uses small numbers and did not find any evidence of genome-wide significance in primary analysis. After secondary analysis including replication there was no evidence of association with either previously published first year response genes [7] or with polygenic predicted height score; however, several loci were identified as possibly significant. The transcriptomic analysis presented in our manuscript covers the whole genome and the associated network modelling links genes with differential expression to the entire genomic background. In a rare disease, such as GHD, a major problem with genetic investigation is that patient numbers will never be sufficient for a fully powered analysis with GWAS as the size of the set of genetic variants will be several orders of magnitude higher than the number of the expressed genes. Analysis of the transcriptome in this situation suffers from issues in the interpretation of tissue-specific action but is much more likely to achieve robust findings. In our study, we have used the blood transcriptome as a marker of the action of r-hGH and, therefore, present the basis of a future test using routine clinical sampling.
The whole-blood transcriptomic profile of GHD and TS patients has been shown to be associated with first year growth response to r-hGH [7] and to correlate with the interaction between GHRd3 and GHD severity [44]. We therefore reasoned that there may be value in using transcriptomic data to classify growth response, as it reflects both a child’s genetic profile and the complex clinical phenotypes arising from changes in physical development during childhood, as well as variation in the severity of the underlying condition. By normalising gene expression for phenotype, including pubertal stage, we were able to show that whole blood transcriptomic data, associated with height velocity at each year of the study, could be used to classify both the low and high quartiles of growth response, with ‘area under the curve’ up to 97%, providing the basis for a predictive test.
Little overlap between GHD and TS was observed between the gene expression data that was associated with each year of growth response. We therefore investigated whether GHD and TS were interacting with similar functional units of genes using network models [46]. We generated network models of growth response (as determined by height velocity) at each of the 5 years of treatment using baseline gene expression. Functional modules of genes within these models were ranked according to their network centrality. The measure of network centrality is known to be associated with mechanism [26, 47] and we used this measure to define the functional hierarchy of the modules of genes whose expression was linked to r-hGH response at each year of therapy.
We demonstrated robustness of the network modules identified by mapping the genetic associations identified in this study to the network models. This process highlighted genes previously identified as associated with growth response after 1 year of therapy (GRB10, SOS1 and INPPL1 in GHD) [7] along with 1 month change in serum IGF-I associated with r-hGH therapy (CDK4) [21]. It was also noted that three genes were present (INPPL1 and SOS1 in GHD and PTPN1 in TS) out of the four genes identified within the PREDICT validation study as having replicated an association with first year growth response when controlled for co-variates [5].
A significant overlap between the core network gene modules between GHD and TS was identified. We then used gene expression changes associated with growth response within these network elements to identify genes common to both conditions and show that their expression could be used to classify growth response.
The major strength of this study is to have identified predictive markers and common genomic mechanisms related to early and later growth in two different growth disorders. Our findings are also supported by the demonstration of differential methylation in these shared genes, associated with response to r-hGH in another study [48]. Importantly, we have defined sets of gene expression with predictive value in two conditions where the number of genes [17,18,19,20,21,22,23,24,25,26] is smaller than the number of patients in the group (33–70) (Tables S2 and S6); this indicates that the findings are not a consequence of overfitting [49].
In this study we have compared the use of baseline patient auxology to blood transcriptome in predicting response to r-hGH. Linear models based on baseline patient auxology can account for ~40–60% of the variance observed [9, 10]. Using RF, we found no significant difference in the AUC of baseline auxology alone compared to using blood transcriptome alone in either GHD or TS (all ~90%). It should be noted that this comparison was with the transcriptome shared between GHD and TS and if the full blood transcriptome is used then the average AUC is significantly higher than that derived from baseline auxology (average AUC ~90% compared to ~95%). We recognise that further work would need to be done to refine a smaller number of genes and therefore minimise the risk of overfitting when using the full blood transcriptome. However, we did identify a significant boost to prediction between 4 and 7% when the transcriptomic signature shared between GHD and TS was combined with the baseline patient auxology. Importantly, the gain in prediction was combined with an average halving of the error rate, a feature that represents a major clinical advance in the prediction of response to r-hGH.
We propose that the work presented in this manuscript represents a step towards individualised prediction of response to r-hGH. To proceed towards a clinical test, further validation would be required using the effect sizes we have defined and we would need to compare the group findings established in our study with the absolute individual gene expression provided by RNAseq. It would also be possible to link the transcriptomic findings to the complex genetic background of response to r-hGH using the definition of expressed quantitative trait loci potentially providing a route to a genetic risk score refined by the transcriptomic response.
This work has led to three novel findings relevant to growth studies, and potentially to other therapeutic areas in paediatrics. First, this study has demonstrated the utility of whole blood transcriptome in the classification of growth response in GHD and TS, derived from a baseline blood sample, which is straightforward to obtain in any child. This technique may be of particular use in conditions with marked variability in response to r-hGH such as the short child born small for gestational age. Second, network analysis provides a novel approach that can be used to identify genomic features that are likely to have high predictive value. Finally, a set of common genes in GHD and TS identified by a network approach can be used to classify growth response in both conditions, providing the opportunity to develop a test to inform clinical management.
Data availability
All transcriptomic data are available from Gene Expression Omnibus (GEO)—GSE72439.
References
Excellence NIfHaC. Human growth hormone (somatropin) for the treatment of growth failure in children. 2010.
Ranke MB, Lindberg A, Brosz M, Kaspers S, Loftus J, Wollmann H, et al. Accurate long-term prediction of height during the first four years of growth hormone treatment in prepubertal children with growth hormone deficiency or Turner Syndrome. Horm Res Paediatr. 2012;78:8–17.
Stevens A, De Leonibus C, Hanson D, Whatmore A, Murray P, Donn R, et al. Pediatric perspective on pharmacogenomics. Pharmacogenomics. 2013;14:1889–905.
Stevens A, De Leonibus C, Whatmore A, Hanson D, Murray P, Chatelain P, et al. Pharmacogenomics related to growth disorders. Horm Res Paediatr. 2013;80:477–90.
Stevens A, Murray P, Wojcik J, Raelson J, Koledova E, Chatelain P, et al. Validating genetic markers of response to recombinant human growth hormone in children with growth hormone deficiency and Turner syndrome: the PREDICT validation study. Eur J Endocrinol. 2016;175:633–43.
De Leonibus C, Chatelain P, Knight C, Clayton P, Stevens A. Effect of summer daylight exposure and genetic background on growth in growth hormone-deficient children. Pharmacogenomics J. 2016;16:540–50.
Clayton P, Chatelain P, Tato L, Yoo HW, Ambler GR, Belgorosky A, et al. A pharmacogenomic approach to the treatment of children with GH deficiency or Turner syndrome. Eur J Endocrinol. 2013;169:277–89.
Stevens A, Hanson D, Whatmore A, Destenaves B, Chatelain P, Clayton P. Human growth is associated with distinct patterns of gene expression in evolutionarily conserved networks. BMC Genom. 2013;14:547.
Ranke MB, Lindberg A, Chatelain P, Wilton P, Cutfield W, Albertsson-Wikland K, et al. Derivation and validation of a mathematical model for predicting the response to exogenous recombinant human growth hormone (GH) in prepubertal children with idiopathic GH deficiency. KIGS International Board. Kabi Pharmacia International Growth Study. J Clin Endocrinol Metab. 1999;84:1174–83.
Ranke MB, Lindberg A, Chatelain P, Wilton P, Cutfield W, Albertsson-Wikland K, et al. Prediction of long-term response to recombinant human growth hormone in Turner syndrome: development and validation of mathematical models. KIGS International Board. Kabi International Growth Study. J Clin Endocrinol Metab. 2000;85:4212–8.
Ranke MB, Lindberg A, Chatelain P, Wilton P, Cutfield W, Albertsson-Wikland K, et al. Predicting the response to recombinant human growth hormone in Turner syndrome: KIGS models. KIGS International Board. Kabi International Growth Study. Acta Paediatr (Oslo, Nor: 1992) Suppl. 1999;88:122–5.
Stevens A, Bonshek C, Whatmore A, Butcher I, Hanson D, De Leonibus C, et al. Insights into the pathophysiology of catch-up compared with non-catch-up growth in children born small for gestational age: an integrated analysis of metabolic and transcriptomic data. Pharmacogenomics J. 2014;14:376–84.
De Leonibus C, Chatelain P, Knight C, Clayton P, Stevens A. Effect of summer daylight exposure and genetic background on growth in growth hormone-deficient children. Pharmacogenomics J. 2015;16:540–50.
Gerber T, Willscher E, Loeffler-Wirth H, Hopp L, Schadendorf D, Schartl M, et al. Mapping heterogeneity in patient-derived melanoma cultures by single-cell RNA-seq. Oncotarget. 2017;8:846–62.
Grigoroiu M, Tagett R, Draghici S, Dima S, Nastase A, Florea R, et al. Gene-expression profiling in non-small cell lung cancer with invasion of mediastinal lymph nodes for prognosis evaluation. Cancer Genom Proteom. 2015;12:231–42.
Cavalli FMG, Remke M, Rampasek L, Peacock J, Shih DJH, Luu B, et al. Intertumoral heterogeneity within medulloblastoma subgroups. Cancer Cell. 2017;31:737–54.e6.
Bhutani M, Zhang Q, Friend R, Voorhees PM, Druhan LJ, Barlogie B, et al. Investigation of a gene signature to predict response to immunomodulatory derivatives for patients with multiple myeloma: an exploratory, retrospective study using microarray datasets from prospective clinical trials. Lancet Haematol. 2017;4:e443–e51.
Kamel HFM, Al-Amodi H. Exploitation of gene expression and cancer biomarkers in paving the path to era of personalized medicine. Genom Proteom Bioinform. 2017;15:220–35.
Sävendahl L, Polak M, Backeljauw P, Blair J, Miller BS, Rohrer TR, et al. Treatment of children with GH in the United States and Europe: long-term follow-up from NordiNet® IOS and ANSWER program. J Clin Endocrinol Metab. 2019;104:4730–42.
Shepherd S, Saraff V, Shaw N, Banerjee I, Patel L. Growth hormone prescribing patterns in the UK, 2013–2016. Arch Dis Child. 2019;104:583–7.
Stevens A, Clayton P, Tato L, Yoo HW, Rodriguez-Arnao MD, Skorodok J, et al. Pharmacogenomics of insulin-like growth factor-I generation during GH treatment in children with GH deficiency or Turner syndrome. Pharmacogenomics J. 2014;14:54–62.
Murray PG, Stevens A, De Leonibus C, Koledova E, Chatelain P, Clayton PE. Transcriptomics and machine learning predict diagnosis and severity of growth hormone deficiency. JCI Insight. 2018;3:1–14.
Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, et al. The BioGRID interaction database: 2015 update. Nucleic Acids Res. 2015;43:D470–8.
Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T. Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics. 2011;27:431–2.
Yu H, Kim PM, Sprecher E, Trifonov V, Gerstein M. The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS Comput Biol. 2007;3:e59.
Sun J, Zhao Z. A comparative study of cancer proteins in the human protein-protein interaction network. BMC Genom. 2010;11:S5.
Szalay-Beko M, Palotai R, Szappanos B, Kovacs IA, Papp B, Csermely P. ModuLand plug-in for Cytoscape: determination of hierarchical layers of overlapping network modules and community centrality. Bioinformatics. 2012;28:2202–4.
Chin CH, Chen SH, Wu HH, Ho CW, Ko MT, Lin CY. cytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst Biol. 2014;8:S11.
Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43:D447–52.
Kolarova J, Ammerpohl O, Gutwein J, Welzel M, Baus I, Riepe FG, et al. In vivo investigations of the effect of short- and long-term recombinant growth hormone treatment on DNA-methylation in humans. PLoS ONE. 2015;10:e0120463.
RCoreTeam. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2016. https://www.R-project.org/.
Jombart T, Devillard S, Balloux F. Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genet. 2010;11:94.
Rohart F, Gautier B, Singh A, Le, Cao KA. mixOmics: an R package for ‘omics feature selection and multiple data integration. PLoS Comput Biol. 2017;13:e1005752.
Liaw A, Wiener M. Classification and regression by RandomForest. R News. 2002;2:18–22.
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–57.
Kursa MB, Rudnicki WR. Feature selection with the Boruta package. J Stat Softw. 2010;36:1–13.
Nilsson J, Fioretos T, Höglund M, Fontes M. Approximate geodesic distances reveal biologically relevant structures in microarray data. Bioinformatics. 2004;20:874–80.
Fontes M, Soneson C. The projection score–an evaluation criterion for variable subset selection in PCA visualization. BMC Bioinform. 2011;12:307.
Ranke MB, Lindberg A. Predicting growth in response to growth hormone treatment. Growth Horm IGF Res: Off J Growth Horm Res Soc Int IGF Res Soc. 2009;19:1–11.
Tuo Y, An N, Zhang M. Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods. Mol Med Rep. 2018;17:4281–90.
Ranke MB, Wit JM. Growth hormone—past, present and future. Nat Rev Endocrinol. 2018;14:285–300.
Wit JM, Ranke MB, Albertsson-Wikland K, Carrascosa A, Rosenfeld RG, Van Buuren S, et al. Personalized approach to growth hormone treatment: clinical use of growth prediction models. Horm Res Paediatr. 2013;79:257–70.
Ranke MB, Schweizer R, Martin DD, Ehehalt S, Schwarze CP, Serra F, et al. Analyses from a centre of short- and long-term growth in Turner’s syndrome on standard growth hormone doses confirm growth prediction algorithms and show normal IGF-I levels. Horm Res Paediatr. 2012;77:214–21.
Valsesia A, Chatelain P, Stevens A, Peterkova VA, Belgorosky A, Maghnie M, et al. GH deficiency status combined with GH receptor polymorphism affects response to GH in children. Eur J Endocrinol. 2015;173:777–89.
Dauber A, Meng Y, Audi L, Vedantam S, Weaver B, Carrascosa A, et al. A genome-wide pharmacogenetic study of growth hormone responsiveness. J Clin Endocrinol Metab. 2020;105:1–12.
Stevens A, De Leonibus C, Hanson D, Dowsey AW, Whatmore A, Meyer S, et al. Network analysis: a new approach to study endocrine disorders. J Mol Endocrinol. 2014;52:R79–93.
Kovacs IA, Palotai R, Szalay MS, Csermely P. Community landscapes: an integrative approach to determine overlapping network module hierarchy, identify key nodes and predict network dynamics. PLoS ONE. 2010;5:1–14.
Kolarova J, Ammerpohl O, Gutwein J, Welzel M, Baus I, Riepe FG, et al. In vivo investigations of the effect of short- and long-term recombinant growth hormone treatment on DNA-methylation in humans. PLoS ONE. 2015;10:e0120463.
Chicco D. Ten quick tips for machine learning in computational biology. BioData Min. 2017;10:35.
Acknowledgements
The following were PREDICT local centre principal investigators who enroled patients into the study: S-W Yang, H-W Yoo, T-J Wang, S Cabrol, J Weill, R Pfaeffle, F Antoniazzi, A Pilotta, D Veimo, V Peterkova, A Ferrandez Longas, I Gonzalez, S Quinteiro, M Rodriguez Arnao, L Bath, M Colle, Y Brusquet, J-P Salles, J-W Hou.
Funding
This work was supported by Merck KGaA, Darmstadt, Germany.
Author information
Authors and Affiliations
Contributions
P Cl and P Ch conceived and designed the PREDICT project and this study. Data analysis and methodology development were undertaken by AS, PM, TG and CDL. The manuscript was written by AS and PM and revised by EK, P Ch and P Cl. EK, GA, KK, GB, MM, SZ, EB, JS, DY, AB, J-PLS, RC, EV-H, LH, JD and CD all contributed to the clinical data analysis. All authors reviewed the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
AS and PM have received speaker honoraria from Merck KGaA, Darmstadt, Germany. P Ch has received investigator research support, consultant and speaker honoraria from Merck KGaA, Darmstadt, Germany. P Cl had received research investigator support and speaker honoraria from Merck KGaA, Darmstadt, Germany. EK is an employee of Merck KGaA, Darmstadt, Germany.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Stevens, A., Murray, P., De Leonibus, C. et al. Gene expression signatures predict response to therapy with growth hormone. Pharmacogenomics J 21, 594–607 (2021). https://doi.org/10.1038/s41397-021-00237-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41397-021-00237-5