Introduction

Height is a highly heritable and classic polygenic trait. In order to discover genes involved in growth regulation, there are basically two approaches. The first approach is to carry out genome-wide association studies (GWAS) for common variants in large populations of individuals. This has led to the discovery of at least 180 loci associated with adult height. However, the contribution of each locus is small, each locus contains various genes, and cumulative loci only explain about 10% of the phenotypic variation.1 Alternatively, when using all single-nucleotide polymorphisms (SNPs) identified in a GWAS approach as predictors simultaneously, up to 40% of the variance in height can be explained.2 The second approach is to perform genetic studies in patients with extremely short or tall height, and search for causative variants.3 With this approach one can either test for gene defects that were previously described or that appear plausible based on observations in knockout mice (candidate gene approach), or perform a genome-wide analysis for copy number variants (CNVs) or whole-exome sequencing for mutations. The candidate gene approach has led to the detection of a substantial number of genes that are involved in monogenic defects associated with short or tall stature, such as IGF1, STAT5B, IGFALS and IGF1R,4, 5, 6, 7, 8, 9, 10 but obviously does not result in finding novel genes involved in growth regulation.

In two previous papers from our group,11, 12 we have described the results of a candidate gene approach in children with short stature, either associated with a low birth size (small for gestational age, SGA)13 or with a normal birth size (idiopathic short stature).14 In this article, we describe the results of a genome-wide analysis for CNVs using SNP arrays in short children, in an effort to identify novel gene variants associated with short stature.

Patients and Methods

Patients

We studied 191 patients from 173 unrelated families with short stature (≤−2 SD score, SDS) of unknown origin, either born with a normal birth size or born SGA. DNA was sent to our laboratory for analysis because of short stature between 2008 and 2011. Twenty-nine were excluded from the present analysis: 8 because of a height SDS >−2.0, 15 because of insufficient or low quality DNA or no parental consent, and 6 cases belonging to one family were separately described with a heterozygous IGF1 mutation and an additional 435.7 kb deletion (arr 3q26.1(162 681 814–163 117 547) × 1).6 This resulted in an analyzable group of 162 patients from 149 families. Height SDS was calculated for Dutch population references,15 except for one patient (I.6/II.2) for whom the reference for children of Turkish ethnicity was used.16 With consent of the medical ethical committee of the Leiden University Medical Center, clinical data were collected and anonymized for all patients.

SNP arrays

In 103 cases, the Affymetrix GeneChip Human Mapping 262K NspI or 238K StyI arrays (Affymetrix, Santa Clara, CA, USA) was used, containing 262 262 and 238 304 25-mer oligonucleotides, respectively, with an average spacing of approximately 12 kb per array. An amount of 250 ng DNA was processed according to the manufacturer’s protocol. Detection of SNP copy number was performed using copy number analyzer for GeneChip (CNAG) version 2.0.17

In 54 cases, the Illumina HumanHap300 BeadChip (Illumina Inc., San Diego, CA, USA) was used, containing 317 000 TagSNPs, with an average spacing of approximately 9 kb, and in 5 cases the Illumina HumanCNV370 BeadChip (Illumina Inc., Eindhoven, The Netherlands), containing 317 000 TagSNPs and 52 000 non-polymorphic markers for specifically targeting nearly 14 000 known CNVs. This array has an average spacing of approximately 7.7 kb. A total of 750 ng DNA was processed according to the manufacturer’s protocol. SNP copy number (log R ratio) and B-allele frequency were assessed using Beadstudio Data Analysis Software Version 3.2 (Illumina Inc., The Netherlands).

Evaluation of CNVs

Deletions of at least five adjacent SNPs and a minimum region of 150 kb and duplications of at least seven adjacent SNPs and a minimum region of 200 kb were evaluated,18 except for three families in which a prominent duplication smaller than 200 kb (although consisting of ≥10 adjacent SNP probes) was observed. The CNVs were classified into four groups: (I) known pathogenic CNVs (known microdeletion or microduplication syndromes); (II) potentially pathogenic CNVs, not described in the Database of Genomic Variants (DGVs; The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Canada, http://projects.tcag.ca/variation/); (III) CNVs not described in the DGV, but not containing any protein-coding genes; and (IV) known polymorphic CNVs described in the DGV or observed in our in-house reference set, whereby at least three individuals must have been reported with the same rearrangement. Type IV CNVs were not further evaluated. All type II CNVs were assessed with Ensembl (Wellcome Trust Genome Campus, Hinxton, Cambridge, UK, http://www.ensembl.org: Ensembl release 63–June 2011) and the DECIPHER database (Wellcome Trust Genome Campus) for gene and microRNA (miRNA) content and similar cases, respectively. If DNA from the parents was available, segregation analysis was performed by SNP array. Finally, data of all patients with potentially pathogenic CNVs were added to the DECIPHER database.

The type I CNVs were confirmed with multiplex ligation-dependent probe amplification (MLPA), using Salsa MLPA P018 probemix for SHOX and P217 for IGF1R analysis (MRC-Holland, Amsterdam, The Netherlands). Amplification products were identified and quantified by capillary electrophoresis on an ABI 3130 genetic analyzer (Applied Biosystems, Nieuwerkerk aan de IJssel, The Netherlands). Fragment analysis was performed using GeneMarker (SoftGenetics, State College, PA, USA). Thresholds for deletions and duplications were set at 0.75 and 1.25, respectively.19

Bioinformatics approach

We checked for all CNVs whether they were located in one of the chromosomal regions associated with height in GWAS.1 For genes in deleted or duplicated regions in cases with de novo or segregating CNVs, we used three additional approaches. First, the rodent homologs were checked for three criteria: (1) higher expression in 1-week-old mouse growth plate than in 1-week-old mouse lung, kidney and heart; (2) spatial regulation: significant difference between zones in the 1-week-old rat growth plate; and (3) temporal regulation: significant difference between 3 and 12 weeks of age in the rat growth plate using previously established mRNA expression profiles.20, 21 Second, associations were investigated for mouse growth plate-related phenotypes. Third, associations with human growth plate-related phenotypes were investigated. For details, see Lui et al.21

Results

Copy number variants

An organization chart illustrating the identified CNVs is shown in Figure 1. In the 162 patients belonging to 149 unrelated families, a total of 49 CNVs were found in 40 families (43 patients).

Figure 1
figure 1

Organization chart illustrating the identified CNVs. The 149 unrelated families (162 patients) divided in the different subcategories are depicted in bold. A total of 49 CNVs were found in 40 families (43 patients).

In six families (4.0%, six patients), a type I CNV was observed and in two of them an additional de novo type II CNV. Table 1 shows the clinical and genetic findings of these six patients, including two microdeletions (I.1 and I.2) and two microduplications (I.3 and I.4) containing SHOX, and two terminal 15q deletions containing IGF1R (I.5/II.1/mi.3 and I.6/II.2). All these CNVs were confirmed with MLPA.

Table 1 Type I CNVs

One or more type II CNVs (n=40) were found in 33 unrelated families (22.1%, 36 patients). Five of these potentially pathogenic CNVs contained besides protein-coding genes also miRNAs (Table 2). In 24 families (27 patients), segregation analysis could be performed, which led to a total of five de novo CNVs (Table 3) and nine CNVs segregating with a height below −1.5 SDS of a carrier family member (Table 4). For 19 CNVs, the lack of segregation with short stature makes a causative role of the CNV unlikely (Supplementary Table 1). In nine patients (nine CNVs), no information on segregation could be obtained (Supplementary Table 2). In two non-related patients (cases II.24 and II.25), a similar CNV (a deletion containing DCAF12L2, alias WDR40C) in the X-chromosome was identified, but both children inherited the deletion from a normal parent.

Table 2 MicroRNAs
Table 3 De novo type II CNVs
Table 4 Type II CNVs segregating with short stature

In one family (0.7%, one patient), a type III CNV was found encompassing a 192.3 kb deletion of chromosome 13 (arr 13q31.1(86 733 645–86 925 974) × 1). The girl (case III.1) was born SGA, had poor food intake and severe postnatal growth failure (length −8.2 SDS at 2.5 years). Screening for IGF1 and the IGF1R for mutations or deletions was negative. The function of this region is unknown.

No potential pathogenic CNVs (only type IV or no CNVs) were found in 109 families (73.2%, 119 patients).

Bioinformatics approach

Five CNVs encountered in our study are close to the loci associated with height in GWAS.1 Four of these CNVs were de novo or segregating with short stature, including loci close to ADAMTS17 (case II.5), PRKG2/BMP3 (cases II.11 and II.13), PAPPA (cases II.11 and II.13) and TULP4 (case II.7). However, none of the deletions included genes tightly linked (r2<0.5) to a GWAS SNP implicated in human height variations. The fifth CNV is close to the MKL2 locus (case II.37/mi.4) but did not segregate with short stature (Supplementary Table 1).

We reasoned that some of the identified CNVs might cause short stature because they contain genes that are expressed and function in the growth plate. We therefore used existing expression microarray data to identify genes that show greater expression in mouse growth plate than in soft tissues, temporal regulation in rat growth plate or spatial regulation in rat growth plate. Within de novo CNVs, this approach implicated five genes (Aldh1a3, Fam3c, Furin, Lrrk1 and Chsy1), and within segregating CNVs, this implicated seven genes (Col14A1, Dscc1, Enpp2, Ezr, Prelid2, Taf2 and Trim32; Table 5). This information, in combination with other bioinformatic data, was used to formulate the arguments pro and contra an association of these genes with short stature (summarized in Tables 3 and 4). Potential candidate genes in de novo CNVs associated with short stature (Table 3) include FURIN, DOCK8 and/or KANK1, NLRP3, FAM3C, SLC13A1, ADAMTS17, ALDH1A3, LRRK1 and CHSY1. Potential candidate genes in CNVs segregating with short stature (Table 4) include FHIT, PTPRG, TULP4, EZR, ENPP2, TAF2, COL14A1, DSCC1, LPPR1, ZNF675, C4orf22 (or PRKG2/BMP3), PRELID2, and ASTN2 and TRIM32 (or PAPPA).

Table 5 Bioinformatic approach (mouse GP vs soft tissues expression, and spatial and temporal regulation of gene expression in the rat GP)

For the CNVs for which insufficient information was available about segregation with short stature, the in silico analysis provided support for four potential candidate genes (TBL1X, ROBO2, CHD8 and TOX4), as well as a candidate region (distal part of common 22q11 deletion syndrome) (Supplementary Table 2).

Discussion

Whole-genome SNP array analysis in 162 patients with short stature from 149 unrelated families (Figure 1) led to the detection of type I CNVs known to cause short stature (involving SHOX or IGF1R) in six families (in two of them combined with type II CNVs), and 40 potentially pathogenic CNVs (type II) in 33 families. Out of the total of 42 type II CNVs, 5 were de novo and 9 others were associated with short stature in their families. In one severely short child, a deletion without protein-coding genes was found, and in five CNVs six miRNAs were encountered.

A recent study on a genome-wide association analysis of CNV and stature showed that children with short stature had a greater global burden of lower frequency and rare deletions and a greater average CNV length than controls.22 There were no significant associations with tall stature. These observations suggest that CNVs might contribute to genetic variation in stature in the general population. These authors also identified three preliminary candidate regions as having significant associations with stature; a duplication at 11q11 and deletions at 14q11.2 and 17q21.31. In our analysis, these regions all display common CNVs, which have been often observed in our in-house database and in the DGV (type IV CNVs).

The two patients carrying a heterozygous deletion containing the SHOX gene had disproportionate short stature, but no Madelung deformity. Case I.1 (sitting height/height (SH/H) ratio +3.7 SDS) inherited the deletion from her mother, who also had disproportionate short stature (height −1.8 SDS, SH/H ratio +4.2 SDS). Case I.2 (SH/H ratio +3.8 SDS) carries besides a de novo SHOX haploinsufficiency also a heterozygous unclassified variant in the IGFALS gene (c.1555C>T, p.Arg519Trp) inherited from her father (height −1.1 SDS). IGFALS sequencing was performed because of a low circulating IGF-I and IGFBP-3 despite elevated GH secretion. Although the referring physician had not suspected Leri–Weill syndrome, in retrospect the increased SH/H ratio would have been sufficient reason to directly test for SHOX defects. The two patients in whom a duplication of the SHOX gene including surrounding genes was observed (de novo and inherited via a normal statured parent, respectively), had a SH/H ratio of approximately +1.9 SDS. Along with others, we have recently reported that a phenotype similar to Leri–Weill syndrome (including short stature) can be associated with SHOX duplication.11, 23, 24

In two patients, a heterozygous deletion on chromosome 15 containing the IGF1R gene was identified, a well-established cause of short stature.11, 25, 26 In both patients, an additional de novo CNV was present (Table 3). In case I.5/II.1/mi.3, this was a duplication in 15q26.1q26.2 (located upstream of the deleted area). Although this patient’s growth failure is similar to that of other patients with IGF1R defects,26 duplication of FURIN may have an additional role. In case I.6/II.2, considerably shorter than usual for IGF1R deletions,26 the terminal 15q deletion was combined with a terminal 9p24.3p24.2 duplication, suggesting the presence of an unbalanced reciprocal translocation. We suspect that one of the parents is a carrier of a balanced 9;15 translocation, but unfortunately parental chromosomes were not available for testing. The presence of two patients in the DECIPHER database with a similar 9q duplication and short stature suggests that there may be an association between the genes DOCK8 and KANK1, and stature.

Bioinformatics analysis of the three other cases with de novo type II CNVs led to several candidate genes (Table 3). In case II.3, a duplication of NLRP3 may be associated with short stature. The CNV in case II.4 (who has besides short stature also mental retardation, behavioral problems, strabismus and various dysmorphic features) suggests that FAM3C and SLC13A1 deletions may be associated with short stature, particularly because of the expression data of Fam3c in the murine growth plate and the dwarfism and skeletal deformities in Texel sheep and mice with loss-of-function of Slc13a1.27, 28

Case II.5, with a terminal de novo 15q deletion located 1.5 Mb downstream of IGF1R and 244 kb downstream of the ADAMTS17 locus on the reverse strand, had a normal birth size, but showed proportionate progressive growth failure (SH/H ratio +1.58 SDS) with a normal head circumference. Clinical characteristics included slight frontal bossing of the skull, a high pitched voice and slight abdominal adiposity and delayed bone age. GH secretion and circulating IGF-I were normal, but IGFBP-3 was low (−2 SDS). Several arguments are in favor of a role of ADAMTS17 in growth regulation (for summary, see Table 3), including: (1) significant association with height in population GWAS;1 (2) a short child with a similar terminal deletion in the DECIPHER database; (3) significant association with size in a GWAS in the domestic dog;29 (4) human mutations in ADAMTS17 causing the acromelic chondrodysplasia Weill–Marchesani-like syndrome (OMIM #277600 and #608328);30, 31, 32, 33 and (5) association of members of the ADAMTSL/ADAMTS family with the modulation of fibrillin-1 function.31, 33 Unfortunately, expression of the rodent homolog of ADAMTS17 could not be investigated, because the gene was not represented on the microarrays used. Besides ADAMTS17, this deletion contains three other genes, ALDH1A3, LRRK1 and CHSY1, that might be implicated in short stature.

Nine CNVs in six families (five families with one index patient each, and one family consisting of a mother and her two sons) segregated with a height of <−1.5 SDS of a carrier family member (Table 4). The 3p duplication that case II.6 (height −2.0 SDS) inherited from his father (−1.8 SDS) contains FHIT and the first part of PTPRG. Both genes are considered tumor suppressors.34, 35 The 6q duplication that case II.7 inherited from his mother is located nearby (97 kb downstream) a locus (TULP4) associated with height.1 One of the duplicated genes (ENPP2) in case II.8 encodes for a lysophospholipase D, producing lysophosphatidic acid inducing cell proliferation.36 The mouse homologs of TAF2, COL14A1 and DSCC1 are differentially expressed in the growth plate. In case II.9, the 9q deletion containing part of LPPR1 (also known as PRG3) did not fully segregate with short stature in the family, but the observation that Prg1 knockout mice are smaller compared with wild-type littermates37 suggests a role for this gene in height regulation. The 19p deletion that case II.10 inherited from his father includes ZNF675, associated with osteoclast differentiation.38 Out of the four CNVs in cases II.11, 12 and 13 (the short members of one family), C4orf22, ASTN2 and TRIM32 are located close to loci (374 kB upstream PRKG2/BMP3 and 289 kb downstream PAPPA, respectively) associated with height,1 suggesting that the 4q and/or 9q deletion are associated with stature.

Four out of nine patients in whom no segregation analysis could be performed (Supplementary Table 2) carry a CNV suggestive for an association with short stature. Case II.14 carries a duplication of TBL1X (alias TBL1), encoding for transducin beta-like protein 1 (TBL1), which is required for Wnt-beta-catenin-mediated transcription.39 Case II.17, described previously,12 carries a duplication of 3p12.3 containing part of ROBO2, encoding a receptor for SLIT2 and probably SLIT1, thought to function in axon guidance and cell migration.40 Case II.21, born SGA, length −3.7 SDS and head circumference −3.1 SDS presented with clinodactily, a protruded tongue and delayed bone age. A search in the DECIPHER database revealed two patients with (partially) overlapping duplications, one of whom was short (patient #258583) and one was not (patient #258497). Out of the six genes outside the overlapping region with patient #258497 CHD8 and TOX4 appear potential candidate genes.41, 42 Case II.22/mi.5 has a 22q deletion containing only the distal part of the common 22q11 deletion syndrome (Velocardiofacial/DiGeorge syndrome). His mother does not carry the duplication, and DNA from the father is not available. In eight patients in the DECIPHER database with overlapping deletions, short stature was observed. The common deleted region contains PI4KA, SERPIND1, SNAP29, CRKL, AIFM3, LZTR1, THAP7 and P2RX6.

In conclusion, whole-genome SNP array analysis in this exploratory study on 162 patients with short stature belonging to 149 unrelated families identified 6 CNVs in 6 families (4%) for which the association with short stature is virtually certain, and 40 CNVs in 33 families (22.1%) with possible pathogenicity. Several of the deleted or duplicated genes may be considered as potential candidate genes for growth disorders, including four genes associated with height in the GWAS (ADAMTS17, PRKG2/BMP3, PAPPA and TULP4). Future studies are needed to support the role of these and other genes in longitudinal growth regulation.