Mammalian sex development is a coordinated series of events resulting in the determination and differentiation of a functional reproductive system with distinct sexual characteristics. Alterations to these events often result in a broad class of rare and heterogeneous conditions referred to as differences in sex development (DSD). DSD is characterized by congenital conditions in which development of gonads and other sex-related structures in relation to sex chromosome complement is atypical.1 DSD has been historically diagnosed using conventional cytogenetics (karyotype and FISH), phenotypic, and biochemical assays while the molecular genetic alterations underlying the disease remains elusive in a significant portion of DSD conditions.2 Traditional karyotype analysis of G-banded chromosomes is used for the cytogenetic characterization of DSD but is limited to detecting large scale abnormalities of a size greater than 5 Mb.3 Chromosomal microarray (CMA) allows for the detection of copy-number variations (CNVs) at a much higher resolution (>100×) than traditional karyotyping.4

Fibroblast growth factors (FGFs) are a large class of signaling molecules that act directly on cells to regulate diverse responses such as mitogenesis, differentiation, migration, and cell survival.5 FGF signaling is ubiquitous in the mammalian embryo and plays key roles in the maturation of many organ systems, including testis determination.6 Disruption to FGF ligands, receptors, and intracellular signaling cascades have been implicated in a range of DSD including hypospadias,7 hypogonadotropic hypogonadism,8,9 and other XX or XY DSD conditions.10 Retrospective investigation11 of a DSD patient set with no definitive molecular genetic diagnoses using a research criteria for calling CNVs involving FGF-related genes provides a rich opportunity for identifying novel regions associated with specific disorders.

Here we investigated CNVs that are smaller than 50 kb (referred to as ‘small CNVs’ hereafter) in 83 FGF-related genes (Supplementary Table 1) among 52 patients with heterogeneous DSD with obscure genetic molecular diagnosis.11 Many of these genes were found on chromosomes 17 (8 genes) and none on 18, 21, and Y. About 47% (39/83) of these genes are smaller than 50 kb. Using high-resolution genome-wide chromosome microarray we found small CNVs in ~31% of our investigated genes with a select number of CNVs that are hypothesized to disrupt normal molecular function. To our knowledge this is the first pathway-based analysis for small CNVs in DSD. This CMA approach will further our understanding of the role FGF signaling genes in DSD etiology.

The study population consists of 52 unrelated patients with atypical sex development (Supplementary Table 3). CMA was performed on patient’s DNA using Genome-Wide CytoScan HD and SNP6.0 and analysis for calling CNVs was performed using Chromosomal Analysis Suite and Genotyping Console, respectively (Affymetrix, Inc., Santa Clara, CA, USA). Negative CMA result (n=32) as well as those with pathogenic gains/losses (n=8) or large genomic imbalance of uncertain clinical significance (n=13) were included in the study (Supplementary Table 3). The study population was evaluated for CNVs smaller than 50 kb to investigate if previously uncharacterized small deletions/duplications in 83 FGF-related genes possibly contribute to DSD phenotype (Supplementary Tables 1 and 3).

Investigation of FGF-related genes in patients with DSD reveals CNVs in 26 genes in the FGF pathway (Supplementary Table 2). The mean number of CNVs in FGF genes per patients was 2.44. CNVs were located roughly evenly between exonic and intronic regions. The number of CNVs per individual patients in FGF-related genes was between zero and eight, with seven individuals (see Note in Supplementary Table 3) lacking a CNV in the investigated genes. The co-occurrence of multiple CNVs in the pathway suggests a possible combinatorial role for these genes in altering the FGF-dependent sex developmental pathway (Supplementary Figure 1).12 All of the called losses were heterozygous at the given locus. Given the ubiquitous importance of most of the signaling machinery a high number, 80% (16/20), of the genes with a found deletion CNV are likely to exhibit haploinsufficiency.13,14

The majority of these CNVs were <10 kb (~57%); however, a minority of large scale deletion and duplication events (~14%) >1 Mb were found, all located on the X chromosome of patients. The total genome-wide CNV burden per patient in all genes was ~132 (data not shown). Though we are focusing only on CNVs in FGF-related genes there are many other genomic regions in each patient that have small deletions and duplications that could be contributing to patient phenotype.

The DECIPHER database13 has documented 102 patients with CNVs encompassing the investigated FGF-related genes and phenotypes associated with DSD (Supplementary Table 4). These variants ranged from small (45.68 kb) to large (95.6 Mb), the majority of which are still much larger than the majority of CNVs found in the DSD specific patient set. Smaller regions within these intervals could partially account for the phenotype in the rare disease patients found in DECIPHER. These findings suggest that small CNVs remain under-represented in DECIPHER.11

CNVs were found throughout the FGF molecular pathway (Table 1). CNVs were found in FGF receptors, ligands, intracellular signaling cascades, and regulatory molecules. Deletions/duplications were investigated based on a research criteria with alterations in as a little as 1 marker and 1 kb CNVs having the potential of being reported, however. all CNVs called contained >4 markers (Supplementary Table 3). All found CNVs were statistically enriched in the DSD patient population compared to healthy individuals15 (P value <0.05; Fisher’s exact test).

Table 1 Deletions and duplications within genomic regions of FGF-related genes patients with DSD

Two examples of CNVs found in patients that might be functionally contributing to disease are highlighted in Figure 1. These examples are located in exonic regions of genes known to be essential to gonadal development. Many other CNVs were found in exonic and intronic regions, however, their relevance has not been explored.

Figure 1
figure 1

Genomic location of deletion in FGFR2 and PIK3CB.19 (a) Schematic of microdeletion found in patient D38. Yellow bars highlight single-copy-deletion surrounding FGFR2 exons 14 and 15. Deletion found in two patients in DECIPHER database (ID: 296424 & 265067)13 but not found in normal population through DGV database.15,25 (b) Location of deletion mapped to FGFR2 protein structure schematic. Location encompasses the entirety of the tyrosine kinase 2 domain and partially the tyrosine kinase 1 domain. (c) Schematic of microdeletion found in patient D13. Yellow bars highlight single-copy-deletion surrounding FGFR2 exons 4 through 10. Deletion was not found in normal population through DGV database.15,25 PIK3CB gene has not been categorized as disease causing. (d) Location of deletion mapped to FGFR2 protein structure schematic. Location encompasses the entirety of the C2 domain and partially the helical and RBD domain. (Red bars indicate deletions, blue bars indicate duplications, green bar indicates OMIM disease causing gene, gray bar indicates OMIM non-disease causing gene). ABD, acidic binding domain; DGV, database of genomic variants; OMIM, online mendelian inheritance in man; RBD, ras-binding domain.

Figures 1a and b highlights a deletion of exons encoding the tyrosine kinase 2 of FGFR2 (~116 kb).15 Heterozygous single-nucleotide-activating mutations in this region have been linked to a number of disorders involved with bone patterning and growth.16 Recent studies in mouse models highlight FGFR2’s functional role in development of the bi-potential embryonic gonad with homozygous removal of FGFR2 leading to a variety of phenotypes including hypoplastic testes and ovotestes.10 Further, FGFR2 is an essential regulator of adrenal cortex development highlighting a potential pathogenesis partially dependent on the production of androgenic hormones.17 Heterozygous copy-number loss of the complete FGFR2 or single-nucleotide point mutations has been suggested as candidate abnormalities for DSD in patients with mediator of specific DSD.7,18 It is possible a heterozygous deletion of the tyrosine kinase 2 domain of the FGFR2 could be contributing to the DSD phenotype present in patient D38.

Figure 1c and d highlights a deletion of exons in PIK3CB (107 kb)19 that partially contains the predicted RBD, PI3K-type C2 domain, and partial PIK helical domain of the p110β protein.20 PIK3CB encodes p110β, a catalytic subunit of the PI3K family of kinases.

Recently, p110β has been found to be required for normal mouse testicular development. Expression of inactivated forms (kinase dead) of p110β result in testicular hypotrophy, impaired spermatogenesis, and increased levels of follicle stimulating hormone.21 Further, experiments have highlighted the role of p110β in Sertoli cell development and regulation of the androgen receptor Rhox5.22 p110β has not currently been investigated as a potential mediator of disease in patients with DSD through whole genome interrogation.

The RBD domain has been proposed as a binding domain for Ras-GDP and Ras-GTP, the molecular switches that activate the kinase function of the p110β subunit.23 The C2 domain has been proposed as involved in the targeting of p110β towards the plasma membrane.24 It is possible a heterozygous mutant p110β isoform with impaired Ras-binding activity and plasma membrane targeting could contribute to the DSD phenotype present in patient D13.

Here we present CNV analysis of a heterogeneous population of patients with DSD, finding a significant enrichment in alterations of many members of the FGF signaling pathway. These findings suggest that small CNVs involving FGF-related genes especially those that are smaller than 50 kb in size should be further investigated as having the potential to be candidate genes contributing to DSD phenotypes in humans.