Whole exome sequencing identified sixty-five coding mutations in four neuroblastoma tumors

Neuroblastoma is a pediatric tumor characterized by histologic heterogeneity, and accounts for ~15% of childhood deaths from cancer. The five-year survival for patients with high-risk stage 4 disease has not improved in two decades. We used whole exome sequencing (WES) to identify mutations present in three independent high-risk stage 4 neuroblastoma tumors (COA/UAB-3, COA/UAB -6 and COA/UAB -8) and a stage 3 tumor (COA/UAB-14). Among the four tumors WES analysis identified forty-three mutations that had not been reported previously, one of which was present in two of the four tumors. WES analysis also corroborated twenty-two mutations that were reported previously. No single mutation occurred in all four tumors or in all stage 4 tumors. Three of the four tumors harbored genes with CADD scores ≥20, indicative of mutations associated with human pathologies. The average depth of coverage ranged from 39.68 to 90.27, with >99% sequences mapping to the genome. In summary, WES identified sixty-five coding mutations including forty-three mutations not reported previously in primary neuroblastoma tumors. The three stage 4 tumors contained mutations in genes encoding protein products that regulate immune function or cell adhesion and tumor cell metastasis.


WES identified 43 mutations not reported previously in four neuroblastoma tumors. WES
analysis revealed that each tumor harbored between seven and twenty-five mutations ( Table 2). The average of 16 mutations per tumor is consistent with previous reports of 12-18 mutations per tumor 13,14 . The four tumors harbored 43 mutations not previously observed in NB tumors in the dbSNP database (version 138), as well as 22 mutations reported previously to be present in other tumor types 15 . Those 43 mutations are in bold in Tables 2-6. In Tables 3-6, 'p' in the third column of each Table identifies the amino acid substitution and position; 'c' in this column identifies the nucleotide substitution and position. While no mutation was common to all four tumors, one of the mutations in the RHPN2 gene was present in two of the four tumors examined: the mutation in this gene (Rhophilin, Rho GTPase Binding Protein 2) was present at nucleotide 217 (G > A encoding Val73Met) in COA/UAB-3 and COA/UAB-8 tumors (Tables 3-5). RHPN2 contributes to actin cytoskeleton organization, an organelle that regulates cell motility 16,17 . A second mutation introducing a start site of RHPN2 gene also occurred in tumors COA/UAB-3 and COA/UAB-8. The location of the introduced start site at the intron-exon boundary suggests that this mutation is unlikely to alter the protein product in tumors COA/UAB-3 and COA/UAB-8. A genome-wide association study (GWAS) found that a region containing RHPN2 has been associated with increased susceptibility to colorectal cancer 18 .
Genes encoding MUC4 and ADAM21 also contained mutations in two of the four tumors, but at different loci. Mucin 4 (MUC4), a transmembrane mucin expressed predominantly by normal epithelial cells, is involved in cell differentiation, inhibition of cell adhesion, and cell migration [19][20][21] . MUC4 protein is thought to contribute  Table 2. Summary of variants (mutations) types for all mutations identified in four neuroblastoma tumors. 1 Mutation of a single nucleotide, resulting in an amino acid change in the encoded protein; may affect phenotype 66 . 2 Mutation that occurs in a coding region, at start site. 3 Mutation that changes nucleotides in genomic loci where splicing takes place. 4 Mutation that generates a new translation initiation codon in the 5′UTR, or that results in the loss of an initiation codon. Start site loss may result in the loss of protein product. 5 Mutation that changes the sequence of a codon to create or remove a stop codon (UAA, UAG, UGA).
to tumor metastasis by limiting the adhesion of tumor cells to primary tumor sites. The mutations identified in this gene include the previously reported 4936 G > A encoding Ala1646Thr in COA/UAB-3 and a not reported mutation at nucleotide 4837 (C > G encoding His1613Asp) in COA/UAB-8. The previously reported mutation at nucleotide 119 (C > T encoding Pro40Leu) of the ADAM21 gene was also present in two of the four tumors (COA/UAB-3 and COA/UAB-6). ADAM21 (A Disintegrin And Metallopeptidase Domain 21) contributes to cell-cell and cell-matrix adhesion and neurogenesis 22,23 .
Each of the three genes (RHPN2, MUC4 or ADAM21) that harbored mutations in more than one tumor has a regulatory role in cell adhesion and motility, cell functions essential to the metastatic process 16,19,[23][24][25] .
A majority of mutations were nonsynonymous coding mutations, indicating that the genes in which these mutations were present encoded proteins containing amino acid substitutions (Table 2). Additional mutations identified were those that introduced ATG start sites or the splice site acceptor sites at an intron-exon boundary. Among the type of mutations, a majority was found to be missense mutations (Tables 3-6). While some of the mutated proteins contribute to common functions, the wide range of functions affected by mutated genes was diverse as has been seen in previous studies 13,14,26,27 . Further, we retrieved the somatic motifs for each variant from the reference sequence, converted into a matrix to estimate the somatic mutational signature and plotted in Fig. 1. The probability bars (UAB-3: purple, UAB-6: blue, UAB-8: green and UAB-14: yellow) from the 6 substitution subtypes (C > A, C > G, C > T, T > A, T > C, or T > G) are shown in Fig. 1.
Three of the four tumors harbored genes that had CADD score greater than 20. Tables 3-6 detail genes that harbor mutations, identified by WES. Three of the four tumors had genes with elevated CADD (Combined Annotation Dependent Depletion) scores, a scoring system designed to predict the potential pathogenicity of nucleotide mutations, deletions, or insertions. Scores of ≥20 for a given gene have been associated with specific human pathologies 28 . Genes that were mutated in any of the four tumors analyzed and that had CADD scores ≥20 are described briefly below.
COA/UAB-3. Five of the nineteen mutated genes were designated as CADD ≥ 20: TCEB3 and TOE1 on chromosome 1, WDR35 and COL4A4 on chromosome 2, and STK11 on chromosome 19. TOE1 is a target of EGR1 (Early Growth Response 1), and inhibits cell growth 29 . Mutations in the TOE1 gene have been associated with hepatic and pancreatic malignancies, but the sample number supporting this association is relatively small 15 . Mutations in the WDR35 gene have been observed in patients with Sensenbrenner syndrome, also known as cranioectodermal dysplasia 30 . Mutations in COL4A4 have been linked to thin basement membrane disease 31 . Mutations in STK11 have been associated with Peutz-Jeghers syndrome, a disease characterized by development of hamartomatous polyps in the gastrointestinal tract 32 . Patients with Peutz-Jeghers syndrome have a ~15-fold higher risk of developing intestinal cancer than the normal population 33 .  Ingenuity Pathway Analysis (IPA) identified pathways and physiological systems, development and function, and function associated with network using proteins encoded by mutated genes. We next used Ingenuity Pathway Analysis (IPA) to identify pathways (Tables 7-10), physiological systems, and functions likely to be affected by variant proteins encoded by mutated genes (Tables 11-14). P-values indicate the greater or less likelihood that a given protein is involved in a specific pathway. P-values < 0.05 indicate a likely association between indicated proteins and pathways 46 . The range of p-values in Tables 11-14 reflects the likelihood that proteins of interest were related to specific functional subcategories in the broader functional    Tables 15-18. IPA determined that biological functions associated with proteins mutated in the stage 3 COA/UAB-14 tumor included nervous system development and function (p < 0.049), reproductive system development and function (p < 0.048), and musculoskeletal development and function (p < 0.043) (Table 14)-all early developmental processes. In contrast, IPA of genes mutated in the three stage 4 high-risk tumors (Tables 11-13) indicate the potential involvement of cellular functions more closely related to cell-mediated immune response, hematologic development and function, immune cell trafficking, and cell adhesion or motility. Detailed findings by IPA for each tumor are as follows.
COA/UAB-3. IPA data indicated that the 19 mutations identified in this tumor were likely to involve proteins that contribute to ERK5 signaling (p = 0.049), PXR/RXR (p = 0.0521), and GPCR signaling (p = 0.0551) ( Table 7). ERK5, extracellular-signal-regulated 5, is a member of the MAPK (mitogen-activated protein kinase) family. This pathway is activated by epidermal growth factors which are reported to play key roles in cell proliferation and differentiation 47 . The pregnane X receptor (PXR) is predominantly expressed in the liver and intestine, is usually activated by PXR in conjunction with the retinoid X receptor (RXR), and contributes to drug metabolism by inducing the family of cytochrome P450 enzymes 48,49 . Table 11 shows that the most affected physiological system and development and function in this tumor includes cell-mediated immune response (p < 0.0452), embryonic development (p < 0.0482), hematologic system development and function (p < 0.049), hematopoiesis (p < 0.046), and immune cell trafficking (p < 0.0448).
Nine genes in which mutations occurred in this tumor contribute to cell morphology, cellular assembly and organization, and neurological disease: ACADVL, CLIC5, COL4A4, ITM2B, RHPN2, SNX21, TCEB3, TOE1, and WDR35 (Table 15). Nine mutated genes are associated with nervous system development and function, connective tissue disorders, and cell-to-cell signaling or interaction: ADAM21, FOXO3, GIPR, MAEL, MUC4, RNASE4, SELL, STK11, and TAS1R2 (Table 15).   (Table 4). IPA analysis demonstrated the likely involvement of the corresponding mutant gene products as components of the following pathways: eicosanoids (p = 0.000989), prostanoid (p = 0.0067), and protein kinase A signaling (p = 0.0253) pathways ( Table 8). The eicosanoid pathway is involved in inflammation and immune-related functions, including cyclooxygenase synthesis and metabolism 50 . Prostanoids are the subclass of eicosanoids to which prostaglandins belong. Protein kinase A signaling pathway involves classic endocrine signaling and function to mediate the effect of cAMP 51 . Key physiological systems, functions and development affected by these pathways include cell-mediated immune response (p < 0.00224), hematological system development and function (p < 0.00224), immune cell trafficking (p < 0.0478), and nervous system development and function (p < 0.05) ( Table 12).

Figure 1.
Somatic mutational signature profiling. The somatic motifs for each variant were retrieved from the reference sequence and converted into a matrix. Non-negative Matrix Factorization (NMF) was used to estimate the somatic signature and then plotted. We used SomaticSignatures package to extract the somatic motifs of these samples.  Table 7. Pathways identified by IPA to be associated with proteins encoded by mutated genes in COA/UAB-3.  Table 8. Pathways identified by IPA to be associated with proteins encoded by mutated genes in COA/UAB-6.

Pathways affected by variant gene products p-value Ratio
RhoA signaling 0.0359 1/122 (0.008) Table 9. Pathways identified by IPA to be associated with proteins encoded by mutated genes in COA/UAB-8.
In summary, WES identified a total of sixty-five mutations in one stage 3 and three stage 4 NB tumors. No affected gene or associated cell function was common to all four tumors. The three stage 4 tumors each had mutations in genes encoding aspects of immune function or response. Genes encoding proteins of diverse function were affected, possibly reflecting the phenotypic heterogeneity that has been observed by other methods of analysis for this tumor type.

Discussion
In our current study, we performed WES analysis of specimens from four primary NB tumors. Three of the four tumors were designated stage 4 and high-risk. Two of the four had amplified MYCN. WES identified 43 mutations not reported previously in these tumors. No single mutation was common to all four tumors. Two of those mutations and one of the previously reported mutations in RHPN2, was identical in tumors COA/UAB-3 and COA/UAB-8 (RHPN2, p.Val73Met/c.217 G > A; p.Arg255Gln/c.764 G > A). Each of the stage 4 tumors harbored mutations in genes encoding proteins that directly affect immune function. The mutation frequency in our study  Table 10. Pathways identified by IPA to be associated with proteins encoded by mutated genes in COA/UAB-14.

Systems affected by variant gene products p-value (range) Molecules
Cell  was similar to those reported in other studies 13,14,26,27 . Of note, we used patient-derived white blood cells as the control to exclude germline mutations which may not contribute to the pathologic process.
Among successful utilization of WES to identify mutations in NB, a recent paper by Pugh et al. described genetic variations of 240 high-risk NB specimens, and identified genes with significant somatic mutation frequencies (mutation frequencies of < 9.2%) including ALK, PTPN11, ATRX, MYCN and NRAS which percentages regarded as too low to be identified in a study in which fewer than hundreds of tumors were analyzed 14,26,56,57 . ALK has been reported as a major familial NB predisposition gene among high risk NB patients 58 . ALK is also a known oncogene in other tumor types such as anaplastic large cell lymphoma 59 . While we observed no ALK mutations in our study, this finding is consistent with the low percent of tumors affected (9.2%) 14 . PTPN11 (mutation frequency of 2.9%), is a member of the protein tyrosine phosphatase family, and alterations of this gene may contribute to NB transformation 14 . We did not detect this mutation in any of 4 NB sequenced. Pugh, et al. also identified mutations in the MYCN gene. While MYCN amplification is a well-documented prognostic indicator for poor outcome in NB, the functional significance of mutations in this gene remains elusive.
Another recent study by Lasorsa et al. identified and discussed somatic mutations that may affect cancer progression in NB 26 . WES analysis of 17 high-risk tumors identified 22 mutated genes implicated in cancer progression. In this study, authors also found similar low rates of mutations reported by us and others 13,14,26,27 . Interestingly, Lasorsa et al. proposed that CHD9 and PTK2 (FAK) comprise driver genes associated with aggressive NB, although only 2-4% of tumor specimens examined harbored mutations in these genes. The authors proposed further that loss of CHD9, chromatin related mesenchymal modulator, leads to NB tumor progression as seen in other cancer types. The somatic mutations found in PTK2 localized adjacent to two functional phosphorylation sites (Tyr576 and Tyr861), mutations that activate FAK protein. FAK activation regulates invasive and migratory properties by altering cytoskeletal function and cell adhesion [60][61][62] . Similarly, we found a mutation in RHPN2 in two of the three stage 4 tumors. RHPN2 regulates cell invasion and migration by activating RhoA, a master regulator of cell motility 16 . Work is ongoing to evaluate whether RHPN2 supports NB cell metastasis, and to examine the hypothesis that inhibition of tumor cell motility comprises a therapeutic approach in high-risk NB. Determining functional correlations for the mutations identified is the priority for the next study to strengthen current findings. Further, we acknowledge that our sample number is too small to designate any mutation as a NB driver mutation, which is considered as a limitation of the current study.
In summary, we identified sixty-five mutations among four NB tumors using WES, a sequencing method to identify genetic aberrations. Current work focuses on comparing expression profiles and phenotypes of these NB tumors with WES analyses. If genomic characteristics of NB tumors reflect tumor cell phenotype and sensitivity or resistance to specific therapeutic regimens, the observed genomic diversity suggests that personalized approaches to therapy may be necessary to improve clinical outcome for patients with high-risk stage 4 NB.

Methods
Ethics Statement: Human Subjects. This study included human subjects. All procedures were approved by the University of Alabama at Birmingham Institutional Review Board (IRB) in accordance with the guiding

Systems affected by variant gene products p-value(range) Molecules
Tissue development 0.000597-0.00119 MUC4 Table 13. Physiological systems or functions identified by IPA to be associated with proteins encoded by mutated genes in COA/UAB-8. Nervous system development and function, connective tissue disorders, cell-cell signaling Table 15. Diseases and functions associated with networks identified by IPA to be affected by observed mutations in COA/UAB-3.

Systems affected by variant gene products p-value Molecules
Nervous system development and function 0.00105-0.049 UTRN, AR,GRIN2C, RNF20, SUFU Reproductive system development and function 0.00105-0.048 AR Skeletal and muscular system development and function 0.00105-0.043 AR, UTRN, SUFU, DOCK5 WBCs mapped to specific genomic regions. Over 84% of reads were properly paired, indicating that both forward and reverse reads were correctly oriented. Percent duplication ranged from 9.12% to 34.34%. These parameters indicate the reliability of the WES data presented (Table S2) 63 . Table S3 shows allele fractions of variants not reported previously in each tumor.

ID Molecules in Network
Diseases and functions associated with this network  Somatic mutation signature profiling. The SomaticSignatures package was used to extract the somatic motifs of these samples. In brief, the somatic motifs for each variant were retrieved from the reference sequence and converted into a matrix 64 . Non-negative Matrix Factorization (NMF) was used to estimate the somatic signature and then plotted.

Data analysis and statistics.
To call variants (SNPs, INDELs), the raw fastq reads from the exome capture was aligned to UCSC's high19 reference genome using Burrow-Wheeler Aligner (BWA) 65 . Variants were identified using Broad's Genome Analysis Toolkit (GATK) and following Broad's Best Practices for Variant Detection protocol 66,67 . Briefly, the aligned file from BWA was realigned and recalibrated using GATK. Following base recalibration, MuTect was used to identify somatic point mutations between the tumor and normal sample 68 .
Once the variants were identified, SnpEff was then used for annotation 69 .