Introduction

Colorectal cancer (CRC), one of the most prevalent cancers worldwide, is responsible for nearly 700 000 deaths annually.1 Similar to other cancers, CRC develops as a result of accumulated genetic modifications that alter normal cellular function and disrupt cell signaling. There are three core cellular processes, cell survival, cell fate and genome maintenance, which are orchestrated through a network of signaling pathways, and disruption of this signaling via genetic mutations confers a selective growth advantage to the cell and eventually results in cancer development.2 These mutations may be inherited or arise spontaneously due to the interplay of numerous environmental factors. In CRC, inherited gene mutations account for roughly 5%–10% of cases and other related syndromes such as familial adenomatous polyposis and hereditary nonpolyposis CRC.3 APC is such a gene that has been widely implicated in the development of CRCs, where nearly 100% of individuals with specific inherited mutations in this gene will eventually develop CRC.4, 5 Spontaneous mutations in APC and other genes such as KRAS and TP53 are also known to contribute to the development and progression of the disease.6

Unlike many other cancers, CRCs can be prevented in an estimated 60% of patients through regular surveillance of individuals over the age of 50 years.7 Despite this, many individuals do not have access to or forgo the recommended screening8 and the widespread incidence of CRC necessitates continued effort to improve patient treatment options. One such strategy that is gaining popularity for cancer treatment is targeted therapy and the use of drugs that specifically target disrupted molecular pathways with more effectiveness and fewer side effects than generalized cancer treatments. For optimal results, this practice requires individual DNA sequencing to identify specific gene mutations that contribute to the cancer progression or interfere with drug effectiveness. For example, KRAS mutations, which are found in a large percentage of rectal cancers, have been found to confer resistance to epidermal growth factor receptor (EGFR) inhibitors, a class of tyrosine kinase inhibitors or monoclonal antibodies designed to slow or halt uncontrolled cell growth.9, 10 Therefore, testing CRC patients for KRAS mutations is recommended before administering EGFR inhibitors, to avoid ineffective treatments with unnecessary toxicity.11

A variety of methods are currently used in the clinical setting to identify gene mutations, such as high-resolution melting and commercially available kits such as DxS and SNaPshot12. Conventional Sanger sequencing and next-generation sequencing (NGS) platforms, such as Illumina 454 pyrosequencing, have also been used to identify genetic anomalies in rectal cancers.13 Although the advantage of these NGS platforms over ready-made kits and high-resolution melting is more data and information on specific mutations, they are costly and time-consuming, and are generally not practical for widespread clinical use. Even Sanger sequencing has limited detection and often fails to recognize mutations when the variant frequency is below 10%,14 which is especially problematic in highly heterogeneous colorectal tumors.15

Recent NGS technological advancements are making personalized DNA sequencing an affordable option with quick turn-around time that may help clinicians to improve patient treatments. Specifically, the Ion Personal Genome Machine (PGM) is a relatively inexpensive benchtop sequencing platform that uses a semiconductor and AmpliSeq cancer panels to rapidly identify mutations in defined or customizable set of known oncogenes and tumor suppressor genes.16 This study aims to demonstrate the utility of the Ion PGM and AmpliSeq cancer panel to identify genetic mutations in 91 rectal cancer patients.

Materials and methods

Ethics statement and patient information

The study has been approved by the Human Research Ethics Committee of Shanxi Provincial People’s Hospital, China. The institutional ethics committee waived the need for consent for formalin-fixed, paraffin-embedded tumor samples from the tumor tissue bank at the hospital’s Department of Pathology. All samples and medical data used in this study have been irreversibly anonymized. A total of 91 formalin-fixed, paraffin-embedded tumor samples from rectal cancer patients were analyzed. Patients were of 31–82 years of age, with a median age of 59 years (Table 1).

Table 1 Clinical features of 91 rectal cancer patients

Sample DNA preparation

The 91 rectal tumor samples used in this study were obtained from the People’s Hospital of Shan Xi Province. Paraffin sections (3- to 5-μm thick) were extracted from formalin-fixed, paraffin-embedded samples and were deparaffinized in xylene, then DNA was isolated using the QIAamp DNA Mini Kit (Qiagen, Valencia, CA, USA) as per the manufacturer’s instructions. Tumor content rate for each formalin-fixed, paraffin-embedded section was determined to be 50% or greater.

Ion PGM library preparation and sequencing

An Ion Torrent adapter-ligated library was constructed using the Ion AmpliSeq Library Kit 2.0 (Life Technologies, Carlsbad, CA, USA, Part 4475345 Rev. A) as per the manufacturer’s protocol and as in our previous publications.17, 18 The Personalized Cancer Mutation Panel used in this study targets 737 mutational hotspot regions to detect mutations in the following 45 tumor suppressor genes and oncogenes: ABL1, AKT1, ALK, APC, ATM, BRAF, CDH1, CDKN2A, CSF1R, CTNNB1, EGFR, ERBB2, ERBB4, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GNAS, HNF1A, HRAS, IDH1, JAK3, KDR, KIT, KRAS, MET, MLH1, MPL, NOTCH1, NPM1, NRAS, PDGFRA, PIK3CA, PTEN, PTPN11, RB1, RET, SMAD4, SMARCB1, SMO, SRC, STK11, TP53 and VHL.

Variant calling

Data from the PGM runs were initially processed with the Ion Torrent platform-specific pipeline software Torrent Suite (Life Technologies) to generate sequence reads, trim adapter sequences, and filter and remove poor signal-profile reads. Initial variant calling from the Ion AmpliSeq sequencing data was generated using Torrent Suite Software v3.0 with a plug-in ‘variant caller v3.0’ program. Four filtering steps were used to eliminate erroneous base calling and generate the final variant calling (Supplementary Figure 1): (1) define average total coverage depth as >100, each variant coverage >20, a variant frequency of each sample >5% and a P-value <0.01; (2) visually examine mutations using Integrative Genomics Viewer software (http://www.broadinstitute.org/igv/) or Samtools software (http://samtools.sourceforge.net) and filter out possible DNA strand-specific errors; (3) set variants within 737 mutational hotspots, according to the manufacturer’s instructions; and (4) eliminate variants in amplicon AMPL339432 (PIK3CA, exon13, chr3:178938822–178938906), which is not uniquely matched in the human genome.

Sequence coverage

For the 91 samples analyzed, the mean read length of each sequence was 78 bp and the average sequence per sample was ~19 Mb. With normalization to 300 000 reads per specimen, there was an average of 1571 reads per amplicon (range: 47–3730; Figure 1a), where 179/189 (94.7%) of amplicons averaged at least 100 reads and 168/189 (88.9%) of amplicons averaged at least 300 reads (Figure 1b).

Figure 1
figure 1

Sequence read distribution across 189 amplicons generated from 91 formalin-fixed, paraffin embedded (FFPE) specimens, normalized to 300 000 reads per sample. (a) Distribution of average coverage of each amplicon. Data are shown as mean±s.d. (b) Number of amplicons with a given read depth, sorted in bins of 100 reads. (Blue bars represent number of target amplicons within read depth, red line represents percentage of target amplicons read depth).

Somatic mutations

In order to distinguish between germline and somatic mutations, detected mutations were compared with variants in the 1000 Genomes Project19 and 6500 exomes of the National Heart, Lung and Blood Institute Exome Sequencing Project.20

Bioinformatical validation

We used the COSMIC database (version 64),21 MyCancerGenome database (http://www.mycancergenome.org/) and additional publications to assess the frequencies of recurring mutations in rectal cancer (Supplementary Table 1).

Results

To examine clinical utility to apply targeted sequencing in the clinical settings, the Ion PGM and AmpliSeq Cancer Panel were used to identify mutations at 737 mutational hotspot regions in 45 oncogenes and tumor suppressor genes in 91 rectal cancer samples from Chinese patients. Analysis revealed that 75 of the 91 samples (82.4%) contained frequent mutations in KRAS (58.2%), TP53 (28.6%), APC (16.5%), PIK3CA (14.3%), FBXW7 (9.9%) and/or NRAS (9.9%), and less frequent mutations in SMAD4 (3.3%), BRAF (2.2%), CTNNB1 (1.1%) and/or ERBB2 (1.1%). Single mutations were found in 37 patients (41.0%; Table 2), double mutations in 24 patients (26.4%; Table 3) and 14 patients (15.4%) had 3 or more mutations (Table 4). There was no significant difference in mutation rates between females and males for any of the genes, except TP53 (65.4% vs 34.6%) and PIK3CA (84.6% vs 15.4%).

Table 2 Single point mutations detected in 91 rectal cancer samples
Table 3 Double combination mutations detected in 91 rectal cancer samples
Table 4 Three or more combination mutations detected in 91 rectal cancer samples

Fifteen of our 91 samples (16.5%) had one or more mutations in the Wnt signaling pathway, where 14 (15.4%) had APC mutations. A total of 16 APC mutations were identified in the 15 samples, all of which were nonsense mutations that introduced a premature stop codon and were located in exon 15 (p.R876*, p.R1114*, p.Q1291*, p.Q1294*, p.E1322*, p.Q1367*, p.Q1378* and p.R1450*). Interestingly, these APC mutations only occurred in combination with mutations in other genes, where 13/14 (92.9%) patients with APC mutations also harbored a mutation in the RAS signaling pathway (Tables 3 and 4). p.R1450* was the most common APC mutation, accounting for 37.5% of the mutations in this gene. In addition, one sample (1.1%) had a CTNNB1 mutation located in exon 3 (p.D32G).

Nine of the 91 samples (9.9%) had a mutation in FBXW7 found at known mutational hotspots in exon 4 (p.R278*), exon 8 (p.R465C), exon 9 (p.R479Q and p.R505C) and exon 10 (p.S582L). Nearly half of all FBXW7 mutations are found at residues 465 and 479 (Akhoondi et al.22), and surprisingly, only 2 of our detected mutations were at these sites. Seven of the nine mutations (77.8%) were found in combination with KRAS mutations, results that are similar to a previous report in CRCs23.

Sixty-three of the 91 (69.2%) samples had 1 or more mutations in the RAS signaling pathway, which includes the oncogenes BRAF, KRAS and NRAS. The majority of RAS mutations were found in KRAS, where 53 samples (58.2%) harbored a mutation in exon 2 (p.G12A/C/D/S/V and p.G13D/R), exon 3 (p.Q61R) or exon 4 (p.A146T). Nine samples (9.9%) contained NRAS mutations, also found in exon 2 (p.G12C/D) or exon 3 (p.Q61H/K/R). Our analysis revealed one NRAS mutation, p.Q61H, has yet to be reported in CRCs in the COSMIC database.24 Two samples (2.2%) contained BRAF mutations, both in exon 15 (p.N581S and p.V600E).

Fourteen of the 91 samples (15.4%) had a mutation in PIK3CA or ERBB2 in the phosphatidylinositol 3-kinase (PI3K) signaling pathway. Thirteen samples (14.3%) contained PIK3CA mutations found in exon 1 (p.R88Q), exon 9 (p.E542K, p.E545K and p.Q546E), or exon 20 (p.M1043I, p.H1047R and p.G1049R) and one sample (1.1%) contained a mutation in exon 21 of ERBB2 (p.V842I). Eleven of these 14 samples (78.6%) also contained additional mutations in the RAS pathway and others.

Three of the 91 samples (3.3%) had a mutation in SMAD4 located in exon 8 (p.R361H) and exon 10 (p.R445*), and occur as a relatively late event with increasing incidence found with progressive disease stage.25 Accordingly, the three samples in our study with SMAD4 mutations were at stage 3b (Table 4) and all occurred in combination with other mutations.

Twenty-six of the 91 (28.6%) samples harbored mutations in TP53, all at known hotspots in exon 5 (p.V173L, p.R175H, p.C176F and p.H179R), exon 6 (p.R196* and p.R213*), exon 7 (p.S241F) and exon 8 (p.R273C, p.V274F and p.P278S). Twenty-one of the 26 (80.8%) TP53 mutations occurred in combination with mutations in other genes and nearly all (90.5%) combined with mutations in the RAS pathway.

Discussion

All of the mutated genes identified in our study have been previously classified as driver mutations that confer a selective growth advantage to the cells harboring the mutations.2 In colorectal and various other cancers with only one driver mutation, the majority of tumors have this mutation in an oncogene, whereas tumors with multiple driver mutations contain a combination of oncogene and tumor suppressor gene mutations.2 Accordingly, in our study, of the 37 samples with a single mutation, 30 (81.1%) harbored the mutation in an oncogene (CTNNB1, KRAS, NRAS or PIK3CA; Table 2), and of the 38 samples with 2 or more mutations, 35 (92.1%) revealed combination mutations in both oncogenes and tumor suppressor genes (Tables 3 and 4). Of the samples with multiple mutations, the most common combinations occurred with KRAS and TP53 (16/38, 42.1%), and KRAS and APC (12/38, 31.6%).

A recent comprehensive exome-sequencing analysis of CRCs found mutations in each of the genes described in our study, with significant mutation rates observed in APC, KRAS, PIK3CA, SMAD4 and TP53.26 This study found 16% of the samples to be hypermutated (mutation rates of >12 per 106 bases) and the remaining non-hypermutated samples containing <8.24 per 106 bases. Interestingly, with the exception of BRAF, all of the mutated genes in our study were shown to be associated with non-hypermutated tumors.

Altered signaling pathways in CRCs

CRC tumorigenesis has been described to follow a sequential pathway from normal mucosa to benign adenoma, then severe dysplasia and finally carcinoma. This process is driven by genetic mutations, both inherited and somatic, which arise in the various signaling pathways, and may follow a sequential accumulation of genetic changes: APC mutations occur early in the transition from normal epithelia to early adenoma, followed by KRAS mutations in the transition to intermediate adenoma, followed by TP53 mutations before the progression to carcinoma.27 Additional mutations may occur at various stages of disease. However, as not all CRCs have been found to contain mutations in APC, KRAS and TP53, alternate pathways leading to CRC have been suggested.6 Our study revealed gene mutations at different frequencies in multiple pathways disrupting each of the core cellular processes: Wnt and Notch signaling affecting cell fate determination, cell survival alteration through mutations in the RAS, PI3K and transforming growth factor-β (TGF-β) pathways, and TP53 mutations affecting genome maintenance through faulty DNA-damage repair (Figure 2). As these various mutations may have prognostic value or may help determine treatment options, it is important for clinicians to have an accessible genotyping tool to uncover multiple common mutations simultaneously, which may be achieved through targeted sequencing using the Ion PGM and AmpliSeq Cancer Panel as in this study.

Figure 2
figure 2

Proportion of gene mutations in different signaling pathways in 91 rectal cancer samples. Outer circle: core cellular processes; middle circle: signaling pathways; inner circle: proportional gene mutations within each signaling pathway, where blue represents tumor suppressor genes and pink represents oncogenes.

Cell fate: Wnt and Notch signaling pathway mutations

APC mutations are early and common events in CRC development, where roughly 40%–65% of patients are found to have mutations in this gene,24 whereas CTNNB1 mutations are less common and have been reported in 6%–9% of CRCs.24 The APC mutation rate detected in our study is lower than other reports, which may reflect our relatively small sample size and genetic variations in different populations. For example, Ling et al.28 used exome capture DNA sequencing to analyze CRCs from Chinese patients and found roughly 30% of samples contained APC mutations, whereas another exome-sequencing study found the mutation frequency of APC in American CRC patients to be 51%–81%.26 In addition, because of our stringent filtering process, several mutations detected were eliminated as false positives owing to low quality (P-value >0.01, coverage <100, variant coverage <20, variant frequency <5%), owing to homopolymers, or because they were not within the hotspot loci defined by the AmpliSeq Cancer Panel. In addition, as only mutational hotspot loci were targeted in our sequencing panel, it is probable that other mutations exist in our study group that were not detected. Over 75% of APC’s coding region exists in exon 15, which most commonly harbors both germline and somatic mutations.29 This exon also contains the mutation cluster region between codons 1286 and 1513, a region that contains 60%–90% of APC mutations.6, 30 Accordingly, 13/16 (81.3%) of the mutations identified in our study were within these codons.

APC is a tumor suppressor and negative regulator of β-catenin, which is encoded for by CTNNB1. β-Catenin is critical in the Wnt signaling pathway that coordinates the expansion and differentiation of intestinal crypt stem cells.31 The APC protein mediates cytoplasmic β-catenin degradation through ubiquitination and a dysfunctional APC protein fails to downregulate the Wnt pathway and leaves β-catenin stabilized.29 Accumulation of β-catenin in the cytoplasm leads to cell proliferation and eventually adenoma formation.32 Most APC mutations result in a C-terminal truncation of the protein and functional loss of both alleles is required for tumorigenesis. Therapeutic options are currently being explored that target the Wnt-dependent and Wnt-independent functions of APC by targeting downstream components of Wnt signaling or inhibiting signaling regulated by APC. However, as loss of APC may cause resistance to certain drugs such as cisplatin through enhanced DNA repair, it is important to identify APC mutations before drug administration.33

An estimated 9%–18% of CRCs contain FBXW7 mutations.22, 24 FBXW7 is a tumor suppressor gene crucial in proteolytic mediation of a number of important regulatory proteins involved in cell fate determination and cell division, including cyclin E1, c-Myc and Notch, among others.34 Roughly 75% of FBXW7 mutations disrupt substrate-binding domains, thus impairing cyclin E degradation or increasing levels of activated mammalian target of rapamycin, whereas most of the remaining mutations are nonsense mutations that result in a truncated protein product.23, 35 FBXW7 mutations also induce p53 activity, as they cause genomic instability, and impaired growth regulation resulting from FBXW7 mutations contributes to the progression of CRC.22, 36, 37 A recent clinical study found that CRC patients with low FBXW7 expression have a significantly poorer prognosis than those with high FBXW7 expression;38 however, another study found CRC patients with FBXW7 mutations R465H/C or R479Q had better 5-year overall survival than those with other mutant types.39

Cell survival: RAS, PI3K and TGF-β signaling pathway mutations

RAS mutations have been found commonly in CRCs: 35% have KRAS mutations, 5%–8% have NRAS mutations and 5%–17% have BRAF mutations.24 RAS proteins are critical in signaling pathways that regulate cell proliferation and differentiation, cell cycle regulation and angiogenesis.40 Dysregulated cell growth and blood vessel formation caused by activating RAS mutations significantly contribute to tumorigenesis.41 KRAS and NRAS mutations are most commonly isolated in codons 12, 13 and 61, and lead to constitutive activation of RAS.41 These mutations cause RAS to have impaired GTPase activity, leading to an accumulation of active, GTP-bound RAS proteins and upregulated RAS function, and subsequently continuous cell proliferation.32, 40 BRAF is a protein kinase downstream of RAS and nearly all BRAF mutations in CRCs are localized to the V600 residue, resulting in constitutive activation of the RAF/MEK/ERK pathway.42 Different clinical studies have found both KRAS and BRAF mutations to be significantly associated with poorer overall survival compared with CRC patients with wild-type status for both genes.43, 44

Consistent with our findings, PIK3CA mutations have been found in roughly 12%–20% of CRCs, whereas ERBB2 mutations are much less common and are found in only 3% of CRC patients.24 The PI3K pathway coordinates multiple cellular functions including migration, proliferation and cell survival, and is also important in oncogenesis. PI3Ks are lipid kinases that phosphorylate and activate a variety of downstream targets responsible for maintaining proper cellular functioning.45 PIK3CA mutations in the helical and kinase domains, such as those found in our study, activate the PI3K–AKT signaling pathway, dysregulate target gene phosphorylation and contribute to oncogenicity.45, 46, 47 Recent clinical studies have found that CRC patients with PIK3CA mutations had decreased relapse-free survival and increased disease-related mortality compared with patients without PIK3CA mutations.48, 49

SMAD4 mutations have been found in roughly 15% of CRCs24 and SMAD4 loss is associated with a worse prognosis and decreased disease-free and overall survival for patients with both early and advanced CRCs.50, 51 The SMAD4 protein acts as a critical tumor suppressor and downstream regulator of the TGF-β signaling pathway. On TGF-β receptor binding and dimerization, R-SMAD is phosphorylated and binds to SMAD4, and this complex enters the nucleus to regulate apoptosis and cell cycle.32 SMAD4 mutations leading to a dysfunctional protein may interfere with proper signaling and gene transcription of target genes critical in cell cycle regulation. Thus, cells with SMAD4 loss may become resistant to growth control and apoptosis normally mediated through TGF-β.52

In addition to prognostic value, mutations in each of the aforementioned genes may be useful in guiding patient treatment, suggesting the clinical benefit of individual tumor sequencing. Preclinical studies have found RAS mutations to cause resistance to PI3K inhibitors, even in the presence of PIK3CA mutations.53, 54 Conversely, loss of SMAD4 has been associated with sensitivity to multiple EGFR family inhibitors;55 however, SMAD4 loss and subsequent PI3K/Akt pathway activation leads to resistance to 5-fluorouracil, a chemotherapy that is widely given to CRC patients.56, 57 Metastatic CRC patients with KRAS mutations do not respond to EGFR inhibitors or monoclonal antibodies.58 As such, it is now recommended that patient KRAS mutation status be confirmed before administration of the EGFR-targeting monoclonal antibodies cetuximab and panitumumab, as these treatments have been shown to have no benefit or cause severe toxicities to CRC patients with these mutations.10, 11, 59 Despite this standard, a universal method of KRAS mutation detection has yet to be established. Some effort has been made to compare various methods of detecting KRAS mutations in a clinical setting, including high-resolution melting, Sanger sequencing and commercially available kits such as DxS and SNaPshot.12, 60 Our study supports that the Ion PGM and AmpliSeq Cancer Panel may be clinically useful in not only detecting KRAS and other RAS mutations but also in simultaneously uncovering relevant mutations in other genes such as BRAF, PIK3CA and SMAD4.

Genome maintenance: TP53 mutations

TP53 is an important tumor suppressor gene and its protein product has multiple important biological functions, including DNA repair, cell cycle arrest and apoptosis.61, 62 The DNA-binding domain encoded by exons 4–8 are critical for the transcription factor functions of p53.63 Mutations within TP53 exons 5–8 are the most common in CRC32, and these mutations prevent sequence-specific DNA binding and lead to defective p53-dependent transcription, cell cycle arrest and apoptosis.64, 65 p53 inactivation is an important step in CRC development, specifically at the transition from adenoma to carcinoma.32 TP53 mutations as prognostic indicators in CRCs is still under debate, as some have found p53 overexpression to significantly correlate with reduced disease-free survival and a much higher incidence of recurrence in rectal cancer patients, and specific mutations have been associated with worse patient prognosis,66, 67, 68, whereas others have found TP53 mutations to have little to no prognostic value.69

TP53 mutations have been reported in 40%–60% of CRCs,24, 69 which is higher than the mutation rate detected in our study. Our relatively low rate of TP53 mutations may again reflect our small sample size, population differences or mutations undetected due to their position outside of our targeted mutational hotspot loci, and follow-up studies with larger sample sizes would be beneficial.

In conclusion, all of the mutated genes in our study have been found to have implications in sensitivity, resistance, or both, to a variety of clinical and preclinical drugs.55 Gene mutations can have significant effects within signaling pathways, which may affect other signaling pathways. No anticancer drug is entirely effective and the complex interplay of genetic factors may contribute to a patient’s response to the treatment. The intricate interaction of genes within and across the signaling pathways highlight the importance of individual cancer DNA sequencing for a variety of genes to maximize treatment benefits. Some NGS methods may provide a fully comprehensive set of mutation data, but may be expensive and time consuming, and may provide more data than clinicians need based on the drugs currently approved for treatment. More affordable commercial mutation detection kits may only analyze one or two genes, may have limited sensitivity, or may not be able to distinguish between mutation types.12, 70 With the Ion PGM and AmpliSeq Cancer Panel we were able to analyze 45 genes simultaneously at a low cost, with rapid turnaround time. Because of the relatively affordable cost and reduced assay time, such technology may be used to direct patient treatments for CRC in the near future.