Introduction

DNA sequencing is one of the main concerns of medical research nowadays. Union of chain termination sequencing by Sanger et al.1 and the polymerase chain reaction (PCR) by Mullis et al.2 established many marked events such as the completion of the Human Genome Project (HGP), providing a barely sufficient reference to investigate the genetic alterations in the associated phenotypes.3, 4, 5

In Sanger sequencing, all four of the standard deoxynucleotides (dNTPs) and the four chain-terminating dideoxynucleotides (ddNTP) with different fluorescent colors attached on each are crucial to perform a sequencing reaction as well as a DNA primer, a single-stranded template, and a DNA polymerase enzyme. Eventually, newly developed technologies replaced the traditional methods for whole-genome sequencing (WGS) and whole-exome sequencing (WES) with a less expensive sequencing cost per genome/exome. These revolutionizing sequencing technologies, so-called next-generation sequencing (NGS), are promising to be used in clinic to improve human health, although their expensive costs, ethical issues related to the produced genetic data and the need for user-friendly software in the analysis of the raw sequence have to be addressed. The current NGS platforms include those produced by 454 Life Sciences, Illumina, Applied BioSystems, Helicos Biosciences, Danaher Motion, Oxford Nanopore Technologies and Pacific Biosciences. NGS has influenced all fields of biological research; here, we provide a review of the recent literature focusing on applications of WES in medical research and how it is imbuing our understanding about genetic mechanisms of the diseases.

Value of WES in medicine

Human genome comprises ∼3 × 109 bases having coding and noncoding sequences. About 3 × 107 base pairs (1%) (30 Mb) of the genome are the coding sequences. The genome assembly from the Genome Reference Consortium (GRCh37.p10, Feb 2009) in Ensembl (http://asia.ensembl.org/Homo_sapiens/Info/StatsTable?db=core) includes ∼20 000 protein-coding genes, pseudogenes and noncoding genes. The HUGO Gene Nomenclature Committee approves more than 19 000 protein-coding genes, 8000 pseudogenes, 4000 noncoding RNA genes (http://www.genenames.org/cgi-bin/hgnc_stats) and almost 34 000 gene symbols.6 In fact, less than 10% (∼3 Mb) of the whole-genome sequence is characterized, and especially insufficient clinical knowledge is acquired from genome sequence.7 In addition, it is estimated that 85% of the disease-causing mutations are located in coding and functional regions of the genome.8, 9 For this reason, sequencing of the complete coding regions (exome) has the potential to uncover the causes of large number of rare, mostly monogenic, genetic disorders as well as predisposing variants in common diseases and cancers. Although the identification of the causal variations in coding and noncoding sequences leads to a clinical phenotype, the epigenetic modifications are to an extent other critical determinants of the phenotype.10

Details of the WES strategy have been recently reviewed.9, 11 In general, the human genome project encompasses about 3.5 million SNPs. A multistep filtering against public databases such as dbSNP, 1000 Genome Project12 and HapMap (www.hapmap.org) is used that narrows down the identified variants to a small number. On the basis of the disease being studied and its mode of inheritance, a specific strategy can be applied to reach the possible causing variants; the functional and biological significance of each variant follows a gradient that ranges from normal to pathological effect.13, 14 Notably, variant calling and gene annotation of WES data of each individual exome contains about 10 000 nonsynonymous variants depending on ethnicity and calling methods. A normal individual has been estimated to have 50–100 mutations in the heterozygous state that can cause a recessive Mendelian disorder when being homozygous.12

Identification of the variants causing the disease brings the research into clinical practice. Disease-causing variants with large pathogenic effect (high penetrance) mostly seen in single gene disorders are the first group of classified variants. These variants are mainly rare. Although some variants are in handful of individuals who are associated with rare or uncommon diseases, they are categorized as likely disease-causing variants with less certainty of the variants causing the disease due to incomplete penetrance. NGS approach would verify these variants, which is helpful in management of individuals carrying such variants. Other group is the variants with higher frequency and lower penetrance in cases than controls based on genome-wide association studies. These variants could be detected with DNA chip genotyping and NGS approaches. WES could identify these variant and is used in clinical management of individuals, for example, in familial hypercholesterolemia dietary management in individuals having the causal variants is lifesaving. High penetrance variants detected by WES are important in diagnosis of the patients and healthy carriers.

There are functional variants including insertions, deletions, nonsense variants, splice variants and copy-number variation with no association to disease and are suspected to cause a disease. They may be novel, and WES of different individuals carrying the variants may be a clue for their effect. Finally, a large number of variants has an unknown clinical significance located in introns and intergenic regions.13 They may be benign or pathogenic.

WES in research

NGS has improved our understanding of the genetic pathology of the diseases. The first report of selectively sequencing of whole exome was published by Ng et al.15 in 2009. They reported the targeted capture and massively parallel sequencing of the exomes of 12 humans including eight individuals previously characterized by HapMap and Human Genome Structural Variation project and four unrelated individuals affected by a rare dominantly inherited disorder called Freeman-Sheldon syndrome, which is known to be caused by mutations in the MYH3 gene (FSS; MIM193700). Rare and common variants were identified. After assessment of the quality of the exome data, 13 347 variants were appeared to be novel. Subsequent filtering of these variants against dbSNP (v129) or those found in the HapMap samples defined the MYH3 gene as a disease-causing gene in FSS patients. Consequently, WES could help to define the causal variants more easily. This kind of study is performed to uncover the genetic basis of genetic disorders, which would lead to treatment and management of patients. A PUBMED search limited to Title/Abstract on ‘whole-exome sequencing’ revealed that more than 150 articles have been published about gene discovery in various disorders.14 Current literature of WES is growing exponentially.

WES is charming its new way in diagnostic and clinical genetics

In the era of genetic medicine, most developed technologies are coping with diagnosis and the development of new therapies. Medical sequencing projects have challenge of identifying the cause of rare genetic disorders and cancers. Different genes and causal variants have been discovered by WGS and WES. The aim is to provide efficient and effective genetic data for best treatment; albeit, accurate, fast and cost-effective diagnosis of the patients are a major concern. Similarly, the preclinical individuals at risk of having the disease could be identified; however, prognostic evaluation and determining the appropriate therapy is applied for those individuals with defined sequence by the NGS technology.

The first report that pinpointed an exact diagnosis established by WES was published by Choi et al.16 in 2009. WES on a patient referred for Bartter syndrome (MIM241200), a rare inherited disorder characterized by hypokalemia, showed that the patient has a novel homozygous mutation in SLC26A3 gene; mutations in this gene have been known to cause congenital chloride-losing diarrhea (CLD;MIM214700). Clinical re-evaluation of the patient who was misdiagnosed as Bartter syndrome established the correct diagnosis as CLD. Another study showed that WES helped in the diagnosis of a patient with Leber congenital amaurosis who had mutation in PEX1 gene associated with peroxisome biogenesis disorders.17 A patient misdiagnosed as having the Hermansky–Pudlak syndrome type 2 was considered for homozygosity mapping and WES because of other phenotypic effects including oculocutaneous albinism (OCA) and neutropenia. No mutation was found in the related AP3B1 gene; followed by WES, two disease causative genes were identified, naming SLC45A2 (related to OCA) and G6PC3 (related to neutropenia) genes.18

Indeed, when applying for diagnostic, sequencing of many known disease-associated genes at once via NGS (in case of genetically heterogeneous disorders) is probably enough rather than WES.19 In a survey by Shen et al.,20 524 candidate nuclear-encoded mitochondrial genes were sequenced for molecular diagnosis. Because of genetic heterogeneity of inherited cardiomyopathies, Meder and colleagues21 used targeted exome sequence of about 30 selected genes. Jones and colleagues22 selected 24 genes of congenital disorders of glycosylation by means of targeted NGS. In this case, greater sequencing depth is achieved at a relatively low cost as well as a time-to-diagnosis would be improved using this targeted gene approach.23

Disorders with phenotypic/genetic heterogeneity or patients with overlapping symptoms are difficult to diagnose. The clinical differential diagnostic tests may be lengthy and costly.24 WES provides insights into genetic diagnosis of the challenging cases. For example, a phenotypically and genetically heterogeneous disorder named neuroacanthocytosis (NA) syndrome includes chorea-acanthocytosis (ChAc), X-linked McLeod syndrome (MLS), Huntington’s disease-like 2 (HDL2) and pantothenate kinase-associated neurodegeneration (PKAN). WES of the genes made the diagnosis easier.25

WES has a useful application in disease treatment, screening and prenatal diagnosis. A 15-month-old boy with an immune deficiency was exome sequenced and diagnosed as Crohn disease due to a mutation in the X-linked inhibitor of apoptosis gene. For the management of the patient, hematopoietic progenitor cell transplant with an excellent outcome was performed.26

Neonatal screening of diabetes mellitus (NDM) has been studied by WES; oral treatment with sulfonylurea drugs instead of insulin therapy is used for patient care in patients carrying a mutation in KCNJ11 or ABCC8 genes.27

Prenatal diagnosis using fetal DNA in maternal serum represented the applicability of WES in finding aneuploidies with a noninvasive method.28, 29, 30 In a survey by Bell et al.31 carrier testing carried out for 448 severe recessive childhood diseases by NGS. The expense of carrier testing with such a method is lower than the cost of treatment and management of the children born with such disorders.

Preclinical application of WES was recognized by characterization of mutations in genes causing phenotypically similar disorders. Jiao and colleagues32 studied 10 nonfamilial pancreatic neuroendocrine tumors with WES. The frequent gene mutations were used for prognosis.

Our understanding of the genetic and epigenetic profiles of a disease facilitates the identification of mutations for diagnosis, prognosis and new therapeutic approaches for health-care programs.7 WES has evolved the biomedical research. The characterization of the disease and genes involved are understood by massively parallel sequencing. WES is less challenging than WGS, although the cost will become less. More or less, all NGS strategies (WES, RNAseq, targeted sequencing, WGS,…) could be used in genetic diagnosis depending on the disease and the available information; for example, targeted NGS works for X-linked mental retardation, but WGS may be appropriate for mental retardation and congenital malformation due to structural variations. The complexity of genome may affect the disease progression as in cancers; therefore, different approaches may be used. As above mentioned, WES has been applied in different areas of research and diagnostics: in brief, including diagnosis (prenatal diagnosis (PND), preimplementation genetic diagnosis (PGD), carrier/mutation detection of the heterogeneous disorders such as hearing loss with many causal genes), prognosis of preclinical individuals, newborn screening procedures and treatment. Certainly, gene discovery of unknown disorders and a predisposed SNP detection of common disorders clarify the genetic basis and molecular mechanism of disorders. The interaction of molecules in cells (interactome) is revealed, and the genes in biological networks are determined. In particular, WES could improve health care by influencing disease management and drug discovery, and later personalized medicine is achieved for customizing the health of each individual (Figure 1). These are maintained for different disorders ranging from rare Mendelian disorders and heterogeneous disorders to polygenic disorders complex diseases such as common disorders and cancers. Here, we focus on the usage of WES in different category of diseases.

Figure 1
figure 1

WES and impact of its genetic consequences on human public health. Variant calling of the WES data is applied for various purposes in different disorders: diagnosis (prenatal diagnosis: PND, preimplementation genetic diagnosis: PGD and mutation detection in heterogeneous diseases such as hearing loss using selected genes for faster diagnosis), screening procedures and research. Consequently, WES benefits for treatment and management of patients, gene discovery, SNP detection for drug effects and finding disease mechanism and genes network. All these will hint at health improvement including disease management, personalized medicine. A full color version of this figure is available at the Journal of Human Genetics journal online.

Usage of WES inhuman disorders

Monogenic disorders

Currently, over 6000 presumably monogenic disorders have been described, but for nearly two-thirds of these the molecular basis has not been reported (OMIM Statistics). Understanding the pathogenetic mechanism of a disease mostly depends on finding the causative gene and variants associated with the phenotype. In case of a newly identified variant in a patient or in a small family, a clearly defined genetic diagnosis is difficult to be plausible only on the basis of variant finding. Whereas its absence in the population controls is verified, the presence of the same and other variants in the same gene in other patients or families with the same disease are usually used to confirm the new pathogenic variant/s. If the disorder is extremely rare, it is hard to find more patients. However, further functional experiments are crucial to validate the pathologic impact of the newly determined variants; if the mutated gene has a defined role in a well-known pathway related to the disease, it is acceptable to perform biochemical confirmatory experiments. Identification of novel genes causing rare monogenic disorders is crucial to comprehend the biological pathways as well as treatment or therapeutic management. Recent publications on WES show that it is a powerful tool in finding the causal genes for Mendelian disorders;14 here we focus only on application of genetically heterogeneous single-gene disorders.

Heterogeneous monogenic phenotypes

There are many genetically heterogeneous conditions such as hearing loss, intellectual disabilities, autism spectrum disorders and retinitis pigmentosa. WES has been successfully used to identify the causative variants in several heterogeneous conditions. Here, we briefly mention some examples.

Hearing loss

Hearing loss (HL) is the most prevalent sensory defect in humans affecting 1.86 in 1000 newborns worldwide, half of which is due to genetic causes.33 On the basis of estimation, about 1% of human genes (200 to 250 genes) may be involved in hereditary HL.34 To date, more than 60 genes have been reported to cause nonsyndromic HL.35 This picture would be more complicated if we consider that more than 400 syndromes related to HL have been described in the OMIM. Identification of novel genes using classical gene discovery methods is usually laborious and time consuming. WES is a rapid way to get a full picture of the protein-coding sequence variations. For example, Walsh et al.36 described the application of WES in conjunction with homozygosity mapping to define the causative variant for recessive deafness in a consanguineous family. By using the regions of homozygosity as an evidence, they quickly identified the causative mutation, p.R127X, in GPSM2 that maps to the DFNB82 locus.36, 37 Diaz-Horta et al.,38 in 2012, after excluding mutations in the most common autosomal-recessive nonsyndromic HL (ARNSHL) gene, GJB2, performed WES on samples from 20 unrelated multiplex consanguineous families with ARNSHL. They found twelve homozygous mutations in known deafness genes in 12 families. Thereupon, rare causative mutations in known ARNSHL genes were reliably determined via WES. More than 15 genes associated with syndromic or nonsyndromic HL have been identified using WES (Table 1).

Table 1 The reported new genes in HL identified by WES

Intellectual disabilities

Intellectual disability (ID), referred to as cognitive impairment or mental retardation, is usually defined by a considerably below-average score on tests of mental capability or intelligence and limitations in the ability to function in areas of daily life39 ranging from mild to profound. It appears as nonsyndromic or associates with other clinical features in syndromic forms. ID accounts for up to 3% of the population in industrial countries.40, 41 Cognition is a consequence of a sequential and simultaneous collaboration among a complex network of neuron and brain-expressed genes in the cellular and molecular levels. In addition to non-genetic factors, ID can have genetic bases including point mutations of single genes, large cytogenetic abnormalities and epigenetic alterations. More than 60 X-linked genes and seven autosomal genes for nonsyndromic intellectual disability have been recognized so far.42, 43 Over thirty studies have used WES to identify the casual variants in various affected families including those with sporadic and familial nonsyndromic and syndromic forms of intellectual disability (Table 2). For instance, Caliskan et al.43 identified the causal mutation, p.P182L in TECR gene, in a consanguineous family with five affected individuals. De novo mutations are involved in many sporadic cases of ID; 45 and 7 out of 74 studied cases are due to de novo and dominant mutations, respectively (Table 2). Correspondingly, WES could be an effective and interesting tool for identifying the genetic basis of such heterogeneous disorders such as ID.

Table 2 The identified new genes in ID by means of WES

Movement disorders

Movement disorders, a group of neurological conditions, cause abnormal voluntary or involuntary movements or slow, reduced movements, for example, ataxias, dystonias and so on. Fuchs et al.,44 using exome sequencing in two families with primary torsion dystonia, identified a nonsense and a missense causative mutation in GNAL gene in two families.44 In a study on a large essential tremor-affected family using exome sequencing, a FUS nonsense mutation was appeared to cause the disease.45 Rosewich et al.46 employed WES on three proband-parent trios and determined three heterozygous de novo missense mutations in alternating hemiplegia of childhood (AHC); they concluded mutation analysis of the ATP1A3 gene in AHC patients could provide an accurate genetic diagnosis.46 Similarly, WES of four siblings with pontocerebellar hypoplasia type 1 showed mutations in EXOSC3 gene.47 In addition, congenital mirror movements, involuntary movements of one side of the body that mirror intentional movements on the opposite side, are genetically heterogeneous and frequently inherited as autosomal-dominant mode of inheritance. By combining genome-wide linkage analysis and WES, Depienne et al.48 identified heterozygous mutations introducing premature termination codons in RAD51gene in two families affected with congenital mirror movements. Weissbach et al.,49 using a combined approach of genome-wide linkage analysis, WES and a systematic validation procedure in a German family with restless legs syndrome(RLS), found variants in four genes (FAT2, ATRN, WWC2 and PCDHA3) that may cause RLS in this family.49 RLS is one of the most common movement disorders in Europe and the United States.

Monogenic types of Diabetes

To identify the variants causing obesity in a patient suffering from permanent neonatal diabetes mellitus (NDM), a rare monogenic form of non-autoimmune diabetes, Bonnefond et al.27 applied WES on the patient who had no mutation in KCNJ11, ABCC8 and INS genes and other abnormalities in chromosome 6q24 region in their previous studies. A de novo non-synonymous mutation, p.Q485H, was found in the ABCC8 gene.

Maturity onset of the young (MODY), a heterogeneous type of diabetes showing an autosomal-dominant mode of inheritance, has an onset before the age of 25 years; it is due to a primary deficit in the pancreatic beta-cell function. Genetic causes of about 30% of MODY cases are still unknown (MODY-X). By using WES, a four-generation MODY-X family was studied to find the genetic causes of the condition. After co-segregation analysis within the family, Bonnefond et al.27 found that a mutation (p.Glu227Lys) in KCNJ11 gene is segregated with the disease. In addition to neonatal diabetes mellitus, KCNJ11 has a role in MODY (MODY13).

Common diseases and complex disorders

Molecular genetics of complex phenotypes including common diseases and multigenic traits has been gradually developed; the common disease-common variant hypothesis (CD-CV) has been the main force for genome-wide association studies (GWAS). However, it turns out that common alleles are responsible for a fraction of complex traits. Thus, a rare variants-common disease (RV-CD) hypothesis is now being emerged.50, 51 WES will identify rare and novel as well as common genetic variants in coding regions associated within complex and common traits.

Cardiovascular disease

Cardiovascular disease (CVD) is the major cause of death in the United States.52 On the basis of the etiology, there are two forms of CVDs: the rare form that is monogenic (Mendelian: includes structural cardiomyopathies, channelopathies and familial dyslipidemias) and the common one having polygenic/multifactorial causes. Recent advances have evolved our understanding of many cardiovascular diseases. Defining the role of genes associated with Mendelian forms of cardiovascular diseases provides insights to other types of heart diseases and its’ managements (prevention/treatment) of the disorder. WES has been used to reveal genetic variants in affected individuals with rare forms of heart disease.

Cardiomyopathies may be presented as hypertrophic cardiomyopathy (HCM) or dilated cardiomyopathy (DCM). A family history of the disease could be traced in HCM form,53 while about 20–35% of DCM may show a familial history of the disease.54 HCM is mainly due to mutations of genes encoding sarcomere proteins; in turn, more heterogeneous etiology of DCM was shown.55, 56 Meder et al.21 established a comprehensive genetic screening in patients with hereditary DCM or HCM in a cost-efficient manner using an array-based subgenomic enrichment systems. They found two microdeletions and four point mutations in six patients. WES provides a new way for the diagnosis of cardiomyopathies. Several new genes have been discovered in cardiomyopathies (Table 3). WES has characterized the causal gene of familial combined hypolipidemia (ANGPTL3),57 severe hypercholesterolemia (ABCG)58 and familial dilated cardiomyopathy (BAG3).59 WES promises to revolutionize the genetic research about CVDs in the near future. Using this technology, identification of rare SNPs susceptible to CVDs is also possible.

Table 3 Reported WES approaches in cardiovascular diseases

Hypertension

One billion people suffer from hypertension worldwide. To date, a few genes causing Mendelian forms of hypertension have been described. More recently, Boyden et al.60 studied 52 unrelated patients suffering from pseudohypoaldosteronism type II (PHAII), a rare Mendelian syndrome characterized by hypertension, hyperkalaemia and metabolic acidosis; mutations in kelch-like 3 (KLHL3) or cullin3 (CUL3) were identified using WES in PHAII patients. Austin et al.,61 using WES, studied a three-generation family with multiple affected family members with pulmonary arterial hypertension and found a frameshift mutation in caveolin-1 (CAV1) gene. Familial hyperkalemic hypertension (FHHt), a Mendelian form of arterial hypertension, is partly defined by WNK1 and WNK4 gene mutations. By means of WES, KLHL3 mutations were determined as a third gene responsible for FHHt;62 elucidation of the role of genes involved in the signaling pathway regulating ion homeostasis, one of the impacts of WES, could provide molecular insights into understanding mechanisms controlling blood pressure.

Obesity and diabetes

Prevalence of the obesity is rapidly growing; 11% of adults aged 20 and over were obese in 2008 according to WHO. Obesity is a main potential risk factor for diabetes type 2 (T2D) and has usually a high predisposition for cardiovascular disorder and hypertension.63 Molecular investigation of obesity is crucial for understanding the underlying mechanisms of regulation of adiposity and the pathophysiology of obesity and unraveling the molecular regulation of appetite. Finding and designing the strategies for obesity management (prevention and therapy) demand to define molecular factors causing the disease. Mendelian obesity is involved at least in 5% of the severely obese individuals.63 WES facilitates identifying new highly penetrant SNPs. Until now, there is no a huge literature about using WES for detecting the underlying or predisposing SNPs among obese population. As the cost of WES is rapidly decreasing, sequencing of 1000 obese people is proposed to perform like 1000 Genomes Project; then, management of the obesity and associated disorders will be expected to be done in a well-understood manner.

Lehne et al.64 investigated the distribution of association signal with respect to protein-coding genes for seven diseases including Crohn’s disease, type 1 and type 2 diabetes, rheumatoid arthritis, hypertension, coronary artery disease and bipolar disorder from the Wellcome Trust Case Control Consortium. Their study showed that there is a consistently stronger association signal in coding than in noncoding regions for these disorders. Albrechtsen et al.65 applied WES to identify novel associations of coding polymorphisms at minor allele frequencies (MAFs) >1% with common metabolic phenotypes. In stage 1 of their study, WES in 1000 cases with type 2 diabetes, BMI >27.5 kg m−2 and hypertension and in 1000 controls was performed and they selected 16 192 SNPs associated with case–control status, from four selected annotation categories or from loci reported to associate with metabolic traits. In the next step (stage 2), these researchers genotyped the identified variants in 15 989 Danes to search for association with 12 metabolic phenotypes. Finally (stage 3), SNPs with potential associations were genotyped in a further 63 896 Europeans. Eventually, robust associations for coding SNPs in three genes were identified: CD300LG (fasting HDL cholesterol: MAF 3.5%), COBLL1 (type 2diabetes: MAF 12.5%) and MACF1 (type 2 diabetes: MAF 23.4%). On the basis of these results, they concluded that coding SNPs with MAF above 1% might have no high effect size on the metabolic phenotypes. WES, therefore, is an appropriate approach to find causal and predisposing SNPs (Figure 1).

Cancer

Accumulation of genetic alterations during the life can lead to a malignant neoplasm or cancer. Because of heterogeneous entity of human cancers, multiple gene tests should be applied. This potentiality is allowed by WES at once. WES could highlight novel insights into cancer mechanisms;66 identification of the DNA sequence of cancer cells in comparison with that of normal cells could help to reach an in-depth understanding of cancer. Sequence variations may influence the predisposition for cancer development. Using WES, it is feasible to check germline and somatic mutations in cancer (Table 4). Table 4 shows some of the identified genes involved in cancer development. Chang et al.67 applied an exome-sequencing technology using Roche Nimblegen capture paired with 454 sequencing to determine variations and mutations in eight commonly used cancer cell lines; they showed that this technology is able to identify sequence variations so that it can provide 95% concordance with an Affymetrix SNP Array 6.0 (Affymetrix, Santa Clara, CA, USA) performed on the same cell lines. These researchers mentioned that WES can be a reliable and cost-effective way to identify variations in cancer genomes.

Table 4 Some of cancers studied by WES

In 2010, Summerer et al.68 reported an automated exome-capture method for a subset of 115 cancer-related genes using microfluidic DNA arrays. WES was applied on a series of primary clear cell renal cell carcinoma (ccRCC).69 A gene, PBRM1, encoding for SWI/SNF chromatin remodeling complex was found in this study.

Recently, it has been reported that mutations in SF3B1 gene encoding a spliceosomal protein were identified in a distinct form of myelodysplastic syndromes (MDS) with ring sideroblasts. Mutations of this gene are associated with a good prognosis.70 In addition to complexity of variations in different types of cancers, mutations in oncogenes and tumor suppressor genes that initiate tumorigenesis should be considered. To confirm variations in tumor suppressor genes leading to loss of heterozygosity, comparison of the identified variants in diseased tissues with normal tissues from the same individual is required.

Finding the driver mutations is one of the major concerns in genomic analysis of cancer. Thus, several studies have attempted to identify driver mutations within the exome in various types of cancer including leukemias, myelomas and solid tumors.71, 72, 73

WES is providing a detailed understanding of cancer pathways and unraveling the molecular mechanisms of cancer. Wang et al.74 applied WES on 22 gastric cancer individuals and identified unreported mutated genes and pathway alterations; they particularly found genes involving in chromatin remodeling. Besides, WES is also being applied for therapeutic aims, for example, identifying relevant pharmacogenetic variants and targeted gene–disease–drug interactions.75, 76

In brief, applications of WES in cancers are (1) somatic mutation detection, (2) driver mutation detection, (3) mutation network reconstruction and (4) identification of predisposing variants; even intra-family WES approaches to uncover cancer predisposition genes need to be considered for some cancers.77

Human variome and ClinSeq projects

The human genome has about four million DNA sequence variability. The important fact is to characterize the clinical importance of these differences. In coming years, a huge amount of gene variations will be produced with application of NGS technology. The challenging issue is how to use these data into clinical practice. This explosively growing data should be managed in an organized manner so that clinicians and researchers could easily find the required information of previously characterized variations. In the era of the personalized medicine, collection of the variants based on the ethnicity and origin of patients would provide a valuable solution for management of the disease in near future.78, 79, 80 The Human Variome Project (http://www.humanvariomeproject.org) will collect systematically mutations that cause human disease on the basis of population. Gathering and characterizing all mutations and their effects would shed light on the road of management of the genetic disorders.

The NGS technology would help to identify not only the causal variants in genotype–phenotype correlation studies but also the modifier genes with the comparison of the alleles in the family members. The clinical phenotype is the result of several alleles with different pathogenic effect, which makes the interpretation of variations difficult.

Most studies are based on small number of samples for identifying the causal variants. To rely on the genetic information in clinic, much more information is needed. In practice, ClinSeq project aims to provide enough amount of genome for study, number of subjects and the associated clinical data. In a single-gene study, only a single gene with clinical phenotype is evaluated, although the 1000 Genome Project aims to evaluate large number of samples with no phenotype evaluation.

The ClinSeq project was organized to use the WGS data of massively parallel sequencing of different patients suffering from diseases to determine the genotype–phenotype association of variants. Identification of variants and knowledge on genotype–phenotype correlation could be used to change the clinical management of patients.81 For example, pharmacological treatment is useful for patients affected with familial hypercholesterolemia that are dietary dependent.82 The pilot study started with large-scale medical sequencing of 1000 participants for atherosclerotic heart disease and makes a model relating the genotype to the phenotype.

The goal is to use genetic data in clinical practice improving our understanding of genetic basis of diseases in health care, diagnosis and therapy. Preliminary data from ClinSeq study showed that results should be interpreted and returned to the individuals. Depending on the disease, variants should be reported. High penetrance variants with high reliability of finding the disease are important to be noticed. They may be used in management and treatment of the disease.

Concerns in WES

Massively parallel DNA sequencing systems have increased the throughput of sequencing compared with the classical sequencing method. Reduced cost of WES rather than WGS led to its applications in the clinical genetics. As WES focuses on only ∼1% of the genome and is limited to the coding and splice-site variants in annotated genes, it is suitable for gene discovery in highly penetrant Mendelian diseases. Also, exon capture step may potentiate technical biases along the procedure and limits its usage in detecting copy-number variants as well as in genomic regions where capture is less efficient.

During search for causative variants, availability of the data throughout the genome makes the clinical phenotype less important, when big families with multiple affected individuals or multiple pedigrees with the same genetically homogeneous disorder are available; if only simplex cases or small families are studied, then detailed clinical evaluations and laboratory tests have important roles to narrow down the candidate causative variants. Aldahmesh et al.,83 for example, reported two individuals showing more severe neurologic phenotype of Sjögren-Larsson syndrome (SLS; MIM 270200), characterized by ichthyosis, seizures, intellectual disability and spasticity. However, fibroblast fatty aldehyde dehydrogenase (FALDH) deficiency, the hallmark of SLS, was assessed to be normal in their patients. Biallelic ELOVL4 mutations were found in these pseudo-SLS patients. Thus, differential diagnosis of the condition may accelerate and be advantageous to find the causal variants. However, as we discussed before, WES is an excellent tool to help finding the correct diagnosis even if the initial diagnosis is incorrect.

One of the most important facts in medical genetics is the ethical issue. Informed consent, which is a guiding principle of medical research, may not be so meaningful because information derived from DNA does not only belong to participant but also reveals properties of the relatives and sometimes original population. As WES provides the massive scale of DNA variants, using it in a clinical setting needs to meet more detailed legal and ethical standards. How many relatives of the participant should be informed? How much information is enough to say that a participant is informed? Who should and how to ensure the patient and relatives that their information will not be misused in their employments and insurance companies? Besides, who does guarantee that their information will be secured in future research for their offspring?

The ethical issues about the privacy, confidentiality and return of results are major concerns and need more evaluation by professionals. However, employment and insurance issues should be solved; a comprehensive agenda considering all possible consequences of WES need to be established to use WES for improving human health.

Briefly, according to Tabor et al.,84 the informed consent for NGS analysis is provided for the participants giving description of the overall experiment, explanation of the risk of sharing the data, privacy of the data and explication to return the results. The consent explains the return of the results in groups including the related causal variant, overall variants and variants associated with the common diseases. It is optional for participants whether or not to receive the research results. It is suggested that a framework for the release of information for the family and participants should be dependent on their preferences and could be changed over time.

Conclusion

Understanding the genetic variants provides appealing insights into the human disease for prevention strategies, diagnostic applications and therapeutic methods. Systematic resequencing of human DNA (large-scale medical sequencing) is revolutionizing medicine.82 As whole-genome sequencing is still costly, WES is a temporary alternative to get a picture of the coding genome. Sequence data production, which is exponentially growing, should be managed and organized so that the information to be available for researchers and clinicians in a systematic framework. Further, more user-friendly bioinformatics’ tools are demanded to be established for improvement of current knowledge about genetic disease management including counseling, prevention and treatment/therapies and drug responses. Considering this impact of DNA sequencing, a well-understood personalized medicine is expected in coming years.

WES is charming its way in research, diagnostic and clinical setting; it has been used for variant detection among both common and rare diseases as well as SNPs associations and pharmacogenetics. It is recommended that for clinical and diagnostic applications of the NGS, it is better to focus on specific human genes within the pathways for faster diagnosis. This could be based on our knowledge of different developmental biology pathways (interactome) relating to the disease. In addition, targeted capture is suitable in a diagnostic setting, especially for those diseases caused by numerous genes.