Introduction

There are two general approaches to utilizing next-generation sequencing (NGS)-based tests in clinical settings. If the clinical presentations of a patient suggest one of the genetically heterogeneous conditions, a phenotype-specific panel (e.g. a hearing-loss panel) test could be ordered; if the clinical evaluation of a patient reveals complex or uncharacteristic presentation, whole-exome sequencing would be ordered. The diagnostic yields vary, depending on several variables:

1. The percentage of patients with a genetic condition that can be explained in terms of the genes currently known

2. The types and distributions of pathogenic variants associated with these genes and the quality of the sequencing to detect them

3. The accuracy and completeness of clinical evaluation by clinical geneticists

4. The capabilities of personnel, including medical directors, genomic scientists, and genetic counselors, who can accurately recognize clinically relevant variants and properly evaluate the pathogenicity of the variants

While the design of the panels and exome can easily be shared across laboratories and countries and the sequencing quality can be very similar, the consistency and accuracy of patient clinical evaluation and variant interpretation are largely dependent on physicians’ and specialists’ personal knowledge and experience. NGS-based tests, both panel and exome tests, have been proven to be a powerful diagnostic tool in countries with both technologies and specialists1,2,3,4,5 and have been proposed as first-tier tests for children with suspected monogenic disorders.6,7,8,9,10 China adopted the NGS technology rapidly, but the lack of well-trained specialists and the high cost associated with NGS-based tests are limiting the clinical utilization of this technology. Currently, there are only a handful of clinical medical geneticists who were trained abroad and are now working in a few top hospitals in China. There are no genetic counselors with training or experience at a level equivalent to that in the United States and other Western countries. As a consequence, genetic testing based on a previous clinical diagnosis is not a routine practice. An NGS-based genomic-first approach provided a unique opportunity for countries such as China to provide a routine molecular diagnostic service for patients without prior screening or extensive clinical evaluation by well-trained specialists. Yet the high cost associated with regular exome sequencing created a significant disparity among patients, who had to pay for the test out of their own pockets. Because of this situation in China, we sought to use a subexome approach by limiting the test to the medical exomes that target only the known Mendelian disease genes and offering it only to the proband in the family. In this study, we intended to evaluate the performance of such practice, based on the experience of one such pediatric hospital which is an early adopter of the use of a large, comprehensive disease panel for routine molecular diagnostic service. We provide evidence showing that this approach can overcome most of the current limitations in China. Proband-only medical exome sequencing (POMES) can provide quality and cost-effective service for a large number of patients with a wide range of genetic conditions. We further demonstrate the clinical utility of POMES in making a positive impact on patient management despite the limited knowledge of medical genetics on the part of ordering physicians.

Materials and methods

Patients

This study involved 1,323 patients who were referred for genetic testing at Shanghai Children’s Medical Center (SCMC) from April 2015 to December 2016. POMES was primarily ordered by physicians managing patients from a diverse range of specialty outpatient clinics and inpatient wards. The test and data interpretation were performed by the molecular diagnosis laboratory at SCMC. This study was approved by the SCMC institutional review board.

Target sequencing and variant evaluation

Genomic DNA was isolated from peripheral blood samples of patients and their family members, when available, by using the Gentra Puregene Blood Kit (Qiagen, Hilden, Germany) according to the manufacturer’s protocol. NGS was performed only on probands. The target regions were captured by the ClearSeq Inherited Disease panel (cat No.5190–7519, Agilent Technologies, Santa Clara, CA) kit, which contains 2,742 confirmed disease-causing genes (Supplementary Table S1 online).

NGS was performed using Hiseq X Ten (Illumina, San Diego, CA) according to the manufacturer’s protocol. Paired-end reads were aligned to the GRCh37/hg19 human reference sequence. BAM and VCF files were generated by NextGENe software (SoftGenetics, State College, PA).Sequencing quality information is provided in Supplementary Table S2.

Variants were annotated and filtered by Ingenuity Variant Analysis (https://variants.ingenuity.com). Common variants were filtered based on their frequencies in the databases of the Exome Aggregation Consortium (ExAC) (http://exac.broadinstitute.org), the Exome Sequencing Project (https://esp.gs.washington.edu), or 1000G (http://www.1000genomes.org), and an internal database. We first analyzed the variants associated with patients’ phenotypes (usually described in a few words by the specialists ordering NGS). If no candidate variant was found, we further analyzed all genes for putative disease-causing variants in case the phenotype description was not accurate. Rare phenotype-related variants were classified by following the American College of Medical Genetics and Genomics/Association for Molecular Pathology (ACMG/AMP) guidelines.11

All putative disease-causing variants detected by NGS were confirmed by Sanger sequencing. Family members were also examined by polymerase chain reaction and Sanger sequencing for testing the origin and phase of the variants, occasionally for segregation analysis, when multiple affected members were available.

Questionnaire

A questionnaire (Supplementary File 1) was designed and sent to each physician who ordered NGS from April 2015 to April 2016. In the question set we investigated the ordering physician’s medical genetics background, their ability to order genetic testing, and their understanding of the diagnostic test results. We also analyzed the clinical correlations of the molecular diagnostic findings with patients’ clinical phenotypes and assessed the clinical utility of the diagnostic results, as well as physicians’ attitudes to POMES.

Result

Patient demographics

The average age of patients tested by POMES was 5.25 ± 0.30 years. There were 1,323 patients: 781 boys and 542 girls (Figure 1a). POMES tests were ordered by 136 individual physicians from 34 specialty clinics (i.e. outpatients, 725) or wards (i.e. inpatients, 598). Most outpatients were from a clinic of pediatric medicine (451 patients), a clinic of cardiovascular disease (106 patients), and a clinic of developmental and behavioral pediatrics (103 patients) (Figure 1b). Most of the inpatients came from cardiology (181 patients), endocrinology (151 patients), neurology (63 patients), respiratory disease (49 patients), PICU (45 patients), nephrology (35 patients), gastroenterology (32 patients), and NICU (28 patients) wards (Figure 1c). The majority of those patients had had no previous genetic testing, and the ordering physicians did not provide distinct clinical diagnoses. The phenotypes varied widely. The composition of the patient cohort represented a fair sampling of patients from a typical tertiary pediatric hospital in China.

Figure 1
figure 1

Characteristics of the patient population and the variants detected. (a) The age and sex distribution of patients tested by proband-only medical exome sequencing (POMES). (b) The composition of a 725-outpatients’ cohort. (c) The composition of a 598-inpatients’ cohort. (d) The classification of 961 variants following the American College of Medical Genetics and Genomics/Association for Molecular Pathology guidelines. (e) The composition of 512 pathogenic or likely pathogenic variants. ENT, ear, nose, and throat; HEENT, head, eye, ear, nose, and throat.

SCMC had the largest pediatric heart center in Asia, and the largest fraction of patients tested for POMES had heart disorders; they consisted of 173 patients with dilated cardiomyopathy, 39 patients with hypertrophic cardiomyopathy, five patients with arrhythmogenic right ventricular cardiomyopathy, 25 patients with Marfan syndrome, and 81 patients with other cardiovascular diseases. Patients with diseases of other categories were distributed as follows: neuromuscular disease (266 patients), endocrine diseases and inborn errors of metabolism (106 patients), short stature (91 patients), multiple malformation (91 patients), hematology and immunological diseases (77 patients), disorders of sex development (63 patients), and rare nephrotic diseases (57 patients). There were 162 patients who had complex presentations with multisystem involvement.

Characterization of POMES variants

We identified 961 rare nonsynonymous variants in phenotype-related genes in 756 patients. Of all these variants, 381 (~40%) were novel. Following the ACMG/AMP guidelines, 319 were classified as pathogenic variants, 193 were classified as likely pathogenic variants, 421 were classified as variants of unknown significance, and 28 were classified as likely benign or benign variants (Figure 1d). Of the 512 pathogenic and likely pathogenic (P/LP) variants, 233 were missense variants, 107 were frameshift variants, 99 were nonsense variants, 43 were ±1 or 2 splice-site variants, 10 were in-frame insertion/deletion variants, 8 were intronic variants, 8 were copy-number variations, and four were initiation codon mutations (Figure 1e). 210 out of the 512 P/LP variants (41%) were variants reported for the first time (Supplementary Table S3).12

For the interpretation of the 961 variants, we assessed the frequency of use for each ACMG/AMP line of evidence, in a manner similar to that described by Richards et al.11 As depicted in Figure 2, PM2, PP3, and PP4 were most frequently used as supporting evidence for pathogenicity. The profile is very similar to what was reported by Richards et al.11 But these lines of evidence are also commonly applicable to variants that are not P/LP; thus, their contribution to the suggestion of pathogenicity lacked specificity, whereas the second tier of frequently utilized lines of evidence, PVS1, PS2, PM1, and PS3, were engaged predominantly for P/LP variants.

Figure 2
figure 2

Frequency of use for each American College of Medical Genetics and Genomics/Association for Molecular Pathology line of evidence.

Diagnostic yield

We identified 512 pathogenic and likely pathogenic variants in 410 of the 1,323 patients (30.1%). Of these 410, definitive molecular diagnoses of 216 distinct disorders were reached for 381(Supplementary Table S3). The remaining 29 patients each carried one P/LP variant and one variant of unknown significance (or only one P/LP variant) for a recessive condition or one P/LP variant for a dominant condition that was not highly consistent with the patient’s phenotype. Further evidence (segregation or functional evidence or follow-up phenotyping) is required to reach definitive diagnoses for these patients. The overall diagnostic rate for this nonselected patient population was 28.8% (Table 1). Over 30% of the diagnoses were of skin diseases (mostly albinism and café-au-lait spots), nephrotic diseases, skeletal diseases, endocrine diseases and congenital errors of metabolism, gastroenteric disease, short stature, and multiple malformations. We failed to uncover any positive cases of disorders in the subgroups of polycystic kidney, autoimmune disease, tachycardia, arrhythmogenic right ventricular cardiomyopathy, and bronchiectasia,. This was due partly to the relatively small cohort size (each subgroup had fewer than 10 patients). We also analyzed the diagnostic yields for patients from different clinics and wards (Figure 3). Patients from PICUs had the highest molecular diagnostic rate, of over 35% (n = 45). Patients from NICUs had the lowest diagnostic rate, of 10% (n = 28).

Table 1 Diagnosis rate of each subgroup of pretest phenotype
Figure 3
figure 3

Diagnostic rates of patients from different clinics or wards.

Turnaround time and cost analysis of POMES

We tracked the turnaround time of our POMES test for 381 patients. It ranged from 8days to 154 days. The average turnaround time was 57 days, with 67% of patients receiving reports within 70 days. The only one outlier case lasted for more than 4 months. This was a patient with seizure disorder; the initial report was issued in 76 days, reporting a variant in CACNA1H that had previously been reported in literature.13 However, extensive clinical correlation analysis did not support the conclusion and we reanalyzed the data and eventually identified a pathogenic variant in DYNC1H1 gene that explained the patient’s condition. This case demonstrated the importance of posttest clinical correlation analysis and the interactions between laboratory specialists and ordering physicians. The cost of running a sample is about US$170, which includes library construction (~$120), proband NGS (~$35), and trio Sanger sequencing (~$15). Patients were charged $360 (~$200 to cover the labor/management and data-analysis costs). These results showed that our POMES test was affordable and reasonably efficient. To assess the cost-effectiveness of POMES versus single-gene tests for discrete conditions in making clinical diagnoses, we reviewed the positive cases and classified these conditions into two categories: (i) single-gene disorders in which the gene is large (e.g. NF1 for neurofibromatosis type 1, DMD for Duchenne muscular dystrophy/Becker muscular dystrophy, and FBN1 for Marfan syndrome) and (ii) conditions associated with multiple disease-causing genes (e.g. ectodermal dysplasia, albinism, congenital adrenal cortical hyperplasia, methylmalonic aciduria, and mucopolysaccharidosis). We calculated the cost of using the single-gene test approach for all these cases (Supplementary Table S4). The resultshowed that the total cost for 54 cases of using POMES as $19,440, whereas the cost of using the single-gene approach for these cases was $24,279 (including Sanger sequencing, multiplex ligation-dependent probe amplification for DMD and the labor/management and data analysis costs).

Follow-up survey

Targeting patients with positive diagnostic findings, we sent 193 questionnaire forms to 35 ordering physicians; 164 valid questionnaires were received from 29 physicians. First we assessed the medical-genetics training experience of the ordering physicians (Figures 4a–d). We found that 65.5% of ordering physicians (19/29) had no secondary medical-genetics education and 13.8% (4/29) had no medical-genetics education at all. This reflects the current status of medical-genetics training of physicians in China. However, 86.2% of the ordering physicians (25/29) said that they will order more molecular diagnostic tests in the future.

Figure 4
figure 4

Characteristics of the ordering physicians. (ad) Medical-genetics education background, training experience, and work experience of the 29 physicians in our survey. (e) Genetic counseling information conveyed to patients with positive reports.

Before POMES testing, less than a third (54/164) of patients were believed to have a monogenic disorder based on the judgment of the ordering physicianss. Physicians were not sure what to order (Sanger sequencing, panel sequencing, exome sequencing, microarray, or something else) for 40.2% (66/164) of their patients. According to our survey of 29 physicians from 17 wards or clinics, the correlation between pretest assessments of whether the patient is likely to have a monogenic disorder and the outcomes of molecular tests did not increase significantly (sometimes it even decreased) among physicians with higher educational degrees, professional rank, and experience, or a more extensive medical-genetics background. (Supplementary Table S5). One possible explanation is that the likelihood of having a monogenic disorder varies among patients from different medical departments. We analyzed three departments with high diagnostic rates in our study (a PICU, a gastroenterology ward, and a cardiovascular disease clinic), and the overall correlation between pretest assessment and molecular result was 40%. The correlation between three departments with low diagnostic rates (a developmental and behavioral pediatrics clinic, a respiratory disease ward, and a cardiology ward) was only 26%.

After testing, physicians reported that 83.5% (137/164) of patients had phenotypes matching the disease revealed by the POMES test; 99.4% (163/164) of patients had at least partially matching phenotypes. In addition, 75.6% (124/164) of laboratory reports were understood without difficult by ordering physicians, whereas further consultations were needed for the remaining cases, which mostly involved rare diseases with which the physicians were unfamiliar. Because there is no genetic counselor to serve this function, laboratory genomics scientists assisted with the interactions with ordering physicians. Extensive interactions were required for 33 (20.1%) patients who had complex phenotypes. In 28.1% of cases, physicians were asked to provide addition phenotype information before (22.6%, 37/164) or after (5.5%, 9/164) the sequencing, so that a positive diagnosis was eventually reached.

We further analyzed the clinical correlations between clinical diagnosis/clinically observed phenotypes and molecular findings. For the 164 cases surveyed, 76.2% (125) of patients had either no clinical diagnosis at all (32) or only very preliminary information (93) regarding clinical diagnosis. The definitive diagnosis depended mainly on the molecular evidence and subsequent clinical correlation. Fewer than a quarter of patients (22.6% (37/164)) had had distinct clinical diagnoses prior to testing, and in these cases molecular evidence helped to determine the specific subtype of the disease. In only two patients who had received a definite clinical diagnosis did molecular testing confirm the clinical diagnosis and identify specific variants.

Effect on clinical management

The positive diagnostic results from POMES affected clinical management widely, prompting the provision of appropriate genetic counseling, referral for systemic evaluation, and offering novel treatment or change of treatment. Genetic counseling for patients was provided by ordering physicians once the reports were issued. For physicians, the mode of inheritance and prognosis were the information most commonly conveyed to patients, followed by the mechanism of pathogenesis, the phenotypic spectrum, and recurrence risk. (Figure 4e). After receiving a molecular diagnosis, 28% (46/164) of patients had organs or systems examined to which attention had not initially been paid, and 45.1% (74) patients were provided with novel clinical management options based on the molecular findings. However, not all of these patients received novel treatment—only 88% (65/74) of patients did. For the 9 patients who did not receive novel treatment, the reasons given were “no treatment available in this hospital or in China” (7) and “the new treatment is too expensive” (2). Examples of impacts on patients’ clinical management following POMES testing are provided below.

Two siblings (patient MES-174 a & b) were referred to an endocrine clinic for short stature. Growth-hormone (GH) stimulation testing revealed a partial GH deficiency. GH replacement therapy was initiated after initial evaluation. POMES detected pathogenic compound heterozygous variants in RECQL3 (BLM), which led to the diagnosis of Bloom syndrome for the siblings. GH was then withdrawn, because it is contraindicated in patients with chromosomal breakage syndromes. The diagnosis also affected medical management for the siblings because of the risks of diabetes mellitus and neoplasia. The patients were counseled regarding proper protection from the sun.

A 9-year-old girl (patient MES-552) was referred to our hospital for unexplained syncope. A preliminary diagnosis of partial anomalous pulmonary venous connection was made after cardiac magnetic resonance imaging and cardiac catheterization. After detection of a pathogenic variant in TNNT2, the diagnosis was changed to hypertrophic cardiomyopathy and therefore surgery was not warranted.

Three boys (patients MES-113, MES-528, and MES-1089) were referred to a pediatrician for disorders of sex development. Sequencing revealed pathogenic variants in KAL1, leading to a diagnosis of Kallmann syndrome. Hormone-replacement therapy was initiated.

Discussion

The practice of medical genetics in China is different from that in the United States and other Western countries, mainly for the following reasons.

1. Medical genetics was formally recognized as an independent medical discipline in China less than 2 years ago, as opposed to a history of more than 25 years in the United States.

2. Only a very limited number of medical schools offer specialized training in medical genetics. As a consequence, as confirmed by our survey, only a small fraction of physicians have had postgraduate training in the field. Most hospitals do not have independent clinics for patients with genetic disorders.

3. Trained genetic counselors do not exist in China as they do in the United States and other developed countries. Genetic counseling is often a side service offered by doctors trained in other specialties, such as gynecology and fetal medicine. Hence, most patients with genetic conditions do not receive a proper evaluation by a clinical geneticist, only a small percent receive a clinical diagnosis, and the majority remain undiagnosed for life. But things are changing rapidly in China as NGS technology is rapidly being adopted as a diagnostic tool and is becoming more accessible for Chinese patients. However, the practice of NGS-based molecular diagnostic testing in China will still differ from that in developed countries, for the reasons listed above. In addition, there is no medical insurance coverage for genetic testing in China; patients must pay for tests out of their own pocket. Thus, the high cost associated with diagnostic exome sequencing is a prohibiting factor for the widespread use of NGS-based tests. Accordingly, we developed a strategy of sequencing only the known disease genes in probands as a first-tier test. This POMES approach does not rely on extensive clinical evaluation and diagnosis. It is cost-effective and affordable for most families. Most importantly, it offers a high diagnostic rate for a wide range of unselected conditions. Our report here demonstrates the clinical validity and utility of this practice. Because of POMES, many patients now receive specific diagnoses within a short period of time and the findings impact their clinical management and expected outcomes. POMES is playing an important role in equalizing the diagnostic opportunities for Chinese patients with suspected genetic conditions with those of patients in Western countries.

Our study provides empirical evidence to support the clinical utility of POMES in spite of the lack of well-trained medical genetics professionals in China. According to our survey, 76.2% of our diagnosed patients did not receive a clinical diagnosis before the testing. In these cases, POMES demonstrated its ultimate utility of facilitating diagnosis with minimal dependency on clinical expertise. This NGS-based approach enabled physicians to reach a definite diagnosis despite their limited knowledge of rare genetic diseases. It is worth pointing out that this approach is particularly applicable to diagnostic testing when patients present with observable phenotypes; the more detailed the clinical phenotype available through interaction with the ordering physician, the greater the likelihood of a confident diagnostic result and a higher yield. The lack of sufficient phenotyping or of known characteristic presentations in NICU patients might explain the low diagnostic yield of this approach for this cohort. Our data support the proposition that POMES is suitable for the molecular diagnosis of the pediatric patient population but is probably not ideal for neonatal testing.

Trio testing is a desirable approach, mainly because it can easily identify de novo variants in the proband that constitute strong supportive, albeit not sufficient, evidence for pathogenicity, and can also determine the configuration of variants in recessive genes. The proband-only strategy seems to have eliminated the two important benefits of trio testing. Yet we wanted to assess the net impact on diagnostic yield of the loss of these two benefits. The null variants (nonsense, frameshift, and canonical splicing variants) detected in a proband will not be missed by our analysis even if we do not have prior knowledge of their de novo status. Some of those variants that are clinically relevant to the patient’s presentation turned out to be de novo after parental Sanger sequencing (Supplementary Table S3). Previously reported pathogenic missense variants will also be evaluated by our analysis. The clinically relevant ones that are at risk of being missed are the novel missense variants, whose de novo statuses were not known to us without trio testing. We reviewed the major publications on exome sequencing since 2012 (Supplementary Table S6); the fraction of de novo missense variants in the cohorts was between 18.1 and 50%. We reported 82 P/LP missense variants that were de novo after parental Sanger sequencing in our study (Supplementary Table S3). This comprised 21.5% of our total P/LP variants detected. Although we could have missed some de novo missense variants relevant to a patient’s condition, the number suggested that we are not missing a significant number of those variants by testing only probands by NGS and following up with parental Sanger sequencing for selected variants. Similarly, for variants in recessive disease genes, we followed up with parental Sanger sequencing when at least one of the variants was either a null variant or a previously reported pathogenic variant. The compound heterozygous with two novel missense variants are at risk of not being selected for parental testing. The data show that the proband-only test probably missed a significant fraction of those variants (P = 0.0077). The average proportion of compound heterozygous missense variants encountered in previous studies is 6.8%, whereas it was only 2.9% in our study. It is unlikely that functional and strong cosegregation evidence exists for those novel missense variants. Supporting evidence is most likely to come from PM1, PM2, and PM5. We should pay attention to two novel deleterious missense variants in a phenotypically relevant gene. The actual impact on the diagnostic yield due to the underdetection of such compound heterozygous variants has yet to be evaluated.

Each year, 259 (by OMIM)–281 (by Orphanet) new disease genes are being discovered.14 This is the major deficiency of our test, which targets only known disease genes. Frequent updates of the medical exome, including newly discovered disease genes, will significantly improve the diagnostic yield. Certainly our approach lacks the possibility of discovering or contributing to the discovery of new disease genes, even though it is possible to identify disease-causing variants for novel phenotypes in known disease genes. Our retrospective data demonstrated the utility of this practice by showing a reasonable diagnostic rate for unselected patient populations. This practice remains meaningful for countries with limited financial and clinical resources until WES and WGS become affordable for developing countries and clinical resources are adequate.

Saudi Mendeliome Group15 had previously demonstrated the success of a similar strategy by utilizing broadly designed panels instead of WES in a population enriched for consanguinity. In that population, the majority (~72%) of clinically relevant variants are detected in genes responsible for recessive disorders. As expected in our Chinese outbred population, variants in AR genes constitute a much smaller proportion (31%) whereas variants in AD (53%) and XL (~15%) are much larger than those reported for the inbred population (24% and 4% respectively). POMES proved that targeting known disease genes is clinically valid for outbred populations as well. Moreover, by using a medical exome that covers much larger genomic regions than panels, we are able to detect copy-number variations (both intergenic and intragenic) from the sequencing data (eight positive cases were due to pathogenic copy-number variations in our study). Since the ordering physicians do not need to choose a panel, the test is less dependent on clinical expertise and offers a better opportunity to identify clinically relevant variants in patients with atypical presentation or variants associated with novel phenotypes.