Molecular diagnosis of non-syndromic hearing loss patients using a stepwise approach

Hearing loss is one of the most common birth disorders in humans, with an estimated prevalence of 1–3 in every 1000 newborns. This study investigates the molecular etiology of a hearing loss cohort using a stepwise strategy to effectively diagnose patients and address the challenges posed by the genetic heterogeneity and variable mutation spectrum of hearing loss. In order to target known pathogenic variants, multiplex PCR plus next-generation sequencing was applied in the first step; patients which did not receive a diagnosis from this were further referred for exome sequencing. A total of 92 unrelated patients with nonsyndromic hearing loss were enrolled in the study. In total, 64% (59/92) of the patients were molecularly diagnosed, 44 of them in the first step by multiplex PCR plus sequencing. Exome sequencing resulted in eleven diagnoses (23%, 11/48) and four probable diagnoses (8%, 4/48) among the 48 patients who were not diagnosed in the first step. The rate of secondary findings from exome sequencing in our cohort was 3% (2/58). This research presents a molecular diagnosis spectrum of 92 non-syndromic hearing loss patients and demonstrates the benefits of using a stepwise diagnostic approach in the genetic testing of nonsyndromic hearing loss.


Forty-four diagnoses by multiplex PCR.
In the first step, the patients were tested via multiplex PCR.
The tests yielded a positive result in 44 out of the 92 patients, while eight were inconclusive and 40 negative (Fig. 1). The genotypes of the 44 patients who tested positive are listed in Table 2. Classifications of these variants were presented in Table 3. There were 27 with a mutation in GJB2, 15 in SLC26A4, 1 with a dual molecular diagnosis of both GJB2 and SLC26A4, and 1 with a mutation in MT-RNR1. The patient who was positive with a homoplasmic m.1555A>G in the MT-RNR1 gene had aminoglycoside exposure history. Homozygous NM_004004.6:c.235delC in the GJB2 gene was the most prevalent genotype, accounting for 11% (10/92) of the Table 1. Characteristics of the study cohort a . a All members in this cohort have bilateral hearing loss; the precise type of hearing loss are not always recorded, most recorded cases are sensorineural; b Severity is determined by the hearing level of the better ear; WHO grading rule is adopted (Mild: 26- Post-lingual (> 3 years) 12 (13) Laterality Bilateral symmetric 69 (75) Bilateral asymmetric 17 (18) No record 6 (7)

Aminoglycoside Exposure
Yes 4 (4) No 63 (68) Uncertain 25 (27)  www.nature.com/scientificreports/ study cohort. NM_004004.6:c.109G>A in GJB2 was identified in 10 of the 44 patients with positive genotypes, including three patients who were homozygous and seven patients who a compound heterozygous mutation. One patient was homozygous for both NM_000441.2:c.919-2A>G in SLC26A4 and NM_004004.6:c.109G>A in GJB2. This patient was clinically diagnosed with deafness and enlarged vestibular aqueducts, a phenotype which can be caused by deficiency of the two genes together.
Fifteen diagnoses/probable diagnoses by exome sequencing. Two groups of patients (n = 58) were referred for exome sequencing. Group 1 was the 48 patients who received inconclusive or negative genotypes from the multiplex PCR ( Fig. 1). Group 2 consisted of 10 patients who were either homozygous or compound heterozygous for NM_004004.6:c.109G>A in GJB2. Due to the variable expressivity and incomplete penetrance of NM_004004.6:c.109G>A in GJB2 13 , these 10 patients were referred for exome sequencing in order to exclude other potential molecular etiologies.
In the first group, exome sequencing resulted in eleven diagnoses (23%, 11/48) and four probable diagnoses (8%, 4/48) ( Table 4). No other causally associated variants related to hearing loss were identified in the second group. It is worth noting that patient P27 was first identified as homozygous for NM_001038603.3(MARVELD2 ):c.1208_1211delGACA by exome sequencing. Given that the family history did not indicate consanguinity, we performed CNV analysis of the exome data from this patient as well as all the other fifty-seven patients in step 2. The exome data revealed a heterozygous deletion of exon 3 to exon 5 in the MARVELD2 gene, which was also verified by qPCR. Variant c.1208_1211delGACA is located in exon 4, which was deleted in this CNV variant. We thus conclude that c.1208_1211delGACA is hemizygous in this case.
Further analysis of the exome sequencing data was carried out to discover secondary findings. The cohort had two pathogenic variants from among the 59 secondary findings genes recommended by the ACMG 14 (Supplementary Table S2). A heterozygous variant, GLA(NM_000169.3):c.1067G>A (p.Arg356Gln) was identified in patient P21 (female). It is related to Fabry disease, an X-linked inborn error of glycosphingolipid catabolism resulting from deficient or absent activity of the lysosomal enzyme alpha-galactosidase A 15 . The second pathogenic variant was RYR1(NM_000540.3):c.6502G>A (p.Val2168Met), which is associated with malignant hyperthermia. This variant was identified in P77 in a heterozygous state. The rate of secondary findings in our cohort was 3% (2/58), comparable to previously published rates of 1.8% to 4.6% [16][17][18][19] .

Discussion
This study applied a stepwise, genetic testing approach to explore molecular diagnoses in an NSHL cohort, achieving 64% (59/92) diagnostic yield. Although diagnostic yield varies between different patient cohorts and depends on the detection methods used, our diagnostic yield (64%) is comparable to a multi-ethnic cohort tested using exome sequencing (56%) 9 .
This study uncovered the etiology of 44 patients in the first step using a commercial multiplex PCR kit, providing a rapid molecular diagnosis and saving the cost of exome sequencing. The multiplex PCR contains amplicons  [20][21][22] . Compared to a single test of GJB2, which is frequently used as the first-tier test to exclude hotspot variants before exome sequencing 6,9 , a multiplex PCR sequencing approach appears to be both efficient and cost-effective, as it is more flexible and can detect hotspot variants across multiple deafness-related genes.
It should be noted that the diagnostic rate of the multiplex PCR assay in the first step could vary dramatically in different populations depending on the prevalence of hotspot variants in targeted patients. In this study, the high diagnostic rate was attributable to the enrichment of NM_004004.6(GJB2):c.109G>A, NM_004004.6(GJB2):c.235delC, NM_004004.6(GJB2):c.299_300delAT, NM_000441.2(SLC26A4):c.919-2A>G, and NM_000441.2(SLC26A4): c.1229C>T in the Chinese population 23,24 . More importantly, ethnic background is relatively uniform in China. The inherent ethnic bias of the multiplex PCR assay is a drawback and might be inappropriate in a racially and ethnically diverse population 25 .
Allelic heterogeneity is common in hearing loss and is associated with clinical phenotype heterogeneity, with both syndromic hearing loss and NSHL being caused by mutations within the same gene 26 . In this study, although we only enrolled patients with nonsyndromic hearing loss, deafness-related variants were also identified in syndromic genes. P44 was heterozygous for a disease-causing nonsense variant in the MITF gene, and P111 was heterozygous for a disease-causing missense variant in the SOX10 gene. Both variants are associated with autosomal dominant inherited Waardenburg syndrome, which is considered an NSHL mimic. The variability of phenotypes makes clinical diagnosis and variant interpretation in genetic hearing loss challenging 27 .
The molecular diagnosis of NSHL is made yet more challenging by the variable expressivity and high prevalence of NM_004004.6:c.109G>A in the GJB2 gene 13 . In our cohort, one patient with enlarged vestibular aqueducts was found to be homozygous for both NM_000441.2:c.919-2A>G in SLC26A4 and NM_004004.6:c.109G>A in GJB2 in the first diagnosis step. The genotype-phenotype consistency led us to www.nature.com/scientificreports/ www.nature.com/scientificreports/ consider NM_000441.2:c.919-2A>G in SLC26A4 as the disease-causing variant. However, we cannot rule out the possibility of a blended phenotype arising from the two variants 28 . By contrast, 10 patients who were diagnosed with only NM_004004.6:c.109G>A in GJB2 in the first step were referred for exome sequencing, and no other potential molecular explanations were identified. These results indicate the importance of incorporating phenotype and genotype in practice and of considering dual molecular diagnoses. Copy number variations are common causes of nonsyndromic hearing loss 29 . Exome sequencing data can be analyzed for CNVs, although such analyses suffer from low sensitivity and uncertain specificity 30 . In this study, one patient in our cohort was diagnosed to have a SNV compounded with a CNV in MARVELD2. The homozygous c.1208_1211delGACA(p.Arg403Lysfs*11) in the MARVELD2 gene was initially thought to be the causal etiology. The lack of consanguineous history led us to reanalyze the coverage depth of the exons in the MARVELD2 gene, resulting in the identification of an EX3_EX5 Del. This finding highlights the importance of considering CNV deletions in a non-consanguineous family when a pathogenic variant is identified in a homozygous state.
It is worth noting that 17% (16/92) of the patients passed a newborn hearing screening at birth but developed hearing loss at a later stage. Seven of these patients received a positive molecular diagnosis, with variants in the GJB2 and SLC26A4 genes (Supplementary Table S3). These results are consistent with recent findings showing that newborns with positive genotypes can be missed by physiologic newborn hearing screens but identified by genetic screens, highlighting the necessity of concurrent hearing and genetic screening in newborns 21,31 .
This study has several limitations which should be noted. First, the stepped approach may miss a dual molecular diagnosis if a patient is diagnosed in the first step. A dual molecular diagnosis was reported in 4.9% of patients with multiple phenotypic traits 28 . For single phenotype traits such as NSHL, dual molecular diagnosis might be rare, but this is worth considering when deciding on a diagnostic approach. Second, CNV analysis of the STRC gene is absent in our exome data pipelines due to the presence of a pseudogene. This might result in an underestimate of the contribution of disease-causing CNVs in this study.
In conclusion, this work demonstrates the benefits of a stepwise approach to diagnose non-syndromic hearing loss patients. Instead of starting with exome sequencing, multiplex PCR targeted hotspot variants across multiple genes can provide a molecular etiology for 48% of Eastern Asian patients in a prompt and efficient manner. It seems likely that this will result in significant savings, but a future cost-effectiveness analysis will address that question.

Participants.
A total of 92 patients with NSHL in Eastern Asian ethnicity were recruited. No other visible phenotype was reported. We obtained informed consent from the patients. This study was approved by the Institutional Review Board of BGI. All methods were performed in accordance with the relevant guidelines and regulations. Study design. The enrolled patients were first assayed with a commercial multiplex PCR to analyze common variants in the Asian population. Patients undiagnosed by the multiplex PCR assay were then referred for exome sequencing. Because of the variable expressivity and penetrance of NM_004004.6:c.109G>A in GJB2, patients diagnosed with this variant were also referred for exome sequencing to explore other potential molecular etiologies.
Multiplex PCR. Genetic variant detection was carried out in all patients by applying multiplex PCR combined with next-generation sequencing. The commercial multiplex PCR kit (BGI Biotech, Wuhan, China) was designed to cover certain pathogenic variants of 22 genes, including the complete coding region of GJB2 and most of the coding regions of SLC26A4. Genomic DNA was extracted from 2 ml of peripheral blood using a DNA Extraction Kit (BGI Biotech, Wuhan, China).  www.nature.com/scientificreports/ Library preparation, sequencing and bioinformatics. PCR products were pooled to prepare a library.
Briefly, ~ 3.5 μg purified products were sheared by ultrasonoscope and quality-controlled using an Agilent Bioanalyzer DNA 2100 kit (Agilent, Santa Clara, CA, USA). Subsequently, end-repair and A-tailing were performed before adapters were ligated to both ends of the fragments. Finally, the adapter-ligated products were amplified by 8-cycle PCR and purified using Agencourt AMPure XP beads (Beckman Coulter, Fullerton, CA, USA). The prepared libraries were subjected to single-strand circularized DNA and DNA nanoball preparation before being sequenced on a BGISEQ-500 sequencer (BGI, Shenzhen, China) with PE50 32 . Raw sequence reads were mapped to the human reference genome (hg19) using Bowtie 2.3.3 with SAMtools 1.6 used to create BAM and index files. For variant calling, Genome Analysis Tool Kit (GATK 3.7) 33 was used to analyze the alignment data.
Exome sequencing and data analysis. Exome sequencing was performed following standard manufacturer protocols on a BGISEQ-500 platform. An in-house bioinformatics pipeline was employed to process the variant call format (VCF) files and to maintain variants of potential clinical usefulness, including (i) variants with minor allele frequency (MAF) < 1%, (ii) variants in genes with an OMIM disease entry. We interpreted variants in 130 genes curated by ClinGen Expert as having a limited-to-definitive relationship to hearing loss 34 . This interpretation was based on ClinGen Expert Specification of the ACMG/AMP Variant Interpretation Guidelines for Genetic Hearing Loss 27 .
Definition of molecular diagnosis. Patients were categorized as "positive" or "diagnosed" if they were homozygous or double heterozygous for a pathogenic/likely pathogenic variant(s) in a recessive inherited gene or heterozygous for a pathogenic/likely pathogenic variant in a dominant inherited gene. In addition, patients with a pathogenic/likely pathogenic variant plus a rare VUS in a recessive inherited gene were considered "probably diagnosed. " Patients with a pathogenic/likely pathogenic variant in a recessive inherited gene were considered "inconclusive". Patients with a variant in a gene that was not inherited recessively or dominantly, for example in mitochondrial genes, were categorized as "diagnosed" if the phenotype associated with the genotype.
Sanger validation and qPCR. Sanger sequencing was carried out to validate SNPs/Indels detected by either multiplex PCR or exome sequencing. All PCR products were sequenced on an ABI 3730XL DNA Analyzer. Mutations were confirmed by comparing our sequencing data with the UCSC human reference sequences 35 . Verification of exon-level deletions or duplications called by exome sequencing was carried out by qPCR. The qPCR methodology has been previously described 36 . The primer pair sequences are shown in Supplementary  Table S1.