Introduction

More than 0.5 billion people are known to be affected by hearing loss (HL) worldwide [1], and the figure is expected to reach approximately 2.5 billion by 2050 (https://www.who.int). This medical condition is known to cause various adverse effects in the affected individuals [2,3,4]. The etiology of HL involves genetic causes, nongenetic causes, and a combination of these two factors [5, 6]. It has been estimated that about 60% of the patients with HL have hereditary hearing loss (HHL) [7], and the genetic causes vary dramatically across different ethnic populations globally [8, 9].

HHL is highly heterogeneous, both in genotype and phenotype. Until now, more than 140 HL genes have been identified, and the inheritance patterns of these genes involve autosomal recessive (AR), autosomal dominant (AD), X-linkage, and mitochondrial inheritance (http://hereditaryhearingloss.org). Thus, the clinical manifestations of HHL are diverse. The various types of HL include sensorineural, conductive, and mixed HL, while the severity of HL includes mild, moderate, severe, and profound, and can occur at any age of life. Apart from simple HL, the genetic causes can also lead to syndromic hearing loss [6]. Additionally, apart from monogenetic inheritance, digenetic inheritance in HL patients was also reported [10,11,12]. Simultaneously, there was still reports that didn’t support the digenetic inheritance pattern in HL [13, 14].

Given the large number of HL genes, the advent and development of high-throughput sequencing technology has revolutionized the identification of molecular etiology of HHL [15, 16]. The massively parallel sequencing has become an efficient routine diagnosis and research method in this field [17]. Subsequently, more patients have been found to obtain positive diagnoses related to HL genes apart from the common HL genes, such as GJB2, SLC26A4, and MT-RNR1 in the Chinese population [18, 19].

In the present study, we focused on monogenic inheritance in HL and aimed to assess the contribution of genetic factors in HL in a large cohort from China and identify the gene spectrum in this cohort. We enrolled 1577 subjects from China, including 1027 patients with bilateral HL and 520 healthy volunteers with normal hearing and tested them using targeted genome enrichment and multiple parallel sequencing for the 277 HL-related genes. These results would enhance our understanding of the molecular etiology of HL in the Chinese population to help guide the medical care and facilitate genetic counseling to the patients and their family members [20].

Materials and methods

Subjects

The Ethics Committee of Chinese PLA General Hospital approved this study (No. S2016-120-01), which was performed consistently with the Declaration of Helsinki. For all participants or the parents of the minors, written informed consent was obtained.

In this study, besides 520 healthy volunteers with normal hearing, 1027 unrelated probands with bilateral hearing loss were also enrolled who had been referred to the genetic testing center for deafness during the period of 2015–2017 and were all tested for common HL genes, including GJB2, SLC26A4, and MT-RNR1(m.A1555G, m.C1494T) by Sanger sequencing.

The audiological evaluation was performed by pure tone audiometry. For those subjects who could not undergo pure tone audiometry, auditory steady-state response or behavior auditory testing or auditory brainstem response were measured. Hearing levels were determined by the average threshold at the frequency of 0.5, 1, 2, and 4 KHz of the better ear for pure tone audiometry, auditory steady-state response, and behavior auditory testing, or response threshold for auditory brainstem response. Other audiometric testing techniques, including otoacoustic emissions, 40 Hz auditory event-related potentials (40 Hz AERP), etc. were recommended if required. The severity of HL was graded as follows: mild (26–40 dB), moderate (41–55 dB), moderately severe (56–70 dB), severe (71–90 dB), and profound (>90 dB) [21]. Asymmetric hearing was defined as the difference in the mean level at the frequency of 0.5, 1, 2, and 4 KHz or three contiguous frequencies between two ears bigger than 15 dB [22,23,24].

High-resolution computed tomography of the temporal bones was performed to evaluate malformations of the inner ear structure. For patients with syndromic HL, other physical examinations were recommended if required.

Physical examination was performed for healthy volunteers, including testing of body temperature, height, body weight, pulse, blood pressure, Electrocardiogram, transabdominal ultrasound, chest X-ray, psychiatric examination, neurologic examination, otolaryngological examination, optical examination. The hearing level determined by pure tone audiometry was smaller than 25 dB for both ears.

DNA was extracted from peripheral leukocytes of each subject and the family members using standard protocol.

Sequencing

The following 227 HL-related genes were included in this study: 60 genes related to AR non-syndromic HL, 27 genes related to AD non-syndromic HL, 5 genes related to X-link HL, 34 genes related to syndromic HL, and other 101 genes related to genetic disease with HL phenotype recorded in Mendelian Inheritance in Man.

Agilent SureDesign online tool (https://erray.chem.agilent.com/suredesign/) was used to design the probes targeting all the exons, flanking intronic sequences (±10 bp), and known pathogenic variants located in introns of the 227 HL-related genes. Thus, 4544 regions encompassing 1.101 Mbp of the genome were targeted. Ion Plus Fragment Library Kit (Agilent Technologies, Santa Clara, CA) was used for library preparation, with the DNA fragments approximately 170 bp long. SureDesign hybridization capture technology (Agilent Technologies, Santa Clara, CA) was applied following the instruction of the manufacturer. The prepared DNA samples were subjected to JingXin BioelectronSeq 4000 System semiconductor sequencer (CFDA registration permit NO. 20153400309).

Bioinformatics analysis

Torrent Suite Software v5.4 (Thermo Fisher Scientific, Waltham, MA) analysis pipeline was used to produce high-quality read files. After quantity control, the sequence reads were aligned to the human reference sequence genome (hg19) by Torrent Mapping Alignment Program (3.6.40). Picard (1.84) was used to remove the repeated reads. Torrent Variant Caller software v5.4–11 was used to detect the single nucleotide variants (SNVs) and insertion and deletion (INDEL) variation.

Variant interpretation

The detected variants with read depth < 5X were filtered out. ANNOVAR (20170601) was used to annotate the variants. Variants with minor allele frequency >0.05 as reported in the population database, including dbSNP (http://www.ncbi.nlm.nih.gov/snp)(20170929), 1000 Genome Project (http://www.browser.1000genomes.org)(20150824), and Genome Aggregation Database (gnomAD; http://gnomad.broadinstitute.org)(20170311) were filtered out. We used SIFT (20170221), PolyPhen-2(20170221), MCAP (20170221), REVEL (20161205), MutationTaster (20170221), PROVEAN (20170221), to predict the damage of the variants.

The detected candidate variants were further interpreted by considering the allelic frequency in the control group of 520 individuals with normal hearing and referring to the database of Deafness Variation Database (2020-07-30) (http://deafnessvariationdatabase.org), ClinVar (2020-07-30) (http://www.ncbi.nlm.nih.gov/clinvar) as well as our internal database. Novel variants were determined as that hadn’t been previously reported in databases including ClinVar and dbSNP. The identified novel variants in this study were submitted to the CinVar database. Furthermore, the correlation between candidate variants and the phenotype of the affected individuals were considered on a patient-by-patient basis. Variants were confirmed by Sanger sequencing in the families. For de novo variants, the paternity and maternity were verified by genotype analysis by short tandem repeat typing assay. Finally, the variants were classified according to the ACMG guidelines and the specification of guidelines for HHL [25, 26].

Splicing assay

For some detected splice variants, minigene assay was performed to validate the impact on splicing [27, 28]. The pair of minigene clones, which carried wild-type sequence or variant sequence of interest, were transfected into HEK-293T cells, respectively.

Statistical analysis

Chi-squared analyses were performed to compare the difference among groups using SPSS Statistics 25. The statistical significance was defined as P < 0.05.

Results

Targeted capture sequencing

We tested 1547 subjects using targeted genome sequencing. An average of 99%, 98.7%, 98%, 97% of the targeted bases for the 227 genes related to HL (Table S1) was covered at 1X, 5X, 10X, 20X reads, respectively.

Genetic diagnosis

Table 1 presents the clinical information of the patients. Healthy volunteers included 352 male and 168 female subjects, aged 18 to 58 years, with an average of 30.79 ± 9.15 years.

Table 1 Clinical information of 1027 patients included in this study

Of the 1027 HL patients, the genetic cause was identified in 588 patients as variants of pathogenic or likely pathogenic were considered. Other 48 patients (uncertain significance variant in POU4F3 and causative variant in GJB2 were simultaneously detected in one case) were identified as uncertain in whom at least one uncertain significance variant (VUS) was identified in one allele in HL genes, even if a pathogenic/likely pathogenic variant was detected in another allele in the AR genes. The remaining 392 patients were categorized as undiagnosed (Fig. 1).

Fig. 1
figure 1

The molecular diagnostic yields of the 1027 HL patients from China in this study

Additionally, 35 HL genes were implicated in the diagnosed patients, and two leading genes were SLC26A4 (278/588) and GJB2 (207/588), as previously reported [29]. The causative variants in MT-RNR1 were detected in 19 patients (18 cases with m.A155G and one case with m.C1494T). These three genes were considered as common HL genes in China [19, 29].

Then, 32 uncommon genes accounted for the remaining 86 diagnosed patients (causative variants in SLC26A4 and COL3A4 were simultaneously identified in one patient). Genes that were detected in more than three patients included MYO15A (17/86), MITF (7/86), OTOF (7/86), POU3F4 (5/86), PTPN11 (5/86), TMC1 (3/86), LARS2 (3/86), PAX3 (3/86), EYA1 (3/86), CHD7 (3/86). MYO7A (3/86) and USH2A (3/86). In this patient subgroup of 86 cases, 55 were diagnosed as non-syndromic HL. The remaining 27 patients were identified to be syndromic HL (Table 2). The other four patients with variants in USH2A, CLRN1, the responsible genes for Usher syndrome, in whom the ophthalmic phenotype was not observed, were classified as non-syndromic HL (NSHL) mimics [30].

Table 2 The syndromic HL detected in this study

Variant identification

We identified 20 pathogenic/likely pathogenic variants in GJB2 (Table S2) as the genetic cause in 207 patients. The two leading causative variants were NM_004004.6: c.235delC and c.299-300del, which were detected in 84.06% (174/207) and 32.37% (67/207) of this patient subgroups, respectively.

Next, 85 pathogenic/likely pathogenic variants in SLC26A4 (Table S3) were identified as the underlying molecular etiology of 278 patients diagnosed as Pendred syndrome or simple HL with enlarged vestibular aqueduct, of which 26 variants had not been previously reported. The most common two causative variants of SLC26A4 were NM_000441.2: c.919-2 A > G and c.2168 G > A, which were detected in 21.22% (59/278) and 76.26% (212/278) of the SLC26A4 related patients, respectively. Table S4 presents the phenotype information of patients caused by variants in GJB2 and SLC26A4.

We also identified 117 pathogenic/likely pathogenic variants in 32 uncommon HL genes as the molecular causes in 86 patients (Table S5), of which, 19 de novo variants in AD or X-linked HL genes were detected.

Additionally, we identified 64 VUS in 24 HL genes in 48 patients, and these variants were all point variants. Additionally, 20 pathogenic/likely pathogenic variants were also identified in this patient subgroup (Table S6).

Validation of two splice variants

The results of minigene assay showed NM_016239.4: c.6956 + 9 C > G in MYO15A trapped 4 nucleotides(nt) of intron33 while c.8340 + 5 G > A trapped the intron45 and intron46, which indicated these two splice variants altered the expression pattern of this gene (Fig. 2).

Fig. 2
figure 2

The impact of two splice variants in MYO15A gene on the splicing pattern. Wild-type: the amplification samples from cells transfected with plasmid carrying wild-type sequence of interest; Variant: the samples from transfectants carrying variant sequence of interest; I: The variant of c.6956 + 9 C > G trapped 4 nucleotides(nt) of intron33 of MYO15A gene; II: The variant of c.8340 + 5 G > A trapped intron45 and intron46 of MYO15A gene

Phenotypes and diagnostic rate

The impact of clinical phenotypes on diagnostic rate were analyzed, including gender, onset/awareness age, family history, the severity of HL, the symmetry of the two affected ears, geographical location, and nationality (Fig. 3, Table S7). Compared with that of the probands without family history, the diagnostic rate of the probands with family history was significantly higher (73.87% vs. 60.07%, P < 0.01). Similarly, the diagnostic rate of the patients with the onset/awareness age below five years was 62.69% (P < 0.01), and that of the patients with syndromic HL was 89.23% (P < 0.005). This result indicated that the genetic cause played a significant role in the etiology of these three subgroups.

Fig. 3
figure 3

The impact of phenotypes on the diagnostic rate. The diagnostic rate(%) = the number of diagnosed patients subgroup/the summary of diagnosed and undiagnosed patients subgroup × 100%. The statistical significance was defined as **P < 0.01

The diagnostic rate of patients with mild HL (29.41%) or with moderate HL (51.39%) or with profound HL (58.51%) was significantly lower than that of patients with severe HL (69.89%) (P < 0.005, P < 0.005, P < 0.01, respectively). This difference might have arisen from the patient subgroup caused by variations in SLC26A4: if we excluded the patients related to SLC26A4 from the diagnosed group, the diagnosis rate wouldn’t be significantly different among all the subgroups with different HL levels (P > 0.05).

Of the 577 patients with High-resolution computed tomography imaging available (excluding the patients with enlarged vestibular aqueduct and incomplete partition type III, 297 cases altogether, which was highly correlated to SLC26A4 and POU3F4, respectively), 31 cases were diagnosed with inner ear malformation (Fig. S1) [31]. Of these 31 patients, 28 did not obtain molecular diagnoses (Table S8), while only three patients with cochlear hypoplasia IV type were identified to be related to the EYA1 gene. This result indicated that there was a necessity to further study the etiology of this molecularly undiagnosed inner ear malformations [32, 33].

Discussion

The case-control study

The allele frequency of variants in an ethnicity-matched healthy population is very useful for the classification of variants [25, 26]. Herein, to explore the molecular etiology of a large cohort from China, a case-control study was performed. For the detected variants, apart from the allele frequency in the publicly available population databases, the frequency of variants detected in this control group was also considered (Tables S2, S3, S5, S6). For example, NM_206933.4:c.8559-2 A > G in USH2A, the allele frequency in the control group was 4/2054, while that in the patient group was 4/1040. Based on the data in this study, this variant was identified as VUS while it was classified as pathogenic in the Deafness Variation Database. However, we still noted that the number of the control group was relatively less compared with that of the patient group, which implied the urgency of the setup of the HL variants database of the ethnicity-matched healthy population to improve variant interpretation.

For the patient group, there were no exclusive criteria except the bilateral HL, which was supposed to be more likely related to hereditary etiology than the unilateral HL [30]. Furthermore, the patients were all pre-screened for common HL genes, including GJB2, SLC26A4, and MT-RNR1(m.A1555G, m.C1494T) by Sanger sequencing. Thus, the diagnoses of HL caused by MT-RNR1 were obtained from this first-generation sequencing.

The diagnosis of the patients

In this study, three common genes, including GJB2, SLC26A4, MT-RNR1, accounted for 85.54% (503/588, 35.20%, 47.28%, 3.23%, respectively) of the diagnostic patient group, while 32 uncommon HL-related genes accounted for the remaining 14.46% (85/588) of the diagnostic yield. While it has been reported in 459 HL patients from the United States, 28% (128) had positive genetic testing, with the leading five involved genes as GJB2, TMPRSS3, SLC26A4, MYO7A, and MT-RNR1 (16%, 10%, 8%, 7%, 5%, respectively) [34]. Another report presented that 56% of 2198 HL patients from 491 Palestinian families was genetic, and the top five genes implicated were GJB2, MYO15A, SLC26A4, MYO7A, and CDH23 (22%, 11%, 8.9%, 8.3%, 5%, respectively) with most common variant to be c.35delG in GJB2, c.1001 G > T in SLC26A4, and c.7207 G > T in MYO15A [35]. Though the including criteria for the subjects were different form each other in these published reports, we still could speculate that the spectrum and frequency of the molecular etiology in this Chinese patient cohort was different from that of other populations [36,37,38].

Here, we focused on sequence variants mostly located in the exons of 227 HL-related genes. In addition, there were other causative variants that were not covered in this panel, including (1) variants located in other non-coding regions of the targeted genes; (2) other variant types, for example, copy number variant, which has been testified to be involved in the HL; (3) unknown novel HL genes [39, 40]. If the above-mentioned variant types that this panel did not cover and 48 uncertain diagnosed patients were considered, the proportion of HHL in this patient cohort would be greater than 57.25% (588/1027).

Impact of phenotypes on the diagnostic rate

In the analysis of clinical phenotypes on the molecular diagnostic rate, 588 diagnosed patients and 370 undiagnosed patients were included, the diagnostic rate(%) = the number of diagnosed patients of subgroup/the summary of diagnosed and undiagnosed patients of subgroup × 100% (Table S7). In the calculation of diagnostic rate, 22 undiagnosed patients with characteristic phenotypes which are highly correlated to the HHL were excluded from the undiagnosed patient group (392), including 18 patients with enlarged vestibular aqueduct carrying one pathogenic variant in SLC26A4, three patients diagnosed as Waardenburg syndrome, and one case with inner malformation of IP-III. Therefore, there was 370 undiagnosed patients were taken into account in the analysis of the diagnostic rate.

In the current study, we noticed that the affected individuals from minority nationalities only comprised 5.16% of the patient group. Similarly, probands from Northeast, South, Northwest, Southwest of China all accounted for 14.51% of the patients, while patients with onset/awareness age >5 years made up 7.40% of the whole patient group, and patients with mild and moderate HL comprised 9.05% of the patients (Table 1). These figures implied that more subjects should be included from these subgroups in future investigations to enhance our knowledge of the HHL in these populations.

De novo variants identified in AD and X-linked HL genes

In the present study, of the 36 diagnosed patients caused by variations in AD or X-linked HL genes, 20 patients were identified as being caused by 19 de novo variations in 7 genes. For example, five variants in PTPN11 gene and three variants in CHD7 gene detected in this study were all de novo mutations. Due to lack of probands and carrier status, prenatal screening for the de novo variants is not yet available clinically. But in 2019, a study reported the non-invasive prenatal screening for a panel of causative genes for frequent dominant monogenic diseases using circulating cell-free fetal DNA [41]. This approach provided sensitivity, specificity at levels sufficient to be transferred to the clinical practice for screening of this type of variant.

Conclusion

In this study, 57.25% of the patient group had obtained positive molecular diagnoses, with 35 causative genes being involved. Of the 224 variants identified in the diagnosed patients, 83.04% (186/224) were related to AR inheritance, 12.95% (29/224) were related to AD, 3.12% (7/224) were related to X-linked, and 0.89% (2/224) were related to mitochondrial inheritance. Still, another 4.67% (48/1027) of the patients were categorized as uncertain diagnoses with at least one VUS, which indicated that more strategies were required to classify the VUS.