Highly efficient capture approach for the identification of diverse inherited retinal disorders

Our study presents a 319-gene panel targeting inherited retinal dystrophy (IRD) genes. Through a multi-center retrospective cohort study, we validated the assay’s effectiveness and clinical utility and characterized the mutation spectrum of Taiwanese IRD patients. Between January 2018 and May 2022, 493 patients in 425 unrelated families, all initially suspected of having IRD without prior genetic diagnoses, underwent detailed ophthalmic and physical examinations (with extra-ocular features recorded) and genetic testing with our customized panel. Disease-causing variants were identified by segregation analysis and clinical interpretation, with validation via Sanger sequencing. We achieved a read depth of >200× for 94.2% of the targeted 1.2 Mb region. 68.5% (291/425) of the probands received molecular diagnoses, with 53.9% (229/425) resolved cases. Retinitis pigmentosa (RP) is the most prevalent initial clinical impression (64.2%), and 90.8% of the cohort have the five most prevalent phenotypes (RP, cone-rod syndrome, Usher’s syndrome, Leber’s congenital amaurosis, Bietti crystalline dystrophy). The most commonly mutated genes of probands that received molecular diagnosis are USH2A (13.7% of the cohort), EYS (11.3%), CYP4V2 (4.8%), ABCA4 (4.5%), RPGR (3.4%), and RP1 (3.1%), collectively accounted for 40.8% of diagnoses. We identify 87 unique unreported variants previously not associated with IRD and refine clinical diagnoses for 21 patients (7.22% of positive cases). We developed a customized gene panel and tested it on the largest Taiwanese cohort, showing that it provides excellent coverage for diverse IRD phenotypes.


INTRODUCTION
Molecular diagnosis of rare diseases is challenging, and few treatments for genetic disorders are currently available 1,2 .Globally, 2.7 billion individuals (36% of the population) carry gene mutations responsible for autosomal recessive inherited retinal dystrophy (AR-IRD), and 5.5 million are afflicted with these mostly untreatable disorders 3 .Recent advances in DNA sequencing technologies and gene therapies have improved diagnostic yield and increased treatment options 4 .However, clinical translation needs to catch up to scientific discovery.For mutations in more than 300 genes associated with IRD identified to date, only mutations in fifteen genes (ABCA4, CEP290, CHM/REP1, CYP4V2, GUCY2D, MERTK, NR2E3, PDE6A, PDE6B, RHO, RPE65, RLBP1, RPGR, RS1, USH2A) are currently investigated for therapy.Clinical trials based on the mutations in these genes have been studied for seven IRDs (Enhanced S-cone syndrome (ESCS), Leber's Congenital Amaurosis (LCA), Rod-Cone dystrophy (RCD), Retinitis Pigmentosa (RP), Stargardt's dystrophy (SD), Usher's syndrome (US) (Supplementary Table 1).In 2017, Luxturna (voretigene neparvovec-rzyl) became the first FDA-approved gene therapy in the USA, targeting biallelic mutations in RPE65 for LCA2, and remains the only approved IRD gene therapy available.As the vast majority of this group of disorders is untreatable, most patients progress to early blindness.Establishing the molecular diagnosis of IRDs is the first step in identifying possible therapeutic targets, and characterizing the mutational spectrum of IRD in the population can help prioritize efforts to develop treatments.
IRDs are genetically heterogeneous diseases that manifest a spectrum of phenotypes.IRDs exist as syndromic and nonsyndromic forms where the former is associated with extra-ocular features, and the latter is confined to the eye.It has been estimated that up to 30% of IRD are syndromic, so ocular manifestations and molecular diagnosis obtained before the development of extra-ocular features may aid in timely diagnosis and management [5][6][7] .This paper presents the highly efficient molecular diagnostic approach for IRDs based on next-generation sequencing of a gene panel of 319 IRD-associated genes.We tested this approach on 425 patients and critical family members, representing the largest Taiwanese IRD cohort to date, and obtained a diagnostic yield of 68.5% of the probands received molecular diagnoses, and 53.9% of those consisted of solved diagnoses.Our results established the Taiwanese IRD genetic landscape and demonstrated that gene panel sequencing could be a cost-effective and highly efficient diagnostic method for IRD in both research and clinical settings.

Diagnostic yield and genetic findings
We developed a high-throughput sequencing panel test and achieved over 200X coverage of 94.2% of the 1.20 Mb target region (Supplementary Data 2).We sequenced 782 subjects in total, including 425 probands, and made molecular diagnoses for 68.5% (291/425) of the probands (Supplementary Data 3: GenoData).The diagnostic yield of cases with a positive family history of IRD (88.9%, 112/129) is notably higher than that for sporadic cases (60.5%, 179/296).In addition, syndromic patients have a higher diagnostic rate (80.4%) compared to that for nonsyndromic cases (60.5%) (Table 1).Overall, our approach achieved a high diagnostic yield for all IRD subtypes (Table 2) in patients of all age groups (Supplementary Table 3).Furthermore, among the 291 probands with positive molecular diagnoses, the clinical diagnoses were confirmed in 92.8% of the cases, while the molecular diagnoses in 7.22% of the cases led to alternate clinical diagnoses.(Table 1).
Evaluating the mode of inheritance for all probands by pedigree (n = 425), we found that two-thirds of the probands have no family history of IRD (sporadic cases, 61.6%) (Table 4).We next studied the genotype and mode of inheritance for 291 positive cases.About half of the cases are sporadic (46.4%), 34.4% are of autosomal recessive inheritance, 11.3% are autosomal dominant, and 7.9% are X-linked recessive (Fig. 2a).Half of the positive cases (50.3%) have compound heterozygous mutation genotypes (Fig. 2b).Evaluating the ACMG pathogenicity classification of the 568 variants identified in the cohort, we found that 57.9% are pathogenic, 27.8% are likely pathogenic, and 14.3% are variants of unknown significance (Fig. 2c).Among the variants found, 111 variants have not been reported previously in databases associated with IRD (Table 3).Detailed genetic information on previously unreported variants is shown in Supplementary Data 3.

DISCUSSION
In this study, we custom-designed a high-quality target capture probe set for NGS of a panel of 319 IRD-associated genes and tested the IRD panel sequencing approach on the largest Taiwanese IRD cohort to date.The IRD gene panel design is optimized for coverage of as many IRD genes as possible, and the sequencing protocol ensures that very high read depth is achieved while we scan the samples in batches of 200 samples in one experiment.The result is a highly efficient process with uniformly high depth coverage (>200×) for the target region, minimal sample failures, and much higher diagnostic yield for a wide range of IRDs than most rare genetic disease sequencing studies [8][9][10][11][12][13][14][15][16] .In the process, we also identified 111 previously unreported causal variants that could be used for therapeutic target development.In addition, the mutation spectrum and heterogeneity in our IRD cohort are significantly different from those in cohorts of other studies and ancestries.We found that our IRD landscape follows that of East Asia, with top genes consisting of USH2A, EYS, and CYP4V2, and not ABCA4, which is mainly found in European cohorts (Supplementary Fig. 2, Supplementary Table 4) 9,13 .
Visual impairment in children is challenging to diagnose in the early stages as IRD is complicated by complex, ambiguous phenotypes, hampering timely clinical diagnosis.Although IRD is considered a pediatric genetic disease, the patients' mean onset age is 25.0.Moreover, 61.6% reported a negative family history, implying a high carrier rate in Taiwan with asymptomatic parents or underdiagnosed family members (Table 2).Genetic testing provides an opportunity to confirm or refine clinical diagnosis, guide disease management, inform prognosis, and assist in family planning 17 .Increased access to testing may make a difference in how patients interpret, adapt to, and experience their condition and are informed as gene therapies become available.Where genetic mutations still present with no cure, genetic results allow the family to prepare and plan for the future to support their child as required and reduce psychosocial burden.The IRD panel we designed provides a high success rate in diagnosis for patients regardless of their family history, phenotype, disease status, gender, and age.With a high diagnostic yield for diverse IRD subtypes, the IRD sequencing panel we designed is useful for early genetic testing and routine implementation in the clinic for IRD patients.

Patient enrollment and DNA preparation
This study was approved by the Institutional Review Board (IRB) of Academia Sinica (AS-IRB01-21064(N)), Taipei Veterans General   The table shows the number of variants and proportions previously unreported of each variant type identified in probands with molecular diagnosis (n = 291) and those with molecular diagnosis based on strict ACMG criteria (n = 229).Homozygous variants are counted twice as allele occurrence.

IRD gene panel screening
We designed the custom gene panel for the primary screening of IRD, which includes 319 genes associated with IRD (collected from Retnet: https://sph.uth.edu/retnet/ and OMIM: http:// www.ncbi.nlm.nih.gov/omim/)(Supplementary Table 2).In addition, the panel also includes 81 noncoding sequences reported for association with IRD.The panel probes were synthesized by IDT (Integrated DNA Technologies, USA), and target capture experiments were conducted in 4 batches of ~200 samples.
For genomic library preps from each sample, the Illumina Nextera Flex for Enrichment kit was applied using 500 ng gDNA and amplified by nine PCR cycles.The individual libraries were quality control (QC) checked by Qubit HS DNA assay (Thermo-Fisher Scientific, USA) and Fragment Analyzer DNA 6k kit (Agilent, USA) for proper profiles.Then, the libraries were equally pooled and subjected to panel capture according to the Nextera enrichment protocol (Illumina, USA) followed by 12 cycles to amplify the enriched DNA pools.After QC check, the captured DNA pools were sequenced on Illumina HiSeq2500 sequencer (Illumina, USA) to obtain greater than 200-fold coverage per sample.

Bioinformatics analysis pipeline, variant filtering, in-house BioIT protocol
After short-read sequencing, the Illumina data were mapped and aligned based on GRCh38 (hg38) from the Genome Reference Consortium reference sequence by BWA (bwa-mem).Pipeline output was limited to variants in the target region ±20 bp.First, variants and indels are identified by the joint variant calling pipeline of the Genome Analysis Toolkit (GATK) with Haplotype-Caller and GenotypeVCFs.Then, variant annotation and variant effect prediction are performed with ANNOVAR.After filtering out synonymous SNVs, we removed common SNVs (>1%) based on the frequency in the public database, including those with minor allele frequency (MAF)_ > 0.01 in 1000 G all, 1000 G EAS, ExAC all, ExAC EAS, gnomAD exome all, gnomAD exome EAS, and gnomAD genome all.Finally, variants were classified using a 5-class system consistent with American College of Medical Genetics (ACMG) standards and guidelines for interpreting sequence variants (Supplementary Fig. 1).
The determination of disease-causeative variants is accompanied by an evaluation of three possible modes of genetic inheritance (autosomal recessive, autosomal dominant, and X-linked recessive) based on their pedigree information.This includes examining the sequencing data from affected and unaffected family members to confirm the co-segregation of candidate mutations with the disease.After identifying the putative IRD-associated mutations, Sanger sequencing was performed for predicted class III-V variants to confirm their presence in the study subjects.17.Zhao, P. Y., Branham, K., Schlegel, Fahim, A. T. & Jayasundera, K. T. Association of no-cost genetic testing program implementation and patient characteristics with access to genetic testing for inherited retinal degenerations.JAMA Ophthalmol.139, 449-455 (2021).

Fig. 1
Fig. 1 Genetic landscape.Pie chart showing the distribution of mutated genes in the 293 probands who received a molecular diagnosis after HRD panel genetic testing.Others denote accumulation of genes with <1.5% contribution.

Fig. 2
Fig. 2 Genetic characteristics.Pie chart showing the distribution of mutated gene characteristics in the 293 probands.Genotype and Inheritance are based on probands (293).Mutation characteristics and ACMG pathogenicity are variant-based such that they represent a total number of variants in the cohort of 293 probands (a total of 512 variants; homozygous variants are duplicated as alleles in the number of occurrences).
aThe table depicts the diagnostic yield of probands and the proportions of molecularly diagnosed patients receiving an alternative clinical diagnosis.The diagnostic yield is broken down into different groups, such as family history and IRD category.Proband phenotype uses initial diagnosis; for the unsolved proband, the final diagnosis is the same as the initial diagnosis.a Diagnostic yield per phenotype category: the numerator denotes the number of probands with a molecular diagnosis, and the denominator denotes the number of probands of the phenotype category.

Table 2 .
Diagnostic yield per initial diagnosis of the proband.The table shows the clinical demographic and diagnostic yield of each IRD phenotype.Phenotype abbreviations are listed in the supplementary table.Uncertain denotes phenotypes that ophthalmologists are unable to confirm the diagnosis.