Molecular Diagnosis of 34 Japanese Families with Leber Congenital Amaurosis Using Targeted Next Generation Sequencing

Leber congenital amaurosis (LCA) is a genetically and clinically heterogeneous disease, and represents the most severe form of inherited retinal dystrophy (IRD). The present study reports the mutation spectra and frequency of known LCA and IRD-associated genes in 34 Japanese families with LCA (including three families that were previously reported). A total of 74 LCA- and IRD-associated genes were analysed via targeted-next generation sequencing (TS), while recently discovered LCA-associated genes, as well as known variants not able to be screened using this approach, were evaluated via additional Sanger sequencing, long-range polymerase chain reaction, and/or copy number variation analyses. The results of these analyses revealed 30 potential pathogenic variants in 12 (nine LCA-associated and three other IRD-associated) genes among 19 of the 34 analysed families. The most frequently mutated genes were CRB1, NMNAT1, and RPGRIP1. The results also showed the mutation spectra and frequencies identified in the analysed Japanese population to be distinctly different from those previously identified for other ethnic backgrounds. Finally, the present study, which is the first to conduct a NGS-based molecular diagnosis of a large Japanese LCA cohort, achieved a detection rate of approximately 56%, indicating that TS is a valuable method for molecular diagnosis of LCA cases in the Japanese population.

Identification of potential pathogenic variants in 19 families. The target-capture panel used in the present study comprised 445,968 bp derived from 1182 target regions of the 74 genes 10 . The targeted region of each analysed gene included all exons and flanking intronic sequences (i.e. the intronic sequence ±25 bp from each exon boundary), and the coverage rate was 98.98% (details see Supp. Table 1). The analysed target regions achieved an average 223.9 ± 47-fold coverage across the 33 newly recruited samples, and an average of 91.4% and 86.1% of bases in the target regions exhibited 10-fold and 20-fold coverage, respectively. These results indicate that sufficient coverage was achieved for the identification of variants.
The obtained sequence data were analysed using the filtering criteria described in the methods section. Consequently, in 19 of the 34 analysed families, we identified 30 potential pathogenic variants, of which 16 were novel. All potential pathogenic variants identified in this study were confirmed via Sanger sequencing. These included 16 families that harboured variants in nine LCA-associated genes, and three families that exhibited variants in three other IRD-associated genes. The 30 identified variants consisted of 4 nonsense, 9 frameshift 11 , 3 splice site 10,11 and 14 missense variants 10 (Tables 1 and 2). The pathogenicity of the novel missense variants was supported by conducted in silico prediction analyses (Supp. Table 2). Variants in LCA-associated genes. Of the identified potential pathogenic variants in LCA-associated genes, 25 (in 14 families) were identified to occur in seven known ar LCA-associated genes, while two (in two families) were identified to occur in two known ad LCA-associated genes (Tables 1 and 2). The most frequently mutated genes identified by the present study were CRB1, NMNAT1, and RPGRIP1, which were each found to occur in three of the 19 families (Fig. 1). The three identified NMNAT1 variants (carried by patients EYE42, EYE159, and S132) were missense variants that were previously shown to be pathogenic 12,13 , while the five identified RPGRIP1 variants, and five of the seven identified CRB1 variants included both previously reported and novel variants that were all shown to induce a loss-of-function (LOF). Of the remaining seven families, five carried CEP290, LRAT, PRPH2, CRX, and IMPDH1 variants, and two were shown to harbour four GUCY2D variants, three of which were missense variants 10,14 . The conducted segregation analyses of 19 of the identified families confirmed that the pathogenic alleles were on different chromosomes 10,11 (Table 1 and Fig. 2). Although TS data suggested that elder (JU1039) and younger (JU1040) affected sisters in family JIKEI-145 had an apparently homozygous missense variant c.163C > T [p.(R55W)] in exon 3 of the LRAT, the conducted segregation analysis revealed that the sisters' unaffected father had the heterozygous variant and their unaffected mother did not. This finding suggested the possibility that both the sisters had a heterozygous deletion including exon 3 of LRAT. In fact, quantitative real-time polymerase chain reaction (qPCR) analysis revealed that both the sisters and their mother harboured a heterozygous deletion that included LRAT exons 1-3 (Supp. Figure 1), indicating that the pathogenic two alleles were on different chromosomes (Table 1 and Fig. 2). The patients with LCA in families EYE70 and EYE125 carried a known heterozygous missense variant in CRX 15 and a novel heterozygous missense variant in IMPDH1, respectively, both of which occurred de novo (Table 1 and Fig. 2), consistent with an ad mode of inheritance.
Variants in other IRD-associated genes. Mutations in RPGR and RP2 have been previously shown to cause X-linked RP. We found a novel hemizygous missense variant c.977A > C [p.(K326T)] in RPGR, and a novel hemizygous splice site variant c.769-2A > G in RP2 in families EYE50 and EYE114, respectively, which were maternally transmitted (Table 1 and Fig. 2). BEST1 mutations have been previously shown to cause five clinically distinct retinopathies, including Best vitelliform macular dystrophy, ar bestrophinopathy, adult-onset vitelliform macular dystrophy, ad vitreoretinochoroidopathy, and ad RP 16 . The present study identified a novel heterozygous BEST1 missense variant c.682G > T [p.(D228Y)] that was carried by the patient with LCA in family EYE187. The patient's mother was normally sighted, whereas his father had previously been diagnosed with RP. The produced sequencing electropherogram suggested that his father may harbour the variant in a mosaic state (Supp. Figure 2). To evaluate the paternal mosaicism state, it is necessary to perform the segregation analysis using the parental DNA extracted from several tissues. However, the parents did not agree to receive the more elaborate examinations. The p.(D228Y) was a novel missense change at an amino acid residue, where a different missense changes [p.(D228N) and p.(D228H)] were reported to be pathogenic 17 Additional mutation screening. Unsolved families, (i.e. those for which potentially pathogenic variant(s) were not detected using the TS approach) were analysed via additional mutation screening (as described in the Methods section). None of the patients in these families carried the c.2991 + 1655A > G intronic variant in CEP290 that is frequently identified in European, Australian, and Brazilian families with LCA 5,7,8 . Neither were any of the analysed patients shown to harbour rare variants in five recently discovered LCA-associated genes (CCT2, CLUAP1, DTHD1, GDF6, and IFT140) [20][21][22][23][24] , nor in exon 15 of RPGR, which is an alternative exon called ORF15 that contains highly repetitive purine-rich sequences (Supp . Tables 3 and 4).  Table 3). In addition, the heterozygous known RPGRIP1 exon-17 deletion (c.2710 + 374_2895 + 74del) 11,25 was identified in families EYE16 and JIKEI-122 via breakpoint-specific PCR (See the Methods section and Supp. Table 4). These eight families harbour a heterozygous variant in LCA-associated gene; this rate is much higher than the occasional carrier rate of ar variants, suggesting that a second variant is likely located within these genes. Therefore, the generated TS data was used to re-search for splice-site variants within 25 bp of the exon-intron boundaries, and for variants in noncoding exonic regions (see Methods section). The results of these analyses did not reveal any second variants within the analysed genes for these families. Meanwhile, eight of these identified patients, (since patient EYE47 was excluded from further analysis due to a lack of sample DNA), were assessed via a multiplex ligation-dependent     The results of the segregation analyses for families EYE20, EYE55, and LCA1H have been previously reported 10,11 . Patient EYE68 harboured three variants, of which both p.(L223Ffs*4) and p.(A245Gfs*16) were maternally inherited, and predicted to produce transcripts likely to be targeted for nonsense-mediated mRNA decay, and thus no final protein product. Both variants were thus considered to be likely pathogenic; however, this was not able to be confirmed by the present study. The parents in family EYE156 were normally sighted at the time of their examinations; however, it is possible that this may be a delayed effect of a late-onset form of retinitis pigmentosa (RP) likely to be induced as a result of their identified heterozygous variants in PRPH2, a known autosomal dominant RP-associated gene. The identification code and gene(s) relevant to each family are shown above the pedigree, and the genotype for each family member is shown below their symbol. probe amplification (MLPA) assay. The results of this assay showed no CNV on the alternate allele in any of the analysed patients (Supp. Table 4).

Patient clinical findings.
All enrolled patients met the three criteria (as described in the Methods section), and the 19 patients shown to carry potential pathogenic variants exhibited classical clinical features of LCA. Ophthalmoscopic examinations of the patients revealed various degrees of retinal degeneration, with or without vascular attenuation, macular degeneration, and optic disc pallor. The decimal best-corrected visual acuity (BCVA) in measurable cases ranged from light perception to 0.5. Among the families assessed by the present study (Fig. 2), the phenotype data collected for family JIKEI-145 (including patients JU1039 and JU1040) were found to be relatively mild (Fig. 3A,B). The proband, JU1039, was a 9-year-old female that exhibited nyctalopia from early childhood. Her BCVA scores at the time of examination were 0.5 and 0.2 (with hyperopia) in the right and left eye, respectively. Her younger sister, JU1040, was a 6-year-old female that also exhibited nyctalopia from early childhood. Her BCVA score at the time of examination were 0.2 and 0.3 (with hyperopia) in the right and left eye, respectively. Fundus examinations revealed almost normal fundi, with only mild diffuse retinal pigment epithelium (RPE) atrophy, and slight retinal vessel narrowing in both patients; however, full-field ERGs were severely decreased and non-recordable for JU1039 and JU1040, respectively. Goldmann perimetry testing showed preserved peripheral visual fields with V-4e isopters, but decreased central sensitivity in both patients. Together these data represent the first report describing the clinical features of Japanese patients with LRAT-associated LCA. Patient EYE139 was found to harbour a GUCY2D mutation, and to exhibit almost normal fundi, with only mild retinal degeneration (Fig. 3C). He was a 9-year-old male with autism, and exhibited nystagmus from birth. His visual acuity (measured using a grating acuity card under binocular conditions), was 0.02 (with hyperopia).
Conversely, patient EYE121 was found to carry a CRB1 mutation, and to display a more severe phenotype (Fig. 3D). She was a 14-year-old female that exhibited nystagmus from birth. Her BCVA scores at the time of examination were 0.3 and 0.06 (with hyperopia) in the right and left eye, respectively. Fundus examinations revealed marked retinal degeneration with maculopathy, slight retinal vessels narrowing, and pigmentation in the midperiphery. Patients EYE42 and S132 were both shown to carry NMNAT1 mutations, and exhibit very severe phenotypes (Fig. 3E,F). EYE42 was a 16-year-old male with no light perception, while S132 was a 14-year-old female that exhibited only light-perception vision. Both patients exhibited nystagmus from birth, as well as bilateral coloboma-like macular atrophy, diffuse RPE atrophy, and retinal vessel narrowing (identified via the conducted fundus examinations).

Discussion
In this study, we described the mutation screening in a total of 34 Japanese families with LCA (including three previously described families 10,11 ), which we believe is the largest Japanese LCA cohort studied until date. The utilised TS and additional screening methods revealed 30 potential pathogenic variants in 19 of the analysed families. A number of potential pathogenic variants have already been identified in families of various ethnic populations, and several have described variants in Japanese families with LCA [25][26][27][28][29][30] . However, this study provides the mutation spectra and frequency of known and novel LCA-causative variants in the analysed Japanese cohorts, which were previously unstudied.
Of the 12 (nine LCA-associated and three other IRD-associated) genes found to contain potential pathogenic variants, the most frequently mutated genes were CRB1 (8.8%, 3/34), NMNAT1 (8.8%, 3/34), and RPGRIP1 (8.8%, 3/34) (Fig. 1). All three of the patients (EYE42, EYE159, and S132) found to harbour a missense variant in NMNAT1 were shown to carry a heterozygous variant c.709C > T [p.(R237C)], and patients EYE159 and S132 were also found to harbour a heterozygous variant c.196C > T [p.(R66W)] ( Table 2). While p.(R237C) and p.(R66W) were originally identified as being pathogenic in French, Asian American, and Indian populations 12,13 , Han J et al. recently reported two unrelated patients from Korean families with LCA that were compound heterozygous for the two variants 31 . They have since been detected in Japanese families, as well as those of various other ethnicities [6][7][8]12,13,17,31 . Conversely, nonsense, frameshift, or canonical splice-site variants in RPGRIP1 were identified in the three of the analysed families (EYE20, EYE55, and EYE149), as was the heterozygous RPGRIP1 exon-17 deletion (see families EYE16, EYE55, and JIKEI-122) ( Table 2). A homozygous RPGRIP1 exon-17 deletion variant was previously reported in a Japanese family with LCA 25 , but not in any other ethnic populations to date. This, together with the results of the present study, suggests that the RPGRIP1 exon-17 deletion variant may be a founder mutation in Japanese LCA.
Other genes found to be mutated in the analysed cohort included GUCY2D ( (Fig. 1). To the best of our knowledge, this is the first study to report on CEP290, IMPDH1, LRAT, PRPH2, RP2, RPGR, and BEST1 variants in Japanese families with LCA. RP2, RPGR, and BEST1 variants have been previously shown to cause IRDs other than LCA. Only a small number of patients with RPGR-or RP2-associated LCA have been reported in Chinese cohorts 6,32 ; however, there is very limited information regarding patients with BEST1-associated LCA 17 . Although the BEST1 variant p.(D228Y) identified by the present study was shown to be rare and likely pathogenic based on the conducted in silico analyses, we did not provide experimental evidences supporting the genotype-phenotype correlations. Thus, it remains possible that the variant(s) in other genes not including our designed panel might be pathogenic in patient EYE187. Further analysis with more large-scale cohort study is necessary to better elucidate the genotype-phenotype correlations for patients with LCA that harbour BEST1 variants.
The utilised TS approach was not able to detect mutated genes in 15 of the analysed families (Supp. Table 3). It is likely that these 15 families harbour variants in other IRD-, or novel LCA-associated genes that were not SCIENTIFIC REPORTS | (2018) 8:8279 | DOI:10.1038/s41598-018-26524-z targeted by the utilised TS panel; thus, our research group is planning to re-examine these families using a whole exome sequencing (WES) approach. Likewise, the mutation screening data produced by the present study indicated that eight of the 15 unsolved families carry a single rare heterozygous variant in an ar LCA-associated gene (Supp. Table 3), but no second variants were detected via a reanalysis of the generated TS data or conducted CNV analyses, indicating that additional high-throughput sequencing is required. Therefore, we plan to use a whole genome-sequencing approach to screen for these elusive second variants, since they may be located within gene regulatory regions, deep intronic regions, or unknown exons (e.g. alternative splicing regions). Additionally, coloboma-like macular atrophy was observed in patients S132 and EYE42. The accompanying optical coherence tomography (OCT) images presented underneath (A-C), and (F) revealed an unremarkable ellipsoid zone, but a relatively preserved retinal structure and thickness in patients JU1039, JU1040, and EYE139, as well as both severe retinal thinning and a disrupted retinal structure in patient S132. The present study profiled and compared the mutation spectra in Japanese families with LCA to those previously reported in the literature [5][6][7][8] . The fact that the study detected a large number of novel variants in the analysed cohort suggests that the genetic basis of Japanese families with LCA is unique. The most frequently mutated genes in European, Chinese, Australian, and Brazilian families with LCA are as follows: European, CEP290 (15%), GUCY2D (11.7%), and CRB1 (10%) 5 ; Chinese, CRB1 (16.8%), GUCY2D (10.7%), and RPGRIP1 (7.6%) 6 ; Brazilian, CRB1 (10.7%), CEP290 (10.7%), RPE65 (10.7%) 7 ; Australian, CEP290 (14.7%), GUCY2D (14.7%), and NMNAT1 (8.8%) 8 . The present study showed that CRB1 was the most frequently mutated gene among the analysed Japanese families and the families of various other ethnic populations, except previously reported Australian families with LCA. Both the present Japanese, and the previously reported Australian families with LCA exhibited a higher proportion (8.8%) of NMNAT1 variants than studied Chinese (2.3%) or Brazilian (3.6%) families with LCA. Conversely, the present Japanese cohort exhibited fewer CEP290 variants (2.9%) than was previously reported to occur in European (15%), Australian (14.7%), and Brazilian (10.7%) families with LCA. Indeed, a large proportion of European, Australian, and Brazilian families with LCA have been shown to carry a founder intronic variant in CEP290 (c.2991 + 1655G > A), that was not detected by the present study (Supp. Table 4). Together, these data strongly suggest that the mutation spectra in Japanese families with LCA are distinctly different to those characteristic of families with LCA of various other ethnic backgrounds.
The present study employed a TS approach that has been demonstrated to capable of accurately and efficiently identifying variants in patients with IRDs 10,33,34 ; however, deep intronic changes and novel genes cannot be identified using this technique. Conversely, the application of WES in genetic diagnosis is aimed for the discovery of novel disease causing genes and WES successfully increases the yield of identification of new causative genes in LCA [20][21][22][23][24] . Although WES will become a standard method for mutation screening in the near future, the utilised TS approach still has a number of advantages. Firstly, this approach can produce a higher coverage rate for targeted regions. Secondly, it can allow more patients to be analysed in a single assay, which leads to less expensive examinations per patient. Thirdly, output data from TS are relatively small and thus make bioinformatics analyses faster and easier. In addition, it reduces the risk of incidental findings. Thus, the TS approach is currently more practically applicable than WES for screening highly heterogeneous diseases such as IRD.
In conclusion, we reported the results of the first NGS-based molecular diagnosis of the largest Japanese LCA cohort so far to our knowledge. We successfully identified potential pathogenic variants in 19 of the 34 analysed families. The mutation spectra and frequency of LCA-associated genes in the Japanese population appears to be distinctly different from those previously reported for other ethnic populations. Finally, the observed detection rate of approximately 56% indicates that the utilised TS approach is a valuable method for diagnosing LCA, and as such, may facilitate the future application of gene-specific treatments for patients with the disease.  Target capture and NGS. The integrity of the TS approach used in this study has been previously evaluated using three families that included six patients with LCA 10,11 . These previous results were combined with those presented in the current study, which used a target-capture panel targeted to 74 IRD genes to analyse 33 new patients with LCA (from 31 families). The panel was designed, the library prepared, and the target-capture sequencing performed as previously described 10 .

Ethics statements.
Bioinformatics analyses. The sequence reads were mapped to the human reference genome sequence (GRCh37/hg19) using Burrows-Wheeler Aligner software (v 0.7.15) after trimming the adapter sequence by SCIENTIFIC REPORTS | (2018) 8:8279 | DOI:10.1038/s41598-018-26524-z cutadapt software (v 1.11) and mapped reads around insertion-deletion polymorphisms (INDELs) were realigned by Genome Analysis Toolkit (GATK; v 3.6) 35 . Base quality scores were recalibrated by GATK. Variant calls were processed by the GATK HaplotypeCaller, and called single nucleotide variants and INDELs were annotated by ANNOVAR software (v 2016Feb01) 36 . We focused on nonsynonymous variants and splice site variants which are within 5 bp of the exon-intron boundaries (±5 bp), and excluded synonymous and non-coding exonic variants for the analysis. Common genetic variants (allele frequency, >0.005 for recessive variants or >0.001 for dominant variants) in any of the ethnic subgroups found in the following single nucleotide polymorphism (SNP) databases and synonymous variants were treated as possible non-pathogenic sequence alterations in this study: 1000 Genomes database (http://www.1000genomes.org/), Exome Aggregation Consortium database (http://exac.broadinstitute.org/), Human Genetic Variation Database (HGVD; http://www.genome.med.kyoto-u.ac.jp/SnpDB/) and Tohoku Medical Megabank Organization database (ToMMo; https://ijgvd.megabank.tohoku.ac.jp/). HGVD and ToMMo were used as a reference for Japanese controls. The Human Gene Mutation Database (HGMD; https://portal.biobase-international.com/cgi-bin/portal/login.cgi) was used to identify previously reported variants. A bioinformatics analysis of the analysed region was conducted to exclude false positive variants; however, this may have also filtered out true pathogenic variants. Therefore, for those families in which the utilized TS approach only identified a heterozygous rare variant in ar LCA-associated genes, we expanded the analysed region, and re-examined the our TS data to screen for variants located in intronic regions within 25 bp of the exon-intron boundaries (±25 bp) 10 , and non-coding exonic regions 30 . Prioritization and assessment of identified variants. Identified variants were filtered by applying the following prioritization criteria: Additional mutation screening. Families for whom no potentially pathogenic variant(s) were identified via the utilised TS method underwent additional mutation screening, as below (Supp. Table 4): 1. The known CEP290 intronic variant c.2991 + 1655A > G 38 was not included in the design of the TS capture panel, so genomic CEP290 fragments encompassing c.2991 + 1655A > G had to be amplified and analysed via Sanger sequencing. 2. Large deletion and insertion variants were not detectable via the applied TS approach; thus, a long-range PCR assay was used to screen the known RPGRIP1 exon 17 deletion 11,25 . 3. The LCA-associated genes CCT2, CLUAP1, DTHD1, GDF6, and IFT140 20-24 were identified after the capture panel used in the present study was designed (as reported in RetNet at the time of system design; https://sph.uth.edu/retnet/; accessed 23th January 2014). Thus, all coding exons in GDF6 and IFT140 were screened via Sanger sequencing. Given the rarity of reported LCA-associated variants in CCT2, CLU-AP1, and DTHD1, only those exons within these genes that contained previously reported variants were analysed. 4. The RPGR exon ORF15 is a mutational hotspot for X-linked RP; however, it contains repetitive sequences that cannot be efficiently captured or enriched via conventional targeted capture NGS 39 . Thus, genomic RPGR fragments encompassing the ORF15 region were amplified and analysed via Sanger sequencing for the unsolved male patients with LCA (i.e. EYE16, EYE103, EYE133, EYE178, EYE182, and JU0955).

CNV analyses.
Screening for CNV in identified ar LCA-associated genes was performed for families in which the utilised TS or additional mutation screening approaches revealed only a single heterozygous rare variant (Supp. Table 3). SALASA MLPA probemix reagents (P221 for CRX, and P222 for RPGRIP1, GUCY2D, and CEP290; MRC Holland, Amsterdam, Netherland) were used to conduct an MLPA, as previously described 40 . A qPCR analysis was performed to confirm the family JIKEI-145 segregation analysis findings, using appropriate primers (available upon request), SYBR Premix EX Taq II (Takara, Japan) and a Thermal Cycler Dice TP800 (Takara, Japan), according to the manufacturer's instructions. Relative comparative threshold cycle (Ct) values were calculated on the basis of the second derivative maximum method using dedicated software (Thermal Cycler Dice Real Time System software, v 5.11B for TP800, Takara, Japan). The relative copy number (RCN) was determined on the basis of the comparative ddCT method using the unaffected father DNA as a control (where an RCN score of 1.0 represented no copy number change, a score of 0.6-1.4 was considered abnormal, and a score <0.6 and >1.4 represented deletions and duplications, respectively). All reactions were performed in duplicate, and all experiments were repeated in triplicate.
SCIENTIFIC REPORTS | (2018) 8:8279 | DOI:10.1038/s41598-018-26524-z Sanger sequencing and segregation analyses. Potential pathogenic variants detected using the TS approach were validated by performing Sanger sequencing as per the standard protocol 41 . All utilised primers are available upon request. Sanger sequencing segregation analyses were performed on DNA from family members to investigate the co-segregation of potentially pathogenic variants.