Genetic Spectrum of EYS-associated Retinal Disease in a Large Japanese Cohort: Identification of Disease-associated Variants with Relatively High Allele Frequency

Biallelic variants in the EYS gene are a major cause of autosomal recessive inherited retinal disease (IRD), with a high prevalence in the Asian population. The purpose of this study was to identify pathogenic EYS variants, to determine the clinical/genetic spectrum of EYS-associated retinal disease (EYS-RD), and to discover disease-associated variants with relatively high allele frequency (1%-10%) in a nationwide Japanese cohort. Sixty-six affected subjects from 61 families with biallelic or multiple pathogenic/disease-associated EYS variants were ascertained by whole-exome sequencing. Three phenotype groups were identified in EYS-RD: retinitis pigmentosa (RP; 85.94%), cone-rod dystrophy (CORD; 10.94%), and Leber congenital amaurosis (LCA; 3.12%). Twenty-six pathogenic/disease-associated EYS variants were identified, including seven novel variants. The two most prevalent variants, p.(Gly843Glu) and p.(Thr2465Ser) were found in 26 and twelve families (42.6%, 19.7%), respectively, for which the allele frequency (AF) in the Japanese population was 2.2% and 3.0%, respectively. These results expand the phenotypic and genotypic spectrum of EYS-RD, accounting for a high proportion of EYS-RD both in autosomal recessive RP (23.4%) and autosomal recessive CORD (9.9%) in the Japanese population. The presence of EYS variants with relatively high AF highlights the importance of considering the pathogenicity of non-rare variants in relatively prevalent Mendelian disorders.

The mean depth/coverage for the detected EYS variants in this study is listed in Table 1. No copy number variants associated with the disease were identified.
Twenty-six EYS variants were identified in total. Six variants have never been reported, and one variant (c.7394 C > G, p.(Thr2465Ser)) has never been associated with the specific phenotype of RP/CORD/LCA. The detected variants are widely distributed in the EYS gene (Fig. 2). There were twelve missense variants, eight frameshift variants, five nonsense variants, and one variant with splice site alteration. The genetic results are summarized in Table 1 and Supplementary Table S1.
Detailed results of the in silico analyses of 26 variants are presented in Supplementary Table S3. Thirteen variants were classified as pathogenic, four were likely pathogenic, and nine were variants of uncertain significance (VUS). Eight VUS were found with likely pathogenic/pathogenic variants or variants previously reported as likely pathogenic elsewhere. One variant (c.8608 A > T, p.(Asn2870Tyr)) was found with the recurrent relatively high AF variant (c.7394 C > G, p.(Thr2465Ser)).

Discussion
The genetic spectrum of EYS-RD is illustrated in a well-characterized large Japanese cohort of 61 families. The identification of variants with relatively high AF confirmed by the co-segregation analysis in multiple families helped to clarify the high proportion of EYS-RD in the IRDs of the Japanese population; 23.4% of AR or sporadic RP (52/222) and 9.9% of AR or sporadic CORD (7/71).
two EYS variants with allele frequencies higher than 1%. Two variants with relatively high AF (>1% in the general population) were confirmed in our cohort: p.(Gly843Glu) and p.(Thr2465Ser). All the subjects harbouring these variants in a homozygous or compound heterozygous status in the Japan Eye Genetics Consortium (JEGC; http://www.jegc.org/) study cohort of 1302 subjects from 729 families demonstrated retinal dystrophy, which supports the disease causation/association of these two variants.
The variant p.(Gly843Glu) was first described by Iwanami et al. in 2012. Five subjects with this variant found with the other proven truncating variants, such as p.(Ser1653Lysfs*2) and p.(Ser2428*), were presented in this report 25   www.nature.com/scientificreports www.nature.com/scientificreports/ in the spectrum of RP. In the HGVD database, there is one subject homozygous for this variant out of 1207 subjects (1/1207, 0.08%) with no registered diseases on the records for whom no further ophthalmic information is available 26 . Given the variable disease onset and phenotype associated with this variant, it is still uncertain whether this subject will develop visual defects in the future. The AF of this variant in our molecularly proven ARRP cohort of 112 families (32/224; 14.3%) was significantly higher than that in the general Japanese population (53/2361, 2.25%; HGVD) calculated with Fisher's exact test (P < 0.001), as implied by the previous studies in the Japanese population 19,25 . Moreover, the AF in the general Japanese population was approximately 50/1000 times higher than that in the East Asian/total population of gnomAD. The pathogenicity of this variant is not fully proven; however, a founder effect in the Japanese population should be considered to explain this most prevalent disease-associated allele.
The other variant with relatively high AF (p.(Thr2465Ser)) was first described by Hosono et al. in 2012 as a possible non-pathogenic variant with the allele frequency of affected (8/200; 4.0%) and normal subjects (2/192; 1.0%) 20 . In our cohort, twelve families harboured this variant and no other candidate variants in any other known retinal disease-associated gene. Three of these twelve families had proven biallelic EYS variants confirmed by the co-segregation analysis. It is of note that five alleles of this variant were associated with CORD. In addition, this variant was found in cis with the p.(Gly843Glu) variant in two families with an additional family harbouring three candidate unsegregated variants. The AF of this variant in our molecularly confirmed ARRP cohort (15/224; 6.70%) was higher than that of the general Japanese population (67/2203, 3.04%; HGVD) calculated with Fisher's exact test, which reached a statistically significant value (P = 0.01). The AF in the general Japanese population was approximately 200/2000 times higher than those in the East Asian/total population, respectively. In the HGVD database, there are two subjects (2/1207, 0.17%) homozygous for this variant with no available ophthalmic information. The results of in silico analysis and comparison analysis between the AF of the affected cohort and the general population suggest some supporting evidence for the disease causation, and a founder effect in the Japanese population could also be considered for this prevalent allele that is possibly associated with IRDs.
the EYS gene and the high prevalence of IRDs in the Japanese population. Two prevalent truncating variants (p.(Ser1653Lysfs*2) and p.(Tyr2935*)) were also frequently found in our cohort. As previously described, these two variants have a higher AF in the Japanese population than in other populations 15,19,20,25 . Together with the other two variants with high AF (p.(Val1270Gly) and p.(Ala2537Thr); AF > 0.45%) in the Japanese general population, several frequent variants especially prevalent in the Japanese population were determined in this study.
The total value of AF of the total detected EYS variants was 6.75% in the general Japanese population. Given this number, the estimated prevalence of subjects at risk for EYS-RD in Japan should be higher than the current estimated value of 1 in 3500-4000 for RP. However, it is of note that the genetic risk does not perfectly correspond to the prevalence of the disease in the real world, as shown for the most prevalent ABCA4 variant (p.(Asn1868Ile) in the European population (AF > 6.7%) 21 .

Genotype-phenotype association of EYS-RD.
Three phenotype groups were identified in our cohort: RP (85.9%), CORD (10.9%), and EORP/LCA (3.1%). There were only a few patients with EYS-CORD reported to date in the previous literature; however, our EYS-RD cohort provided the largest number of CORD patients associated with EYS variants 5,11-13 . Seven out of 45 molecularly proven cases of ARCORD in the JEGC cohort are caused by biallelic or putative biallelic EYS variants (7/45, 15.6%). This fact highlighted that EYS should be the major IRD gene in the Japanese population, with the a significantly higher prevalence than that in the European population 5,16,17,27 .
There was no clear genotype-phenotype association/correlation between RP/CORD/EORD due to the limited number of CORD and EORD/LCA cases. Although further detailed analysis is needed for accurate assessment, both of the two aforementioned variants with relatively high AF are associated with either RP or CORD; thus, the prediction of predominant functional failure (rod or cone) seems hard based on the genotype. It is noteworthy that even patients with the identical genotype presented with the contrasting clinical phenotypes (RP/ CORD), which suggests the possible presence of modifiers outside of the EYS gene that contribute to the disease presentation.
There were seven families with multiple disorders (Supplementary Fig. S2) or non-AR inheritance in our cohort. Conclusive genetic diagnosis is still unavailable in four affected subjects with limited clinical information from three families (Families 5, 8 and 16). Whole-exome sequencing was not performed in three subjects from two families (Families 5 and 16). Given the presence of variants with a relatively high AF, such coincidence with the other EYS variants or other pathogenic variants in the non-EYS genes should be considered in the clinical/ genetic diagnosis of IRD. For this reason, comprehensive gene screening is helpful to elucidate the cause of complicated phenotypes in such families.
Limitations of this study. There are limitations to this study. First, the molecular mechanisms of disease causation for most variants have not yet been known, and the clinical effect of the variants (e.g., functional loss by a single variant, acting as a modifier, complexing with missing disease-causing variants, and others) is poorly understood. Further functional analysis is needed to conclude the disease causation of each variant. Second, the identification of copy number variants is technically hard with the results of whole-exome sequencing; thus, the possible presence of copy number variants was not completely excluded in our study. As previously reported 28,29 , it is crucial to examine the structural variants in the EYS gene. Third, the AF data of general populations were not studied in the detail due to the limited data resources of ophthalmic findings and natural history, which should be valuable to assess the clinical effect of each variant in subjects at risk in the real world, especially in relatively prevalent IRDs with diverse onset and phenotype. Last, the identification of background ethnicity for each variant (2020) 10:5497 | https://doi.org/10.1038/s41598-020-62119-3 www.nature.com/scientificreports www.nature.com/scientificreports/ was not available in this study. Extensive genomic analysis with detailed haplotype information could delineate the ethnic specificity of EYS-RD.
In conclusion, the phenotypic and genotypic characteristics of EYS-RD were determined in this largest cohort of the Japanese population. The presence of variants with a relatively high AF in a specific population requires the survey of non-rare variants in consideration of founder effects, especially in relatively prevalent Mendelian disorders.

Methods
The protocol of this study adhered to the tenets of the Declaration of Helsinki and was approved by the ethics committee of the participating institutions from Japan; National Institute of Sensory Organs, National Hospital Organization Tokyo Medical Center (Reference; R18-029). A signed informed consent was obtained from all subjects.
Participants and clinical investigation. Patients with a clinical diagnosis of IRD and available genetic data were studied between 2008 and 2018 as part of the JEGC (http://www.jegc.org/) 30 . A total of 1302 subjects from 729 families, including 222 families with autosomal recessive or sporadic RP and 71 families with autosomal recessive or sporadic CORD, registered in the JEGC cohort were surveyed. Clinical information is available via the NISO online database, including medical history, family history, ethnicity, chief complaints of visual symptoms, the onset of disease, the best-corrected decimal visual acuity converted to the LogMAR unit, fundoscopy, fundus photography, autofluorescence imaging, and phenotypic categorization.

EYS variant detection.
Genomic DNA was extracted from the peripheral blood of all affected subjects and the available unaffected family members with the Gentra Puregene Blood Kit (Qiagen, Tokyo, Japan). Whole-exome sequencing with targeted analysis of retinal disease-associated genes (RetNet; https://sph.uth.edu/ retnet/home.htm; accessed on 1 January 2017) was performed on the affected subjects and unaffected family members according to previously published methods 30 . Briefly, paired-end sequence library construction and exome capturing were performed with the Agilent Bravo automated liquid-handling platform with SureSelect XT Human All Exon V3-5+UTRs kit (Agilent Technologies, Santa Clara, CA, USA). Enriched libraries were sequenced with the Illumina HiSeq. 2000/2500 sequencer (San Diego, CA, USA; read length 2×101 bp). Reads were aligned to the University of California, Santa Cruz (USCS; California, United States) human genome 19 reference sequence with Burrows-Wheeler Aligner software. Duplicated reads were removed by the Picard MarkDuplicates module, and mapped reads around insertion-deletion polymorphisms (INDELs) were realigned by the Genome Analysis Toolkit (GATK) Version 3.0. Base-quality scoring was recalibrated by the GATK. Mutation calling was performed by the GATK Unified Genotyper module. Called single-nucleotide variants (SNVs) and INDELs were annotated by the snpEff software (snpEff score; "high", "moderate" or "low"). The XHMM (eXome-Hidden Markov Model; https://atgu.mgh.harvard.edu/xhmm/) tool was applied for the detection of copy number variations. The read depth and coverage of the targeted regions were also confirmed with Integrate Genome Viewer (IGV; http://software.broadinstitute.org/software/igv/). All called SNVs and INDELs of the RetNet genes were selected for further analysis. Variants with the read depths higher than 15× were selected for this study.
The identified variants with an allele frequency of less than 1% in the HGVD (http://www.genome.med. kyoto-u.ac.jp/SnpDB/; accessed on 1 July 2017), and Integrative Japanese Genome Variation (iJGVD 3.5k, https:// ijgvd.megabank.tohoku.ac.jp/download_3.5kjpn/; accessed on 1 July 2017), which are two allele frequency databases specific for the Japanese population, were filtered. Only for the two autosomal recessive genes with high prevalence (EYS and ABCA4), the identified variants were filtered with an AF of less than 10.0% in the HGVD to avoid missing the pathogenic/disease-associated variants with a relatively high AF (1%-10%). Together with phenotypic features and inheritance, as well as co-segregation, disease-causing/disease-associated variants were determined from the called variants in the retinal disease-associated genes.
In silico molecular genetic analysis. Sequence