Introduction

Inherited retinal diseases (IRDs) constitute a large group of rare monogenic diseases that affect primarily the retina resulting in vision impairment, and often ultimately blindness. They collectively represent the leading cause of vision loss in the working-age population with a combined incidence of 1:30001. The genetic etiology of IRDs is highly heterogeneous. In most cases, IRDs follow simple inheritance patterns (autosomal dominant, autosomal recessive, X-linked and mitochondrial) and are associated with mutations in 280 genes (RetNet, http://sph.uth.edu/retnet/; accessed in April 2022). These genes encode proteins with diverse functions in the context of the retina, which range from structural components of retinal cells to key elements of the phototransduction and retinoid cycle. The complex molecular basis of IRDs mirrors an equally heterogeneous range of clinical phenotypes, varying in terms of cell-type/tissue involvement, disease onset, severity, and progression. Despite their clinical variability, a common hallmark of IRDs is photoreceptor dysfunction and death that cause different degrees of vision loss. IRDs are therefore typically classified according to the primarily affected retinal cell type. On this basis, IRDs are grouped into rod-cone dystrophies, cone-rod dystrophies, or generalized photoreceptor diseases, in which rods and cones degenerate simultaneously with an involvement of the retinal pigment epithelium (RPE)2. A fourth group comprises the hereditary vitreoretinopathies (exudative and erosive) which are characterized by degenerative changes in the vitreous body and the retina2. Another group of rare retinal diseases is albinism, where there is little or no production of melanin in conjunction with characteristic ocular and visual pathway anomalies. Optic neuropathies are a distinct disease entity in which vision loss is caused by dysfunction of the optic nerve without a direct impact on retinal integrity. Finally, additional classifications consider whether vision loss appears in the context of syndromic conditions with extraretinal involvement3 or in non-syndromic forms, in which only the retina is affected, and can be either progressive or stationary2.

The extensive phenotypic overlap of IRD subtypes hinders their accurate clinical diagnosis. Genetic testing is therefore critical because it can provide differential diagnosis and improve patient management with correct prognosis, genetic counselling, and access to gene-specific therapeutic options. Nevertheless, molecular genetics alone is often not sufficient to firmly sustain a clinical hypothesis because numerous IRD genes are associated with different clinical forms4, underscoring the need for the combined expertise of clinicians and ophthalmic geneticists in IRD patient management. For example, establishing the clinical subtype can impact prioritization for treatment with Voretigene Neparvovec in patients with actionable genotypes5.

In the last two decades, the genetic diagnosis of IRDs improved greatly thanks to the unprecedented progress in the field of DNA testing and human genomics. Publicly available databases of curated genomic variants and allelic frequencies as well as pathogenicity prediction tools have empowered variant interpretation, thereby improving diagnostic rates and accuracy. Not least, sequencing costs constantly declined making genetic screening accessible to a large fraction of patients. Consequently, there has been an exponential increase in the number of molecularly characterized patients, which offered a deeper understanding of IRD etiology and unveiled useful genotype–phenotype associations.

Given their genetic heterogeneity, recent studies quantified the relative contribution of causative genes to IRD pathogenesis in different populations6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21. Some of the largest cohorts described were either collected through nationwide consortia8,16,20 or followed in a single diagnostic center14,15,17,19. The largest cohorts of molecularly characterized patients reported by European initiatives included a German (2158 solved cases)19, British (4236 solved cases)15 and Spanish (2100 solved families)14 cohort. To date, there are no reports on Italian IRD cohorts of comparable size and clinical heterogeneity. Currently, the most comprehensive study on Italian IRD patients described the molecular epidemiology in a cohort of 221 solved probands with non-syndromic retinitis pigmentosa and Usher syndrome22. Other studies reported even smaller cohorts with mutations in certain genes of interest23,24,25,26 or with specific clinical subtypes27,28,29,30,31,32,33,34,35,36,37,38,39,40. Here, we report on a much larger cohort of patients which comprises a broad spectrum of clinical subgroups of inherited retinal and vitreoretinal diseases followed at a single reference center for IRDs in Italy. We collected, reviewed, and analyzed the clinical and genetic data of 2790 patients who underwent genetic testing, and identified a total of 2036 cases (73%) with a potentially conclusive molecular diagnosis. The clinical subtypes of the cohort spanned the entire spectrum of rod-dominated, cone-dominated, generalized photoreceptor or vitreoretinal degenerations, in the context of both isolated and syndromic forms. Optic neuropathies and albinism were also included. We report on the contribution of 132 genes to IRD pathogenesis and on the prevalence of 1319 distinct pathogenic alleles in the analyzed cohort. We also describe 353 novel variants in 96 genes and discuss genetic associations with clinical forms. Understanding the prevalence of causal gene defects in the overall population not only reveals the complex genetic basis of IRDs but also drives the development of targeted therapies that could be of benefit for many patients.

Results

Clinical composition of the genetically solved cohort

The Eye Clinic of the University of Campania ‘Luigi Vanvitelli’ (Naples, Italy) is a Reference Center for rare eye diseases in Southern Italy, founded in 1991, and follows one of the largest IRD cohorts in the country, with 3514 patients enrolled in the Italian Registry of Rare Disease. Patients from all Italian regions (Fig. 1) and with clinical forms that cover the entire spectrum of IRD subtypes are referred to the clinic. Considering that Italy has 59.6 million inhabitants (census 2020) and IRDs have an overall prevalence of 1:3000, we would expect that there are 19,866 IRD patients in the Italian population. In this case, the clinic’s cohort would represent about 17.7% of the IRD patient population in Italy. By considering only the geographic area in which the Center is located (Southern Italy and Islands, resident population of 20.2 million), the clinic’s cohort (2769 patients from Southern Italy and Islands) comprises roughly 41% of the estimated IRD patients. Therefore, this cohort represents well the clinical and genetic variation of IRD patients in Italy.

Figure 1
figure 1

Overview of the geographical representation of the clinic’s cohort. Map of Italy, divided by administrative districts (Regions; delimited by a grey line), depicting the origin of patients followed at the Reference Center for Rare Eye Diseases at the Eye Clinic of the University of Campania ‘Luigi Vanvitelli’ (Naples, Campania Region; red dot). Color intensity indicates the relative number of patients originating from each administrative region. The geographic area corresponding to the Southern Italy and Islands is delimited by a dotted line.

In this study, we focused our attention on all patients who underwent genetic testing (n = 2790), regardless of their clinical diagnosis and their region of origin, in order to obtain a representative overview of the genetic etiology of IRDs in Italy. We examined the clinical records and genetic data of the selected patients and identified 2036 subjects (from 1683 families) with a potentially conclusive or (very) likely conclusive molecular diagnosis.

The age of the genetically defined cohort ranged from 1 to 89 years. Males and females were almost equally represented among the solved cases (55.6% vs. 44.4%). The slightly higher number of male patients is attributed to recessive X-linked conditions and the early single-gene testing of forms with a well-identifiable phenotype (e.g. retinoschisis, advanced choroideremia). Indeed, by removing the cases due to variants in X-linked genes, the sex balance was 50.17% males vs 49.83% females. The genetically solved cases comprised 1718 patients with isolated and 318 patients with syndromic forms of IRDs (84.4% and 15.6%, respectively) (Table 1). The most frequent isolated retinopathies were retinitis pigmentosa (RP; n = 733 [36%]), Stargardt disease (STGD; n = 500 [24.6%]), Leber congenital amaurosis/early-onset retinitis pigmentosa (LCA/EORP; n = 138 [6.8%]), cone/cone-rod dystrophy (CD/CRD; n = 80 [3.9%]), choroideremia (CHM; n = 68 [3.3%]), retinoschisis (RS; n = 29 [1.4%]), Best-type macular dystrophy (BEST; n = 27 [1.3%]), and achromatopsia (ACHM; n = 18 [0.9%]) (Table 1). Among the syndromic phenotypes, Usher syndrome was the most common (n = 250, representing 78.6% of syndromic forms and 12.3% of the overall cohort) (Table 1). Finally, vitreoretinal diseases (i.e. enhanced S-cone syndrome, exudative vitreoretinopathy), optic neuropathies (i.e. optic atrophy, LHON) and albinism (i.e. ocular and oculocutaneous albinism) accounted, respectively, for 0.5%, 2.8%, and 1.2% of the genetically solved cohort (Table 1).

Table 1 Clinical composition of the molecularly diagnosed cohort.

Molecular diagnosis success rate

The overall diagnostic success rate was 69.5%, when calculated on the number of probands/families, or 73% if based on individual cases (2036 solved out of 2790 molecularly analyzed cases). The diagnostic rate varied widely depending on the genotyping methodologies used and on the specificity of the clinical phenotype. Specifically, 58% of solved cases (n = 1181) received a potentially conclusive genetic diagnosis after NGS-based genotyping by custom retinopathy panels, clinical exome or whole exome sequencing (WES) (Table 2). Roughly a third of the solved cohort (n = 587 [28.8%]) was analyzed by single-gene testing and mainly comprised subjects with well-defined clinical phenotypes, tightly associated to specific genes (e.g. STGD, CHM, RS). Finally, a significant proportion of the cohort (n = 201 [9.9%]) was genetically explained by segregation analysis of familiar disease-causing variants. Arrayed Primer Extension (APEX)-microarrays resolved 55 cases (2.7%) (Table 2). Sixty patients harboring structural variants (i.e. extended deletions or duplications) in CHM, NMNAT1, NPHP1, PCDH15, PRPF31, RAX2, RP2, RPGR, USH1G, USH2A and WFS1 were solved by combining the above-mentioned approaches with multiplex ligation-dependent probe amplification (MLPA), array comparative genomic hybridization (aCGH)41, Sanger sequencing or in silico tools for CNV detection (e.g. CONTRA42, Vargenius43).

Table 2 Overview of the genetically solved cohort.

Molecular genetic composition of the cohort and causative gene prevalence

We identified 1319 distinct causative variants (Supplementary Table S1) in 132 different genes (Table 3). The ten most commonly mutated genes were ABCA4 (n = 535 [26.3%]), USH2A (n = 228 [11.2%]), RPGR (n = 102 [5%]), CHM (n = 72 [3.5%]), RHO (n = 72 [3.5%]), MYO7A (n = 69 [3.4%]), CRB1 (n = 55 [2.7%]), RPE65 (n = 40 [2%]), RP1 (n = 37 [1.8%]), and GUCY2D (n = 34 [1.7%]) (Table 3, Fig. 2a). The other 122 genes had a lower contribution to IRDs. One hundred genes were mutated in 15 patients or less and were collectively responsible for disease pathogenesis in 18% of the solved cohort (Fig. 2a, Table 3). Thirty-two genes were mutated in only one patient (Table 3). Mutations in the mitochondrial DNA accounted for 2.1% of the cohort and were implicated almost exclusively in the pathogenesis of LHON.

Table 3 Contribution of causative genes.
Figure 2
figure 2

Distribution of different types of alleles in the solved cohort. (a) Pie chart showing the relative contribution of each causal IRD gene in disease pathogenesis of the genetically defined cohort. Genes (n = 99) implicated in less than 15 cases are plotted as a single group. (bd) Pie chart depicting the prevalence and relative contribution of causative genes implicated in autosomal recessive (b), autosomal dominant (c) and X-linked (d) retinal dystrophy forms. Genes that are responsible for IRD pathogenesis in less than 10 cases with autosomal recessive (or less than 4 cases with dominant) forms are depicted as a single group. (e) Contribution of the most recurrent IRD-associated genes involved in syndromic forms. Genes implicated in less than 3 cases are plotted as a single group.

ABCA4 (OMIM # 601691) was the most prevalent causative gene, implicated in 535 solved cases from 453 families (Table 3; Fig. 2a). The vast majority of ABCA4 positive cases (n = 501 [93.7%]) had a diagnosis of STGD1 (n = 470), CD/CRD (n = 28), pattern dystrophy (n = 2) or ACHM (n = 1), in line with the strong association of biallelic ABCA4 mutations with cone-dominated phenotypes that primarily affect the central retina. Only 6.3% of ABCA4-associated cases (n = 34) had a diagnosis of RP (Supplementary Figure S1). Most ABCA4-IRD subjects (n = 477 [89.2%]) were compound heterozygous for disease-causing variants, while only 58 cases (10.8%) were homozygous. We identified 255 distinct pathogenic alleles for ABCA4 (Supplementary Table S1). Missense variants constituted the largest fraction of the ABCA4 disease-causing alleles (58%) followed by protein truncating and splice site variants (17.5% and 17.7%, respectively), while complex alleles were 5.3% (Fig. 3a). The contribution of the different allele types was consistent with their distribution in large collections of ABCA4 variants44. The frequent hypomorphic variant c.5882G>A (p.Gly1961Glu) was the most prevalent ABCA4 pathogenic allele (n = 183 [17.1%]) (Fig. 3b) and the most frequent variant overall in the analysed cohort (Table 4). This is the major disease-causing variant in STGD1 and is typically implicated in mild clinical phenotypes39,44. The second most frequent ABCA4 variation (n = 49 alleles [4.6%]) was c.5714+5G>A (p.[= ,Glu1863Leufs*33]), classified as a moderately severe causal allele45,46, followed by the c.5018+2T>C (p.?) splice variant (n = 38 alleles [3.6%]) and the deleterious c.[1622T>C;3113C>T] (p.[Leu541Pro;Ala1038Val]) complex allele (n = 36 alleles [3.4%]). The c.247_250dup (p.Ser84Thrfs*16) and c.286A>G (p.Asn96Asp) variants constituted respectively 3.4% (n = 36) and 3.0% (n = 32) of the identified ABCA4 alleles (Fig. 3a). Sixty STGD1 cases were solved using single molecule Molecular Inversion Probes (smMIPs)-based analysis of the entire ABCA4 gene47, and comprised 17 patients who were initially monoallelic for bona-fide pathogenic variants in ABCA4 after a first-level genetic testing (mostly carried out by Sanger sequencing or massive parallel sequencing of ABCA4 exons).

Figure 3
figure 3

Most frequent variants identified in ABCA4- and USH2A-associated cases. (a) Pie-chart showing the relative abundance of the different classes of variants across the ABCA4 pathogenic alleles. (b) Histogram showing the most frequent ABCA4 alleles. (c) Histogram of occurrence of the most recurrent USH2A variants. Blue bars show variants located in exon 13.

Table 4 Most frequent variants in the cohort (found on at least 12 alleles).

USH2A (OMIM # 608400) was the second most recurrent disease-causing gene, with causative variants identified in 228 subjects (from 200 families) with rod-dominated phenotypes, both syndromic and isolated (Table 3; Fig. 2a). Specifically, 67% of the patients carrying pathogenic variants in USH2A presented syndromic RP with hearing impairment (Usher syndrome type II) whereas 33% had isolated RP, frequently mild forms with benign prognosis such as the pericentral RP subgroup29 (Supplementary Figure S1). Overall, 94.4% of patients initially diagnosed with Usher type II were found to harbor mutations in USH2A, confirming that variants in this gene are the major cause of this condition48. The majority of USH2A genotypes were compound heterozygous (n = 172 [75.4%] vs n = 56 [24.6%] homozygotes). We identified 164 distinct variants in USH2A, including five pathogenic alleles harboring extended deletions (Supplementary Table S1). The largest fraction of USH2A pathogenic alleles (46.9%) were missense variants, while protein truncating and splicing variants constituted 43.7% and 9.6%, respectively. The most frequent variant (8.1%) was c.10712C>T (p.Thr3571Met) (Supplementary Table S1, Fig. 3b, Table 4). Nine different variants located in exon 13 were identified in 45 subjects; they represented 11.7% of all USH2A pathogenic alleles, with c.2276G>T (p.Cys759Phe) and c.2299del (p.Glu767Serfs*21) being the most prevalent mutations in exon 13 (Fig. 3c). Mutations within this exon are currently the target of antisense oligonucleotide-based therapies, therefore these results constitute an important framework for the future application of such approaches49.

Variants in the retinitis pigmentosa GTPase regulator (RPGR) (OMIM # 312610) were the third most frequent genetic cause in the solved cohort (Table 3; Fig. 2a). Mutations in RPGR were implicated in disease pathogenesis in 102 subjects (from 74 families), mostly males with X-linked RP (n = 98 [96.1%]) and less frequently (n = 4 [3.9%]) X-linked cone-dominated forms (Supplementary Figure S1). Mutations in RPGR were the causative genetic defect in seven female carriers who presented milder or comparable retinal disease to that encountered in the affected males of the pedigree. Roughly 60% of the disease-causing mutations in RPGR (mainly small deletions or duplications) occurred in the terminal exon (open reading frame 15; ORF15) of the RPGRORF15 isoform, a mutational hotspot due to its repetitive sequence with a high GC content (Supplementary Table S1).

When considering syndromic forms with extraocular manifestations, 40 genes were identified across the 21 distinct syndromic phenotypes (Table 1). Mutations in USH2A and MYO7A were responsible for two-thirds of solved cases (48% and 20%, respectively) (Fig. 2e), in line with the high prevalence of Usher syndrome patients (78.6% of syndromic cases) (Table 1). Mutations in BBS genes (BBS1, BBS10, BBS12, BBS2, BBS4, BBS9) identified in Bardet-Biedl patients were the third most common genetic cause accounting for 9.1% of syndromic cases.

Inheritance patterns, allelic heterogeneity and novel variants

Prior to genotyping, most cases were annotated as sporadic, based on pedigree information and family history due to the lack of affected family members. After genotyping, autosomal recessive (AR) inheritance turned out to be by far the most prevalent pattern among our patients, accounting for 71.5% (1457/2036) of solved cases. Specifically, almost half of the solved patients (n = 1,001 [49.2%]) were potentially compound heterozygous for causal variants in genes associated with recessive clinical phenotypes, while 22.4% (n = 456) were homozygous (Table 2). The most frequently mutated genes in AR forms were ABCA4 and USH2A, accounting for more than half of AR cases, followed by MYO7A (n = 69), CRB1 (n = 55) and RPE65 (n = 40) (Fig. 2b).

Autosomal dominant (AD) forms represented 13.9% (n = 283) of solved cases and were associated with mutations in RHO (n = 72), BEST1 (n = 29), RP1 (n = 29), PRPF31 (n = 23), PRPH2 (n = 23), GUCY2D (n = 18), OPA1 (n = 14), PRPF8 (n = 11), PRPF3 (n = 11), SNRNP200 (n = 8), CRX (n = 7) and GUCA1A (n = 6). A further 20 genes were mutated in less than 6 cases, accounting for 32 patients with AD forms (Fig. 2c). Mutations in splicing factor genes (PRPF31, PRPF8, PRPF3, SNRNP200, PRPF6) accounted for 19.1% of causal alleles associated with AD phenotypes (n = 54). By segregation analysis in unaffected relatives (parents), we confirmed the incomplete penetrance of the autosomal dominant RP phenotype in family members of three apparently sporadic cases carrying bona-fide pathogenic variants in PRPF31 (c.73G>T, p.Glu25*; c.549delG, p.Glu183Aspfs*15; c.615C>A, p.Tyr205*).

Twelve percent of cases (n = 238) were hemizygous for variants in genes linked to recessive X-linked phenotypes. The most recurrently implicated genes were RPGR (n = 102 [40%]), CHM (n = 72 [28%]), RP2 (n = 33 [13%]) and RS1 (n = 32 [13%]) (Fig. 2d). Mutations were also identified in CACNA1F, GPR143, NDP and NYX in a smaller number of cases. Interestingly, some cases with apparently autosomal dominant forms were reclassified post-genotyping as X-linked following the identification of mutations in RPGR and RP2. In these instances, female carriers of RPGR-/RP2-associated RP had an almost equally severe phenotype to that of affected males, mimicking a dominant inheritance. Finally, maternal inheritance of mitochondrial DNA variants was observed in 2.1% of the cohort and was almost exclusively associated with a diagnosis of LHON. Two sibs with RP and sensorimotor peripheral neuropathy (NARP syndrome) carried the m.8993T>G (p.Leu156Arg) substitution in subunit 6 of the mitochondrial ATPase gene (MT-ATP6).

Almost half of the pathogenic alleles (50.7%) identified in our cohort harbored single-nucleotide substitutions resulting in disease-causing missense mutations (Table 2). Protein truncating variants represented approximately 45% of the pathogenic alleles, and included nonsense (17%), frameshift (15%) and canonical splicing variants (12%) (Table 2). Small in/del changes and larger structural variants were encountered in ~ 4% of mutated alleles with 88 and 63 allele counts, respectively. We also identified 18 deep intronic variants in CEP290 (c.2991+1665A>G; p.[Cys998*, =]), USH2A (c.7595-2144A>G; p.Lys2532Thrfs*56) and ABCA4 (c.4539+2064C>T; p.[= , Arg1514Leufs*36]) as well as 68 complex alleles (23 distinct types), mostly associated with variants in ABCA4. Seven of these complex alleles were supported by segregation analysis (namely, four in ABCA4 (c.[1622T>C;3113C>T], c.[2813T>C;3602T>G], c.[3323G>A;4297G>A], c.[5584+5G>A;4594G>T]), two in USH2A (c.[14204C>G;7939C>T], c.[2330G>A;5858C>G]) and one in MYO7A (c.[569A>G;1969A>T]); Supplementary Table S1).

We identified nine de novo events in patients with mutations in GPR143, PRPF31, PRPF8, RHO, RPE65, RPGRIP1, SMARCA4 and TUBB4B (Table 5). In all these cases, the de novo pathogenic allele was not detected by segregation analysis in the parental samples. For three cases in which parents gave their consent to paternity testing, we could exclude non-paternity as a possible explanation. Moreover, we confirmed in a male patient with X-linked recessive ocular albinism that the causative variant was not present in the maternal sample.

Table 5 De novo variants identified in the cohort.

The genetic analysis of our cohort revealed 353 novel disease associated variants in 96 genes (Supplementary Table S2). These variants were not previously reported in ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/) or in the Leiden Open Variation Database (LOVD; http://www.lovd.nl). To systematically assess their pathogenicity according to the ACMG guidelines, we retrieved the annotation reported in Varsome and interrogated the MutScore pathogenicity predictor50 for 122 novel missense alleles. The majority of DNA changes (n = 279 [80.6%]) were annotated as ‘likely pathogenic’ or ‘pathogenic’, while 19.4% were variants of uncertain significance (VUS). Moreover, the missense variants had an average Mutscore of 0.8 (84% with a MutScore > 0.6) corroborating their causative role in IRD pathogenesis (Supplementary Table S2).

Discussion

Recent progress in genomic medicine increased our ability to identify the molecular causes of retinopathies in affected individuals leading to the systematic implementation of genetic analyses in clinical practice as part of standard diagnostic protocols. The large number of genetically diagnosed patients empowered epidemiological studies aimed at quantifying IRD gene prevalence. This is particularly relevant nowadays as gene-targeted therapies for IRD subtypes become a tangible prospect and it is crucial to delineate the cohort of eligible cases. Such information not only impacts assessments on disease management, prognosis and treatment options, but can also stimulate the investment of resources on therapeutic approaches that can be of benefit in large numbers of patients.

Previously, several studies have described the genetic composition of extensive nationwide or monocentric IRD cohorts from different countries6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,51. These studies allow comparisons of causal variant prevalence among different populations and uncover population-specific genetic features of IRD pathogenesis. Here, we investigated the genetic features of IRDs in 2036 genetically defined subjects followed at a single referral center in Italy. As in other studies19, there was a prevalence of patients from the district where the Reference Center is located (i.e. Campania region, approx. 52% of the cohort) and the surrounding area (78.8% of the cohort resides in Southern Italy with a population of 20.2 million). The remaining part originated from the entire Italian territory (Fig. 1). Therefore, the clinic’s cohort can provide a representative and useful insight into the prevalence of clinical subtypes and genetic etiology in the Italian IRD community.

The genetic structure of the modern Italian population has been shaped by a series of historic migration events that induced recent demographic reshuffles and gene flow. Despite its markedly heterogeneous genomic background, Italy still has some genetic isolates in geographically secluded areas52. In our cohort, we identified variants that were recurrent in small, restricted local communities, suggesting a higher inbreeding. For instance, a 2.9 kb deletion in RAX2 (19:g.3771337_3774298del), which was initially identified in a female RP patient (homozygous for this CNV)53, was detected also in an unrelated male patient (compound heterozygous) originating from the same small village. Another example is a novel missense variant in RHO (NM_000539:c.473C>A; p.Ala158Asp), which was recurrent in our cohort with 17 cases from three, apparently unrelated, families from the Campania region (Southern Italy). Lastly, the nonsense variant c.2219C>G (p.Ser740*) in RP1 was recurrent in the Sicily island, with 13 cases from 9, apparently unrelated, families. A further corroboration to the significant extent of inbreeding in our cohort is the overall prevalence of homozygous genotypes, which were identified in about one third (n = 455 [31.23%]) of patients with recessive forms.

This is the first report on an extensive Italian cohort of comparable size and scope as other population-based reports. In terms of number of genetically characterized cases and approach, our study is similar to the German cohort described by Weisschuh et al.19 which comprised 1528 individuals with a conclusive genetic analysis followed at a single diagnostic center. Our cohort is also comparable in size and spectrum of clinical phenotypes with reports from Israel (1369 solved families)16, Spain (2100 families)14 and North America (760 solved families)17. The genetic composition of our solved cohort was equally complex compared to that of other reports, with a total of 132 genes implicated in the pathogenesis of 42 clinical subtypes. The relative contribution of causative genes was also largely in line. When considering all clinical subtypes, the three most frequently mutated genes were ABCA4, USH2A and RPGR, thereby confirming their high contribution to IRD pathogenesis reported in other populations8,14,15,16,17,19. The observed frequency of ABCA4- and USH2A-associated genotypes was consistent with the high prevalence of variants in these genes in at least five main world populations54. Variants in CHM were the fourth most common genetic IRD cause in our cohort, yet their prevalence could be in part skewed by the easily identifiable clinical phenotype of advanced forms and single-gene etiology of CHM which enabled early candidate gene analyses and high diagnostic rates for affected subjects (Table 1). The same possibly applies to RS1 and BEST1 that ranked, respectively, at position 13 and 14 (Tables 1, 2).

Rod-dominated phenotypes had the highest genetic heterogeneity, with 75 causal genes explaining disease etiology in 733 patients with RP, which represented the most common subtype. Comparatively, a smaller number of genes was associated with cone-dominated phenotypes, with the extreme example of recessive STGD forms, which were almost exclusively caused by mutations in ABCA4. Overall, our findings suggest that the genetic causes underlying IRD pathogenesis in Italy are mostly in line with those reported for other cohorts. Differences in the relative causal gene frequencies with other populations could be attributed to local founder mutations or to the specialized clinical focus of each diagnostic center. For example, variants in FAM161A were detected only in 3 cases, while it was the third most common mutated gene in Israel due to founder mutations16. An uninvestigated founder effect may also explain the high prevalence of CYP4V2 mutations (c.802-8_810delinsGC) in Taiwan51 and China55. Other discrepancies in observed genetic variant frequencies could be due to small cohort sizes, e.g. comparatively high frequency of RLBP1 variants in Iceland18.

Compared to a study on an Italian cohort of 221 molecularly diagnosed cases22, we herein report a larger cohort of 2036 solved cases and include a spectrum of clinical entities that extends beyond non-syndromic RP and Usher syndrome. Specifically, our study offers an overview of the pathogenesis of macular/cone dominated conditions as roughly a third of the cohort (n = 625) had such phenotypes. To date, the largest Italian cohort of genetically defined patients with macular and cone/cone-rod dystrophy comprised 136 cases37. Moreover, we describe genetically solved cases with a diagnosis of CHM, RS, optic atrophy, LHON, vitreoretinopathies and albinism as well as a spectrum of syndromic forms which, besides Usher, comprise BBS, JBS, Alström and Knobloch phenotypes among others. Regarding the clinical phenotypes common to both studies (i.e. non-syndromic RP and Usher syndrome22, and macular, cone/cone-rod dystrophies37), we find extensive overlap of genetic causes, as expected.

When compiling our study cohort, we recruited patients with rare monogenic eye diseases that cause visual impairment, even when these did not strictly fit in the IRD classification proposed by Berger2. Specifically, we also recruited patients with optic neuropathies (including LHON) and albinism in order to understand disease and mutation prevalence, given that these patients are normally referred to the Rare Ocular Disease units and pharmacological treatment options (e.g. idebenone) are available. As expected, a diagnosis of LHON was almost exclusively associated with the three common mtDNA mutations m.11778G>A, m.3460G>A and m.14484T>C (relative frequency of 52%, 26% and 5%, respectively).

Mutations in RPGR were the third most common cause of IRDs, accounting for 5.2% of our genetically diagnosed patients (n = 102 cases of RPGR-associated IRD), as also observed in other populations14,19,56. The contribution of RPGR to IRD pathogenesis is likely underestimated since about 60% of the disease-causing mutations are located in the terminal exon (open reading frame 15; ORF15) of the RPGRORF15 isoform which is refractory to NGS-based analyses due to its repetitive, purine-rich sequence. The insufficient coverage of ORF15 in NGS experiments warrants the implementation of complementary approaches to probe this mutational hot-spot, especially in males with a negative NGS analysis, compatible clinical presentation (RP or CD) and/or evidence of X-linked transmission, considering both the high incidence of RPGR mutations as well as the concrete therapeutic perspectives for RPGR-associated forms (e.g. ClinicalTrials.gov Identifier: NCT03252847).

Segregation analysis of the identified variants was performed in 22.3% of the cases, as in the case of other studies of similar magnitude. Sample unavailability did not allow us to systematically confirm biallelism, especially for genes with frequent complex alleles, such as ABCA4, in which certain variants (e.g. c.1622T>C, c.3113C>T) have been described both as part of complex and simple alleles44.

Herein, we give a detailed account of the allelic heterogeneity in genetically defined IRD patients from a large Italian cohort and report over 300 novel variants, not previously described, to the best of our knowledge. To establish whether a genotype could explain the disease, we assessed the concordance with the inheritance mode, available segregation results and, most importantly, the clinical phenotype. A total of 15 cases (out of 2036) were clinically reconsidered and reclassified after genetic testing: 6 cases with a non-RP diagnosis and mutations in ABCA4, CRB1, NRL, RLBP1, PRPF6, RPGR were reclassified as RP, whereas 9 RP cases were revised as non-RP (3 cases with mutations in NR2E3 and CHM) or as syndromic IRD forms (6 cases with mutations in SMARCA4, AHI1, CLN3, PCYT1A, MFN2 and MKS157). Therefore, the integrative management of IRD patients should avail the combined expertise of clinicians and ophthalmic geneticists.

Overall, we identified 866 patients (42.5% of the solved cohort) with potentially actionable genotypes for therapeutic approaches (both pharmacological and gene therapy-based) that are either already available [e.g. Luxturna for RPE65 (n = 40), idebenone for mtDNA mutations causing LHON (n = 41)] or are currently being tested in advanced clinical trials (Phase II/III) [e.g. ABCA4 (n = 535), CHM (n = 72), CEP290:c.2991+1665A>G (n = 7), CNGA3 (n = 9), MERTK (n = 7), PDE6A (n = 8), RPGR (n = 102), USH2A-exon13 (n = 45)]. We expect that the number of molecularly diagnosed patients will increase further thanks to the availability of gene-specific treatments that stimulate patients to actively seek genetic diagnosis. Defining the molecular epidemiology of IRDs, besides providing insights on their molecular etiology, can advise policymakers and stakeholders on the healthcare burden of these rare diseases, and can act as a driver to guide research efforts on the development of therapeutic options for a growing number of patients.

Methods

Patient selection and ethical statement

We reviewed 2790 patients with IRD based on clinical records (i.e. clinical diagnosis, family history, clinical history, systemic findings, visual acuity tests, fundus changes, visual field assessment, optical coherence tomography imaging, fundus autofluorescence, electroretinography) available at the Center for Inherited Retinal Dystrophies of the Eye Clinic, University of Campania ‘Luigi Vanvitelli’. Only patients who were willing to undergo, or had already undergone, genetic screening were included in the study. Procedures adhered to the tenets of the Declaration of Helsinki and were approved by the Ethics Board of the University of Campania ‘Luigi Vanvitelli’. Informed consent to genetic testing and data sharing was obtained from the patients (or their parents/legal guardians for minors). Available reports from genetic analyses commissioned by the patient or by other medical practitioners were voluntarily provided by the patient for archiving in their medical record.

Genotyping methods

Different genotyping methods have been used over the years, following the technical evolution of sequencing methodologies. Earlier analyses were based on single gene testing (i.e. by PCR on genomic DNA followed by Sanger sequencing) whenever the clinical phenotype and inheritance pattern were strongly indicative of a candidate gene (e.g. ABCA4 for recessive STGD, CHM in choroideremia, RS1 in retinoschisis). Later, APEX-based genotyping microarrays (www.asperbio.com; Asper Biotech, Ltd.) were used to screen for known mutations implicated in LCA, RP or STGD. Starting from 2013, samples were screened by high-throughput targeted sequencing (including smMIPs-based analysis of a single or few candidate genes e.g. ABCA4 and PRPH2 for STGD patients)47. More recently, patient samples were analysed using custom panels of known retinopathy genes28, clinical exome sequencing or WES (Supplementary Table S3). Some cases with a well-defined clinical diagnosis which was commonly associated with a limited number of genes (e.g. patients with Usher syndrome48, albinism, cone dystrophy, LCA), underwent a first-tier analysis on restricted gene panels relevant to their condition.

For some cases that remained unsolved after an NGS-based analysis, we implemented complementary approaches to identify disease-causing mutations (e.g. Sanger-based analysis of RPGRORF1523, search for rare structural variants/larger copy number variations (CNVs) by experimental and in silico approaches41,42,43). Because whole-exome and clinical exome approaches do not efficiently detect all known deep-intronic variants associated with IRDs, we screened by Sanger sequencing known intronic variants (e.g. CEP290:c.2991+1655A>G, USH2A:c.7595-2144A>G) whenever these were not present in the panel used and patients had a compatible clinical phenotype and/or monoallelic pathogenic variants in these genes.

For all genetic analyses, DNA was extracted from peripheral blood samples using standard protocols. The identified variants were always validated by Sanger sequencing. Segregation analysis was performed whenever parental DNA (or samples from other family members) were available. For NGS analyses, library preparation, sequencing and sequence data analysis was performed as previously described27. Filtered reads were visually inspected on the Integrative Genomics Viewer (IGV).

Pathogenicity assessment of sequence variants and criteria for genotype classification

The pathogenicity of sequence variants was assessed according to the guidelines of the American College of Medical Genetics and Genomics (ACMG)58, either by manual implementation of the criteria or using the Varsome automated variant classification59. For the annotation of the novel missense variants, we availed of the MutScore pathogenicity predictor50 to corroborate the Varsome annotation. For the scope of this study, we considered as ‘genetically (likely) solved’:

  1. (a)

    patients carrying monoallelic ‘pathogenic’ (P) or ‘likely pathogenic’ (LP) variants in genes associated with dominant phenotypes or with recessive X-linked IRD forms (only for males);

  2. (b)

    patients with a homozygous or two heterozygous variants (P, LP or variants of uncertain significance [VUS]) in a gene associated with recessive phenotypes consistent with their clinical presentation. We confirmed biallelism of heterozygous variants whenever segregation analysis was possible. However, since segregation analysis could not be systematically applied due to sample unavailability, we did not set it as a prerequisite for inclusion among the ‘solved’ cases, in line with studies on similar sized cohorts19;

  3. (c)

    patients with bona-fide mutation in the mitochondrial DNA (mtDNA).