Whole exome sequencing in 17 consanguineous Iranian pedigrees expands the mutational spectrum of inherited retinal dystrophies

Inherited retinal dystrophies (IRDs) constitute one of the most heterogeneous groups of Mendelian human disorders. Using autozygome-guided next-generation sequencing methods in 17 consanguineous pedigrees of Iranian descent with isolated or syndromic IRD, we identified 17 distinct genomic variants in 11 previously-reported disease genes. Consistent with a recessive inheritance pattern, as suggested by pedigrees, variants discovered in our study were exclusively bi-allelic and mostly in a homozygous state (in 15 families out of 17, or 88%). Out of the 17 variants identified, 5 (29%) were never reported before. Interestingly, two mutations (GUCY2D:c.564dup, p.Ala189ArgfsTer130 and TULP1:c.1199G > A, p.Arg400Gln) were also identified in four separate pedigrees (two pedigrees each). In addition to expanding the mutational spectrum of IRDs, our findings confirm that the traditional practice of endogamy in the Iranian population is a prime cause for the appearance of IRDs.

Center of Fasa University of Medical Sciences, and the Ethics Commission of the Canton de Vaud) and adhered to the principles of the Declaration of Helsinki. All individuals participating in this study were Iranian residents, who agreed in contributing to this study by signing a written informed consent form. Patients were clinically evaluated by local ophthalmologists and their medical records were maintained at their respective hospitals. Approximately 5.0 ml peripheral blood was collected using a EDTA K2 golden vac disposable vacuum blood collection tube (Zhejiang Gondong Medical Technology, China) or were mixed with EDTA anticoagulant (Merck KGaA, Darmstadt, Germany) after sample collection. DNA was extracted from peripheral blood leukocytes following standard protocols. Quantitative assessment of DNA was made using a NanoDrop 1000 Spectrophotometer (Thermo Fisher Scientific, USA), whereas integrity was evaluated by running the DNA samples on a 1% agarose gel. Pedigrees were drawn with the help of HaploPainter 10 .
Genetic analyses. Exome capture and library preparation was performed on one affected individual per family using the SureSelect Human All Exon v6 kit (Agilent, Santa Clara, CA, USA) and the HiSeq Rapid PE Cluster Kit v2 (Illumina, San Diego, CA, USA), from 2 μg genomic DNA. Whole-exome sequencing (WES) was performed at the Institute of Genomics of the University of Tartu (Estonia) using an Illumina HiSeq (HSQ-700358) instrument. Bioinformatic analyses were performed as described previously 11 . Briefly, raw reads were mapped to the human reference genome (hg19/GRCh37) using the Novoalign software (V3.08.00, Novocraft Technologies). Next, Picard (version 2.14.0-SNAPSHOT) was used to remove duplicate reads and Genome Analysis Toolkit (GATK) (version 3.8) was used to perform base quality score recalibration on both singlenucleotide variants and insertion-deletions. A VCF file with the variants was generated by HaplotypeCaller. They were annotated according to a specific in-house pipeline using mainly ANNOVAR software 12 20 . Clinical significance of the variants was evaluated with the help of publicly available databases such as ClinVar 21 , the Human Gene Mutation Database (HGMD) 22 and Varsome 23 , as well as according to their frequency in available databases, e.g. the Genome Aggregation Database (gnomAD) 17 . Seven online in-silico methods were used to predict the pathogenicity of all variants. The online in-silico tools used included MutationTaster 24 , Mutation Assessor 24 , Polymorphism Phenotyping v2 (PolyPhen-2) 25 , Likelihood Ratio Test (LRT) 26 , Sorting Intolerant from Tolerant (SIFT) 27 , PROVEAN 28 , and Combined Annotation Dependent Depletion (CADD) 29 . Furthermore, all candidate variants were compared with data from Iranome, a database containing information from 800 exomes from individuals belonging to eight major ethnic groups in Iran (http:// www. irano me. ir/; accessed on April 19, 2020). Finally, Sanger sequencing was performed to validate all potentially pathogenic variants and to establish their causality through strict genotype-phenotype co-segregation within the available family members.

Results
Following WES analysis in 17 probands of Iranian descent, 16 of which were the direct offspring of consanguineous unions (Fig. 1), we identified 17 distinct genetic variants in 11 genes linked to inherited retinal diseases (Tables 1 and 2). Of the 17 pedigrees, two families each were linked to disease-causing variants in CNGA3, GUCY2D, IQCB1, RDH12, RP1, and TULP1 genes, while only one family was found with causative variants in either USH1G, ABCA4, NMNAT1, CRB1, or BBS2 genes. The mutational spectrum across these 11 genes comprised 7 missense variants, 4 nonsense variants, 4 small insertion-deletions (Indels) or duplications leading to frameshifts, one canonical splice site variant and one synonymous variant with effect on splicing. As expected, most of the variants in our study were found in a homozygous state (in 15 families out of 17, or 88%). Compound heterozygosity was detected in two families (F009 and IRN_070, Tables 1, 2). Except for one homozygous allele in the RP1 gene (NM_006269.1:c.788-1G > A) in family IRN_039, all other homozygous pathogenic variants were found in genes that were located inside a so-called run of homozygosity (ROH), generally spanning more than one megabase (Mb) in size.

Discussion
Consanguinity is a major risk factor for the occurrence of rare recessive Mendelian disorders, yet it is a long-lived social practice in many Asian and African countries. In Iran, the second most populated country in the Middle East, 37.4% of all marriages are between consanguineous partners. Of these, 19.3% occur between first cousins and 18.1% involve second cousins 7 .
In this work, we used consanguinity as a means to facilitate identification of mutations in IRD cases, through an autozygome-guided NGS approach. Consistent with the high level of consanguinity displayed by the Iranian population, we observed a recessive inheritance pattern in all our cases, with the largest majority of them carrying indeed homozygous pathogenic variants in known IRD genes. With only one exception, all genes carrying homozygous pathogenic variants resided inside runs of homozygosity, thus supporting earlier studies that highlighted the importance of homozygosity mapping in consanguineous families [41][42][43][44][45][46] . Nevertheless, compound heterozygous patients were also identified, with mutations in CRB1 and NMNAT1. Interestingly, these patients (from families F009, and IRN_070, respectively) also had relatively lower values of overall genomic homogeneity (197, and 71 Mb, respectively, over an average of 280.0 Mb in the cohort as a whole). The appearance of  www.nature.com/scientificreports/ compound heterozygosity in the Iranian population is not unprecedented, and an earlier study suggested that CRB1 is a commonly mutated gene in Iranian patients with non-syndromic IRDs 43,47 . Interestingly, our cohort did not include any instance of variants in USH2A, although mutations in this gene are considered to be among the most frequent causes of Usher syndrome or non-syndromic retinitis pigmentosa (RP) 48 . The mutational spectrum in our cohort comprised 1 synonymous (with predicted effect on splicing), 1 splice change, 7 missense, 4 nonsense, and 4 frameshift variants. To establish pathogenicity of the novel missense variants we heavily relied on data from existing literature and the ACMG guidelines. Lastly, we assessed the status of each variant by comparing them with the Iranome database, to filter out common variants specific to Iranian population.
Unlike missense substitutions, the majority of nonsense and frameshift DNA changes can be considered as bona fide deleterious mutations, since they mostly constitute loss-of-function (pLoF) alleles in genes where this pathogenicity mechanism is well known (criteria PVS1 of ACMG guidelines). We therefore classified all of them as such, based on this feature and the fact that they were all either absent or present at an extremely low frequency in the gnomAD database.
We also found a synonymous change in the BBS2 gene (c.471G > A, p.Thr157 =) co-segregating with Bardet-Biedl syndrome in one family (IRN_065) and reported in three previous studies 40,49,50 . Due to the high nucleotide conservation and its localization at an exon-intron boundary, it is possible that the c.471G > A substitution may impair the correct splicing of BBS2 pre-mRNA. Indeed, all splicing predictors tested (AdaBoost and RandomForest from dbscSNC, MaxEntScan, and spliceAI) indicated a high impact on splicing and disruption of the 5' site. Our findings thus provide additional support to the potential pathogenicity of this apparently neutral variant. It is worthwhile to mention here that the majority of the previously reported patients with the p.Thr157 = mutation originated from Middle Eastern countries, such as Lebanon and Iran 49,50 . Although this variant perfectly co-segregates with disease in family IRN_065 and has been described in previous reports in association with Bardet-Biedl syndrome 40,49,50 , there is still a chance that it could represent a benign DNA change, detected in homozygosity in our patients by virtue of their ethnical origin. Additional functional studies are needed to definitely confirm its pathogenic role in syndromic IRD. www.nature.com/scientificreports/ Since geographic isolation and consanguinity-driven genomic homozygosity lead to the enrichment of rare founder mutations in specific societies or ethnic groups 7,9,51,52 , the presence of such mutations in our cohort of patients from related families is not surprising. Similar to other reports 53-55 , we identified two mutations that were present in more than one pedigree. The first, p.Ala189ArgfsTer130 in GUCY2D, was shared by two families originating from the Fars province in the Southwest of Iran and was found in a common ROH of 3.0 Mb with an identical haplotype. The second was a homozygous missense variant (p.Arg400Gln) in the TULP1 gene, detected in two families from the Razavi Khorasan province, in Northeastern Iran, again in a common ROH of 21.5 Mb with an identical haplotype. This latter variant has been previously reported in an Indian family in a homozygous state 33 .
In summary, this work extends current knowledge about the genetic landscape of IRDs in Iran and, in line with previous studies, supports the evidence that homozygosity mapping is an effective tool for uncovering rare genomic variants in consanguineous pedigrees with rare recessive disorders. Most importantly, we hope that our data would contribute to better molecular diagnosis and access to future gene therapy trials in Iran.  www.nature.com/scientificreports/