Genetic testing of leukodystrophies unraveling extensive heterogeneity in a large cohort and report of five common diseases and 38 novel variants

This study evaluates the genetic spectrum of leukodystrophies and leukoencephalopathies in Iran. 152 children, aged from 1 day to 15 years, were genetically tested for leukodystrophies and leukoencephalopathies based on clinical and neuroradiological findings from 2016 to 2019. Patients with a suggestive specific leukodystrophy, e. g. metachromatic leukodystrophy, Canavan disease, Tay-Sachs disease were tested for mutations in single genes (108; 71%) while patients with less suggestive findings were evaluated by NGS. 108 of 152(71%) had MRI patterns and clinical findings suggestive of a known leukodystrophy. In total, 114(75%) affected individuals had (likely) pathogenic variants which included 38 novel variants. 35 different types of leukodystrophies and genetic leukoencephalopathies were identified. The more common identified disorders included metachromatic leukodystrophy (19 of 152; 13%), Canavan disease (12; 8%), Tay-Sachs disease (11; 7%), megalencephalic leukodystrophy with subcortical cysts (7; 5%), X-linked adrenoleukodystrophy (8; 5%), Pelizaeus–Merzbacher-like disease type 1 (8; 5%), Sandhoff disease (6; 4%), Krabbe disease (5; 3%), and vanishing white matter disease (4; 3%). Whole exome sequencing (WES) revealed 90% leukodystrophies and genetic leukoencephalopathies. The total diagnosis rate was 75%. This unique study presents a national genetic data of leukodystrophies; it may provide clues to the genetic pool of neighboring countries. Patients with clinical and neuroradiological evidence of a genetic leukoencephalopathy should undergo a genetic analysis to reach a definitive diagnosis. This will allow a diagnosis at earlier stages of the disease, reduce the burden of uncertainty and costs, and will provide the basis for genetic counseling and family planning.


Background
Leukodystrophies and genetic leukoencephalopathies are a large heterogeneous group of genetic diseases affecting the white matter of the central nervous system. The single diseases are rare, but overall they affected 1 per 7663 live births, in a US American study 1 ; the estimated prevalence of leukodystrophies is about 1-2/100,000 live births in Germany 2 . Most of these diseases are associated with severe progressive functional losses of motor and cognitive abilities, helplessness and early death. Their causes are either related to primary defects of myelin synthesis and myelin stability, but myelin damage may also be secondary to disturbances outside this structure 3 . Some mitochondrial and lysosomal storage disorders, organic acidemias, other inborn errors of metabolism and vascular disorders are also categorized under genetic leukoencephalopathies 4 . Demographic, clinical and genetic evaluation of patients confirmed genetically. Thirty-five different leukodystrophies and genetic leukoencephalopathies were identified in this study ( Table 1). The clinical characteristics of the most common genetically confirmed patients are summarized in Table 1 and Fig. 1A. The main clinical manifestation was motor regression and neurological complaints including dystonia, hypotonia, developmental delay, ataxia, tremor, seizure, macrocephaly, nystagmus, cognition and learning impairment (Table 1 and Supplementary Table 2). 114 (75%) patients were confirmed based on genetic testing. Male consist of 73 of 114 (64%) of patients. The mean age of onset was 5yrs and 1 m ± 18yrs and 11 m. 94 of 114 (82. 5%) cases were born in a consanguineous family. The ethnicity of these patients is compared in Fig. 1B. The ethnical distribution showed higher incidence in Fars 32%; other ethnical distribution included 27% in Turk, Arab 13%, Lur 8%, Kurd 7%, Mazani 4%, Gilak 3%, and the rest Balooch, Afghan, Lak, and Turkeman (Fig. 1B). Based on age of onset of disease, 47 infantile (41%, I), 17 late infantile (15%, LI), 29 early juvenile (25%, EJ), 19 late juvenile (17%, LJ) and 2 adults (A) were available (Supplementary Table 2). 38 of 152 (25%) patients were not genetically confirmed based on genetic analysis. Some candidates of single gene analysis were not tested for panel based analysis because the parents were not satisfied for the test performance (Fig. 2). In addition, panel negative patients did not perform WES.  (Table 1). Totally, 74 out of 108 (69%) patients were genetically diagnosed based on single gene analysis (Fig. 2).

Single gene analyses. Sixteen patients had mutations in the
Next generation sequencing: gene-panel and WES. Gene-panel and WES identified 40 of 44 (90%) patients having leukodystrophies and leukoencephalopathies (Table 1, Supplementary Table 2). Four cases did not show any variants with multigene panel analysis of leukodystrophies (Fig. 2).
Frequency of lysosomal, peroxisomal, mitochondrial and errors of intermediary metabolism. Fifty of 114 patients were diagnosed as lysosomal disorders (29 lysosomal LD and 21 lysosomal gLE) ( Table 1, Fig. 2 and Supplementary Table 2).
Eleven patients were diagnosed as peroxisomal disorders which eight of them were X-ALD (Table 1, Fig. 2  and Supplementary Table 2).
Forty patients diagnosed as errors of intermediary metabolism, consisted of 12 CD, 8 PMLD and 7 MLC (Table 1). CD as the most common degenerative cerebral diseases, due to abnormal amino acid/organic acid metabolism, accounted for the second most common disease in our population (Table 1, Fig. 2 and Supplementary Table 2).  Table 2). Novel variants. Thirty-eight novel variants were identified in 40 patients ( Table 2). Each of ABCD1 and GJC2 showed four novel variants. Following genes had each two novel variants: ASPA, FUCA, GALC, HEXA, L2HGDH and MLC1 ( Table 2). The variants were classified according to ACMG guideline; 11 variants met the criteria for being pathogenic, 17 and 10 variants were likely pathogenic and VUS, respectively.

Discussion
Genetic diagnosis of childhood leukodystrophies is rapidly increasing throughout the past years in Iran and worldwide; approximately, 30 leukodystrophies and more that 60 disorders have been classified as genetic leukoencephalopathies 4 . This study provides a comprehensive spectrum of leukodystrophies and other genetic leukoencephalopathies in Iran as referred to a tertiary pediatric center. Totally, 35 types of leukodystrophies were determined in the studied population. Based on pattern of brain MRI and single gene analysis, approximately 69% (74 of 108) of the referred patients were confirmed by direct Sanger sequencing. Clinical diagnosis reduced the number of genes to be evaluated. Panel based analysis also confirmed leukodystrophies in 90% (40 of 44) of the cases. Our diagnostic rate of panel-based analysis was comparable to other studies 6  MLD was the most common cause of leukodystrophies in our population 9 . The next diseases were CD, TSD, PMLD, X-ALD and then MLC. MLC is the most common (6 of 23) among Turk patients while PMLD may be common among Arab population in our study. Moreover, ten common diseases of this study, compromise 70% of all recognized patients (80 of 114) ( Table 1). A recent study showed that peroxisomal disorders are identified to be common. Although other common disorders including Aicardi Goutières Syndrome, TUBB4A-related leukodystrophy, POLR3-related Leukodystrophy and Pelizaeus-Merzbacher Disease were not found in our study with a high frequency 10 . ABCD1 had the highest relative frequency in their study while ARSA was the most common in our population.
Clinically, we had unsolved cases due to variable phenotypic features or overlapping neurological manifestations which were candidates of gene-panel and/or WES analysis. Despite we had patients with no genetic diagnosis even though they had undergone panel-based analysis. This could be due to intronic variants, copy number variations, unknown gene defects, and multigenic effect. Therefore, more genetic analysis should be performed for these cases and they could benefit from reanalysis of exome sequencing data, genome sequencing and transcriptomics. For rare diseases genetic analysis, NGS may unravel more genes relating to leukodystrophies in patients with unsolved genetics 6,11 .
Lysosomal diseases had 43% incidence in our studied population which could be managed at earlier age of diagnosis. Individuals with known causal variants benefit from unexpected clinical presentations, prognosis, palliative treatment and avoiding unnecessary treatments. Hematopoietic stem cell transplantation (HSCT) has been used for lysosomal storage diseases 7 . Some of our patients might potentially have benefitted from HSCT at early stages of the disease. However, patients' follow up for HSCT is out of the scope of this study.
Some have an ethnic-specific distribution, e. g. TSD in Ashkenazi Jewish population, GM1 gangliosidosis in Rudari isolate and MLD in Western Navajo Nation 12 . MLD patients were from western part of Iran 9 . Four of our TSD patients were from northern parts of Iran.
The peroxisomal disorders, as a heterogeneous group, occur due to a defect in function (e. g. X-ALD) and biogenesis (e. g. Zellweger spectrum) of peroxisomes. X-ALD is the most common peroxisomal disorder caused by mutation in the ABCD1 gene co-expressed with HSD17B4 gene. Patients with X-ALD could benefit from HSCT 13 or hematopoietic stem-cell gene therapy 8 .
CD is the second frequent disease in our study. It is the most common disease during infancy and has been observed mainly in Ashkenazi Jews while in our study patients were from various ethnicities. Various experimental therapies for Canavan patients are under investigation 14 . Patients with known genetic etiology may benefit from such experimental therapies.
PMLD is responsible for 8% of hypomyelinating leukodystrophy patients 15 . In this study 7% of the patients had the disease. In addition to GJC2, mutations in other genes such as a Myelin-associated glycoprotein (MAG) gene have been reported to cause PMLD 16 . GJC2 is co-expressed with PLP1 and interacts with products of FAM1256A, POLR3A and EIF2B5 genes. Our results highlighted that PMLD may have a higher frequency than www.nature.com/scientificreports/ PMD in our population especially in Arab and Fars ethnicities. Also, six of MLC patients were from Turk ethnicity; it may be a common disorder and limit to specific ethnicity e. g. from Turkey. 11% of patients diagnosed with mitochondrial genetic leukoencephalopathies; Leigh syndrome and L-2-HGA accounted for 4 and 3 of them, respectively. Leigh spectrum was due to SURF1. Also, it was due to NDUFS1, NDUFS7 and SDHAF1 genes. L2HGDH encoding mitochondrial L-2-hydroxyglutarate dehydrogenase may be common in our ethnicities. The mechanism of leukodystrophy is very complicated and there may be proteins involved in disease progress which show overlapping phenotype but have no or unknown interaction with each other.
Analysis of founder effect and hotspot mutations. Ancestral or founder effect or a genetic signature within an ethnicity usually leads to a high frequency and homozygosity of a mutation in that cohort; in contrast, if a specific mutation is distributed uniformly among many ethnicities, it is known as a mutational hotspot. Haplotype analysis is used to define recognized that a mutation is a hotspot or a founder one. The studied mutations of ABCD1 (c. 1415_1416delAG), ASPA(c. 634 + 1G > T and c. 237_238insA) and HEXA (c. 1528C > T) show a wide distribution around the world [17][18][19][20] ; especially c. 634 + 1G > T in ASPA gene has been reported from Turkey for the first time and we found it in patients from Fars, Afghani, Lur and Arab ethnicities 18 . These mutations are  (Table 1). CD as the most common degenerative cerebral diseases, due to abnormal amino acid/organic acid metabolism, accounted for the second most common disease in our population. PMD and PMLD are disorders of myelin genes. 4 patients had vWM, 2 patients with hypomyelination-hypogonadotropic-hypogonadism-hypodontia, 1 hypomyelination and congenital contract, 1 PMD, 1 AxD, 1 infantile neuroaxonal dystrophy/atypical neuroaxonal dystrophy, 1 hypomyelination leukodystrophy 9 (HLD9, MIM 616140), 1 Cockayne syndrome CS, MIM 133540), and 1 biotinidase deficiency. Thirteen patients diagnosed with mitochondrial genetic leukoencephalopathies; Leigh syndrome and L-2-HGA accounted for 4 and 3 of them, respectively.

Challenges and limitations.
We have not included all the affected patients in our registry, only the patients referred to our center for genetic testing were accounted in this study. In addition, Children's Hospital is a tertiary center in Tehran and some patients around the country may have not been registered and/or died previously before registration. Therefore, a multicenter registry is needed. The incidence of the disease in this part of the world may be different due to consanguineous marriages. Ethnical background had higher incidence in Fars and Turk; however, the population of these ethnicities is also high in Iran.

Conclusion
In conclusion, five common disorders are responsible for more than fifty percent of leukodystrophies in this region. Considering Iran as the crossroad of the Middle East is composed of more than 15 ethnicities 22 , it may reflect the distribution of leukodystrophies in the Middle East especially its neighboring populations. For instance, PMDL may be common among Arab countries while MLC may have a high frequency in Turkish countries. Genetic analysis provides diagnostic confirmation of the disease, and physicians are allowed for prognosis and management of patients and affected families. Genetic testing following counseling decreases further worry of the family about the diagnosis and further costs. The mortality rate in affected families is very high and it underscores the necessity of genetic testing in the country. Moreover, this study provides information to help for future therapeutic planning's in the country. This will allow a diagnosis at earlier stages of the disease, reduce the burden of uncertainty and costs, and will provide the basis for genetic counseling and family planning.  Table 1). The coding regions and exon-intron boundaries of the genes were enriched using NimbleGen kit (NimbleGen, Roche, Basel, Switzerland). Sequencing analysis was performed by Illumina, Hiseq2000 (Illumina, San Diego, California, USA). Reads were aligned using Burrows-Wheeler Aligner (BWA) on reference genome (hg19), called by SAMTools and annotated by GATK and ANNOVAR. Based on, 1000Genome and dbSNP database variant were selected for analysis. Coverage of target region with at least depth of 30X was 99%. In addition, WES was only performed with an average coverage depth of ≈100X for three patients. Sanger sequencing was done for the candidate variants in the affected families.

Methods
Variant categories. The sequence data were compared with public databases and filtered to find out the candidate variants according to published pipelines. The candidate variants were categorized as the previously reported pathogenic variants and novel variants. ACMG guideline criteria were used to interpret novel variants and classify them 23 . In silico analyses. Pathogenic effect. According to HGVS (http://varno men.hgvs.org/.hgvs.org/), novel variants were named as missense, nonsense, splice site, intronic, regulatory and indel. The following software tools were applied to predict the pathogenic effects of novel variants: polymorphism phenotyping (PolyPhen-2v2.1) 24 , combined annotation dependent depletion (CADD) 25

Data availability
There are no additional unpublished data.