An exonic insertion in the NAGLU gene causing Mucopolysaccharidosis IIIB in Schipperke dogs

Mucopolysaccharidosis (MPS) IIIB (Sanfilippo syndrome B; OMIM 252920), is a lysosomal storage disease with progressive neurological signs caused by deficient activity of alpha-N-acetylglucosaminidase (NAGLU, EC 3.2.1.50). Herein we report the causative variant in the NAGLU gene in Schipperke dogs and a genotyping survey in the breed. All six exons and adjacent regions of the NAGLU gene were sequenced from six healthy appearing and three affected Schipperkes. DNA fragment length and TaqMan assays were used to genotype privately owned Schipperkes. A single variant was found in exon 6 of MPS IIIB affected Schipperkes: an insertion consisting of a 40–70 bp poly-A and an 11 bp duplication of the exonic region preceding the poly-A (XM_548088.6:c.2110_2111ins[A(40_70);2100_2110]) is predicted to insert a stretch of 13 or more lysines followed by either an in-frame insertion of a repeat of the four amino acids preceding the lysines, or a frameshift. The clinically affected Schipperkes were homozygous for this insertion, and the sequenced healthy dogs were either heterozygous or homozygous for the wild-type allele. From 2003–2019, 3219 Schipperkes were genotyped. Of these, 1.5% were homozygous for this insertion and found to be clinically affected, and 23.6% were heterozygous for the insertion and were clinically healthy, the remaining 74.9% were homozygous for the wild-type and were also clinically healthy. The number of dogs homozygous and heterozygous for the insertion declined rapidly after the initial years of genotyping, documenting the benefit of a DNA screening program in a breed with a small gene pool. In conclusion, a causative NAGLU variant in Schipperke dogs with MPS IIIB was identified and was found at high frequency in the breed. Through genotyping and informed breeding practices, the prevalence of canine MPS IIIB has been drastically reduced in the Schipperke population worldwide.

The mucopolysaccharidoses (MPS) are a group of hereditary lysosomal storage disorders in which specific glycosaminoglycans accumulate in lysosomes due to various enzyme deficiencies. There are up to 12 individual MPS types described in humans 1 and animals 2 with all but MPS II showing autosomal recessive inheritance. Genetic variants in dogs have been identified in the genes associated with MPS I 3 , IIIA 4,5 , VI 6,7 , and VII 8,9 . Clinical manifestations of MPS in dogs are most commonly ocular and musculoskeletal 2 . In contrast, MPS III, also known as Sanfilippo Syndrome, causes a progressive and primarily neurological disease 4,5,10 . At the cellular level it is characterized by primary lysosomal accumulation of heparan sulfate and secondary lysosomal storage of gangliosides 1 .
Mucopolysaccharidosis IIIB (also known as Sanfilippo syndrome B) is caused by variants in the alpha-N-acetylglucosaminidase (NAGLU) gene and has been previously characterized at the clinical and molecular level in humans, emus 11 , cattle 12 , knockout mice 13 , and transgenic swine 14,15 . Furthermore, MPS IIIB has been clinicopathologically reported in Schipperke dogs (OMIA 001342-9615) 10 . At approximately two years of age, affected Schipperkes develop a slowly progressive ataxia leading to humane euthanasia before six years of age 10 . This study documents the causative NAGLU gene variant in Schipperke dogs with MPS IIIB and also demonstrates the high initial frequency of this allele in the breed, and its reduction after inauguration of genotype screening for this variant allele in the breed.

Methods
Ethylenediaminetetraacetic acid (EDTA) anticoagulated blood and cheek swab/brush samples were sent for the diagnosis of MPS IIIB or genotyping to PennGen Laboratories at the University of Pennsylvania, Philadelphia, Pennsylvania USA from clinically affected and healthy-appearing Schipperke dogs. Affected status was based on the reported onset of severe and progressive ataxia with onset at two to three years of age based on communication from owners or referring veterinarians. The studies were approved by the Institutional Animal Care and Use Committee (IACUC) of the University of Pennsylvania.
Briefly, DNA was extracted from EDTA blood and cheek swab/brush samples using the QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany). The six exons of the NAGLU gene were sequenced from nine Schipperkes including six healthy appearing and three clinically affected with MPS IIIB. Primers were designed, based on the published canine reference sequences (CanFam3.1 and XM_548088.6) to cover all six exons and at least 90 bp of intronic sequence adjacent to the exons ( Table 1). The PCR-amplified products (amplified using KOD Xtreme Hot Start DNA Polymerase, EMD Millipore Corp., Billerica, MA, USA) were submitted for direct Sanger sequencing at the University of Pennsylvania's Sequencing Core Facility. The DNA sequences were then aligned to dog reference sequences (CanFam3.1 and XM_548088.6) to find genetic variants in the exons and splice-sites of the NAGLU gene.
Sanger sequencing was unable to accurately determine the exact sequence of the long homopolymer poly-A insertion in the affected dogs (data not shown). Consequently, the insert sizes were estimated based upon either gel separation or Sanger sequencing. In addition, for genotyping, a specific primer pair (Table 1) was designed to amplify the region surrounding the exon 6 insertion, with subsequent PCR products subjected to fragment length analysis using electrophoresis on a 6% polyacrylamide gel. The fragments' size, either consistent with the mutant and/or wild-type alleles, was the basis for genotyping. To control for any potential allelic dropout issues, samples were separately tested twice. A TaqMan genotyping assay was subsequently developed: briefly, a VIC dye-labelled probe (TGACAAGAATGCCTTCCAGCT) was designed to anneal to the insertion site of wild-type allele and a FAM dye-labelled probe (CAAGAATGCCTTCCAAAAA) to a unique sequence of the variant allele with the insertion and the primers used for the amplification are CTGGGTGCCGAAGATAAAGGT and CCCTTCCAACAGCACCAGTT. Genotyping data for MPS IIIB in the Schipperke pet population based on samples submitted to the PennGen Laboratories from 2003 to 2019 was gathered and analyzed.

Results
When aligning Sanger sequencing data of the exonic and surrounding intronic regions of the NAGLU gene from clinically healthy Schipperkes and those Schipperkes affected with MPS IIIB to the canine reference sequences (CanFam3.1 and XM_548088.6), only one single variant was discovered. The sixth and last exon of the NAGLU gene contains an insertion (XM_548088.6:c.2110_2111ins[A(40_70);2100_2110]) comprised of a homopolymer of A residues (poly-A) and an 11 bp duplication of the sequence directly upstream of the poly-A. Three clinically affected dogs were found to be homozygous for this insertion, four clinically healthy dogs were heterozygous, and two clinically healthy dogs were homozygous for the wild-type allele. The poly-A region was at least 40 bp in length, however the exact length of the poly-A in the insert could not be determined from the Sanger sequencing data as the sequencing read quality decreased, presumably due to variation between the two allele insert sizes, and/or due to "slippage" of the polymerases during amplification or sequencing (Fig. 1).
The primer pair designed to be used for genotyping by gel electrophoresis amplified a wild-type fragment predicted to be 169 bp and a longer fragment for the variant allele with the insertion (Fig. 2). The fragments with the insertion were at least 50 bp longer than the wild-type fragment, but the insert lengths varied markedly between individual Schipperkes, ranging between 50-80 bp.
From 2003 to 2019, a total of 3,219 Schipperkes were genotyped at PennGen Laboratories. Of the total number genotyped 2,411 (74.9%) Schipperkes were homozygous for the wild-type allele, 760 (23.6%) were heterozygous, and 48 (1.5%) were homozygous for the insertion (Table 2 and Fig. 3). All Schipperkes homozygous for the variant had or developed clinical signs of MPS IIIB unless lost to follow up before reaching the age of onset of clinical signs (≥2 years). As this was a genotyping survey, there was no means to follow cases closely. Of the 48 animals tested as homozygous for the mutation, 54.2% (n = 26) were of an age where they were definitively displaying clinical signs of MPS IIIB. The remaining animals were younger than the extreme limit for onset of signs (3 years www.nature.com/scientificreports www.nature.com/scientificreports/ of age). Of these dogs (n = 22), none were subsequently reported as not developing disease, but may well have been euthanized before the expected age of disease onset.
The number of Schipperke samples submitted for genotyping rapidly and drastically declined from the first years of screening. Similarly, the number of samples from Schipperkes that were genotyped as homo-and heterozygous for the insertion declined. However, while in absolute numbers, the mutant allele numbers decreased strikingly, the frequency of the allele in submitted samples did not decrease per year, and there have even been recent carriers and (rarely) affected dogs identified (Table 2 and Fig. 3). Screening of Schipperkes from North America, Europe, Australasia, and Russia revealed carrier dogs in all these regions, indicating the worldwide distribution of the mutant allele (data not shown).

Discussion
The canine NAGLU gene is on chromosome 9 and comprised of six exons (XM_548088.6) with exon 6 being by far the longest (1450 bp). It codes for the lysosomal acid hydrolase alpha-N-acetylglucosaminidase (EC 3.2.1.50) which consists of 747 amino acids (XP_548088.2), including the signal sequence. The exonic NAGLU sequence from all nine dogs sequenced was identical to the reference genomic sequence except for the disease-associated variant. The protein sequence shows close homology to human (86% identity) and other mammalian (76-90%) sequences which is expected for a housekeeping gene (https://www.ncbi.nlm.nih.gov/homologene/222).
Schipperkes with MPS IIIB have a NAGLU insertion near the end of exon 6, which contains a poly-A insertion followed by a duplication of the preceding 11 bp of wild-type sequence. The sequence of this 11 base pair repeat in the native context is flanked by AA (two adenines) at the 5′ end and A (one adenine) at the 3′ end. Since the insertion is a poly-A sequence, and the molecular mechanism of the sequence repetition is not known, it could actually have been a 14 bp repeat of the native sequence. We have chosen the conservative assessment that the A residues at both ends of the insertion were part of an exogenous poly-A insert. The insertion is predicted to result in the addition of many lysines after the 704 th amino acid (a glutamine that results from a synonymous variant caused by the insertion) in the NAGLU protein with three potential consequences past the stretch of inserted lysines depending on the actual length of the poly-A insertion: (1) It stays in-frame with an insertion of a repeat of the four native amino acids preceding the lysines (asparagine, alanine, phenylalanine, glutamine), or (2) it causes a frameshift with an early stop-codon, or (3) frameshift with the lack of a stop-codon. In any case, the exonic insertion is predicted to disrupt the C-terminal end of the enzyme in affected dogs. We had shown the lack of NAGLU enzyme activity and lysosomal storage in affected dogs, but neither immunoblotting nor gene expression studies were performed to further confirm the disruptive nature of this genetic variant.
A review of the human NAGLU gene sequence in ClinVar (https://www.ncbi.nlm.nih.gov/clinvar) accessed on (October 14, 2019) contains 181 variants. Of the 77 that are labelled as pathogenic or likely pathogenic, 42 are in exon 6. However, only one (c.2116C>T, p.Gln706Ter) is near the location of the insertion seen in MPS IIIB  www.nature.com/scientificreports www.nature.com/scientificreports/ Schipperkes. The c.2116C>T variant was reported in a 6 year old female child with severe degenerative neuropathy due to MPS IIIB 16 .
Interestingly, there are several disease-causing poly-A insertions known in dogs that have the same pattern of a poly-A flanked by a duplicated/repeated native sequence at both ends [17][18][19][20][21] . Such inserts with characteristic repeats may likely be the result of a target primed reverse transcription mechanism 22 . Some are also known to exhibit varied length of their poly-A, for example, the FXI variant in the Kerry Blue Terriers with Factor XI deficiency 17 . In cattle and emus with MPS IIIB, the disease-causing NAGLU variants are a missense (c.1354G>A, p.Glu452Lys) and frameshift deletion (c.1098_1099delGG), respectively, and both are also located in exon 6.
Occasionally when genotyping heterozygotes by fragment length, the amplification preferentially produced the smaller wild-type amplicon and failed to amplify the larger fragment, resulting in allelic dropout in heterozygotes. This did not appear to be a factor in the homozygous affected dogs. This preferential amplification in rare circumstances could have led to misidentification of heterozygotes as homozygotes for the wild-type allele. A cause was not identified and analyzing the samples in separate assays eliminated the allelic dropout issue. The TaqMan genotyping assay clearly discriminated all three genotypes, and this technique was not affected by any apparent allelic dropout artifact.
Based upon the devastating progressive clinical course of canine MPS IIIB in Schipperkes, breeders, owners, and veterinary clinicians were eager to genotype their dogs and patients. And while this represents a biased population within the breed, a striking number of homozygous and heterozygous dogs for the mutant allele were identified. The survey of NAGLU variant genotyping results in Schipperkes shows characteristic dynamics for a canine breed with a small gene pool. According to the UK Kennel Club only ≤51 (https://www.thekennelclub. org.uk/media/129029/10yrstatsutility.pdf) puppies were registered each year, from 2009 to 2018. Before the disease was discovered at the turn of the century and the molecular basis was established, no clinical screening test for MPS IIIB carrier dogs was available, and the prevalence of the mutant allele was not known. Following the initiation of genotyping in 2003, about one third of 3,219 Schipperkes whose genotypes are presented herein were tested in the first year of screening and another one third during the following five years. The last third of dogs were genotyped during next 11 years with <140 dogs tested per year. Strikingly, in the first year of genotyping, the heterozygotes represented 21.3% of the total of 1,097 animals screened that year. Overall, dogs homozygous and heterozygous for the mutant allele were 1.5% and 23.6%, respectively, indicating a mutant allele frequency of 0.133 which is high for a canine breed and reflects an ancient popular sire/dam effect, a population bottleneck,  Affected  48  12  28  8  22  11  0  0  1  1  1  2  1  1  4  1  2  0  0  1  0   Carrier  760  391  343  26  234  86  60  72  34  49  22  39  17  45  20  26  17  17  8  7  7   Normal  2411  1290  1092  29  841  200  142  228  153  111  78  77  64  93  81  67  65  42  64  53  52   Total  3219  1693  1463  63  1097 297  202  300  188  161  101  118  82  139  105  94  84  59  72 61 59  www.nature.com/scientificreports www.nature.com/scientificreports/ and/or close inbreeding affecting the Hardy-Weinberg equilibrium. Indeed a potential founder individual up to eight generations deep was found in the pedigree, with six separate lines of descent, which was further compounded by multiple lines of decent from two intermediate animals 10 . While a specific popular sire/dam was not identified through testing, the mutant allele was widespread in the breeding population worldwide.
Our genotyping was in complete concordance with phenotype, except for those dogs too young (<3 years) to show clinical signs and lost to follow up. There is no evidence that heterozygous Schipperkes have any advantage over homozygous wild-type animals, but likely the lines of dogs with the mutant allele had other desirable traits to be used for breeding worldwide. Also during the first year of genotyping, many more affected and carrier dogs were found than in subsequent years, reflecting the impact of the genotyping program. As this is a storage disease with an adult onset of clinical signs, it was noted that some affected dogs were in the breeding pool. Our recommendation for screening was initially to test all breeding Schipperkes and to perform only two types of matings: (1) matings between dogs free of the mutant allele or (2) matings of carrier dogs with proven clear dogs. While we recommended direct testing of all breeding dogs, it was likely that breeders used the initial results for subsequent breeding selection and assumed they were clear by descent. The rare affected dogs were likely the result of parents that were not tested or mis-parentage. The marked reduction in Schipperkes homozygous or heterozygous for the variant indicates the positive impact the screening had on the breeding population. While PennGen was the only diagnostic laboratory offering genotyping, the survey is still biased by breeder and pet owner interest, the discovery of carrier and affected dogs in certain breeding lines and kennels, as well as the further use of carrier dogs for breeding. Once all breeding dogs are screened and parentage is assured, there may thereafter no longer be a need for screening. Similarly, beneficial effects by genotype screening were seen with various other serious hereditary disease traits in specific canine breeds including copper toxicosis 23 , leukocyte adhesion deficiency 24 , and myotonia congenita 25 .
In conclusion, a causative variant of MPS IIIB in Schipperke dogs was identified, found to be widely disseminated in the breed, and drastically reduced in the Schipperke population worldwide by effective genotyping and breeding practices.

Data availability
All data is made available either in the manuscript or as Supplementary Information. Any data generated and/or analyzed during this study and if not already included in the manuscript or Supplementary Information will be made available from the corresponding author on reasonable request.