A large deletion on CFA28 omitting ACSL5 gene is associated with intestinal lipid malabsorption in the Australian Kelpie dog breed

Inborn errors of metabolism are genetic conditions that can disrupt intermediary metabolic pathways and cause defective absorption and metabolism of dietary nutrients. In an Australian Kelpie breeding population, 17 puppies presented with intestinal lipid malabsorption. Juvenile dogs exhibited stunted postnatal growth, steatorrhea, abdominal distension and a wiry coat. Using genome-wide association analysis, an associated locus on CFA28 (Praw = 2.87E−06) was discovered and validated in a closely related population (Praw = 1.75E−45). A 103.3 kb deletion NC_006610.3CFA28:g.23380074_23483377del, containing genes Acyl-CoA Synthetase Long Chain Family Member 5 (ACSL5) and Zinc Finger DHHC-Type Containing 6 (ZDHHC6), was characterised using whole transcriptomic data. Whole transcriptomic sequencing revealed no expression of ACSL5 and disrupted splicing of ZDHHC6 in jejunal tissue of affected Kelpies. The ACSL5 gene plays a key role in long chain fatty acid absorption, a phenotype similar to that of our affected Kelpies has been observed in a knockout mouse model. A PCR-based diagnostic test was developed and confirmed fully penetrant autosomal recessive mode of inheritance. We conclude the structural variant causing a deletion of the ACSL5 gene is the most likely cause for intestinal lipid malabsorption in the Australian Kelpie.

Animal selection and phenotype selection. Related juvenile AK dogs were observed to exhibit stunted postnatal growth and intestinal lipid malabsorption. Affected individuals remain a third to one half the size of their littermates during development and mature so that adult dogs are smaller in stature and exhibit persistent intolerance to a fatty diet. Starting in 2011, 17 of 319 puppies, from 45 litters, were born at one Australian kennel presenting with identical clinical features (Fig. 1). As neonates, affected puppies are indistinguishable from their littermates, but rapidly show clinical signs of polyphagia, failure to thrive, stunted growth (around one-third to one-half of the size of their siblings- Fig. 2a), yellowish poorly digested loose and pulpy faeces (Fig. 2b), increased faecal volume, and frequent defecation. Once affected puppies are transferred to a solid diet with digestive enzyme supplementation, faecal consistency normalises. From around six months of age, most affected Females and males are indicated by circles and squares, respectively. Filled symbols indicate affected samples, half-filled symbols represent carriers of the disease allele based on autosomal recessive inheritance. Offspring from a single litter are represented by a line descending from a horizontal connection between parent symbols. A triangle has been used to designate multiple samples (N) from a single litter that are not affected or suspected to be carriers based on recessive inheritance. Litters that included zero affected samples have not been included. Affected samples highlighted in blue were included in the study. Diagnostic testing found all samples to be homozygous for the disease-associated variant.
Kelpies appear to outgrow the characteristic clinical presentation. However, the dogs remain smaller in stature than their siblings, consistently produce more voluminous faeces than age-matched dogs, and their intolerance to high-fat foods persists throughout their lives.
This study involved 265 Kelpies. Samples were made up of 35 AK (10 cases and 25 controls), 225 AWK (225 controls), and 5 international Kelpies (one case and four controls). Cases in this study represent dogs that adhered to the described clinical presentation. Cases were easily recognised through signs of ill thrift, faecal appearance (steatorrhea), and stunted growth when compared to littermates. Seven cases from the originally described kennel have been included in this study. Two samples from separate kennels were reported as dams of affected pups. They were included as control samples and treated as obligate carriers when observing results.
Biological samples were collected as whole blood in EDTA tubes or buccal cells using cheek swabs. Genomic DNA was extracted using the PureLink Genomic DNA Mini Kit (Invitrogen, Carlsbad, CA, USA) or submitted as EDTA blood to the genotyping service provider on Whatman Flinders Technology Associates (FTA) cards, supplied by the genotyping service. Genotyping array data for 255 samples was obtained from the CanineHD BeadChip (Illumina, San Diego, CA, USA) by Neogen (Lincoln, NE, USA).
A full post-mortem was conducted at the Veterinary Pathology Diagnostic Services (University of Sydney, Camperdown, NSW, Australia) on a 17-week-old affected AK pup that was euthanized with approval by the owner on welfare grounds. A thorough examination was conducted on tissue of the lung, spleen, liver, heart, major cardiac vessels, lymph nodes, thyroid gland, kidney, bone marrow, pancreas, small intestine (duodenum, jejunum and ileum), brain and spinal cord.

Genome-wide association study (GWAS).
To detect and validate signals associated with malabsorption in the Kelpie population two case-control GWAS were performed using Plink 1.9 (--assoc) 12 . Quality control of genotypic data was conducted on 25 AK, five internationally bred Kelpies, and 225 AWK. Single Nucleotide variants (SNVs) were excluded if they exhibited a call rate of less than 90% (--geno) or a low minor allele frequency < 10% (--maf). Pairwise identity by decent was calculated (--genome) to detect and remove duplicated or highly related individuals. Population stratification was visualised using a multidimensional scaling (MDS) plot with two dimensions (--mds). One sample from each pair with a pairwise identity by decent > 0.7 was excluded. This was done to control for inflation resulting from cryptic relatedness and population stratification. Population stratification in the preliminary GWAS was determined by the genomic inflation factor based on the median Chi-squared statistic. The primary GWAS was conducted using 30 Kelpies, including 25 AK and five internationally bred Kelpies. Both groups show evidence of carrying the studied trait; reflected in our dataset. To control for the testing of multiple hypotheses, genome-wide significant and suggestive thresholds were Bonferroni-corrected, 5 × 10 -7 (Bonferroni cut-off of α = 0.05, n = 99,326) and 1 × 10 -5 (Bonferroni cut-off of α = 1.0, n = 99,326), respectively. Reported P-values are chi-square allelic test P-values as calculated in Plink. The 200 most associated markers from the unstratified preliminary GWAS were taken forward to a second analysis that added 225 control dogs from the closely related population of AWK.

Confirmation of deletion by polymerase chain reaction (PCR).
A large segment of consecutive uncalled array markers was observed only in cases suggesting the presence of a large deletion in these animals. Primers designed using primer3 [13][14][15] were used to detect the presence of the deletion. The novel deletion was confirmed through amplification of the last coding exons in impacted genes by PCR. Where no amplification was observed, to gauge the size of the deletion, further primers were designed to amplify the preceding exon. Alternatively, where amplification was witnessed, we designed primers in the gene's untranslated region (UTR). A total of seven primers were designed (Table S1). PCR was carried out in total volume of 20 μl using AmpliTaq

RNA sequencing, alignment and variant detection.
In order to gauge if mapped candidate genes were influencing the observed phenotype, whole transcriptomic sequencing was conducted on whole tissue of the jejunum collected during post-mortem. Using Invitrogen TRIzol Reagent, RNA was extracted according to the manufacturer's protocol. Total RNA sequencing (RNAseq) was performed on Illumina NovaSeq S1 using the TrueSeq Stranded RNA RiboZero Gold (h/m/r) kit at the Ramaciotti Centre for Genomics (University of New South Wales, Kensington, NSW, Australia). A total of 142,658,896, 100 base pair (bp) paired end reads were generated. Tissue matched transcriptomic sequence data for the Labrador retriever JEJUNUM_LABR (Accession identifier: SRR3727723) were obtained from the Sequence Read Archive (SRA) in Genbank (https ://www. ncbi.nlm.nih.gov/sra/). Quality control for the raw paired-end reads was performed with FastQC v0.11.8 (https ://www.bioin forma tics.babra ham.ac.uk/proje cts/fastq c/) and visualised using MultiQC 16 . Raw RNAseq reads were mapped to the canine reference genome (CanFam3.1) using STAR aligner v2.7.0e basic options 17 . Reads surrounding the candidate region were visualised and extracted for a genome-guided de novo transcriptome assembly using Trinity v2.8.3 18 . The distribution of read data was calculated across known gene features using RSeQC v3.0.1 19 .
STAR aligned bam files and Trinity constructed fasta sequences were visualised in Integrative Genomics Viewer v2.8.2 20 . Variants in the affected Kelpie transcript were compared with those of the tissue matched Labrador retriever. Alternate transcript splicing was visualised using a sashimi plot created using ggsashimi 21 . The minimum read coverage at splice junctions was set to 15 reads to reduce background hybridisation signals.
Multiplex PCR assay design for deletion. Custom primers were designed to capture a disease-associated variant identified in the sequenced AK using primer3. A three-primer multiplex PCR was designed to detect the structural variant (Table S1). The multiplex PCR includes a forward primer, 37 bp upstream of the disease-associated variant, and two reverse primers, one 140 bp downstream from the start of the variant and another ~ 103.7 kb downstream of the forward primer. PCR was carried out in total volume of 20 μl as previously described. A total of 19 Kelpie samples from affected populations were assessed using this method, including nine cases and ten controls. This encompassed two cases and eight controls also utilised in the GWAS analysis. Six of the controls were known to come from families that have produced affected puppies. Equipment and settings. All images have been formatted for publishing using Adobe Photoshop 2020 (v21.1.1). Images that have been cropped have been done so to improve clarity and conciseness. As such, all images correctly represent the original data. If an electrophoretic gel images has been cropped, it is stated in the figure legend and an original image has been provided in "Supplementary materials".

Results
Post-mortem results. Post-mortem was conducted on a young, 17-week old, female Kelpie. The dog was received in excellent post-mortem condition immediately following euthanasia. The affected Kelpie pup showed signs of malnutrition including decreased body condition (body condition score 2/5), atrophied musculature and depleted subcutaneous adipose tissue stores. Histological examination of the small intestine showed evidence of mild non-specific chronic enteritis including focal ileal ulceration, rare crypt abscesses in the ileum and colon, and possible crypt fusion in the jejunum.

Genome-wide association study (GWAS).
After frequency and genotype pruning, 99,326 SNVs remained in the analysis. Three cases and 27 control Kelpies were available for the primary analysis. Of these, four controls and one case were bred outside Australia. By MDS the AK and International populations clustered closely and so were treated analytically as one population (Fig. S1a). When AWK were included in the MDS the principal Kelpie population and AWK clustered separately (Fig. S1b).
A preliminary GWAS was performed in the closely clustered Kelpie populations with affected samples (AK and internationally bred Kelpies). The quantile-quantile plot shows limited inflation and the genomic inflation factor was 1.23 (Fig. 3a). GWAS revealed a suggestive association with intestinal lipid malabsorption on canine chromosome 28 (CFA28) (best P raw = 2.87E −6 ) (Fig. 3b). Six SNVs within a three megabase (Mb) region (28:24,521,377-26,556,336; 2.03 Mb) passed the suggestive genome-wide significance threshold and were in strong linkage with the index SNV (r 2 > 0.93). In the validation analysis, 225 AWK controls were added to the leading dataset. When analysing the top 200 associated SNVs in the primary GWAS in the extended cohort, 52 SNVs passed genome-wide significance (Fig. 3c). Of these SNVs, 21 (40.3%) were located on CFA28: 20 that clustered within a four Mb region (28: 24,030,090-27,194,500; 3.16 Mb) including 16 (30.7%) that matched the expected GT frequency for a recessively inherited trait (Table S2). The top SNV from the preliminary analysis remained the strongest in the validation set (best P raw = 1.75E −45 ).

Confirmation of deletion by PCR.
Within the associated locus, we identified a region of nine consecutive SNVs spanning marker CFA28:23,370,822 (BICF2P674000) to CFA28:23,493,334 (BICF2P338375) 122.5 kb, where all but two SNVs were consistently uncalled in cases but not controls (Table S3) represented a strong regional candidate for disease. The orientation of the genes within the region of uncalled markers were positioned so that the last coding exon of each gene aligned with the edges of the putative deletion (Fig. S2a). Using seven primer pairs, we confirmed the presence of a deletion in the affected Kelpies between 101.6 kb and 105.2 kb (Fig. S2b). The PCR confirmed a complete loss of GUCY2GP and ACSL5 and partial loss of ZDHHC6. RNAseq data was used to validate this result.

Variant detection and RNA expression. RNAseq data were inspected in Integrative Genomics Viewer.
Read distributions indicated RNAseq data had underlying DNA contamination with an equal portion of reads aligning to introns or intergenic regions compared to exons, 34.9% and 38.1% respectively. DNA from the AK with intestinal lipid malabsorption harboured a 103.3 kb deletion, NC_006610.3CFA28:g.23380074_234 83377del (CanFam 3.1; Fig. S3), involving the complete loss of ACSL5, pseudogene GUCY2GP and omitting exons 7-10 of ZDHHC6 (Fig. 4). RNAseq data demonstrated no detectable expression of GUCY2GP, ACSL5 and ZDHHC6 exons beyond the breakpoint of the observed deletion. A further gene, TECTB, located outside the deleted region had no observable expression compared with that of the Labrador retriever jejunum. Gene expression of ZDHHC6 and ACSL5 in the control jejunum was consistent with the reference transcript. GUCY2GP and TECTB were not expressed in either case or control. In the AK, expression of novel exons as a result of cryptic splicing were observed 148.4 kb downstream from the ZDHHC6 gene. The alternate splicing event was captured by 152 reads. A consensus sequence was produced using a genome guided de-novo assembly with Trinity (Data S1).    (Fig. S4). In the controls, six samples were homozygous wild type and four were heterozygous for the variant. Dogs heterozygous for the variant were asymptomatic but came from families known to produce offspring with the disease phenotype.

Discussion
Inborn errors of metabolism (IEM) are genetic disorders resulting from defects in biochemical pathways that can have a profound effect on an animal's overall health 22,23 . IEM affecting intermediary metabolic pathways are often recognised through clinical signs such as failure to thrive, hypotonia and functional decompensation 22,23 . Increased prevalence of IEM among specific breeds has previously been observed [22][23][24] . Frequently reported metabolic disorders clinically similar to intestinal lipid malabsorption are hereditary selective ileal cobalamin malabsorption and exocrine pancreatic insufficiency. Both are IEM that present with failure to thrive and persistent diarrhea 11,25,26 , however the AK presents earlier (before six weeks of age), show no signs of lethargy and have clear evidence of fat in faeces (steatorrhea). Here we present an IEM affecting lipid absorption in the AK resulting from the deletion of ACSL5 and partial loss of ZDHHC6. Characterisation of the genetic factors associated with IEM is of strong interest for improving canine welfare and improving our understanding of the genomic control of metabolism. Genes influencing the phenotype described in this study, ACSL5 and ZDHHC6, have not been previously implicated in naturally occurring disease models. Long chain acyl-CoA synthetases are major enzymes in fatty acid metabolism [27][28][29][30][31][32] . In human and rodent studies, variation in the ACSL gene family are often associated with diet induced metabolic and body composition phenotypes 27,31,[33][34][35][36][37][38] . ACSL5, essential for lipid metabolism and fat deposition in carnivores 39 , is a principal candidate for the observed phenotype in the AK. ACSL genes have already been implicated in canine body composition phenotypes, with variation in ACSL4 associated with heavy weight dogs 40 .
The clinical phenotype associated with absent expression of ACSL5 in the jejunal tissue of affected AK puppy is consistent with a knockout (KO) mouse model, including delayed fat absorption and a reduced fat mass 27 . KO mice exhibited additional increased lean mass and energy expenditure, as well as improved insulin sensitivity; traits not observed or tested in our cases. The results of the mouse KO study contradicted an earlier ACSL5 knockout study, which showed little effect on long-term dietary LCFA absorption and weight gain, likely . Sashimi plot of RNAseq data for CanFam3 genomic coordinates CFA28:23320000-23500000. The coverage for each alignment track is plotted as a bar graph, the Y axis represents read counts. Arcs are supported exon junctions and reads split across the junction (junction depth). Below the plots are the gene annotations for corresponding genomic coordinates. The figure illustrates RNAseq data from the jejunum of two samples; A case sample (AK Australian Kelpie) and control (LR Labrador retriever). Underlying DNA contamination can be seen in the AK highlighted by the low read count across the genomic region. A 103.3 kb deletion is seen in the AK, illustrated with a transparent box. The gap in the AK includes GUCY2GP, ACSL5, ZDHHC6 and a Long non-coding RNA (lncRNA). In the AK, expression of novel exons can be seen 148.4 kb downstream from the ZDHHC6 gene, the junction is supported by 152 reads. A consensus sequence produced using a genome guided de-novo assembly with Trinity is included in the gene track as ZDHHC6_AK_DEL.

Scientific Reports
| (2020) 10:18223 | https://doi.org/10.1038/s41598-020-75243-x www.nature.com/scientificreports/ compensated by residual ACSL activity 41 . Long chain fatty acid absorption occurs largely through the jejunum where LCFA are absorbed across the brush border of jejunal enterocytes. ACSL5 is expressed in brown adipose tissue, small intestine, liver 27,28,[42][43][44] and is the primary activator of dietary LCFA in the jejunum 41 . Expression, synthesis and activity of ACSL5 is connected to the state of villus architecture, epithelial homeostasis and enterocyte apoptosis [45][46][47] . The relatively improved health status of affected AK at maturity may imply an important role of ACSL5 during early development. The extreme effects identified in immature AK may be partially offset by other ACSL genes as they reach full size or may be linked with a transition to a solid diet. Following absorption of LCFA in enterocytes, they undergo re-esterification before transportation and storage is possible. Previous research in rodent studies has implicated ACSL5 in fat absorption during the re-esterification of dietary fats 3,27,28,48 . In the present study it has been noted that once affected Kelpies are on a solid diet with enzyme supplementation, dogs continue to present with a low body condition score. While AK display ongoing sensitivity to dietary lipids into adulthood it remains unconfirmed if their smaller size is a result of persistent intestinal lipid malabsorption or stunted early development.
Further to the complete loss of ACSL5, the genomic deletion resulted in the partial deletion and cryptic splicing event downstream of the last translated exon of ZDHHC6. ZDHHC6 plays a role in posttranslational modification (palmitoylation) of proteins, which can contribute to protein function and regulation beyond underlying genomic architecture. Differences in the palmitoylation of proteins involved in fat and carbohydrate transport and signalling may compromise digestion. Articles reviewing the biological effects of protein palmitoylation have anticipated a functional role in lipid and glucose metabolism 49,50 , though ZDHHC6 is not currently implicated. ZDHHC6 localises in the endoplasmic reticulum and is reported to be involved in the palmitoylation of five protein targets [51][52][53][54][55][56] . Within the context of existing research neither ZDHHC6 nor proteins palmitoylated by ZDHHC6 are expected to play a major role in lipid digestion. However, novel roles and targets of palmitoylation are frequently reported and the list of proteins that undergo palmitoylation is constantly growing 57 (https :// swiss palm.epfl.ch/). It is possible that other key substrates influencing the observed phenotype in the AK are not yet reported and AK harbouring the disease-associated variant may be a unique tool in furthering our current understanding of post-translational modification.
TECTB and GUCY2GP were not expressed in either the case or control RNAseq samples. The genomic region containing the TECTB transcript falls outside the observed variant. It is unlikely that gene expression is altered in appropriate tissue samples. Mice studies indicated that GUCY2G plays a role in jejunal integrity 58 . However, GUCY2G is a known pseudogene in humans and was suggested to be under purifying selection in the dog 59 . Conversely, Ensembl genebuild predicts the transcript is non-protein coding (Gene identifier: ENSCAFG00000010908), and recent canine gene catalogue observing ten tissue types reported no expression across all samples and replicates 60 . The gastrointestinal tract was not reported in the catalogue but a lack of expression in the Labrador retriever control supports the concept of GUCY2GP as a pseudogene, indicating no involvement in the observed phenotype.
Therapies to overcome deficit in ACSL5 function are currently unknown and were not assessed in this research. In humans, therapies for disorders disrupting lipid digestion and absorption, involve removing lipids from the diet or replacing them with those that bypass the genetic block 61,62 . The disorder described in this study chiefly impacts the metabolism of LCFA. Some human studies have demonstrated positive effects of mediumchain triglyceride formulation (MCT) on individuals suffering from long chain fatty acid disorders [63][64][65] , however the use of MCT in canine research is restricted [66][67][68][69] . Auxiliary research into therapeutic options especially during early development is necessary.
Results of the multiplex-PCR were consistent with a fully-penetrant autosomal recessive disorder. Results reported here are not indicative of breed-wide prevalence rates as dogs included in this study originated from a small group of Australian kennels. However, the presence of the deletion in international samples suggests that the variant allele is globally dispersed. To obtain comprehensive prevalence parameters, randomised and wide scale testing is required.
In conclusion we presented a novel deletion of ACSL5, causing hereditary intestinal lipid malabsorption in the Australian Kelpie dog breed. ACSL5 plays an important role in long chain fatty acid storage and metabolism. The improved health of affected individuals with age implies that genetic compensation of this gene beyond neonatal development is possible. This research identifies the first spontaneous animal model to validate key mouse knockout model findings previously reported. The AK model presents a unique opportunity to improve gaps in our understanding of ACSL5. A simple genetic test has been developed and validated to identify dogs harbouring the described variant. International testing of Australian Kelpies is warranted to obtain better estimates of global prevalence. At this time the disorder is presumed to be restricted to a single breed.