Molecular evidence for distinct modes of nutrient acquisition between visceral and neurotropic schistosomes of birds

Trichobilharzia species are parasitic flatworms (called schistosomes or flukes) that cause important diseases in birds and humans, but very little is known about their molecular biology. Here, using a transcriptomics-bioinformatics-based approach, we explored molecular aspects pertaining to the nutritional requirements of Trichobilharzia szidati (‘visceral fluke’) and T. regenti (‘neurotropic fluke’) in their avian host. We studied the larvae of each species before they enter (cercariae) and as they migrate (schistosomules) through distinct tissues in their avian (duck) host. Cercariae of both species were enriched for pathways or molecules associated predominantly with carbohydrate metabolism, oxidative phosphorylation and translation of proteins linked to ribosome biogenesis, exosome production and/or lipid biogenesis. Schistosomules of both species were enriched for pathways or molecules associated with processes including signal transduction, cell turnover and motility, DNA replication and repair, molecular transport and/or catabolism. Comparative informatic analyses identified molecular repertoires (within, e.g., peptidases and secretory proteins) in schistosomules that can broadly degrade macromolecules in both T. szidati and T. regenti, and others that are tailored to each species to selectively acquire nutrients from particular tissues through which it migrates. Thus, this study provides molecular evidence for distinct modes of nutrient acquisition between the visceral and neurotropic flukes of birds.


Results and Discussion
transcriptomes and homology-based comparisons. The transcriptomes of T. szidati (n = 13,007) and T. regenti (n = 12,705) contained similar numbers of transcripts and annotated transcripts (Table 1) and were 80.5% to 81.5% complete. In total, 728 of the BUSCOs in the transcriptomes of T. szidati (n = 787) and T. regenti (n = 797) were shared. Most (80%) transcripts in the transcriptomes of T. szidati (n = 10,498) and T. regenti (n = 11,114) shared amino acid sequence homology with proteins assigned one or more KEGG terms. Almost half of the transcripts (n = 5,663 and 5,935) associated with 3,065 and 3,275 KEGG terms with conserved protein family annotations (KEGG BRITE), respectively (Table 1), and 3,452 and 3,611 transcripts associated with 1,837 and 1,934 KEGG terms with KEGG biological pathway annotations, respectively (Fig. 1, panel a; Supplementary  Table S1). In total, more than 460 excretory/secretory proteins were predicted from the transcriptomes of T. szidati (n = 642) and T. regenti (n = 468) (Supplementary Table S2).
An homology-based analysis showed that 5,886 orthologous protein groups were shared by T. szidati, T. regenti and S. mansoni (Fig. 1, panel b). Neurotropic T. regenti shared twice as many orthologous groups (n = 757) with S. mansoni than did visceral T. szidati (n = 391). Analyses of orthologues shared by T. regenti and S. mansoni revealed 145 transcripts assigned to "chromosome, DNA repair and recombination proteins, ubiquitin system, DNA replication proteins, protein kinases, chaperons and folding catalysts"; 112 transcripts of these orthologues were not detected in T. szidati, indicating differential transcription between species (upon pairwise comparison) rather than phylogenetic divergence. Transcripts that were unique to T. szidati (n = 601) and T. regenti (n = 765) linked to 20 and 18 unique protein families in the KEGG BRITE database, respectively, being assigned to one or more specific KEGG terms including spliceosome (7 for T. szidati and 11 for T. regenti), exosome (6 and 9), chromosome (5 and 9), peptidase (4 and 4) and kinase (2 and 6) (Supplementary Table S3). Differential transcription. Differential transcription was investigated between the free-living, infective (cercaria) and the migratory, parasitic (schistosomule) stages of T. szidati and T. regenti. In T. szidati, 3,729 of all 13,007 known transcripts were differentially transcribed between these two larval stages (1,881 up-regulated in the cercaria, and 1,848 up-regulated in the schistosomule). In T. regenti, 3,396 of all 12,705 known transcripts were differentially transcribed between these two stages (1,301 up-regulated in cercaria, and 1,876 up-regulated in schistosomule). For each species, differential transcripts were then linked to enriched biological pathways and protein families using the KEGG database, and the numbers of species-and stage-specific enriched biological pathways and protein families established (Figs 2 and 3 (Figs 2 and 3) in the cercariae of both Trichobilharzia species. Compared with the schistosomules, the results showed that cercariae usually had greater enrichment for pathways or protein families associated with metabolism and translation. The former (i.e. metabolism) related predominantly to carbohydrate metabolism (linked to conserved pathways or processes, such as Krebs cycle, glycolysis, pentose phosphate pathway and pyruvate metabolism), and energy metabolism was reflected in an enrichment of proteins linked to oxidative phosphorylation. Enriched metabolic pathways linked specifically to Krebs cycle, glycolysis and oxidative phosphorylation reflect the aerobic metabolism of free-living cercariae, contrary to the microaerobic metabolism of schistosomules. The latter (i.e. translation) related predominantly The main biological pathways enriched in T. regenti and T. szidati schistosomules linked to signalling, cell growth and death, cell motility, transport and catabolism, membrane transport, and DNA replication and repair. A distinction between the two species was observed in an enrichment for signal transduction pathways (linked to Ras, MAPK, Rap1, Wnt and ErbB) that was exclusive to T. regenti. Enriched protein families were represented by chromosome-associated proteins, proteins linked with transcription, transport system, cell adhesion, cytoskeleton proteins, DNA replication, protein kinases and peptidases (Supplementary Table S4), and MEG-encoded secretory proteins and saposins were conspicuous.
Microexon gene (MEG)-encoded secretory proteins. As MEG-encoded secretory proteins are proposed to be involved in blood processing in schistosomes 19,20 , we focused our attention on transcripts linked to this particular gene family in T. szidati and T. regenti. We detected 10 and 4 transcripts encoding MEGs in T. szidati and T. regenti, respectively. An analysis showed that predicted secretory proteins encoded by MEG orthologs of T. szidati and T. regenti clustered into four different groups (Fig. 4). Trichobilharzia regenti transcripts were represented in all four groups, suggesting that MEG-encoded secretory proteins are not strictly linked to visceral schistosomiasis 19 . While S. mansoni MEGs were assigned previously to three groups (MEG-3, MEG-4 and MEG-8), some orthologs of T. szidati and T. regenti were not closely related to known MEGs of S. mansoni (Fig. 4), suggesting that they are specific to avian Trichobilharzia species. In T. szidati, MEG homologs were highly transcribed, with two of them (Tszi_001069 and Tszi_004986) being in the top-10 most upregulated transcripts in schistosomules. In T. regenti, MEG genes were transcribed at markedly lower levels than those in T. szidati (Fig. 4, panel b). Collectively, these results suggest that there is a differential involvement of MEG-encoded secretory proteins in feeding by the two Trichobilharzia species.
saposins. These lysosomal proteins can play a key role in the degradation of some sphingolipids 21 . Two transcripts (Treg_014592, Treg_015261) encoding saposin B were represented in the ten most upregulated transcripts in T. regenti schistosomules. Saposin B degrades sulfatides, a major component of the myelin sheaths of nerves 22 .
The specific detection of myelin in the guts of T. regenti schistosomules suggests that saposin B is central to lipid degradation and digestion during their migration through the spinal cord. Saposin-encoding transcripts were also identified in T. szidati schistosomules, but at very low levels. A role for saposins in blood digestion has been proposed for S. mansoni and for the liver flukes Clonorchis sinensis and Fasciola hepatica [23][24][25] . Taken together, this   Table S5) were principally the cathepsins (B, L, D, A, K and C), dipeptidyl-peptidase II, leucyl aminopeptidase, aspartyl peptidase, tissue plasminogen activator, legumain, calpain, tolkin, ADAMTS (a disintegrin and metalloproteinase with thrombospondin motifs) peptidases and carbamoyl-phosphate synthase. The transcription profiles representing these peptidases were similar between cercariae and schistosomules for both Trichobilharzia species (Fig. 5). Usually represented by relatively moderate to high numbers of transcripts (n > 7) were cathepsin B (11 for T. szidati and 12 for T. regenti), cathepsin D (11 and 8, respectively), cathepsin A (10 and 4, respectively) and calpain (9 and 7, respectively). Transcripts encoding calpain were abundant in both cercariae and schistosomules of both T. szidati and T. regenti, suggesting that this peptidase is multi-functional. Following these analyses, we explored the abundance of transcripts encoding peptidases that were upregulated in the schistosomules of each species. In T. szidati schistosomules, transcripts encoding legumain were most abundant, followed equally by those encoding cathepsins L and B, followed by calpain, cathepsins D, K, C and A, leucyl aminopeptidase and dipeptidyl peptidase II (in order of decreasing abundance) (Fig. 5). In T. regenti schistosomules, transcripts representing cathepsin B were most abundant and twice as high as those representing cathepsin L, followed (in decreasing order) by transcripts representing cathepsins K and C, tissue plasminogen activator, leucyl aminopeptidase, dipeptidyl-peptidase II, cathepsin A, calpain and legumain. Transcripts representing selected peptidases were unique to T. szidati (i.e. disintegrin and metalloproteinase domain-containing protein 9) and to T. regenti (i.e. cathepsin S, caspase 7, disintegrin and metalloproteinase domain-containing protein 1 and anhydrolase domain-containing protein 5). These findings suggest that peptidases are the key enzymes involved the digestion and acquisition of nutrients from different macro-molecular substrates in schistosomules of T. szidati and T. regenti.
Although there were qualitative differences between the two species, the differential levels of transcription of genes encoding particular peptidases (taken as a snap-shot) indicate distinctive requirements of the schistosomules between the two species, to digest tissue types with differing macromolecular (protein) compositions (i.e. blood versus neural tissues). In particular, the high transcription of genes encoding cathepsins B and L and legumain (=haemoglobinase) are likely to be specifically linked to the degradation of blood 26,27 , which accords with microscopic evidence showing the presence of blood metabolites specifically in the gut of schistosomules of T. szidati, but not T. regenti (Fig. 6). By contrast, genes encoding legumain (haemoglobinase) were transcribed at a substantially lower level in schistosomules of T. regenti compared with those of T. szidati. As peptidases are often multi-functional 28 , it is possible that limiting amounts of legumain might activate (other) peptidases, such as cathepsin B, which is known to degrade myelin 29 and is represented by the top transcribed peptidase gene (~3,300 CPM) in T. regenti (Fig. 5).

Conclusion
Using a bioinformatics workflow, we explored the transcriptomic landscape of the cercarial and schistosomule stages of T. szidati and T. regenti, which assume distinct migratory paths and predilection sites in the avian host. This study allowed us to explore the differences in biology between these flukes before they enter (cercariae) and as they migrate (schistosomules) through tissues in the host animal and aspects likely linked to fluke-host cross-talk. The exploration indicated that both T. szidati and T. regenti schistosomules have key molecular pathways and repertoires of enzymes critical for acquisition and degradation of macromolecules (including proteins, lipoproteins and lipids) from distinct tissue types during their distinct migratory paths. Such repertoires likely allow these migratory flukes to effectively use key components in blood (T. szidati), nerves (T. regenti) and/or other tissues (both species) to migrate, develop and live in quite different physiological conditions. Following skin penetration of the avian host, cercariae lose their tails and metamorphose to schistosomules which then migrate through tissues in the definitive host. In the penetration phase, cercariae of both species are enriched for pathways or molecules associated with carbohydrate metabolism (including glycolysis and Krebs cycle), oxidative phosphorylation and the translation of proteins including those required for ribosome biogenesis, exosome production and/or lipid biogenesis. In the migration phase, schistosomules of both species are enriched for pathways or molecules associated with processes including signal transduction, cell turnover and motility, DNA replication and repair, molecular transport and catabolism. The exclusive enrichment in signalling pathways linked to Ras, MAPK, Rap1, Wnt and ErbB in T. regenti appears to relate to processes including cell adhesion, proliferation, differentiation, migration, cell to cell communication and/or junction formation. These observations suggest the importance of cell adhesion molecules (CAMs) in this species 18 . CAMs are known to regulate host-parasite interactions, in addition to having crucial roles in the regulation of cellular integrity and interactions. The unique transcription profiles representing these molecules in schistosomules suggest an involvement in rapid growth and development of different organ structures in T. regenti within the avian host 30 and/or host-parasite interactions. The former aspect is supported by an abundant transcription linked to neuroligin, netrin receptor and semaphorin in T. regenti 18 , which might relate to neural development in the schistosomule stage 31,32 . In contrast to cercariae, which store and express proteolytic enzymes in their penetration glands for percutaneous invasion, schistosomules actively degrade various host tissue and cell types during migration 21 and inhibit or evade immune attack by the host 33 . The markedly higher levels of transcription of MEG genes in T.  szidati compared with T. regenti suggests variation in nutrient acquisition or parasite-host cross-talk between the two species of fluke. Although many peptidases in both species would have broad substrate specificity to degrade many proteins and lipoprotein complexes, selected peptidases likely have relative specificity to degrade blood (e.g., legumain) in T. szidati or myelin in nerves (e.g., cathepsin B) in T. regenti, reflecting an adaptation to distinct modes of nutrient acquisition between these visceral and neurotropic flukes of birds. The relatively high transcription of genes encoding a wide array of cathepsins and other proteolytic enzymes lends support for the involvement of this group of enzymes in numerous biological processes with key roles during the parasite's migration, feeding and survival in the definitive host.  8,18,34 . Life cycle stages of T. szidati were produced here, and those of T. regenti had been raised previously 18 . For each Trichobilharzia species, four independent biological replicates of cercariae were collected from distinct groups of infected snails (n = 20). For each replicate, 10,000 cercariae were washed extensively in H 2 0, pelleted by centrifugation at 2,500 g (4 °C), suspended in TRIzol (Thermo Fisher Scientific) and then frozen at −80 °C. For each Trichobilharzia species, four biological replicates of schistosomules were produced in (7-day old, helminth-free) ducks; each of four ducks was infected percutaneously with 2,500 cercariae 8,18,29 . Schistosomules were collected by microdissection from the lungs (T. szidati) and spinal cord (T. regenti) at 90 h and 162 h following infection of ducks, respectively; these time points were selected to ensure that the development of the schistosomules of the two species (collected from the distinct tissues) was the same and thus comparable 2,35,36 . For each replicate, 300 schistosomules were washed extensively in phosphate-buffered saline (PBS, pH 7.0), pelleted, suspended in TRIzol and frozen in the same manner as for cercariae.

RNA-sequencing.
RNAs of T. szidati were sequenced here, and those of T. regenti had been done previously 18 . For both T. szidati and T. regenti, total RNA was isolated from each of the four replicates of each cercariae and schistosomules and DNase I-treated 18 . RNA quality was assessed using a Bioanalyser 2100 (Agilent), and the quantity was estimated using a Qubit 4 fluorometer (Invitrogen For each Trichobilharzia species, RNA-seq datasets representing the cercariae and schistosomules were pooled and employed to assemble a non-redundant transcriptome using Oases 41 v.0.2.8. In order to achieve an optimum transcriptome for each species and each replicate, various k-mer (21 to 49) and coverage (5 to 21) threshold values were assessed during the assembly process. Based on number of contigs, length of contigs, and transcript redundancy, the best assemblies were achieved using k-mers of 47 to 49 and coverages of 9 to 14. For each species, the eight transcriptomes representing the individual replicates of each of the larval stages (cercaria and schistosomule) were pooled and any redundancy removed using CD-HIT-EST 42 employing a nucleotide identity threshold of 85%. Protein-coding regions were predicted from individual non-redundant transcripts using Transdecoder 43 v.3.0.0 employing a minimum length of 30 amino acids (aa). transcriptome annotation and curation. An established pipeline 44 was used for annotation. Predicted protein sequences were annotated using their closest homologues (BLASTp; E-value cut-off: ≤10 −5 ) employing the NCBI non-redundant protein, Kyoto Encyclopedia of Genes and Genomes (KEGG), excluding the "organismal system" and "human disease" categories 45 , and UniProt 46 . The proteomes predicted separately from the larval transcriptomes of T. szidati and T. regenti were compared with that of S. mansoni 47 . For each Trichobilharzia species, orthologous and unique protein groups were identified using OrthoMCL 48 (E-value cut-off: ≤10 −5 ; similarity cut-off: 0.5). The completeness of each transcriptome was assessed by benchmarking universal single-copy orthologs (BUSCO) 49 . Excretory/secretory proteins were predicted using the prokaryotic and eukaryotic classical analysis of secretomes (PECAS) pipeline employing default settings 50 .
Subsequently, each transcriptome was curated. Transcripts with high nucleotide identity (BLASTn; E-value: ≤10 −5 ) to avian, bacterial or viral sequences present in the NCBI non-redundant nucleotide database (NCBI) were removed. Any conceptually translated protein with a high amino acid sequence homology (BLASTp; E-value cut-off: ≤10 −5 ) to transposable elements in RepBase database 51 were also eliminated, as were translated sequences of ≤50 amino acids with no homology (at the nucleotide and protein levels) to sequences in current public databases. Additionally, transcripts with read counts of <10 (using RSEM) 52  Differential transcription and enrichment analyses. An established method 18 was used for this analysis. For each Trichobilharzia species and each replicate, the trimmed, corrected, paired reads were mapped to the respective non-redundant, merged transcriptome using RSEM 52 . Differential gene transcription between cercariae and schistosomules was calculated using edgeR 56 v.3.6.7 and R 57 3.2.3. Read counts (in counts per million, CPM) were normalised for G+C bias 58 and for the trimmed mean of M-values (TMM) 59 . The edgeR exactTest function was used to assess differential transcription. The multiplicity correction of P-values was then performed by applying the Benjamini Hochberg method to establish the false discovery rate. Transcripts were considered differentially transcribed if the latter rate was ≤0.01. For differentially transcribed molecules, enriched metabolic pathways and protein families were identified using the KEGG database 40 . Upregulated transcripts assigned to KEGG orthology (KO) terms were mapped to KEGG pathways and the KEGG BRITE database. A Fisher's exact test was used to calculate levels of statistical significance in differences between numbers of KO terms assigned to particular pathway/enzyme classes; a P-value of ≤0.05 was set as the threshold for significant enrichment in a particular developmental stage of each Trichobilharzia species studied.
Microscopic examinations. In the light microscopic examination, tissues containing schistosomules were collected from ducks infected for 4 days with T. szidati (lungs) or for 7 days with T. regenti (spinal cord) and immediately fixed in Bouin's solution (cat. no. HT10132, Sigma-Aldrich). These tissues were processed and stained with Masson's trichrome (lungs), haematoxylin-eosin or with Luxol fast blue (spinal cord) for histological examination using standard procedures. Sections were examined using a BX51 microscope (Olympus), and photographically documented using a DP72 camera (Olympus) and software Quick Photo Micro 3.1. (Promicra).
In the transmission electron microscopic examination, schistosomules of T. szidati (4 days) dissected from lung tissues and those of T. regenti (10 days) within spinal cord tissue blocks (4-6 mm 3 ) were immediately fixed in modified Karnovsky solution 60 , fixed in OsO 4 and then embedded in Spurr resin (SPI-supplies) using established protocols 61 . Schistosomules and/or tissues were subjected to transmission electron microscopy (JEOL JEM-1011) and photographed using a CCD camera (Veleta) employing acquisition software (Olympus Soft Imaging Solution GmbH).

Data Availability
Short sequence reads and transcriptome analysed are available via Bioproject Submission SUB4374991.