Introduction

Leishmaniasis encompasses a broad spectrum of diseases caused by an obligate intramacrophage protozoan belonging to the genus Leishmania (Kinetoplastida: Trypanosomatidae), occurring predominantly in the tropical and subtropical world regions. The bite of sand fly vectors (Diptera: Psychodidae: Phlebotominae) transmits promastigote forms of Leishmania into the host’s skin while acquiring a blood meal1,2. Additionally, the phlebotomine deliver the parasite in conjunction with salivary proteins3, whose pharmacological activities assist blood feeding by preventing host hemostasis and modulating the host’s immune system4,5,6.

Leishmania (Viannia) braziliensis (Viana 1911) and Leishmania (Leishmania) amazonensis (Lainson & Shaw 1972) are the main species involved in both the cutaneous and mucocutaneous forms of Tegumentary Leishmaniasis (TL) in Brazil7. Autochthonous cases of TL have been reported in the northeastern São Paulo state (NSPS), Southeastern Brazil. Both sand flies: Nyssomyia intermedia (Lutz & Neiva 1912) (syn = Lutzomyia (Nyssomyia) intermedia) and Nyssomyia neivai (Pinto 1926) are the main vectors of L. (V.) braziliensis in São Paulo state; however, only N. neivai has been recognized in systematic research collections in Northeastern São Paulo State (NSPS) cities8.

Pemphigus foliaceus (PF) is an autoimmune bullous disease caused by autoantibodies against desmoglein-1 (Dsg-1). PF is subdivided into classic sporadic worldwide Cazenave’s pemphigus and endemic pemphigus (known in Brazil as Fogo Selvagem). Although the pathogenesis of PF remains unclear, genetic and environmental factors have been implicated in the susceptibility to this disease9,10. Interestingly, some salivary proteins of sand flies have been associated with pemphigus etiopathogenesis. Maxadilan is a highly immunogenic salivary protein described in Lutzomyia longipalpis (Lutz and Neiva 1912), the vector of visceral leishmaniasis (VL) in South America. Higher levels of serum IgG against maxadilan were observed in PF patients compared to healthy controls living in the same endemic region11,12. There is also evidence that antibodies raised against LJM11 and LJM17, immunogenic proteins from Lu. longipalpis, cross-reacts with antibodies against Dsg-1, and it has been proposed as the antigen that triggers PF diagnosed in Amerindians living in Mato Grosso state, Brazil13,14. Of note, Lu. longipalpis is not widely distributed in NSPS10,15.

The profile of salivary components has been defined in ten Old World Phlebotomus species16,17,18,19,20,21,22,23,24, and one Sergentomyia (Sergentomyia schwetzi) species25; notwithstanding, the sialotranscriptome of the New World sand flies have been documented only in four species: Bichromomyia olmeca (Vargas and Diaz-Najera 1959) (syn = Nyssomyia olmeca), Lutzomyia ayacuchensis (Caceres and Galati 1988), Lu. longipalpis, and N. intermedia26,27,28,29. Considering the developments on sand fly saliva-based vaccines for Leishmania sp. infection, and discovery of possible candidate proteins that might be the trigger of anti-Dsg1 autoantibodies in Brazilian endemic PF in NSPS, we here report the identity and abundances of the putative secreted proteins on the sialome of N. neivai by RNA-sequencing. We take this opportunity to compare the most abundant proteins in the N. neivai sialome to N. intermedia27 (the main vector of Leishmania (V.) braziliensis in the coastal SP state), and Lu. longipalpis28 (the vector of VL in São Paulo state), besides other published salivary proteins in Old World sand flies.

Methods

Collection and maintenance of sand flies

A colony of N. neivai was established at São Paulo State University30 from sand flies collected in Santa Eudóxia, SP state, Brazil (along the edges of the Mogi Guaçu River) on the wall of a house, using a manual aspirator between 06:00 PM and 11:00 PM. In the laboratory, the sand flies were maintained in cages covered with voile (30 cm3) at 26 ± 1 °C, 80–90% humidity, and a 12:12 (L:D) photoperiod. Salivary glands were dissected as follows: Sand flies were transferred into a tube with mild soap solution. The contents were poured over a fine-mesh screen stretched over a beaker and rinsed with water. We transferred the flies from the mesh screen to a small Petri dish containing 1xPBS. We relocated a freshly rinsed sand fly to a drop of PBS on the microscope slide. The sand flies’ legs and wings were removed. We pierced the thorax and held it against the glass and removed the sand fly’s head. When the head was pulled from the body the salivary glands became visible at the back of the head. The glands were collected with the dissecting pins and transferred to a small, labeled Eppendorf tube for storage. Sand fly identification was based on morphological characteristics of the spermathecae in females and apical genital filaments on males as described by Andrade Filho et al.31.

Salivary gland transportation

Two hundred salivary gland pairs were dissected from starved and non-gravid, 2 to 3 days old N. neivai female sand flies. Samples were submitted in 1 mL RNAlater each to the North Carolina State Genomic Sciences Laboratory (Raleigh, NC, USA) for RNA extraction, Illumina library preparation, and sequencing. The salivary glands tissue samples export was approved by the United States Department of Agriculture—Veterinary Permit under the ID number #130339.

Salivary gland RNA extraction

Prior to extraction, salivary glands were pelleted by the addition of 1 mL PBS in a benchtop centrifuge at 5000×g for 10 m to remove the RNAlater. Total RNA was extracted using the RNeasy Mini Kit (Qiagen, MD, USA) following the manufacturer’s protocol for purification of total RNA from animal tissue. Briefly, Qiagen RLT buffer with β-Mercaptoethanol (β-ME) was added to the tissue samples, and samples were homogenized using a Qiagen TissueLyser with 5 mm stainless steel beads (Qiagen). Samples were then purified with provided RNeasy spin columns. Total RNA was then assessed for purity and size integrity using an Agilent 2100 Bioanalyzer with an RNA 6000 Nano Chip (Agilent Technologies, CA, USA). Purification of messenger RNA (mRNA) was performed using oligo-dT beads provided in the NEBNExt Poly(A) mRNA Magnetic Isolation Module (New England Biolabs, MA, USA). Complementary DNA (cDNA) libraries for Illumina sequencing were constructed using the NEBNext Ultra Directional RNA Library Prep Kit (NEB) and NEBNext Multiplex Oligos for Illumina (NEB) using the manufacturer-specified protocol.

Sialome library

Briefly, the mRNA was chemically fragmented and primed with random Oligos for first strand cDNA synthesis. Second strand cDNA synthesis was then carried out with dUTPs to preserve strand orientation information. The double-stranded cDNA was then purified, end repaired, and “a-tailed” for adaptor ligation. Following ligation, the samples were selected for a final library size (adapters included) of 400–550 bp using sequential AMPure XP bead isolation (Beckman Coulter, USA). Library enrichment was performed, and specific indexes for each sample were added during the protocol-specified PCR amplification. The amplified RNA-seq library fragments were purified and checked for quality and final concentration using an Agilent 2200 Tapestation (D1000 chip, Agilent Technologies, CA, USA) combined with a Qubit fluorometer (Thermo-Fisher, MA, USA). The final quantified libraries were pooled in equimolar amounts for clustering and sequencing on an Illumina HiSeq 2500 DNA sequencer, utilizing a 125 bp single-end cycle sequencing kit (Illumina, CA, USA). The software package Real Time Analysis (RTA) was used to generate raw bcl, or base call files, which were then de-multiplexed by sample into fastq files using bcl2fastq Conversion Software v2.17 (Illumina, CA, USA).

Bioinformatics

Custom bioinformatic analysis were described elsewhere32. Succinctly, low quality reads were trimmed from Fastq files (< 20) and contaminating adapter primer sequences removed. De novo assembly from reads was a result of Abyss33,34 (using k parameters from 21 to 91 in fivefold increments) and SOAP de novo-trans35 assemblers. The fasta files were combined and further assembled using an iterative blast and CAP336 pipeline as previously described37. Coding sequences (CDs) were extracted based on the existence of a signal peptide in the longer open reading frame (ORF) and by similarities to other proteins found in the Refseq invertebrate database from the National Center for Biotechnology Information (NCBI), proteins from “Diptera [organism]” deposited at NCBI’s GenBank, and from Swiss-Prot. Contigs containing an open reading frame and any similarity to sequences in the chosen databases were selected for further analysis. Reads for each library were mapped on the deducted CDs using blastn38 with a word size of 25, 1 gap allowed, and 95% identity or better required. Up to five matches were allowed if and only if the scores were the same as the largest score. Mapping of the reads was also included in the Excel spreadsheet. Values of the Reads Per Kilobase of transcript, per Million mapped reads (RPKM)39 for each coding sequence were also mapped to a spreadsheet. Automated annotation of proteins was based on a vocabulary of nearly 350 words found in matches to various databases: Swiss-Prot, Gene Ontology, KOG, Pfam, SMART, Refseq-invertebrates, and the GenBank Diptera subset. Raw reads were deposited on the Sequence Read Archive (SRA) of the National Center for Biotechnology Information (NCBI) under BioProject ID PRJNA359206 and read accession SRR5134059. This Transcriptome Shotgun Assembly project has been deposited at DDBJ/EMBL/GenBank under the accession GFDF00000000.

Phylogenetic analysis

For multiple sequence alignment and phylogenetic analysis, abundant salivary proteins from N. neivai, had their predicted signal peptide signal (SignalP-5.0 server40) removed, and resulting protein sequence entered into a Basic Local Alignment Search tool (BLAST38) against NR and TSA databases. We selected the five most similar homolog sequences (based on the e-value) for each sand fly species. The cut-off to exclude a homologue was an e-value above 1–10, except for the N. neivai yellow family of proteins where a homologue from Drosophila was used to root the tree. Multiple sequence alignment and identity/similarity matrix were constructed on MacVector v15.5.3 with MUSCLE41 using PAM 200 profile. We determined the best method for amino acid substitution using the “Find best protein Models” feature of MEGA742. A score was given to each of 56 amino acid substitution models including the mixing Gamma and invariant sites likelihood. The option with the lowest Bayesian information criterion score was selected to build the tree. Through this feature, it was determined that the best amino acid substitution model for phylogeny as follows: WAG for the ML domain and Maxadilan trees. The model WAG with discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+ G, parameter = 1.6114)) for the Yellow proteins tree. For the SP15 family of proteins, the model WAG with a discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+ G, parameter = 4.0108)). The rate variation model allowed for some sites to be evolutionarily invariable ([+ I], 2.61% sites). For the C-type lectin, the best model was LG + F. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories (+ G, parameter = 3.6037)). For Gaps/Missing data treatment, a partial deletion option was utilized. Finally, the reliability of the trees was tested, by bootstrap method (N = 1000).

Results and discussion

cDNA library of Nyssomyia neivai salivary gland

cDNA library was constructed from salivary glands of N. neivai females dissected up to 3 days after emergence. From this cDNA library 1,302,396 high quality reads were assembled in 1200 contigs (Table 1). Contigs were classified in five categories namely: secreted, housekeeping, transposable elements, viral, and unknown. Remarkably, the secreted proteins category comprised 92.4% of the number of reads, dispersed in 41.2% of the contigs. Most salivary transcriptomes from Phlebotomus and Lutzomyia genera were based in low output cDNA library sequencing; nevertheless, a high abundance of transcripts encoding secreted proteins were also reported17,18,19,20,23,27,43,44. Our data further validates the specialization of the salivary gland machinery and the specificity of the material obtained with the sand fly dissection. The housekeeping category had 33.3% of the clusters and only 3.2% of the total sequences. The category of ‘‘unknowns” comprised 24.6% of the clusters and 4.3% of the sequences. Finally, Viral products and transposable elements included less than 1% of the families (0.3% and 0.8% respectively) and less than 0.1% of total sequences. Recently, RNA-seq of salivary glands of Old World P. kandelakii24 and Sergentomyia schwetzi have been published25 and the presence of representative salivary proteins has been confirmed.

Table 1 Classification of transcripts originating from the sialotranscriptome of Nyssomyia neivai.

Housekeeping and unknown proteins sequences

The 399 clusters (comprising 42,119 sequences) attributed to Housekeeping genes expressed in the salivary glands of N. neivai were further divided into 22 subgroups according to their function. Two sets were associated with (a) protein synthesis machinery (22 contigs), including translation, ribosomal structure and biogenesis, and (b) metabolism (94 contigs), a pattern also observed in other sialotranscriptomes. Proteins with unknown function (295 contigs comprising 4.3% of reads) were classified as “unknown”.

Secreted proteins sequences

The putative secreted salivary proteins of N. neivai were classified into 35 main protein families (Table 2). The most abundant transcripts were within the SP13-15 protein family (35.09%), followed by C-type lectins (15.9%), Maxadilan-like (15.6%), ML domain salivary proteins (5.8%), and the Yellow protein family (5.1%). Previously, the novel families 8-kDa, 6-kDa and 5-kDa that were only described in the N. intermedia sialome27, are now also present in N. neivai sialotranscriptome and grouped as Toxin-like peptides.

Table 2 Classification of secreted proteins originating from the sialotranscriptome of Nyssomyia neivai.

The following paragraphs describe the most abundant families in detail, focusing on protein family characteristics, possible function, biochemical, immune-modulatory, and antigenic properties: also, phylogenetic analysis in context with related proteins from other Brazilian sand flies and desmoglein proteins.

SP15 family

The SP15 is the most abundant secreted protein family in N. neivai sialome with 23.13% of the total RPKM in the transcriptome (Table 2). This salivary family is present among all species of sand flies studied so far. In N. neivai, we have categorized eight full-length members of this family (Table 3). For further analysis we considered the four most abundant members JAV08233.1, JAV08232.1, JAV08238.1 and JAV08231.1 with 91.37% of the SP13-15 family abundance as highlighted in Table 3.

Table 3 SP15 secreted proteins originating from the sialotranscriptome of Nyssomyia neivai.

The SP15 family was described for the first time in the sialome of Lu. longipalpis as the SL1 family45. This family was then also reported in the Old World sand fly Phlebotomus perniciosus (Newstead 1911), and later named as SP15 family due to 15-kDa salivary protein from Phlebotomus papatasi (Scopoli 1786) (PpSP15: AF335487)19. Thus far, SP15-like proteins have only been reported in sand flies and not in any other Dipteran; It has been suggested that SP15-like proteins were derived from an ancestral odorant-binding protein and were closely related to mosquitoes short D7 proteins16,19. Alvarenga et al.46, demonstrated that SP15 from Phlebotomus duboscqi (Neveu-Lemaire 1906) inhibit anionic surface-mediated reactions suggesting a role in anticoagulation, inhibiting the activation of FXII and FXI, and anti-inflammatory processes.

PpSP15-like proteins were reported as promising anti-Leishmania vaccine candidates. Immunization of mice with P. papatasi SP15 protein conferred partial protection against Leishmania (Leishmania) major (Yakimoff and Schokhor 1914) infection43; furthermore, a DNA vaccine containing the PpSP15 cDNA provided the same protection43. ParSP03 (AAX56359), a PpSP15-like protein from Phlebotomus ariasi (Tonnoir 1921), elicited similar delayed type hypersensitivity and humoral immune responses upon DNA vaccination20. Recently, BALB/c mice immunized to PsSP19 (HM56964), a protein member of the SP15 family from Phlebotomus sergenti (Parrot 1917), acted as an adjuvant to accelerate the cell-mediated immune response to co-administered Leishmania antigens, providing protection against Leishmania (Leishmania) tropica (Wright 1903) infection47.

Phylogenetic analysis comparing selected sequences from sand fly salivary transcriptomes to the N. neivai SP15 family clustered these 4 abundant members closely (Fig. 1A, red asterisks) in a New World sand fly clade next to members of N. intermedia, B. olmeca, Lu. ayacuchensis and to the only SP-15 family protein described in Lu. longipalpis so far, SL128 (AAD32197.1) (Fig. 1A). The remaining clades represent Old world VL and TL vectors and the three members of the S. schwetzi SP-15 family. We then aligned N. neivai SP15 proteins to SP15 salivary proteins from other sand fly vectors present in Brazil, namely N. intermedia and Lu. longipalpis (Fig. 1B). N. neivai SP15 family proteins shared a relatively high percentage of identity (41.7 to 98.3%) to N. intermedia and at a lesser extent to Lu. longipalpis SL1 (43.1 to 57.1%) (Table 4).

Figure 1
figure 1

Molecular phylogenetic analysis and sequence alignment of Nyssomyia neivai 15 protein family. (A) The evolutionary history was inferred by using the Maximum Likelihood method. The tree with the highest log likelihood is shown. The tree is drawn to scale, with branch lengths measured in number of substitutions per site. A discrete Gamma distribution was used to model evolutionary rate differences among sites [5 categories (+ G, parameter = 4.0108)]. The rate variation model allowed for some sites to be evolutionarily invariable ([+ I], 2.61% sites). All positions with less than 95% site coverage were eliminated. Evolutionary analyses were conducted in MEGA7. (B) Multiple alignments of SP15 from Nyssomyia neivai with Nyssomyia intermedia and Lutzomyia longipalpis SP15 proteins using Muscle. Black shading represents identical amino acids, light gray shading represents similar amino acids.

Table 4 Pairwise comparison matrix of identity and similarity percentages.

N. neivai SP15 members shared 76.4 to 98.3% identity (Table 4) with N. intermedia Linb-8 (AFP99232.1). Interestingly, BALB/c immunization with DNA plasmids encoding Linb-8 induced the highest humoral immune response against N. intermedia salivary gland homogenate (SGH), even greater than Linb-7 (AFP99230.1), another SP-15 protein family member also tested27. Linb-7-immunized mice induced a strong humoral response leading to a sustained local inflammatory process, which could exacerbate Leishmania sp. infection by L. (V). braziliensis27. Regarding N. neivai and N. intermedia, which are the two vectors of L. (V). braziliensis in Southeastern Brazil, the observed high similarities between N. neivai members and Linb-8 make them possible targets as biomarkers of vector exposure and as a vector-based vaccine for TL in Brazil.

SP13 family

The SP13 N. neivai family (JAV08240.1, JAV08113.1 and JAV08193.1) represents 11.26% of the total RPKM in the transcriptome and has the most abundant salivary transcript (JAV08240.1) representing 72.86% of the reads belonging to this family (Table 5). Despite its abundance in N. neivai, searches for SP13 subfamily (JAV08240.1, JAV08113.1 and JAV08193.1) related members available in the NCBI NR and TSA-NR databases (blast-P hits with e-value lower that 1e−10), yielded few sand fly salivary proteins. We identified only two members from N. intermedia (AFP99227.1 and AFP99242.1), one member for L. ayacuchensis (BAM69127.1) and one from B. olmeca (ANW11435.1) (Fig. 2).The N. neivai JAV08240.1 member it is identical to N. intermedia Linb-1 (AFP99227.1) that was also the most abundant contig in the N. intermedia salivary transcriptome27. Interestingly, all the members identified, but JAV08113.1 (Fig. 2) share the RGD domain in the carboxy region that are common in members of the disintegrin family48,49. These RGD containing proteins had been previously observed in other New World sand flies such as Lu. ayacuchensis (LuayaRGD; BAM69127.1) and Lu. longipalpis (LuloRGD; AAD32196). The function and relevance of this family during blood feeding remains to be tested.

Table 5 SP13 secreted proteins originating from the sialotranscriptome of Nyssomyia neivai.
Figure 2
figure 2

Molecular phylogenetic analysis and sequence alignment of Nyssomyia neivai 15 protein family. (A) Multiple alignments of SP13 from Nyssomyia neivai with other SP13 sand fly salivary proteins using Muscle. Black shading represents identical amino acids, light gray shading represents similar amino acids.

C-type lectin family

We have categorized twelve novel full-length transcripts as C-type lectins in the N. neivai sialome, as shown in Table 6. The C-type lectins are the third most abundant salivary family in N. neivai. Of those twelve, six members (JAV08563.1, JAV08583.1, JAV08561.1, JAV08565.1, JAV08584.1, JAV08562.1) corresponded to 89.5% of this family abundance and were further considered for in depth analysis (Table 6). In vertebrates, protein-carbohydrate interactions serve multiple functions in the immune system. C-type lectin family members are components of the innate immune response and work via pathogen neutralization through the activation of the complement pathway and adaptive immune response50. The C-type lectin putative domain may function as a Ca2+-dependent carbohydrate-binding pocket involved in extracellular matrix organization, pathogen recognition, and cell-to-cell interactions50. Homologous salivary proteins with molecular weight of 16.2–16.5 kDa have been identified in New World sand flies. Recently, homologues were also found in Old World sand flies by next generation sequencing of salivary glands from P. kandelakii24, as a partial protein, and Sergentomyia schwetzi25. The most abundant members of the N. neivai C-type lectins family seem to have a close relationship to N. intermedia homologues (Fig. 3A). Interestingly, JAV08583.1N. neivai protein segregated from the other C-type lectin N. neivai members in a subtree that also encompass members from Lu. ayacuchensis and Lu. longipalpis. The protein sequence alignment comparing Brazilian sand flies depicts a scenario of fast evolution of this family (Fig. 3B), indicated by the large ranges of amino acid identity scores (from 96.4 to 26.7%) across species in pairwise comparisons (Table 7). This may be associated with multiple events of gene duplication and high immune pressure from hosts. The exact role of these proteins in sand flies remains elusive.

Table 6 C-type lectin secreted proteins originating from the sialotranscriptome of Nyssomyia neivai.
Figure 3
figure 3

Molecular phylogenetic analysis and sequence alignment of Nyssomyia neivai C-type lectin protein family. (A) The evolutionary history was inferred by using the Maximum Likelihood method. The tree with the highest log likelihood is shown. A discrete Gamma distribution was used to model evolutionary rate differences among sites [5 categories (+ G, parameter = 3.6037)]. The tree is drawn to scale, with branch lengths measured in number of substitutions per site. All positions with less than 95% site coverage were eliminated. Evolutionary analyses were conducted in MEGA7. (B) Multiple alignments of C-type lectins from Brazilian sand flies using Muscle. Black shading represents identical amino acids, light gray shading represents similar amino acids.

Table 7 Pairwise comparison matrix of identity and similarity percentages.

Maxadilan-like family

Maxadilan (AAA29288.1) is a 7-kDa peptide present in the salivary gland of the sand fly Lu. longipalpis. Maxadilan was the first molecule to be identified in sand fly saliva51, and it is recognized for its powerful vasodilator effect. N. neivai maxadilan-like family corresponds of 11 full length abundantly expressed proteins (Table 8) representing 15.6% of the transcriptome. We will be further discussing 6 members of this family (JAV08475.1, JAV08473.1, JAV08474.1, JAV08472.1, JAV08471.1, JAV08462.1) accounting for 90.14% of this family abundance.

Table 8 Maxadilan secreted proteins originating from the sialotranscriptome of Nyssomyia neivai.

Comparative analyses of abundant transcripts from N. neivai maxadilan-like family were able to identify three homologues in N. intermedia (Fig. 4A). Phylogenetic topology shows that a main clade clustered the most abundant N. neivai members (JAV08475.1, JAV08473.1, JAV08474.1) with Linb-9 (AFP99245.1), while the other N. neivai members clustered with Linb-25 from N. intermedia. Maxadilan has its own branch and, JAV08462.1 from N. neivai was the closest relative to Maxadilan with 37.5% identity and 56.2% similarity suggesting that N. neivai JAV08462.1 (Fig. 4B) may have preserved its pharmacological properties52. N. neivai JAV08462.1 represents the sixth most abundant member with 6.44% of the maxadilan-like family abundance. Linb-147 (JK846521), a partial sequence from N. intermedia, showed a similar match to maxadilan, provided for only 34% identity and 70% similarity over a stretch of 50 amino acids. This sequence seems to be scarcely present in N. intermedia sialome, with only one transcript identified in its cDNA library, as compared to 30 transcripts of maxadilan present in Lu. longipalpis sialome27,28.

Figure 4
figure 4

Molecular phylogenetic analysis and sequence alignment of Nyssomyia neivai Maxadilan-simile protein family. (A) The evolutionary history was inferred by using the Maximum Likelihood method. The tree with the highest log likelihood is shown. The tree is drawn to scale, with branch lengths measured in number of substitutions per site. All positions with less than 95% site coverage were eliminated. Evolutionary analyses were conducted in MEGA7. (B) Multiple alignments of Maxadilan and N. neivai (JAV08462.1) proteins using Muscle. Black shading represents identical amino acids, light gray shading represents similar amino acids.

Maxadilan-like proteins have never been identified in Old World Phlebotomus species17,18,19,20,23,27,43. Phlebotomus sand flies, except for P. duboscqi, contain large amounts of vasodilatory adenosine and AMP in their saliva18,19,20,23,25. Interestingly, they lack adenosine deaminase (ADA), an enzyme that hydrolyzes adenosine and adenosine monophosphate (AMP)53,54. In contrast, Lu. longipalpis has ADA and lacks adenosine and AMP in the saliva28,45. Unexpectedly, neither ADA nor maxadilan were identified in Lu. ayacuchensis. Thus, with those exceptions in general, salivary vasodilators of sand flies are adenosine and AMP in the Phlebotomus complex, and maxadilan in the Lu. longipalpis complex.

Inoculation of maxadilan in experimental animals exacerbates Leishmania infection to the same degree as the whole salivary gland homogenate55. This peptide can drive a Th1 response to Th2, up-regulate IL-10 and TGF-β production, and suppress IL-12p40, TNF-α, and NO production56,57.

In animal models, mice vaccinated with maxadilan become markedly protected against Leishmania infection, producing not only anti-maxadilan antibodies, but also immune CD4 + T cells specific to maxadilan generating IFN-γ and inducing NO production55. Notoriously, immunization against maxadilan also inhibits blood meal acquisition by sand flies, a promising target to block the vector reproductive process58. Nonetheless, maxadilan is not free of polymorphisms, its amino acid substitution rate is around 23%, with most amino acids positions not being conserved among homologues59,60. Considering Lu. longipalpis as the vector of VL and N. neivai as the vector of TL in the same endemic Brazilian regions8,61, maxadilan-like proteins could bring a new insight for a common vaccine targeting VL and TL.

ML domain peptide family

We have categorized 10 full length abundant proteins as being part of the ML domain family that mapped to 5.8% of the N. neivai transcriptome (Table 9). We focused on the six most abundant members (JAV08576.1, JAV08588.1, JAV08582.1, JAV08586.1, JAV08585.1, JAV08572.1) for a comparative analysis representing 88.01% of this family representativity (Table 9). The MD-2-related lipid-recognition (ML) domain is implicated in lipid-mediated membrane binding mechanisms with a role in the execution and regulation of many cellular processes, including cell signaling and membrane trafficking62. The ML domain from the SMART database hints that this proteins may be involved in innate immunity or as an antagonist of lipid mediators of hemostasis and inflammation63.

Table 9 ML domain secreted proteins originating from the sialotranscriptome of Nyssomyia neivai.

This family is relatively common in tick sialomes63 but it had only been described in N. intermedia and B. olmeca sialomes so far27,29. N. neivai phylogenetic analysis indicates the presence of 3 subfamilies with the ML domain, with several likely events of gene duplication occurring in N. neivai (Fig. 5A). ML domain salivary proteins present in sand flies seems to be a very divergent family with few conserved amino acids across species (Fig. 5B); however, when comparing the clustered molecules within each of the clades we start to appreciate a higher degree of conservation(Fig. 5C), for example comparing the N. intermedia ML family sequence AFP99241.1 and B. olmeca (ANW11447) that share the same subtree with N. neivai JAV08582.1 in Fig. 5A, we observe 97.8% and 64.9% identity (Fig. 5C), respectively, likely hinting at possibly three independent proteins families within the ML domain.

Figure 5
figure 5

Molecular phylogenetic analysis and sequence alignment of ML-domain protein family of Nyssomyia neivai. (A) The evolutionary history was inferred by using the Maximum Likelihood method. The tree with the highest log likelihood is shown. The tree is drawn to scale, with branch lengths measured in number of substitutions per site. All positions with less than 95% site coverage were eliminated. Evolutionary analyses were conducted in MEGA7. (B) Multiple alignments of ML-domain from Nyssomyia neivai with South American sand flies ML-domain proteins using Muscle. Black shading represents identical amino acids, light gray shading represents similar amino acids. (C) Alignment of Nyssomyia neivai JAV08582.1, N. intermedia AFP99241.1 and B. olmeca ANW11447.1 from the ML-domain protein family using Muscle. Black shading represents identical amino acids, light gray shading represents similar amino acids.

Yellow protein family

N. Neivai salivary transcriptome encompasses eight members of the yellow salivary proteins corresponding to 5.1% of the total sialome. From these eight members three of them corresponds to 99.72% of the yellow family abundance (Table 10).

Table 10 Yellow-related secreted proteins originating from the sialotranscriptome of Nyssomyia neivai.

Yellow-related proteins are abundantly expressed in salivary glands of phlebotomies, mainly in Old World sand flies17,18,19,20,23,24,25,27,43,44. The Yellow family was the most abundant salivary protein family detected, also by next generation sequencing, on Phlebotomus kandelakii salivary glands (accounting to 31.7% of the mRNA on salivary glands)24 contrasting with the limited presence of this family in N. neivai (5.1%). Phlebotomine yellow-related proteins are characterized by having the major royal jelly protein domain (MRJP). Originally, MRJP proteins were described from honeybee larval jelly, making up to 90% of the protein content64. Sequences related to MRJP proteins were described in Drosophila, where it is related to cuticle pigmentation and, when mutated, it produced a yellow phenotype and thus named Yellow proteins65. It was later found that in Diptera they had a dopachrome oxidase function66,67.

In bloodsucking Diptera, salivary yellow-related proteins have only been described in sand flies (all sand fly species studied to date)17,18,19,20,23,24,25,27,43,44, and Glossina morsitans morsitans (Westwood 1851)68. The proteins of this family are immunogenic and host antibody responses to this protein can be a potential marker for sand fly exposure in experimentally bitten mice and dogs, as well as naturally exposed dogs, humans, and foxes3,69. Lu. longipalpis proteins, LJM11, LJM111 and LJM17, act as high affinity binders of pro-inflammatory biogenic amines such as serotonin, catecholamines, and histamine, suggesting that the proteins play a role for the reduction of inflammation during sand fly blood-feeding3,70, this activity has also been confirmed in salivary yellows from Old World P. orientalis and P. perniciosus71,72.

A combination of recombinant LJM17 and LJM11 successfully substituted Lu. longipalpis whole salivary gland homogenate in probing sera of individuals for vector exposure73. Yellow proteins are also under consideration for anti-Leishmania vector-based vaccines. LJM17 from Lu. longipalpis elicited leishmanicidal Th1 cytokines in immunized dogs74,75, and LJM11 protected laboratory animals against L. (L.) infantum (Nicolle 1908), L. (L.) major, and L. (V.) braziliensis70,76,77.

In contrast, mice immunized with P. papatasi yellow-related proteins PpSP42 or PpSP44 (AAL11052 and AAL11051, respectively) elicited Th2 cytokines and exacerbated L. (L.) major infection78. Other yellow-related proteins from P. papatasi, specifically PPTSP44 (AGE83095.1), induced a strong Th1 response constituting a potential vaccine candidates against leishmaniasis79. It remains to be elucidated whether the protection induced by yellow-related proteins is related to particular protein immunogenicity, to sand fly species, or to the vector-Leishmania host combination, as all of these factors can contribute to vaccine efficacy. New approaches using novel vaccine techniques, consisting of a single dose of plasmid, followed by two doses of recombinant Canarypoxvirus expressing Lu. longipalpis yellow-related salivary proteins, are a promising strategy to control Leishmania infection75.

Phylogenetic analysis segregated the New World sand fly yellow proteins in its own clade separated from VL and TL Old World sand fly yellow proteins (Fig. 6A). The New World sand fly yellow proteins clade branched out in two subclades, one with the presence of Lu. longipalpis LJM11(AAS05318.1) and LJM111(ABB00904.1) yellow proteins that clustered with N. neivai yellow-related protein JAV07960.1 and JAV07959.1 (Fig. 6A). Of note, we can observe that these two N. neivai yellows are closely related to the yellow from B. olmeca (ANW11468.1) (Fig. 6A). In the other subclade N. neivai yellow-related protein JAV07968.1 clustered closely with the N. intermedia (AFP99235.1) and with the Lu. longipalpis LJM17 (AAD32198.1) member (Fig. 6A). Multiple alignment of N. neivai, Lu. longipalpis, and N. intermedia (Fig. 6B) shows a high level of conservation among these proteins. For example, N. neivai JAV07960.1 and Lu. longipalpis LJM11(AAS05318.1) share 73.3% identity and 84.6% similarity). In the other hand, N. neivai yellow-related protein JAV07968.1 share a closer relationship with N. intermedia Linb-21 (AFP99235.1) highlighted by its 97.7% identity compared to 62.4% identity with Lu. longipalpis LJM17 (AAD32198.1).

Figure 6
figure 6

Molecular phylogenetic analysis and sequence alignment Yellow-related protein family of Nyssomyia neivai. (A) The evolutionary history was inferred by using the Maximum Likelihood method. The tree with the highest log likelihood is shown. The tree is drawn to scale, with branch lengths measured in number of substitutions per site. A discrete Gamma distribution was used to model evolutionary rate differences among sites [5 categories (+ G, parameter = 1.6114)]. All positions with less than 95% site coverage were eliminated. Evolutionary analyses were conducted in MEGA7. (B) Multiple alignments of Yellow-related protein from Nyssomyia neivai with Nyssomyia intermedia and Lutzomyia longipalpis Yellow-related protein using Muscle. Black shading represents identical amino acids, light gray shading represents similar amino acids.

Sand flies’ salivary proteins and pemphigus foliaceus

Sand fly salivary proteins have been associated with endemic PF pathogenesis in Brazil and Tunisia10,11,12,52,80,81,82. Screening of sera from pemphigus foliaceus patients from NSPS antibodies against Lu. longipalpis maxadilan11 while Mato Grosso do Sul State patients reacted to Lu. longipalpis LJM1113 and LJM1714. Similarly, Tunisian PF patients also reacted to several salivary proteins from P. papatasi80.

The mechanisms through which exposure to sand fly bites may induce an autoimmunity triggering production of IgG autoantibodies against Dsg1 in genetically susceptible individuals remains to be teased out. In fact, IgG antibodies against salivary homogenates from N. neivai correlated positively with IgG anti-Dsg1 in PF patients10. An antigenic cross-reactivity between salivary proteins and Dsg1 is the most plausible hypothesis for anti-Dsg1 autoantibodies production following sand fly bites. Nevertheless, both BLAST and PSIBLAST do not detect any highly significant homology between N. neivai, Lu. longipalpis or P. papatasi salivary proteins and Dsg114. Other pathogenic mechanisms that could explain the loss of Dsg1 self-antigen tolerance induced by exposure to sand fly salivary proteins remains to be tested.

The presence of an abundant maxadilan-simile transcripts in N. neivai (JAV08462.1) sialome may explain the antibody reactivity to maxadilan in NSPS patients, where N. neivai but no Lu. longipalpis is present9. Further testing with a N. neivai recombinant maxadilan can help confirm this assumption. Moreover, testing the PF sera from NSPS against recombinant yellow proteins identified in N. neivai would also be desirable in our PF casuistic.

Considering that non-homologous salivary proteins from different sand fly species were associated with PF pathogenesis10,11,12,13,14,81,82, we may expect that not a single peptide may act as an independent antigen to trigger PF. More than one protein may be involved in the PF pathogenesis considering the shared pharmacological properties and conformational mimotopes of these proteins in distinct biting sand fly species.

Conclusion

Leishmaniasis is still a frequent and neglected disease in Brazil. Our results add valuable data related to New World Phlebotomine salivary proteins, expanding the findings reported in Lu. longipalpis and N. intermedia sialomes. The availability of the identity of the most abundant N. neivai salivary proteins of the three main species of sand flies widely distributed in Brazil will bring new insights into the host-vector-parasite relationship of L. (L.) infantum and L. (V.) braziliensis infections and may point to targets of interest for a vector-based vaccine. We hope the availability of this compilation of N. neivai salivary proteins by abundance can inform researchers on the selection N. neivai candidates for future experiments. Production of distinct abundant N. neivai recombinant proteins can be used to test individual candidates for the etiology of PF as the trigger of anti-Dsg1 autoantibodies, and also used as biomarkers of vector exposure translating into monitoring tools for vector intervention campaigns.