Introduction

The olfactory system in insects regulates their intersex communication, host-plant interactions, oviposition, foraging, escape from predators and reproduction1,2,3,4,5. Insects have a complex chemosensory system in which pheromones and plant odors are initially recognized by odorant-binding proteins (OBPs) expressed in the antennal sensilla lymph that transfer the odorants to membrane-bound olfactory receptors (ORs) to activate olfactory receptor neurons (ORNs) and stimulate behavioral responses6,7,8,9,10,11.

Understanding the molecular mechanisms of olfaction is essential for better using olfactory-based pest management strategies and the development of novel strategies. OBPs are more accessible targets for research, considering they are small, soluble, stable and easier to manipulate and modify. OBPs are small water soluble proteins that have six positionally conserved cysteines to form three interlocking disulphide bridges that stabilize the protein’s three-dimensional structure12,13,14,15,16,17,18,19. OBPs were first discovered in the antenna of Antheraea polyphemus, where they distinguish and bind to lipophilic odorant compounds20,21,22,23,24,25. However, emerging data suggests that OBPs are not restricted to the sensory organs of insect and show expression in non-sensory organs including reproductive organs26,27. Li et al. showed that AaegOBP22 was highly expressed in the male reproductive organs of Aedes aegypti and transfers to females during mating. This suggests an additional function for this protein as pheromone carrier, analogously to vertebrates’ urinary and salivary proteins as well as some insect chemosensory proteins26. Sun et al. also found that HarmOBP10 and HassOBP10 is highly abundant in seminal fluid of Helicoverpa armigera and H. assulta and transfers to female during mating. HarmOBP10 and HassOBP10 also bind 1-dodecene, a known insect repellent27.

Athetis dissimilis Hampson (Lepidoptera: Noctuidae) is an important agricultural pest and mainly distributed in Asian countries including China, Japan, Philippines, Korea, Indonesia and India, causing serious damages to maize, wheat, peanut, soybean and sweet potato28,29,30. Because of the fact that larvae of A. dissimilis live under plant residues, it is difficult to control the spread of the pest with chemical pesticides. Therefore, novel control managements are urgently needed to mitigate crop damage. We first sequenced the antennal transcriptomes of A. dissimilis31 and characterized 5 OBPs that showed tissue-specific expression patterns32. Of note, AdisOBP6 was highly expressed in the testes of A. dissimilis32. We reasoned that the testis of insects possess a defined set of OBPs in a manner comparable to the antenna. In this study, we reanalyzed the previous antennal transcriptome data and identified 31 OBP genes. We also sequenced the transcriptomes of the A. dissimilis reproductive organs, and studied the expression of the OBPs in the antennae, testis and ovaries. Our study provides a new reference for studying the function of OBP genes.

Results

Illumina sequencing and assembly

A total of 34,565,866, 32,154,799, and 26,952,526 clean reads containing 10.35, 9.63, and 8.07 giga base (Gb) pairs of clean nucleotides respectively, were obtained from the three replicates of the A. dissimilis ovaries. A total of 27,752,168, 28,900,040, and 30,838,686 clean reads containing 8.29, 8.65 and 9.23 giga base (Gb) pairs of clean nucleotides respectively, were obtained from the three replicates of A. dissimilis testes. The quality of the transcriptome sequences was high, with Q30 percentages of 94.03%, 94.36%, 94.21%, 94.42%, 94.27% and 94.01% for the three replicates of A. dissimilis ovaries and testes, with a GC content of ~ 50% (Table 1). Then 221,074 transcripts and 82,016 unigenes with N50 length of 1350 and 1243 were obtained from assembled using Trinity (Table 2).

Table 1 Summary of the sequence assemblies according to the RNA-seq data of the A. dissimilis.
Table 2 Summary of de novo assembly of the A. dissimilis transcriptomes.

Functional annotation

Significant matches of 33,587 unigenes (96.91%) in the NR; 29,936 (86.38%) in the eggnog; 20,134 (58.09%) in the Pfam; 15,174 (43.78%) in the Swissprot database; 14,775 (42.63%) in the KEGG; 7797 (22.50%) in the GO; and 6712 (19.37%) in the COG were observed. As a result, up to 34,658 putative coding sequences were identified (Table 3). NR database queries revealed a high percentage of A. dissimilis sequences that closely matched to sequences of H. armigera (19,072, 56.87%), Amyelois transitella (1936, 5.77%), Bombyx mori (1543, 4.60%), Papilio machaon (1155, 3.44%), Papilio xuthus (868, 2.59%), Plutella xylostella (844, 2.52%), Danaus plexippus (634, 1.89%), Branchiostoma belcheri (473, 1.41%), and Papilio polytes (368, 1.10%) (Fig. 1).

Table 3 Functional annotation of the A. dissimilis transcriptomes.
Figure 1
figure 1

The Blastx results of Athetis dissimilis reproductive organs unigenes in NR database.

For GO analysis, 7797 unigenes (22.50%) could be assigned to three GO terms including: cellular components, molecular functions and biological process (Fig. 2). For the “molecular functions” ontology, catalytic activity (4227, 42.19%) and binding (3972, 39.64%) were most prevalent.

Figure 2
figure 2

Gene Ontology (GO) classifications of Athetis dissimilis reproductive organs unigenes according to their involvement in biological processes, cellular component and molecular function.

Identification of putative odorant-binding proteins

In the A. dissimilis antennal and reproductive organ transcriptome, we identified 54 candidate OBPs (Genbank accession number: KR780027–KR780030, MH900289–MH900338), 31 of which were from the antennae (through the analysis of previous A. dissimilis antennal transcriptomes) and 23 from the testis and ovaries transcriptomes of A. dissimilis (Table 4). A total of 44 AdisOBP sequences had full-length ORFs. Their cDNAs encoded protein of 131–293 amino acids with molecular weights of 11.6–33.2 kDa and isoelectric points of 4.44–9.74. Excluding 7 AdisOBPs (AdisOBP28, 30, 31, 35, 36, 41, 42, 52, 53 and 54) signal peptides were predicted at the N-terminus. AdisOBPs had 39–99% sequence homology with previously identified OBPs from other insect species, displaying a high level of sequence similarity. For example, AdisOBP13 had a 95% identity with Spodoptera exigua OBP9 (Table 4). There was 11.87% the lowest identity level in a pairwise comparison of AdisOBPs.

Table 4 The characteristic of candidate OBP genes in the antennae and reproductive organs of A. dissimilis.

Multiple sequence alignments of the A. dissimilis OBPs revealed the presence of expected conserved cysteines (Fig. 3). The phylogenetic tree of A. dissimilis and other lepidopteran OBPs constructed using the neighbor-joining method, indicated five clades that contained four possible subclass OBPs (Fig. 4). In addition, the tree showed low levels of clustering highlighting the diversity of the lepidopteran OBPs. Five AdisOBPs (AdisPBP1-3, GOBP1-2) belonged to PBP/GOBP. A total of 30 OBPs (AdisOBP2-3, 9, 11, 20–24, 26–32, 34–35, 37, 39, 42, 45–48, 50–54) were ‘Classic’ OBPs that contained six positionally-conserved cysteine residues. Seven OBPs (AdisOBP14-16, 18, 33, 36 and 41) belonged to ‘Plus-C’ subclass OBP genes with more cysteines in addition to those of the conserved motif. Nine OBPs belonged to ‘Minus-C’ subclass OBP genes with only four cysteines. Interestingly, AdisOBP1, AdisOBP17 and AdisOBP40 did not belong to any of the four subclass OBPs (Fig. 4). However, according to BLAST results these three genes were homologous with OBP genes of Bombyx mori, Spodoptera exigua and Dendrolimus punctatus (Table 4). The transcription abundance of A. dissimilis OBPs in antennae of female and males, ovary and testis are profiled in Fig. 5.

Figure 3
figure 3

Sequence alignments of Athetis dissimilis OBPs. The six conserved cysteine residues are indicated with the asterisks under the sequence.

Figure 4
figure 4

Phylogenetic relationships of candidate OBP proteins (including 5 OBPs identified in a previous study) from Athetis dissimilis and 33 Lepidoptera species.

Figure 5
figure 5

Heat map showing the abundance of unigenes encoding OBPs (including 5 OBPs identified in a previous study) in the Athetis dissimilis different tissues transcriptomes presented as normalized reads in reads per kilobase per million mapped reads (RPKM). In the figure each column represents 1 samples; each line represents 1 OBP gene. The color depth represents the number of reads contained in OBPs; red means more; blue means less. FA female antennae, MA male antennae, Ov ovaries, Te testis.

Expression of the OBPs in the antennae, ovaries and testis of A. dissimilis

Next, we measured the relative expression levels of the identified OBPs in different tissues of A. dissimilis via fluorescence qRT-PCR (Fig. 6). A total of 23 OBPs (AdisGOBP1-2, PBP1-3, OBP1-2, 8–9, 11, 17, 20–22, 24, 26–31, 50 and 54) were highly expressed in the antennae compared to the reproductive organs, including three OBPs (AdisPBP1, OBP17 and OBP26) that exhibited male-biased expression, 15 OBPs (AdisGOBP2, PBP2-3, OBP1-2, 11, 20–22, 27–28, 30–31, 50 and 54) that exhibited female-biased expression, and five OBPs (Adis GOBP1, OBP8-9, 24 and 29) that showed comparable expression in the male and female antennae of A. dissimilis.

Figure 6
figure 6

Expression profiles of the candidate OBPs in different tissues of Athetis dissimilis. FA female antennae, MA male antennae, Ov ovaries, Te testis. The standard errors are represented by the error bars; different lowercase letters (a–c) above the bars denote significant differences at p ˂ 0.05.

A total of 24 OBPs (AdisOBP3, 5, 15, 18–19, 23, 25, 33–41, 44–45, 47–49 and 51–53) were highly expressed in the testis of A. dissimilis compared to other tissues. The expression of the OBPs was low in the ovaries of A. dissimilis.

Discussion

In this study, we identified 31 novel OBPs through the analysis of A. dissimilis antennal transcriptomes, except for 5 AdisOBP genes identified in a previous study32. The number of OBPs in A. dissimilis antennae were similar to those in the antennal transcriptomes of S. litura (33)17 and S. littoralis (36)33 but more abundant than S. exigua (11)34, M. sexta (18)35 and H. armigera (26)36. We additionally sequenced the transcriptomes of A. dissimilis ovaries and testis. The alignments against the Nr database showed that 56.87% of the A. dissimilis unigenes were comparable to H. armigera sequences. A total of 23 OBPs were identified in the transcriptomes of A. dissimilis reproduction organs.

Based on sequence alignments and the cluster analysis of the phylogenetic trees, five PBP/GOBP genes, 35 Classic genes, 7 Plus-C genes and 9 Minus-C genes were obtained from the A. dissimilis antennal library. These results were similar to the classifications of most insect OBPs17,27,37. Interestingly, AdisOBP1, AdisOBP17 and AdisOBP40 could not be clustered into any subfamilies, and multiple sequence alignments of all AdisOBP genes revealed that the three OBPs contain no conserved cysteines. The phylogenetic tree supports a highly dynamic evolutionary process for the lepidopteran OBP family and a high degree of OBP sequence divergence. The diversification of OBPs might be the result of multiple and late independent gene duplications. In addition, they might be derived from a common ancestor and later diverged into different subfamilies by different selection pressures, which has been evidenced by evolutionary selection analysis in several insect species38,39,40.

OBPs are expressed specifically in the antennae and other parts associated with olfactory organs15,19,31,41,42,43. Our comprehensive expression analysis revealed that 23 AdisOBPs were found to be restricted to the antenna. It is worth noting that only 3 AdisOBPs had male-biased expression pattern in the antennae, suggesting that females require more abundant OBPs for spawning. It is interesting to note 24 AdisOBPs showed significant expression in the testis of A. dissimilis compared to other tissues, but the expression of AdisOBPs in the ovaries was low. The expression of OBPs in reproduction has also been reported in some literature44,45,46. It was previously speculated that OBPs expressed in the testis deliver compounds to the females during mating26,27. Hence, it is understandable to presume that such stable proteins could be used in the testis of insect where there is need for transportation of hydrophobic molecules in aqueous media or protection of chemicals from degradation, as well as to assure a gradual release of semiochemicals in the environment. So these proteins have been named for ‘‘encapsulins”, to imply the common role of encapsulating small ligands47. qRT-PCR was conducted on 53 candidate genes, and the expression level of most genes were consistent with the variation of RPKM values.

Like the OBP families of insect antennae, insect testes contain a large number of OBP genes. The functions of these genes is unclear, and they need us to further study. Our results provide a reference for the study of these genes.

Materials and methods

Insect rearing and sample preparation

The A. dissimilis strain was collected from Luoyang (province of Henan, China) corn fields (112° 26′ E, 34° 43′ N) in 2014 and maintained at the Henan Science and Technology University. Colonies were reared on an artificial diet at 25 ± 1 °C, 80 ± 5% relative humidity and a 16-h/8-h light/dark cycle.

Based on preliminary data, we found that the A. dissimilis sperm and eggs began to mature 3 days after emergence. We respectively collected the ovaries and testes of 3-day old virgin females and male adults (n = 40 per treatment) from three biological replications. Dissections were performed in sterile PBS-DEPC and immediately frozen in liquid nitrogen until RNA isolation.

cDNA library preparation and sequencing

Total RNA from the A. dissimilis ovaries and testis tissues were extracted using RNAiso Plus kit (TaKaRa, Dalian, China) and treated with DNase I (TaKaRa, Dalian, China) as per the manufacturer’s protocols. RNA was assessed through 1% agarose gel electrophoresis and Nanodrop 2000 (Thermo Scientific, Waltham, MA, USA), Qubit 2.0 (Life Technologies, Carlsbad, CA, USA) and Agilent 2100 (Agilent, Santa Clara, CA, USA) analysis.

Following the TruSeq RNA Sample Preparation Guide v2 (Illumina, San Diego, CA, USA), mRNA was enriched using magnetic beads crosslinked with Oligo (dT). Enriched RNA was then fragmented using fragmentation buffer and first-strand cDNA synthesis was used to produce small mRNA fragments, random primers, reverse transcriptase, and second-strand cDNA synthesis through the addition of dNTPs, DNA polymerase I, and RNase H. Double-stranded cDNA was purified with AMPure XP beads (Beckman Coulter, Brea, CA, USA) and treated to repair ends, remove poly(-A) tails, and link sequencing adapters. Fragment sizes were selected using AMPure XP beads and cDNA libraries were constructed through PCR amplification (Veriti 96-Well Thermal Cycle, Applied Biosystems, Foster City, USA). The concentration and insert size of the cDNA libraries were detected using Qubit 2.0 and Agilent 2100 and quantified via q-PCR (CFX-96, Bio-Rad, Hercules, CA, USA).

Finally, sequencing was performed using the Illumina HiSeq 4000 platform to generate 150-bp paired-end reads. Sequencing analyses were performed by the Genomics Services of the Beijing Biomarker Technologies Co., Ltd. (Beijing, China). Raw data processing and base calling were performed using Illumina software.

Assembly and functional annotation

Raw data (raw reads) in the FASTQ format were first modified into clean data (clean reads) through Perl scripts. This was performed through the removal of reads containing adapter sequences, > 10% unknown nucleotides and quality values ≤ 20. The Q20, Q30, and GC content were then calculated using high-quality data.

Transcriptomes were assembled using Trinity (version trinityrnaseq_r20131110) with default settings, except for min_kmer_cov set to 248. Unigene functions were annotated based on NCBI non-redundant protein sequences (NR, NCBI blast 2.2.28+, e-value = 1e−5), NCBI nucleotide sequences (NT, NCBI blast 2.2.28+, e-value = 1e−5), Protein family (Pfam, HMMER 3.0 package, hmmscan, e-value = 0.01), eukaryotic Ortholog Groups (KOG, NCBI blast 2.2.28+, e-value = 1e−3), SwissProt (NCBI blast 2.2.28+, e-value = 1e−5), the Kyoto Encyclopedia of Genes and Genomes (KEGG; KEGG Automatic Annotation Server [KASS], e-value = 1e−10) and Gene Ontology (GO, Blast2GO v2.5, e-value = 1e−6). Coding sequences (CDS) were predicted through aligning transcriptome sequences to the Nr and Swiss-Prot database or using estscan 3.0.349. The read count for each gene was obtained by mapping clean reads to the assembled transcriptome using RSEM (bowtie2 parameters: mismatch 0). The final read count was calculated as Fragments Per Kilobase of transcript per Million mapped reads (FPKM)50.

Sequence and phylogenetic analysis

Sequence similarities were assessed using the NCBI-Blast network server (http://blast.ncbi.nlm.nih.gov/). The signal peptides of OBPs were predicted using SignalP 4.1 (http://www.cbs.dtu.dk/services/SignalP/)51. Multiple sequence alignments were assessed using DNAMAN 6.0. Sequence alignments of the candidate OBPs were performed using ClustalX 2.152 and used to construct phylogenetic trees with PhyML in Seaview v.4 based on the Jones–Taylor–Thormton (JTT) model with nearest-neighbor interchanges. Trees were viewed and edited using FigTree v.1.3.1. Amino acid sequences of OBPs in phylogenetice tree were listed in Supplementary File 1.

Expression analysis through quantitative real-time polymerase chain reaction

Male antennae (100), female antennae (100), ovaries (80) and testes (150) tissue from adults at 3 post-eclosion were excised and frozen in liquid nitrogen. Total RNA was extracted using RNAiso Plus kits (TaKaRa, Dalian, China) and isolated RNA was transcribed to first-strand cDNA using PrimeScript RT reagent with gDNA Eraser (TaKaRa, Dalian, China) following the manufacturer’s protocols. Real-time quantitative PCR (RT-qPCR) was performed with SYBR® Premix Ex Taq II (TaKaRa). The A. dissimilis GAPDH gene was used as an endogenous control to correct for sample-to-sample variations. A 200 ng/μL cDNA sample was used for per tissue. Primers were designed using Primer Premier 5.0 software and are listed in Supplementary File 2. RT-qPCR reactions contained: 10 μL of SYBR Premix Ex Taq II, 20 ng of cDNA template, 0.2 μM of each primer and nuclease-free water. The cycling conditions were 1 cycle of 95 °C for 5 min, followed by 40 cycles of 95 °C for 5 s and 55 °C for 30 s. Melt curve conditions were 95 °C for 10 s and 65 °C for 30 s. No-template controls (NTC) were included to detect possible contamination. Three biological replicates were analyzed and the relative expression of the OBP genes was measured using the 2−∆∆CT method53. Expression was calculated relative to levels in the female antennae, which were arbitrarily set to 1. Differences in the expression of AdisOBP genes between the different tissues were compared using a one-way nested analysis of variance (ANOVA), followed by a Tukey’s honestly significance difference (HSD) test using SPSS (SPSS Institute 17.0, SPSS Inc, Chicago, IL, USA).