Introduction

The yellow peach moth Conogethes punctiferalis (Guenée) is a kind of multivoltine and polyphagous insect pest, distributed in the south eastern Asia and Australia1,2. The adult female feeds, oviposits and develops primarily in buds and fruits of peach, plum, chestnut, maize and sunflowers2. After hatching, larvae remain within the reproductive structures of the host plant and use them as food sources and a protected habitat to complete their life cycle. The endophytic behavior of larvae makes this insect difficult to control with conventional insecticides and other cultural practices. Thus, new methods to monitor C. punctiferalis population outbreaks and to achieve pest control have been initiated3,4,5. For example, sex pheromone composites of C. punctiferalis have been analyzed, synthetized and made into lure to attract male moths and disrupt their mating in fields1,5,6,7. At the same time, attention has been given to host plant volatiles usable to synergize response to sex attractant pheromone in the yellow peach moth4,8.

In insects, chemosensation serves to detect and react to environmental chemical cues, in virtually every aspect of their life cycle9,10. Olfaction, as a kind of chemosensation, is critical to food source identification, predator avoidance, oviposition site selection, kin recognition, mate choice and toxic compound avoidance. It is, thus, an attractive target for pest control, for example, several olfactory-based strategies including mass trapping and mating disruption have been developed to control moth populations11. Better knowledge on the molecular mechanisms by which an odor generates a neuronal signal could lead to the identification of targets for the development of new control strategies.

Antennae are the primary olfactory sensor of insects and their cuticular surface is covered with several different types of small sensory structures, named sensilla, in which olfactory receptor neurons extend dendrites into the antennal lymph where peripheral olfactory signal transduction events occur. Previous studies reported diverse olfactory proteins, including odorant-binding proteins (OBPs), odorant receptors (ORs), chemosensory proteins (CSPs), sensory neuron membrane proteins (SNMPs), ionotropic receptors (IRs) and odorant degrading enzymes (ODEs) involved in different odor perception steps in signal transduction pathway12,13,14. OBPs are widely engaged in the initial biochemical recognition steps in insect odorant perception and play a key role in transporting hydrophobic odorants across the sensillum lymph to the ORs15,16. Recently, OBPs have attracted the attention of many researchers17,18,19. OBP family notably includes two sub-families: the pheromone-binding proteins (PBPs), transporting pheromone molecules and the general odorant-binding proteins (GOBPs), transporting general odorants such as plant volatiles20,21. As to the procedures of olfaction transmission, the volatile hydrophobic molecules are firstly bound by the sensilla-enriched binding proteins (OBPs and CSPs) to cross the aqueous sensillum lymph that embeds the olfactory neuron dendrites, thus interacting with the membrane-bound chemosensory receptors (ORs and IRs) located in the dendritic membrane of receptor neurons22. The chemical signal is then transformed into an electric signal that is transmitted to the brain. Sensory neuron membrane proteins (SNMPs), located in the dendritic membrane of pheromone sensitive neurons, are thought to trigger ligand delivery to the receptor. Subsequently, signal termination may then be ensured by ODEs14,21,23,24,25.

Lepidopteran species have been widely used as models of insect olfaction because of their highly specific and sensitive olfactory senses and complex olfactory behaviors. The emergence of next-generation sequencing (high-throughput deep sequencing) technology has dramatically improved the efficiency and quantity of gene annotation26. Similarly, application of the high-throughput sequencing technology in the field of entomological research has greatly promoted its progress27,28,29. Recently, studies on antennal transcriptomes have led to the identification of olfactory-related genes in several moth species18,19,28,30,31,32, which demonstrated the power of transgenomic strategies for olfactory gene identification. However, in C. punctiferalis, only two olfactory-related genes (CpunOrco and CpunPBP1) with their expression profiling were reported to date33,34. Hence, little is known about the function of olfactory genes of C. punctiferalis, due to the deficiency of the genomic data for this species.

In this study, we used next generation sequencing (NGS) to gain insights into the complexity of the antennal transcriptome and to identify genes related to chemosensation of C. punctiferalis. We also report the results from gene ontology (GO) annotation as well as sets of putative OBPs, ORs and IRs in C. punctiferalis. Moreover, using real-time quantitative-PCR (RT-qPCR), we screened all the annotated olfactory genes from C. punctiferalis antennal transcriptomes. The results will be the basis for further studies of the olfactory mechanisms of C. punctiferalis and to select some of the olfactory genes that may be used as targets in management programs of this destructive insect pest.

Results and Discussion

Sequence analysis and assembly

Two non-normalized cDNA libraries (SRR2976624 and SRR2976631) of the male and female C. punctiferalis antennae were constructed. After a trimming of adaptor sequences, contaminating or low quality sequences, 70.3 and 74.2 million clean-reads comprised of 8.88 and 9.34 gigabases were generated from male and female antennae respectively and remained for the following assembly.

All clean reads from male and female antennae were assembled and a total of 47,109 unigenes were generated. The transcript dataset was 41.82 mega bases in size and with a mean length of 887.83 bp and N50 of 1,808 bp. Among these unigenes, 19,765 (41.96%) were longer than 500 bp and 12,129 (25.75%) were longer than 1 kb (Fig. 1 and Table 1). Compared with the published Lepidoptera antennal transcriptomes, especially the two Crambidae species Chilo suppressalis (66,560 unigenes, mean length 761bp, N50 1,271bp)35 and Ostrinia furnacalis (37,687 unigenes, mean length 818bp, N50 1,022bp)18, the assembly quality of our transcriptome was qualified and even better than most of these transcriptomes. These results further demonstrated the effectiveness of Illumina sequencing technology in rapidly capturing a large portion of the transcriptome and provided a sequence basis for future studies, such as rapid characterization of a large portion of the transcriptome and better reference of the genes of interest36. The assembled sequences have been deposited in the NCBI Transcriptome Shotgun Assembly (TSA) Database with the title as BioProject: PRJNA304355 and accession numbers GEDO01000001 to GEDO01000068.

Table 1 An overview of the sequencing and assembly process.
Figure 1
figure 1

The size distribution of the assembled unigenes from Conogethes punctiferalis male and female antennal transcriptome.

A total of 47,109 unigenes were generated. Among which, 19,765 (41.96%) were longer than 500 bp and 12,129 (25.75%) were longer than 1 kb. The x-axis represents the unigene length (bp) and the y-axis represents the number of unigenes.

Functional annotation of the C. punctiferalis antennal unigenes

The unigenes were annotated by aligning with the deposited ones in diverse protein databases including the National Center for Biotechnology Information (NCBI) non-redundant protein (nr) database, the Kyoto Encyclopedia of Genes and Genomes (KEGG), the UniProt/Swiss-Prot, Gene Ontology (GO), Cluster of Orthologous Groups of proteins (COG) and the UniProt/TrEMBL databases, using BLASTx with a cut off E-value of 10−5 (Table 2). The analyses showed that a total of 18990 unigenes (40.31%) were successfully annotated in all above-mentioned databases. Of which, 18924 unigenes (40.17%) had significant matches in the nr database, followed by 11489 unigenes (24.39%) in the Swiss-Prot database. However, 28119 unigenes (59.69%) were unmapped in these databases. The higher percentage of sequences without annotation information could be attributable to the insufficient sequences in public databases for phylogenetically closely related species to date37. For example, in the two published Crambidae antennal transcriptomes, the ratio of the unigenes annotated in nr database in C. suppressalis35 and O. furnacalis18 was 45.4% and 41.2% respectively, similar to the results in present study. On the other hand, short reads obtained from sequencing would rarely be matched to known species because the significance of the BLAST comparison depends in part on the length of the query sequence37. In the present study, more than one third (36.65%) unigenes were shorter than 300 bp, which might be too short to allow for statistically meaningfully matches. As to sequences longer than 1 kb, the annotation rate was 76.08%, whereas for sequences longer than 300 bp, the percentage decreased to 52.95% (Tables 1 and 2). In addition, the low annotated percentage might be due to non-conserved areas of proteins where homology is not detected38,39. For example, the 5′ ends of genes generally show less sequence conservation than the body40. Therefore, partial transcripts, especially unigenes representing the 5′ CDS, may not find matches in the various databases.

Table 2 Functional annotation of the Conogethes punctiferalis.

For GO analysis, a total of 10411 unigenes (22.10%) could be assigned to three ontologies, including biological process ontology, cellular components ontology and molecular function (Fig. 2). In biological process ontology, the “metabolic process” and “cellular process” were most represented, with 5853 (22.64%) and 5651 (21.85%) unigenes, respectively. In the cellular component ontology, the terms were mainly distributed in cell (3337 unigenes, 19.59%) and cell part (3364 unigenes, 19.74%). In the molecular function ontology, the terms binding functions (5546 unigenes, 41.27%) and catalytic activity (5120 unigenes, 38.10%) were the most represented. These results were also similar to those found in the antennal transcriptomes of Manduca sexta30, Spodoptera littoralis41 and Agrotis ipsilon42.

Figure 2
figure 2

Functional annotation of assembled sequences based on gene ontology (GO) categorization.

GO analysis was performed at the level two for three main categories (cellular component, molecular function and biological process).

In addition, all unigenes were subjected to a search against the COG database for functional prediction and classification (Fig. 3). As result, a total of 5076 unigenes with hits in the nr database could be assigned to COG classification and divided into 25 specific categories. The category of “general function prediction”, similarly to that found in Dialeurodes citri36, was also the largest group (1517 unigenes, 29.89%), followed by the classification of “replication, recombination and repair” (785 unigenes, 15.46%). The categories of “cell motility” (11 unigenes, 0.22%) and “nuclear structure” (3 unigenes, 0.06%) were the smallest groups.

Figure 3
figure 3

Cluster of orthologous groups (COG) classification.

In total, 5076 of the 47109 unigenes with non-redundant database hits were grouped into 25 COG classifications.

The unigenes metabolic pathway analysis was also conducted using the KEGG annotation system. This process predicted a total of 197 pathways, which represented a total of 4931 unigenes.

Identification of olfactory genes and analysis of differentially expressed genes

A total of 68 olfactory genes, including 15 OBPs, 46 ORs and 7 IRs, were identified from antennal transcriptome of C. punctiferalis. Analysis of gene expression differences at a single time indicated that the antennal transcriptomes of male and female C. punctiferalis were different, mainly distributed in the expression of 1308 genes. Using female antennae as the reference standard, we found 759 up-regulated genes and 549 down-regulated genes. Among which, 3 OBPs (OBP4, OBP8, PBP2) and 4 ORs (OR22, OR26, OR44, OR46) are male antennae-specific expression, whereas 4 ORs (OR5, OR16, OR25, OR42) are female antennae-enriched expression.

Candidate odorant binding proteins in the C. punctiferalis antennae

In the antennal transcriptome of C. punctiferalis, a total of 15 OBP genes, including four pheromone binding proteins (PBPs), two general odorant binding proteins (GOBPs) and one antennal binding protein (ABP) were identified (Table 3). The BLASTx results indicated that all of these 15 identified CpunOBPs shared a typical structural feature of OBPs (i.e. having typical six conserved cysteins) with other insects43 and twelve of them shared relatively high amino acid identities (62–91%) with Lepidoptera OBPs at NCBI. Thirteen of these presented intact ORFs with lengths ranging from 384 bp to 837 bp and the other two genes, CpunPBP1 and CpunABP, were represented as partial ORFs with length 483 bp and 432 bp, respectively.

Table 3 Candidate OBP genes in Conogethes punctiferalis antennae.

Among the 15 putative OBP genes in the C. punctiferalis antennal transcriptome data, the gene of CpunPBP1 has been reported in our previous study34, but the remaining 14 CpunOBPs are reported here for the first time. The number of C. punctiferalis OBPs was less than those identified from the antennal transcriptome of Bombyx mori (44)17, Helicoverpa armigera (26)44, Dendrolimus houi (23)45, O. furnacalis (23)18 and Spodoptera litura (38)19, but comparable with those identified in M. sexta (18)30 and more than those identified in Spodoptera exigua (11)46. Since we used the same methods and technologies reported for previously cited papers we hypothesized the possible reasons of the small number of OBPs identified in C. punctiferalis in actually less number of OBPs than other caterpillar or that some OBPs may be larvae-biased ones, some species-specific ones and some ones that low expressed in antennae. For example, some of the genes might be expressed only in the larva47,48.

The RPKM value analysis revealed that 12 OBP genes (OBP2, OBP5, OBP6, OBP7, OBP8, PBP1, PBP2, PBP3, PBP4, GOBP1, GOBP2 and ABP) were highly expressed in both male and female antennal transcriptomes (RPKM value much higher than 100). The other 3 OBP genes (OBP1, OBP3 and OBP4), however, showed a relative low expression level (RPKM ranged from 0 to 8). Six OBPs (OBP4, OBP7, OBP8, PBP2, PBP3 and PBP4) showed a higher RPKM in the male antennae than in the female antennae (about 1 to 20 times) (Table 3).

Furthermore, RT-qPCR analysis was performed to compare the accurate quantitative expression levels of these OBP genes among different tissues between sexes (Fig. 4). The results indicated that three OBPs (OBP4, OBP8 and PBP2) were significantly overexpressed in male antennae and have male antennae-specific expression, which suggests that these OBPs may play essential roles in the detection of sex pheromones. Comparatively, the expression of 2 GOBPs (GOBP1, GOBP2) in female antennae were almost twice to three times higher than those in male antennae) (Table 3, Fig. 4), which suggests that these OBPs may play important roles in the detection of general odorants such as host plant volatiles. Especially, three OBPs (OBP5, PBP1 and ABP) showed somewhat higher RPKM in the female antennae than in the male antennae (Table 3), lack concordance with the results of RT-qPCR (Fig. 4), which maybe the sequencing depth of Hiseq2500 is not good enough, or may need more repetition to further test in the future study.

Figure 4
figure 4

Conogethes punctiferalis OBP transcript levels in different tissues as measured by RT-qPCR.

MA: male antennae; FA: female antennae; MB: male body with antennae cut off; FB: female body with antennae cut off. The internal controls β-actin was used to normalize transcript levels in each sample. The standard error is represented by the error bar and the different letters (a–c) above each bar denote significant differences (p < 0.05).

In addition, the RT-qPCR results showed that all of the 15 C. punctiferalis OBPs were significantly overexpressed in the antennae compared with the bodies (P < 0.05) (Fig. 4). The result of high expression in antennae was not only concordant with that from RPKM values in present study, but also same as that in Anopheles gambiae10, H. armigera44, Ips typographus and Dendroctonus ponderosae49, Ag. Ipsilon42 and Sp. Litura19. For the body parts with antennae cut off, no significant difference appeared between male and female OBP gene expression levels, excepting OBP7 and PBP1 significantly overexpressed in the male body, whereas ABP overexpressed in female body. Up regulation in antennae indicate their participation in moth olfaction during attraction to the host plants and may offer targets for disrupting this activity.

A neighbor-joining tree of 126 OBP sequences was built from six different Lepidoptera species, including C. punctiferlis, O. furnacalis, B. mori, H. armigera, Ag. ipsilon and Sp. exigua (Fig. 5). The OBP trees indicated that the six Lepidoptera species were extremely divergent; however, the GOBPs (GOBP1 and GOBP2) were highly conserved among different species. All PBPs, GOBPs and OBPs from C. punctiferlis were grouped into corresponding branches except CpunPBP3 clustered with OBP group. No evident specific expansion of OBP lineages was found except CpunOBP5 and CpunOBP7 were grouped together.

Figure 5
figure 5

Neighbor-joining dendrogram based on protein sequences of candidate odorant binding proteins (OBPs).

The protein names and sequences of OBPs used in this analysis are listed in Supplementary Table 3.

Candidate olfactory receptors in the C. punctiferalis antennae

In the process of recognizing smells, insect ORs are the most important players in sex pheromone and general odorant detection. In this research, the OR candidates from the C. punctiferalis antennal transcriptomes were identified carefully and a total of 46 ORs (including the full-length or almost full-length OR candidates) were submitted for further analysis. Of which, ten ORs (OR2, 10, 17, 19, 21, 22, 23, 25, 30 and 32) had intact ORF, whereas the other 36 ORs were represented as partial open reading frames. In addition, 45 of these submitted 46 ORs were first report in C. punctiferalis and identified as typical ORs, whereas one OR (OR23) has been reported and was identified as atypical coreceptor33 (Table 4). The number of C. punctiferalis ORs identified in this study was comparable with the numbers identified in M. sexta (47)30, H. armigera (47)44 and Ag. ipsilon (42)42 and more than those identified in Sesamia inferens (39)32, Dendrolimus houi (33) and Dendrolimus kikuchii (33)45, but less than those identified in B. mori (72)17 and O. furnacalis (56)18. Considering that those OR candidates with partial ORFs were discarded in the present study, we speculated that more ORs may be identified in the future.

Table 4 Candidate OR genes in Conogethes punctiferalis antennae.

The RPKM value analysis revealed that the ORco (OR23) had the highest expression level among the 46 ORs, with RPKM value of 320 and 531 in the male and female antennae, respectively. The other 45 typical ORs, however, showed a relative low expression level (RPKM ranged from 0 to 233) compared with the ORco (OR23) and OBP genes. In detail, five ORs (OR17, OR22, OR26, OR44 and OR46) showed a higher RPKM in the male antennae than in the female antennae (more than 10 times), whereas OR16 and OR42 showed opposite results, with RPKM from the male antennae almost 20 times lower compared to female antennae (Table 4). The RT-qPCR results indicated that ORco (OR23) had a significant higher expression level in the antennae than in the bodies of C. punctiferalis, which was concordant with previous results33. Moreover, 4 ORs (OR22, OR26, OR44 and OR46) have a male antennae-specific expression, whereas other 4 ORs (OR5, OR16, OR25 and OR42) have a female antennae-enriched expression (Fig. 6). This male-biased transcription also appears to be retained among the B. mori orthologs OR3, 4, 5 and 650. Comparative genomic analyses suggested that male-biased expression and female pheromone receptor function is retained in OR subfamily in B. mori and female-biased transcription of OR gene family members is predicted among transcripts in both B. mori50,51 and O. furnacalis18.

Figure 6
figure 6

Conogethes punctiferalis OR transcript levels in different tissues as measured by RT-qPCR.

MA: male antennae; FA: female antennae; MB: male body with antennae cut off; FB: female body with antennae cut off. The internal controls β-actin was used to normalize transcript levels in each sample. The standard error is represented by the error bar and the different letters (a–c) above each bar denote significant differences (p < 0.05).

A neighbor-joining tree of 130 OR sequences was built from three different Lepidoptera species, including C. punctiferlis, B. mori and O. furnacalis (Fig. 7). The ORco (OR23) was clustered with other Lepidoptera ORco sequences (OfurOR2). Most ORs from C. punctiferlis and O. furnacalis appear in pairs on the dendrogram, according with the fact that they belong to the same family of Crambidae. Especially to be mentioned, four male-biased ORs (OR22, OR26, OR44 and OR46) were clustered together with OfurOR4 and OfurOR6, which were suggestive of a functional role in male pheromone response18. However, the female-biased ORs (OR5, OR16, OR25 and OR42) were stretched in different branches. Given that several B. mori female-biased ORs are capable to respond to host plant volatiles51,52, it is conceivable that C. punctiferalis orthologs may have retained similar functions, but further studies are required to investigate any potential evolutionary conservation of function. However, based on the different expression profiles of these ORs in male and female antennae, we suggest that these male antennae-enriched expressed ORs are involved in sex pheromone detection, whereas female antennae-enriched expressed ORs play important roles in locating suitable host plants and oviposition sites.

Figure 7
figure 7

Neighbor-joining dendrogram based on protein sequences of candidate odorant receptor proteins (ORs).

The protein names and sequences of ORs used in this analysis are listed in Supplementary Table 4.

Candidate ionotropic receptors in the C. punctiferalis antennae

IRs were recently discovered as another class of receptors involved in chemoreception53. Since IRs have been identified throughout protostome lineages, they belong to an ancient chemosensory receptor family54. To date, 15 IRs in Cy. Pomonella28, 24 IRs in Ag. Ipsilon42 and 12 IRs in H. armigera44 have been identified. In the present study, 7 IR genes were first identified from the C. punctiferalis antennal transcriptomes. Among these, two IRs (IR2 and IR6) had intact ORF, whereas the other 5 candidate IRs were represented as partial ORFs. The BLASTx results indicated that all of these 7 identified CpunIRs shared relatively high amino acid identities (67–81%) with Lepidoptera IRs at NCBI (Table 5). Compared with the number of IRs in above mentioned three species, the scarcity of divergent IRs in C. punctiferalis antennal transcriptomes may due to some IRs only expressed in other tissues. For example, the expression of divergent IRs was detected only in gustatory organs in Drosophila melanogaster53,54. It is generally reported that in insects, the antennal IR subfamily constitutes only a portion of the total number of IRs49. In particular 15 D. melanogaster IRs53, 10 H. armigera IRs44 and 7 S. littoralis IRs55 were expressed exclusively in the antennae.

Table 5 Candidate IR genes in Conogethes punctiferalis antennae.

The RPKM value analysis revealed almost no differences between male and female IRs, which was validated by RT-qPCR results (Table 5, Fig. 8). Therefore we speculated that the IRs were relatively highly conserved. Similarly to the ORs, the RPKM value analysis revealed that all of the 7 IRs showed a relative low expression level (RPKM value ranged from 1 to 54) compared with the OBPs. Our RT-qPCR results also indicated that all of the 7 C. punctiferlis IRs were highly expressed in the antennae. The antennae-enriched IRs may play important roles in odorant detection. The IR tree from four lepidopteran insects was similar to that from ORs, with most of IRs from C. punctiferlis and O. furnacalis appearing in pairs on the dendrogram, concordant with the fact that they belong to the same family of Crambidae (Fig. 9).

Figure 8
figure 8

Conogethes punctiferalis IR transcript levels in different tissues as measured by qRT-PCR.

MA: male antennae; FA: female antennae; MB: male body with antennae cut off; FB: female body with antennae cut off. The internal controls β-actin was used to normalize transcript levels in each sample. The standard error is represented by the error bar and the different letters (a–c) above each bar denote significant differences (p < 0.05).

Figure 9
figure 9

Neighbor-joining dendrogram based on protein sequences of candidate ionotropic receptors (IRs).

The protein names and sequences of IRs used in this analysis are listed in Supplementary Table 5.

Conclusion

Olfaction is a primary sensory modality in insects. In the present study we performed a comprehensive analysis of the antennal transcriptome of C. punctiferalis. As a result, three major gene families (OBPs, ORs and IRs) that encode olfactory-related proteins were annotated for the first time and their expression levels were measured based on the transcriptomic data and validated by RT-qPCR. The expression profile analysis revealed that 15 OBPs, 46 ORs and 7 IRs are uniquely or primarily expressed in the male and female antennae. The results from the present study will be fundamental for future functional studies of olfactory-related genes in C. punctiferalis. Connection of the molecular information presented here and the available chemical and ecological knowledge will clarify the olfactory mechanisms of C. punctiferalis and provide new targets for pest management in the future.

Materials and Methods

Insect rearing and tissue collection

The mature larvae of C. punctiferalis were collected from cornfields of the Agricultural Experiment Station of Beijing University of Agriculture on October 9th, 2009 and the insects had been maintained for about 25 generations on maize in climate incubators (RTOP-B, Zhejiang Top Instrument Co., Ltd.) at 23 ± 1 °C, RH 75 ± 2%, 16L/8D photoperiod and 3500 lux light intensity. Adult moths were provided with 5–8% honey solution after emergence2. Antennae were excised from 3-days-old male and female moths, frozen immediately and stored in liquid nitrogen until use.

RNA extraction

200 antennae from each sex were pooled for total RNA extraction using RNeasy Plus Mini Kit (Qiagen GmbH, Hiden, Germany) following the manufacturer’s instructions. During which, the DNA could be eliminated automatically. The quantity and concentration of RNA samples were determined using 1.2% agarose electrophoresis and a Qubit® RNA Assay Kit in a Qubit® 2.0 Fluorometer (Life Technologies, CA, USA), respectively. The integrity of RNA samples was assessed using a RNA Nano 6000 Assay Kit of the Bioanalyzer 2100 system (Agilent Technologies, CA, USA).

cDNA library construction and sequencing

Firstly, mRNA was purified from total RNA using Oligo (dT) magnetic beads. mRNA was fragmented in fragmentation buffer into 200–700 nucleotides sections. The first cDNA was synthesized using random hexamer primer with the fragmented mRNA as templates. Second–strand cDNA were synthesized using DNA Polymerase I, dNTPs and RNaseH (Invitrogen, Carlsbad, CA, USA). Short fragments were purified using QiaQuik PCR Extraction Kit (Qiagen, Hilden, Germany) and eluted with ethidium bromide (EB) buffer for end-repair, poly (A) addition, then linked to sequencing adapters. The suitable fragments, as judged by agarose gel electrophoresis, were selected as templates for PCR amplification. The cDNA library of C. punctiferalis was sequenced on Illumina HiSeq™ 2500 using PE125 technology in a single run by Beijing Biomake Company.

Sequence analysis and assembly

The raw reads were cleaned by removing adapter sequences, low-quality sequences (reads with ambiguous bases “N”) and reads with >10% Q < 20 bases. Cleaned reads shorter than 60 bases were removed because short reads might represent sequencing artifacts56. The quality reads were assembled into unigenes using short reads assembling program Trinity (Trinityrnaseq_r2013-11-10)57.

Functional annotation

The assembled sequences were annotated using BLASTn (version 2.2.14) with an E-value < 10−5 and BLASTx (E-value < 10−5) programs against the NCBI nr database58,59. To annotate the assembled sequences with GO terms, the Swiss-Prot BLAST results were imported into BLAST2GO, a software package that retrieves GO terms, allowing gene functions to be determined and compared60. The COG database was also used to predict and classify functions of the unigene sequences61. Kyoto Encyclopedia of Genes and Genome (KEGG) pathways were assigned to the assembled sequences using the online KEGG Automatic Annotation Server (KAAS) used to determine pathway annotations for unigenes62. Finally, the best matches were used to identify coding regions and to determine the sequence direction.

Olfactory genes identification and phylogenetic analyses

All candidate OBPs, ORs and IRs were manually checked by the BLASTx program at the National Center for Biotechnology Information (NCBI). For contigs with hits against genes of interest, open reading frames (ORFs) were identified and the annotation verified OBPs, ORs and IRs protein sequences and orthologs in other species of Lepidoptera and model insects to analyze the characteristics of olfactory genes in C. punctiferalis. The nucleotide sequences of all olfactory genes that were identified from C. punctiferalis antennal transcriptomes were named according to sequence homology analysis and numbered arbitrarily. Of which, the genes of OBP1, OBP2, PBP1, PBP4, GOBP1, GOBP2 and ABP were numbered according to blast results, whereas other OBPs and all ORs and IRs were numbered arbitrarily. In addition, we use the prefix CpunOBP, CpunOR or CpunIR to reflect that the gene is a putative member belonging to yellow peach moth OBP, OR or IR-like family (Tables 3, 4 and 5).

Phylogenetic reconstruction for analysis of OBPs, ORs and IRs was performed with MEGA5.0 software63, with construct consensus phylogenetic trees using neighbour-joining (NJ) method. Bootstrap analysis of 1000 replications was performed to evaluate the branch strength of each tree.

Analysis of differentially expressed genes

To compare the differential expression of chemosensory genes in the C. punctiferalis male and female antennal transcriptomes, the read number for each chemosensory gene between male and female antennae was converted to RPKM (Reads Per Kilobase of exon model per Million mapped reads)64. The RPKM method eliminates the influence of gene length and sequencing depth on the calculation of gene expression and is currently the most commonly used method for estimating gene expression levels. Thus, the calculated gene expression can be directly used to compare gene expression between samples.

RT-qPCR and data analysis

To verify the quantification of gene expression levels in transcriptome sequencing, the RT-qPCR for different tissue and sex samples was performed. Two biological samples each with 80 male antennae or 80 female antennae and another two samples each with one male or one female moth body with antennae cut off, were used for RNA extraction using RNeasy Plus Mini Kit (Qiagen GmbH, Hiden, Germany) following the manufacturer’s instructions. cDNAs from antennae and other body part of both sexes were synthesized using the SMARTTMPCR cDNA synthesis kit(Clontech, Mountain View, CA, USA).

An equal amount of cDNA (100 ng) was used as RT-qPCR templates. For each sample, the β-actin gene (GenBank JX119014) of C. punctiferalis was used as an internal control gene. The primers were designed using the Primer Premier 5.0 program (Primer Biosoft International, Palo Alto, CA, USA) (Supplementary Table 2). The RT-qPCR was performed in an iCycler iQ2 Real-Time PCR Detection System (Bio-Rad, Hercules, CA, USA) with SYBR green dye bound to double strand DNA at the end of each elongation cycle. Each RT-qPCR reaction was conducted in a 20.0 μl reaction mixture containing 10.0 μl of 2 × SYBR Green PCR Master Mix, 0.4 μl of each primer, 2.0 μl of cDNA sample (100 ng/μl) and 7.2 μl sterilized ultrapure H2O. The cycling parameters were: 95 °C for 3 min, 40 cycles at 95 °C for 10 sec and 60 °C for 30 sec to measure the dissociation curves. Blank controls with sterilized ultrapure H2O instead of template were included in each experiment. To check reproducibility, each RT-qPCR reaction for each sample was carried out in three technical replicates and three biological replicates.

The Relative quantification analyses among four samples were performed using comparative 2−ΔΔCt method65. The comparative analyses of each target gene among different tissues were determined with one-way nested analysis of variance (ANOVA), followed by Least-significant difference (LSD) test using SPSS Statistics 19.0 (SPSS Inc., Chicago, IL, USA).

Additional Information

How to cite this article: Jia, X.-J. et al. Antennal transcriptome and differential expression of olfactory genes in the yellow peach moth, Conogethes punctiferalis (Lepidoptera: Crambidae). Sci. Rep. 6, 29067; doi: 10.1038/srep29067 (2016).