Identification of Genes Involved in Chemoreception in Plutella xyllostella by Antennal Transcriptome Analysis

Perception of environmental and habitat cues is of significance for insect survival and reproduction. Odor detection in insects is mediated by a number of proteins in antennae such as odorant receptors (ORs), ionotropic receptors (IRs), odorant binding proteins (OBPs), chemosensory proteins (CSPs), sensory neuron membrane proteins (SNMPs) and odorant degrading enzymes. In this study, we sequenced and assembled the adult male and female antennal transcriptomes of a destructive agricultural pest, the diamondback moth Plutella xyllostella. In these transcriptomes, we identified transcripts belonging to 6 chemoreception gene families related to ordor detection, including 54 ORs, 16 IRs, 7 gustatory receptors (GRs), 15 CSPs, 24 OBPs and 2 SNMPs. Semi-quantitative reverse transcription PCR analysis of expression patterns indicated that some of these ORs and IRs have clear sex-biased and tissue-specific expression patterns. Our results lay the foundation for future characterization of the functions of these P. xyllostella chemosensory receptors at the molecular level and development of novel semiochemicals for integrated control of this agricultural pest.

of physical defense in Daphnia pulex 17 . IRs usually contain three transmembrane domains (TMDs), a bipartite ligand-binding domain with two lobes and one ion channel, and have been proposed to act as dimmers or trimers of subunits coexpressed in the same neuron 12 . However, they aren't expressed in chemosensory neurons that express ORs or Orco 14 .
OBPs are the liaisons between external cues and ORs 18 , and they selectively bind hydrophobic odorant chemicals and transport them to the surface of the dendrites of ORNs [19][20][21] . OBPs also function in the recognition of specific odors through activation of the ORx/Orco complex 20 . Another class of odorant binding proteins, CSPs, are small soluble proteins expressed predominantly in the sensilum lymph as well as in non-olfactory tissues. It is clear that CSPs bind odorant or pheromone compounds [22][23][24] , but their olfactory mechanisms areas yet poorly studied.
SNMPs are insect membrane proteins that are known to associate with pheromone sensitive ORNs in Lepidoptera and Diptera 25 . There are two types of SNMPs, SNMP1 and SNMP2 25 . In moth, the subtype SNMP1 is coexpressed with pheromone receptors (PRs) in pheromone-responsive neurons 25 , whereas the subtype SNMP2 is confined to sensilla support cells [25][26][27][28] .
The diamondback moth, Plutella xylostella (Lepidoptera: Plutellidae), is a destructive insect pest distributed worldwide that can cause considerable damage in cruciferous crops. It is estimated that the total loss caused by P. xylostella is about US$4-5 billion annually 29 . Although a bioinformatics analysis of the whole-genome sequence has explained the evolutionary success of P. xylostella with regard to its expansion in gene families associated with the perception and detoxification of plant defense compounds/insecticides at the genetic and molecular levels 30 , the peripheral olfactory mechanisms that contribute to the fitness of this insect pest remain poorly understood. Identification of genes expressed in the antennae will supply baseline information to understand their likely function in odorant perception in P. xylostella and insects adaptation to various host plants.
In the present study, we sequenced and analyzed the antennal transcriptome of P. xylostella adults using second-generation high-throughput Illumina RNA sequencing (RNA-seq). The purpose of our study was to identify olfaction-related genes which might be targets as a part of pest control strategies of this insect pest species that devastates cruciferous vegetables. We identified 118 candidate chemosensory genes encoding 54 ORs, 16 IRs,7 GRs, 15 CSPs, 24 OBPs and 2 SNMPs. The sex-biased and tissue-specific expression patterns of 54 ORs and 16 IRs was also determined by semi-quantitative reverse transcription PCR. We reported the protein sequences of these chemosensory genes in Supplementary Dataset File.

Results
Sequencing and unigene assembly. By using Hiseq. 2000 sequencing approach, a total of 60,041,232 and 59,753,272 raw reads were obtained from the P. xylostella female and male antennae samples, respectively. After removing low quality and adaptor reads, female and male antennae yielded 54,430,716 and 54,059,300 clean reads and 4,898,764,440 nt and 4,865,337,000 nt clean nucleotides, respectively. After initial assembly, 124,488(mean length 278 nt) and 132,190 contigs (mean length 268 nt) were obtained from the female and male antennae libraries, respectively. Next, 62,278 female (mean length 555 nt) and 63,928 male unigenes (mean length 531 nt) were generated after contig connecting. These two unigene sets were then pooled together for further clustering, which yielded a final set of 59,844unigenesconsisting of 18,570 distinct clusters and 41,274 distinct singletons. The mean length of these unigenes was 660 nt, and N50 was 979 nt ( Table 1).

Identification of candidate chemosensory receptors: ORs and GRs.
All the unigenes were searched by blastx against nr database and further by tblastn using 63 ORs from B. mori as queries, 54 candidate OR genes were identified (Table 2). Of these, 23 were predicted to have full-length open reading frames (ORFs). The length of these 23 OR genes ranges from 376 to 473 amino acid residues, and the encoded proteins are estimated to have 5-7 TMDs, which is characteristic of typical insect ORs. The remaining 31 OR genes code for at least 163 amino acids and are predicted to have more than one TMD. A phylogenetic analysis was then performed using our candidate ORs and the ORs from other Lepidopteran insects including H. armigera, H. virescens and B. mori (Fig. 1).
The OR co-receptor gene was easily identified because of extremely high conservation among species compared to other chemosensory receptors. Similar to other insect ORs, most P. xylostella (Pxyl) ORs are highly divergent and share low similarity with other Lepidopteran insect ORs, including ORs from H. armigera, H. virescens and B. mori. However, nine PxylORs had 33%~100% identity to previously characterized PRs from P. xylostella and B. mori. They formed a single subgroup in a phylogenetic tree of Lepidopteran ORs (Fig. 1). Seven of these nine PxylORs (PxylOR1 andPxylOR3-8) were predicted to have full-length ORFs. Two short sequences (PxylOR41and PxylOR45) were also clustered in the PR branch. PxylOR41 has high similarity to PxylOR4, and PxylOR45 has relatively high similarity to BmorOR6. 12 of the remaining PxylORs were clustered with their  Contig  Female 124,488  34,667,373  278  403 ---Male  132,190  35,402,665  268  369 ---Unigene   Female 62,278  34,543,989  555  829 62,278  16,328  45,950   Male  63,928  33,941,348  531  761 63,928  15,969  47,959   All  59,844  39,492,885  660  979 59,844  18, Lepidopteran orthologous genes in the phylogenetic tree. But most PxylORs appeared to be distantly related to the known insect ORs (Fig. 1). We named the Orco unigene PxylOR2 and the 7full-length candidate PR unigenes PxylOR1 and PxylOR3-PxylOR8. The other 46 OR unigenes were ranked in order of decreasing ORF length and named PxylOR9-PxylOR54. We also identified 7 candidate GRs and named them as PxylGR1-PxylGR7. Identification of putative OBPs. We identified 24 unigenes encoding OBPs from the antennal transcriptome of P. xylostella, including 3pheromone binding proteins (PBPs) and 3 general odorant binding proteins (GOBPs) ( Table 4). Twenty-two of these 24 unigenes were predicted to have signal peptides, and 19 have full length ORFs. Signal peptide sequences were not detected in the remaining two putative OBPs due to incomplete N-terminal sequences. All 24 putative OBPs had high similarity to known Lepidopteran OBPs. The PBP and GOBP sequences were clustered in a separate clade in the OBP neighbor-joining tree (Fig. 3). Three candidate OBPs were classified into a PBP subgroup in the phylogenetic tree. They share 66%~100% similarity with previously characterized Lepidopteran PBPs and thus were named PBPs. We also found two GOBPs in the antennal transcriptome of P. xylostella and named them PxylGOBP1 and GOBP2. A new GOBP (PxylGOBP1.2) was identified that has 77% identity with PxylGOBP1. It was clustered in the GOBP clade and distinguished from other OBPs in the phylogenetic tree.  ORs, PxylOR8, was only expressed in female antennae. PxylOR45 was expressed in both male and female at a similar level. In other 44 general ORs PxylOR54 expression was much higher in female than in male antenna and the remaining 43 ORs were expressed in both male and female antennae at a similar level. In contrast to ORs, the expression of all IRs did not differ significantly between males and females. All of these 16 PxylIRs were expressed in the male and female antennae, but PxylIR7d. 3 and PxylIR25a were also expressed in legs.

Discussion
In the present study, we profiled the antennal transcriptome of P.  [35][36][37][38] and many other insect pests. Insects utilize three groups of chemosensory receptors, ORs, IRs and GRs, to perform a variety of essential behaviors such as foraging, mating and oviposition. ORs are the centerpiece of peripheral olfactory reception and determine the sensitivity and specificity of odorant reception 3 . Due to the availability of insect genome databases and progress in sequencing technology, increasing numbers of OR genes have been identified from many Lepidopteran species. To date, 68, 64, 70 ORs have been identified in the genome databases of B. mori 38 , Danaus plexippus 39 and Helioconius Melpomene 40 , respectively. Recently, by using next-generation sequencing technology the antennal transcriptome of M. sexta was profiled, and 48 OR genes were identified 34,41 . In this study, we identified 54 ORs in the antennal transcriptome of adult P. xylostella. The number of ORs identified in this paper is less than that identified by You et al. 30 in the genome database of P. xylostella. We might have missed some development-related OR genes because we only identified chemosensory genes in the adult antennae. Typical insect ORs are characterized by seven TMDs. We found less than seven TMDs in PxylORs, which is also observed in other Lepidopteran insects 33,42,43 . This is probably caused by the limited power of the software used for TMDs finding.
All of the PxylORs identified in the antennal transcriptome are highly divergent and share low similarity with other Lepidopteran insect ORs. A study showed that the common ancestor of Lepidopterans had fewer OR genes but that there were multiple gene gains and few gene losses during the evolution of Lepidoptera. This phenomenon of gene family expansion is suggested to be associated with the adaption of Lepidopteran species to host plants 44 . We also identified 9 (PxylOR1, PxylOR3-8, PxylOR41 and PxylOR45) candidate PRs based on their similarity to previously characterized PRs. The antennal expression pattern of PoxylPRs is consistent with that of PRs in H. armigera 42 and S. littoralis 45 . Among these 9 candidate PRs, 7 showed male-biased expression, and PxylOR5 was only expressed in male antennae. In contrast, PoxylOR8 was only expressed in female antennae. Sex   and tissue-specific expression of chemosensory genes is very common among Lepidoperan pests. It was found in H. assulta 33 and H. armigera 42 that some of their antennal OR genes showed sex-biased expression pattern. The male-specific expression of PxylOR5 probably plays a role in locating females, while the female-specific expression PxylOR8 likely also has ecological significance, i.e. optimization of pheromone production and spatial dispersion of females among host plants 46,47 and selection of oviposition sites. We identified one Orco unigene, named PxylOR2, which has high similarity to HarmOR2, BmorOR2 and HvirOR2. Orco is highly conserved among all insect species 3 and carries out similar functions in different insects 48 by forming a ligand-gated ion channel 49 . Orco probably functions as a chaperone and forms a dimer with the other ORs in P. xylostella.
GRs can respond to tastants such as sugars, bitter substances, CO 2 and some contact pheromones 50 . Thus, GRs play very important roles in food selection and feeding behaviors in insects. The first insect GRs were identified in the fruit fly, D. melanogaster 51 . The number of Lepidopteran GRs varies greatly; there is one GR in Cydia pomonella 52 and H. armigera 42 , 2 in M. sexta 34 , 3 in Heliothis virescens 53 and 5 in Spodoptera littoralis 45,54 . In the antennal transcriptome of adult P. xylostella we identified 7 GRs, which is more than those in the Lepidopteran insects mentioned above, but far less than the number found in the silkworm B. mori (65 GRs) 55 and the oriental tobacco budworm H. assulta (18 GRs) 56 . GRs are mainly expressed in gustatory organs such as the proboscis and maxillary palps, rather than in antennae 8 . This is a possible reason why we identified only 7GRs in P. xylostella. Two GR genes, GR21a and GR63a have been proved to be putative CO 2 receptors in the antennae of the fruit  57,58 . And in mosquitos, 3 putative CO 2 receptor genes (GR22, 23 and 24) have been identified in the maxillary palps of different species [59][60][61] . The PxylGR1 was closely related to the GR22 in mosquito and GR21a in the fruit fly and predicted to be a candidate CO 2 receptor.
IRs belong to an ancient chemosensory receptor family, and two subfamilies of IRs have been identified recently, i.e. the conserved 'antennal IRs' and the species-specific 'divergent IRs' 62 43 , and 12 IRs in S. litoralis 36 . All of these IRs are expressed in antennae, but PxylIR7d.3 and PxylIR25a are also expressed in legs, which is different from the expression patterns of these genes in H. assulta 33 . Coincidently, HarmIR25a, HarmIR75d, HarmIR75p and HarmIR76p are also expressed in the cotton bollworm legs 42 . The function of leg-expressed IRs remains unknown and deserves in-depth investigation.
OBPs are believed to be directly involved in the activation of the ORx/Orco complex in the recognition of specific odors 20 . A total of 24 OBPs were identified in the antennal transcriptome of P. xylostella, including three GOBPs and three PBPs. The number of OBPs identified in the present study was comparable to those identified in transcriptomic analyses of H. armigera (34) and H. assulta (29) 33 , S. litura (21) 64 , S. littoralis (26) 54 , but fewer than those identified in B. mori (44) 37 . OBPs showed lineage-specific expansion and diversification; therefore, it is not surprising that there are some differences, or even big differences, in the number of OBPs. Previous studies have also shown that some insect OBPs and CSPs are expressed exclusively in non-antennae tissues or in larvae 65 . Therefore, different sampling and sequencing strategies may lead to different results. In a previous study, two GOBPs, GOBP1 and 2, were identified in P. xylostella antennae 66 . GOBPs were also found in the antennae of C. pomonella 67 and S. litura 68 . The antennal P. xylostella GOBPs identified in this study have ecological significance, e.g. guiding P. xylstella to find better food 69 . The antennal S. litura GOBP1 can bind to plant odorants, while S. litura GOBP2 can bind to aldehyde-sex compounds and analogs 68 . CSPs are a class of small soluble proteins expressed highly in the chemosensilla lymph 70 and show high binding activity to odorants and pheromones 71 . We identified 15CSPs genes in the present study. The number of CSPs identified from P. xylostella was comparable to the number in B. mori (18) 72 , H. armigera (18) and H. assulta (17) 33 and S. litura (18) 64 , but fewer than the number in M. sexta (21) 34 , Sesamia inferens (24) 63 and S. littoralis (31) 45 . Because CSPs are also expressed in tissues other than antennae 73,74 and may participate in other physiological processes, it is possible that we have missed some CSPs in our antennal transcriptome analysis.
SNMPs are two-transmembrane domain proteins that share very high homology to members of the mammalian CD36 receptor family, which are thought to function in pheromone detection of Lepidopteran and Dipteran insects 31 . Two subtypes of SNMPs (SNMP1 and SNPM2) have been frequently identified in most insects, e.g. Helicoverpa armigera 33,42 , Cnaphalocrocis medinalis 27 , S. exigua 75 , S. litura 28 , C. suppressalis 43 , H. assulta 33 , and in this study, P. xylostella. The expression of antennal SNMPs in P. xylostella suggests their role in pheromone detection, similar to what has been reported in D. melanogaster 32,76 .

Conclusions
In summary, we identified 118 candidate olfactory genes that may function in odorant perception in the diamondback moth, P. xylostella by assembling and annotating transcriptomic sequence data. We carried out a comparative phylogenetic analysis to predict gene functions and examined the transcriptome patterns of the P. xylostella OR and IR genes. Genes with sex-biased and tissue-specific expression patterns, especially PxylOR5 and PxylOR8, are potential targets for environmentally-friendly management of this destructive insect pest. Our results lay the foundation for functional analysis of these receptors in both neurobiological and evolutionary studies.

Materials and Methods
Insect rearing. The laboratory-maintained P. xylostella was reared in the Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing, China. The larvae and adults were fed on Chinese cabbage and kept in cages at 27 ± 1 °C under a 16: 8 (L: D) photoperiod and 65 ± 5% relative humidity. Male and female larvae were distinguished at the last instar and placed in separate cages. Antennae of female or male adults were dissected at 1-3 days after adult emergence, immediately frozen in liquid nitrogen, and then stored at −70 °C until use.
Total RNA extraction. The frozen antennae were transferred to a liquid nitrogen-cooled mortar and ground with a pestle. One mL of TRIzol reagent was pipetted to the homogenate (Invitrogen, Carlsbad, CA, USA) and total RNA was extracted following the manufacturer's instructions. Total RNA was resuspended in RNAse-free   Unigene generation. Raw reads were pre-processed to remove low quality reads and reads containing adapter sequences and poly-A/T tails. The publicly available program Trinity was used to perform de novo assembly of clean reads to generate a set of transcripts 77 . The Trinity outputs were then clustered by TGICL (TGI Clustering tools) 78 . The final unigene dataset consists of uniformly clustered sequences and singletons.
Gene identification and functional annotation. Unigene sequences were first searched against protein databases like nr, Swiss-Prot, KEGG and COG, using blastx with an e-value cut-off of 1e −5 79 . To identify more OR genes, 63ORs from B. mori were used as queries in tblastn searches of P. xylostella antennal unigenes. Unigene ESTs were predicted using ESTScan 80 . Signal peptides in the protein sequences were predicted using SignaIP 4.0 81 . The TMDs of annotated genes were predicted using TMHMM Server Version2.0 (http://www.cbs.dtu.dk/ services/TMHMM).

Expression analysis of the candidate receptors by semi-quantitative reverse transcription PCR.
To illustrate and compare the expression patterns of candidate receptors in male and female antennae, semi-quantitative RT-PCR was performed using cDNA prepared from male antennae, female antennae and legs (male and female mixture). Legs were used as a control to confirm the antennae-enriched expression of candidate receptors. Total RNA was extracted as described above. Prior to cDNA synthesis, RNA was treated with DNase I (Fermentas, Vilnius, Lithuania) to remove trace amounts of genomic DNA. The cDNA was synthesized using the First Strand cDNA Synthesis Kit (Fermentas, Vilnius, Lithuania) and was used as a template in PCR reactions with gene-specific primers. The housekeeping gene RPS3 was used as a control 87 . Primers were designed using the Primer Premier 5 software (PREMIER Biosoft International), and the sequences are available in Supplementary  Table S1. PCR was performed with the Veriti Thermal Cycler (Applied Biosystems, Carlsbad, CA, USA) under the following conditions: 94 °C for 2 min, 33 cycles of 94 °C for 30 s, 55-60 °C for 30 s, and 72 °C for 30 s, and 72 °C for 10 min. The cycle number was reduced to 27 and 30 for Actin and OR2 amplification because of their high expression level. The experiment was repeated three times using three independently isolated RNA samples. PCR amplification products were run on a 2% agarose gel and verified by DNA sequencing.