Use of the proteomic tool MALDI-TOF MS in termite identification

Matrix-assisted laser desorption/ionization mass spectrometry (MALDI-TOF MS) has proved effective for the identification of many arthropods. A total of 432 termite specimens were collected in Mali, Cote d’Ivoire, Togo, Senegal, Switzerland and France. Morphologically, 22 species were identified, including Ancistrotermes cavithorax, Amitermes evuncifer, Cryptotermes brevis, Cubitermes orthognathus, Kalotermes flavicollis, Macrotermes bellicosus, Macrotermes herus, Macrotermes ivorensis, Macrotermes subhyalinus, Microcerotermes parvus, Microtermes sp., Odontotermes latericius, Procubitermes sjostedti, Promirotermes holmgreni, Reticulitermes grassei, Reticulitermes lucifugus, Reticulitermes santonensis, Trinervitermes geminatus, Trinervitermes occidentalis, Trinervitermes togoensis, Trinervitermes sp., Trinervitermes trinervoides and Trinervitermes trinervius. Analysis of MALDI-TOF MS spectra profiles from termites revealed that all were of high quality, with intra-species reproducibility and inter-species specificity. Blind testing of the spectra of 389 termites against our updated database with the spectra of 43 specimens of different termite species revealed that all were correctly identified with log score values (LSVs) ranging from 1.65 to 2.851, mean 2.290 ± 0.225, median 2.299, and 98.4% (383) had LSVs > 1.8. This study is the first on the use of MALDI-TOF for termite identification and shows its importance as a tool for arthropod taxonomy and reinforces the idea that MALDI-TOF MS is a promising tool in the field of entomology.

Termites or white ants belong to the arthropods that have thrived on earth for over 300 million years 1,2 . They belong to the class Insecta, infra-order Isoptera 3 . Phylogenetic studies have indicated that the nearest relatives are cockroaches, which explains their classification in the order of Blattodea 4 . Their distribution depends on climatic conditions, especially temperature and precipitation 5 . Termites live in colonies divided into two castes (reproductive or sexual and sterile or asexual) 6,7 . Humans and termites live in close proximity 8 . Many termite species are recognized as harmful.
Termites attack the wooden parts of buildings and construction, causing damage costing more than three billion dollars each year, as well as in the agriculture field by eating fast-growing plants 9,10 . At the same time, they are known as ecosystem engineers by influencing the distribution of natural resources, such as water and nutrients in the ecosystem 11 , and that refers to their ability to generate valuable biogenesis that improves soil properties 12 , which increases water infiltration rates 13,14 . They are also of interest in traditional medicine, particularly for suturing wounds and treating angina, fever, burns and abscesses 15 . Termite species such as Macrotermes bellicosus have been shown to have anti-inflammatory and analgesic effects 16 . Termites can be used as bait to catch fish and birds 15 and as a natural human food resource that has a significant value in protein and vitamins 17 . They participate in the beneficial chemical variation of the earth and its components 18 .
Morphological identification of termites, based mainly on the observation of morphological characteristics, is limited by the need for entomological expertise; i.e., the difficulty in identification up to the species level due to the ambiguity of their features, crypto-biotic social structure and their similarity, the availability of identification keys and the long time required for identification. Identification based on molecular amplification and sequencing of genes such as mitochondrial cytochrome oxidase subunits I and II (cox1 and cox2), genes coding for the NADH-ubiquinone oxidoreductase chain 1 (ND1), Internal Transcribed Spacer (ITS2), large and small ribosomal RNA subunits (16S and 12S rRNA), and nuclear DNA such as 18S, Microsat and genes for endobeta-1,4-glucanase (RsEG),interactive domain-containing protein 1A (AT-rich DNA); these molecular markers www.nature.com/scientificreports/ have proven to be an efficient alternative for species identification and overcoming morphological limitations 19 . However, the molecular identification approach is still limited by the high cost of reagents, the time consumed, and the absence of universal primers allowing for the amplification of a given gene in all species and sequences of all species on GenBank 20 .
To overcome the difficulties of morphological and molecular identification of arthropods, MALDI-TOF MS has been proposed as an alternative identification technology to these two methods. The MALDI-TOF MS is a technique that allows identifying an organism from protein signals (borrowed protein) of molecular weight between 2000 and 20,000 Da. This method has been used in many studies to identify different arthropods, including ticks, mosquitoes, biting midges, fleas, lice, bedbugs, triatomines and phlebotomine sand flies, and also for the determination of their blood meal origin and to discriminate between the infectious status of some arthropod vectors 21 . In medical entomology, the use of MALDI-TOF MS requires a development of protocols such as the choice of the compartment to be used and the quantity of crushing mix allowing to have spectra with intra-species reproducibility and inter-species specificity. The part of the arthropod to generate reproducible and specific spectra by MALDI-TOF MS analysis varies between arthropod groups but also according to the developmental stages of the arthropod 21 . For example, the legs are used for mosquitoes and ticks, the cephalothorax for fleas, lice and bedbugs 21 . MALDI-TOF is a fast and easy technique that does not require expertise in entomology. However, the high cost of the machine, its maintenance, the choice of the compartment used for the analysis and the preservation methods are the limiting factors of this technique 21 . The machines used in entomology research are those used in the microbiology platform, at no extra cost, and the cost of the analysis is very low. At present, for termites the MALDI-TOF MS tool has been used to identify methanotrophic bacteria in the gut 22 , cuticular hydrocarbons 23 , carboxy-methyl cellulose, crystalline celluloses or xylan from the gut of Reticulitermes santonensis 24 and the chemical profile and antimicrobial activity of Macrotermes bellicosus used in traditional medicine 25 . However, no study has been done on the identification of termite species by MALDI-TOF MS. Hence, the aim of this study is to evaluate the ability of MALDI-TOF to identify different species of termites collected in four West African countries and in two countries in Europe.  Fig. 1 and Table 1). The largest number of species was identified in Senegal, with 12 different species ( Fig. 1 and Table 1). The termites belonged to the sterile caste (workers and soldiers) and reproductive caste (winged). We could not identify several specimens of Trinervitermes and Microtermes termites up to the species level (Table 1).

Termite collection and morphological identification.
For all the species that we identified morphologically, pictures of a body and mandibular were taken and represented in the supplementary Fig. 1.

Validation of morphological identification by molecular tools.
A total of 47 randomly selected termite specimens morphologically identified for the creation of our database were submitted to standard PCR and sequencing for molecular identification using both 12S rRNA and COI genes. BLAST analysis of sequences obtained from specimens identified as M. subhyalinus showed that they were 100% and 99.4% identical to the corresponding sequence of M. subhyalinus (GenBank: DQ441726 and AY127708) respectively for 12S rRNA and COI genes. Those  The results of the molecular identification of the termites in our study are summarized in Table 1. The phylogenetic position of termites in this study is shown in Fig. 2A and B. The sequences of the 12S rRNA and COI genes obtained in this study were deposited in the GenBank (National Centre for Biotechnology Information, NCBI) under the following accession numbers: MW078935 to MW078965 and MZ029056 to MZ029087, respectively for both genes (supplementary Table 1).

MS spectra analysis.
In total, the legs of 432 termite specimens were subjected to MALDI-TOF/MS analysis. The visualization of the MS spectra obtained from all samples showed that they were of high quality (no smoothing, baseline subtraction corrects and peak intensity > 3000 arbitrary units,) (Fig. 3). A cluster analysis (dendrogram) was performed with two to five MS spectra from the legs of each species to evaluate the reproducibility and specificity of the spectra according to species. A perfect clustering of specimens of the same species on the same branch reveals intra-species reproducibility and inter-species specificity for the different termite species (Fig. 4). Interestingly, the MS spectra of sterile and reproductive caste specimens of the same species were specific according to caste as shown in the PCA and dendrogram made with two castes (reproductive and sterile) of K. flavicollis and R. lucifugus ( Supplementary Fig. 2). All the reference MS spectra included in our lab database have been deposited in a public repository in order to be shared with the entire research community, they are available and can be downloaded with the following DOI number: https:// doi. org/ 10. 35088/ q281-st29.
Blind test for validation of termite identification. The accuracy of MALDI-TOF MS identification of 17 termite species was evaluated by querying 389 specimens morphologically, identified against our updated MALDI-TOF MS database with 43 spectra of one to ten spectra per species confirmed by molecular biology (Table 2). However, the termite identified as Microtermes sp. was not included in the blind tested because there was only one specimen.
The result of the interrogation showed that 100% (389) of the species were correctly identified; i.e., they agreed with our morphological identification, with an LSV ranging from 1.65 to 2.851, mean of 2.290 ± 0.225, median

Discussion
According to our morphological identification, 22 species belonging to 12 genera were identified in this study. The presence of most of the species identified in West Africa had already been reported in this area 26 . Termites belonged to the reproductive or biological caste (with wings) and sterile caste composed of workers and soldiers (without wings). Each caste ensures a specific role within the colony; thus, the soldiers are involved in defense, workers in elementary tasks (gathering food, taking care of the queen and king and constructing or repairing the nest) and the reproductive caste ensure the function of reproduction 27 . The molecular biology of social insects has always been contentious 28 ; we used both gene systems including 12SrRNA and COI in our study. Based on previous studies the 12SrRNA gene was suggested as more discriminating compared to the COI gene 29,30 . However, molecular identification of Isoptera has most often used the COI gene 29,31,32 . Additionally, 12S is characterized by the narrow spectrum of sequences for each species where each species is presented by one or two sequence.
Based on the results of molecular biology, our morphological identification was confirmed for some species, such as K. flavicollis www.nature.com/scientificreports/ did not match with molecular identification. Most probable explication is the morphological misidentifications that we made due to close anatomical similarities among species from Reticulitermes and Trinervitermes genera 33,34 . On the other hand, this discrepancy between our morphological and molecular identification would be due to the updating of the systematics of the termites with a change in the name of the genus as in the case of the genus Cubitermes changed to Nitiditermes 35,36 or by the lack of reference sequence of some species in the GenBank database, as in the case of the species M. ivorensis and P. holmgreni, which is one of the limitations of the molecular method 21 .
In this study we developed the MALDI-TOF MS tool to identify different termite species. Although some studies have been performed on the identification of the gut microbiota and chemical composition of termites using MALDI-TOF MS 26,37,38 , to our knowledge our study is the first to use this tool for the identification of these insects. MALDI-TOF MS is a technique that allows the identification of bimolecular protein contained in a sample by soft ionization according to the mass/charge ratio (m/z) 21 . The measurement of the m/z ratio is determined by the time it takes for an ion to travel the flight path and thus generate a spectral profile specific to the composition of the sample being analyzed. Over several years, this technique has been developed for the routine diagnosis of a large number of microorganisms (bacteria, archaea, yeasts, filamentous fungi, helminths and intestinal protozoa) of medical and/or veterinary importance 21 . Over the last 15 years, MALDI-TOF MS has been widely used in entomology for the identification of a large number of arthropod vectors and non-vectors, as well as for the determination of blood meal origin and the discrimination of infected and non-infected arthropods. Although this technique is less expensive, reliable, fast and does not require knowledge of entomology, its use in entomology requires the development of specific protocols to generate reproducible and species-specific spectra 21 . The price of the machine, the choice of the part of the arthropod used for MALDI-TOF MS analysis    (3)(4), Cubitermes orthognatus (5-6), Cryptotermes brevis (7)(8), Kalotermes flavicollis (9-10), Macrotermes bellicosus (11)(12), Macrotermes herus (13)(14), Macrotermes ivorensis (15)(16), Macrotermes subhyalinus (17)(18) and (B) the spectra from Microcerotermes parvus (19)(20), Microtermes sp (21)(22), Odontotermes latericius (23)(24), Procubitermes sjostedti (25)(26), Promirotermes holmgreni (27)(28), (Reticulitermes lucifugus (29)(30), Trinervitermes geminatus (31)(32), Trinervitermes occidentalis (33)(34), Trinervitermes trinervius (35)(36). a.u.: arbitrary units; m/z: mass-to-charge ratio.   Table 2. Termite collection information (location, year of collection, morphological identification) and MALDI-TOF identification with log score value obtained by blind test. www.nature.com/scientificreports/ identification, proving the performance of the MALDI-TOF MS tool in distinguishing the different arthropod species, has already been reported in several studies 21 . It is interesting to note that very closely related species, often difficult to distinguish morphologically, (species of the genus Macrotermes and Trinervitermes) and specimens of different castes (reproductive and sterile), could be distinguished by MALDI-TOF MS. The ability of MALDI-TOF MS to distinguish closely related tick species has already been reported 39 .

Ancistrotermes cavithorax
To summarize, our study reliably shows that MALDI-TOF/MS is a promising tool that may make termite studies much easier and specific. It would be interesting to apply this innovative tool on termites from other sites in the world in order to have a maximum of species to enrich our MALDI-TOF MS database.

Materials and methods
Study area and collection periods. Termites  The collection in Senegal was made at four sites; Niokolo-Koba National Park (sites of Simenti, Dar Salam and Niokolo Poste) and in Southeastern Senegal (region of Kédougou) in the Dindefello forest, were the termitaries were identified visually, and samples collected using a shovel and pincer. From each termitary, the soil substrate, the fungus combs (if available) and adult termites (soldiers and workers) were collected in a plastic ventilated box where they were stored during the transport at ambient temperature. On arrival, termites were separated from the substrate and fungus. The maps showing the collection sites were made with QGIS software version 3.20 (Fig. 1). All termites were stored in 70% ethanol except for termites collected from Marseille in 2020. They were stored at − 20 °C because they were collected in the proximity of the laboratory.
Morphological identification. Morphological identification of termites was done down to the genus and/ or species level using different keys based on the specific and discriminating characteristics of soldiers, workers and alates, or biological, such as the keys of Sands 40 , Clay 41 , Adlard et al. 42 , Ifan 43 , Bouillon 44 .
The termite collection site was also an important element that was considered and helped us in the taxonomy. For workers of some species, we dissected the mandible to identify termites down to the species. The criteria were observed with the optical microscope, the binocular loupe, and the ZEISS Axio Zoom V16, and then photographed with the digital Canon E05 7D supplied with a Canon MP-E 65 mm Lens (French).

Molecular identification of termites.
To confirm the morphological identification of termites whose spectra were to be introduced into our MS spectra database (DB), molecular analyses were performed. DNA was extracted from a small part of the termite abdomen, which was subjected to enzymatic lysis by incubation at 56 °C overnight in 180 μL of lysis buffer G2 (QIAGEN, Hilden, Germany) and 20 μl of proteinase K (QIAGEN, Hilden, Germany). Total DNA was extracted into 100 μL of eluate using EZ1 Tissue Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions and stored at − 20 °C before use.

Termite preparation for MALDI-TOF analysis.
After rinsing and drying on sterile filter paper, three legs from each of the termites were individually placed in an Eppendorf tube and dried at 37 °C overnight for those stored in alcohol. Specimens that were at − 20 °C were not dried at 37 °C overnight. The legs were then homogenized with the TissueLyser (Qiagen) with a pinch of glass beads and 20 µL of a mixture of 70% formic acid and 50% acetonitrile (Fluka, Buchs, Switzerland) in a three-minute cycle at a frequency of 30 Hertz as already described 20,46,47 . The legs of Aedes albopictus reared in our laboratory were used as a positive control in all manipulations.

MALDI-TOF/MS parameters. Protein mass profiles were obtained using a Microflex LT MALDI-TOF
Mass Spectrometer (Bruker Daltonics, Germany), with detection in the linear positive-ion mode at a laser frequency of 50 Hz within a mass range of 2-20 kDa. The setting parameters of the MALDI-TOF/MS apparatus were identical to those previously used 20,48,49 . Briefly, the acceleration voltage was 20 kV, and the extraction delay time was 200 ns. Each spectrum corresponds to ions obtained from 240 laser shots performed in six regions of the same spot and automatically acquired using the AutoXecute of the Flex Control v.2.4 software (Bruker Daltonics). www.nature.com/scientificreports/ Spectra analysis. The MS spectra were then exported to flex Analysis v3.3, ClinProTools v2.2 and MALDI-Biotyper v3.0. (Bruker Daltonics) software for data processing (smoothing, baseline subtraction, peak picking). The quality of MS spectra was evaluated by visualization of spectra obtained from the four spots for each sample with the flex Analysis v3.3 software (Bruker Daltonics). Cluster analyses (MSP dendrogram) and principal component analysis (PCA) were performed to verify intra-species reproducibility and inter-species specificity as well as variability within different castes (soldiers and workers) of the same species. Cluster analyses were performed based on the comparison of the MSPs given by the MALDI-Biotyper v3.0. software and grouped according to the mass profile of the proteins (i.e., their mass signals and intensities) and it reflects how tick specimens are related to each other. The setting parameters were as follows: distance measure by correlation, linkage by average; the score threshold value for a single organism was 300 (arbitrary unit) and for related organisms was 0 (arbitrary unit).

Database creation and blind test for validation of termite identification. Reference MS spectra
were created from the spectra of each termite species, where available, using MALDI-Biotyper v3.0. (Bruker Daltonics). MS spectra of legs from 42 specimens of termites identified morphologically and molecularly were added into our MS spectra database (DB) 50 , Microtermes sp. spectra were not added to the DB because we had only one specimen.
All remaining spectra were blind tested against our MS spectra database for termite identification. The level of the identification significance was established using the log score values (LSVs) given by the MALDI-Biotyper v.3.3 software that correlated with a corresponding degree of signal intensity of the request and reference mass spectra. The LSVs, ranging from 0 to 3, were obtained for each spectrum of the tested samples. The results of identification were considered reliable and relevant when the LSVs were greater than or equal to 1.8, as previously established in many studies 20 www.nature.com/scientificreports/ Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.