Hologenomic adaptations underlying the evolution of sanguivory in the common vampire bat

Adaptation to specialized diets often requires modifications at both genomic and microbiome levels. We applied a hologenomic approach to the common vampire bat (Desmodus rotundus), one of the only three obligate blood-feeding (sanguivorous) mammals, to study the evolution of its complex dietary adaptation. Specifically, we assembled its high-quality reference genome (scaffold N50=26.9 Mb, contig N50=36.6 Kb) and gut metagenome, and compared them against those of insectivorous, frugivorous, and carnivorous bats. Our analyses showed i) a particular common vampire bat genomic landscape regarding integrated viral elements, ii) a dietary and phylogenetic influence on gut microbiome taxonomic and functional profiles, and iii) that both genetic elements harbor key traits related to the nutritional (e.g. vitamins and lipids shortage) and non-nutritional challenges (e.g. nitrogen waste and osmotic homeostasis) of sanguivory. These findings highlight the value of a holistic study of both host and microbiota when attempting to decipher adaptations underlying radical dietary lifestyles.

T he order Chiroptera (bats) exhibits a wide variety of dietary specializations, and includes the only three obligate bloodfeeding mammalian species, the vampire bats (family Phyllostomidae, subfamily Desmodontinae). Blood is a challenging dietary source because it consists of an ~78% liquid phase and a dry-matter phase consisting of ~93% proteins and only ~1% carbohydrates 1 , providing very low levels of vitamins 2 , and potentially containing blood-borne pathogens. Vampire bats have evolved numerous key physiological adaptations to this lifestyle, for which the associated genomic changes have not yet been fully characterized due to the lack of an available reference genome. These adaptations include morphological specializations (such as claw-thumbed wings and craniofacial changes including sharp incisors and canines), infrared sensing capacity 3 for the identification of easily accessible blood vessels in prey 4 , and renal adaptations to the high protein content in its diet 5 (such as a high glomerular filtration rate and effective urea excretion). Furthermore, given the high risk of exposure to blood-borne pathogens, another important trait in the vampire bat is its immune system 6 .
Besides genomic adaptations, host-associated microbiota may play an additional, possibly equally important, role in the evolution of vertebrate dietary specialization 7 . Although the functional role of the vampire bat gut microbiome has not been studied, analyses of obligate invertebrate sanguivorous organisms 8 have shown that the gut microbiota contributes to blood meal digestion 9 , provision of nutrients absent from blood 10 and to immunological protection 11 . Studies on mammals have shown that the gut microbiome is a key aspect of an organism's digestive capacities (energy harvest, nutrient acquisition and intestinal homeostasis) 12 , and that it also affects phenotypes related to the immune and neuroendocrine systems 13,14 . Furthermore, changes in the gut microbiome are associated with diseases such as diabetes, obesity, irritable bowel syndrome and Nature ecology & evolutioN inflammatory bowel disease [15][16][17] . In response to the growing awareness of the key roles that host-microbiome relationships can play across the spectra of life, various studies have advocated for the 'hologenome' concept [18][19][20] . In brief, this argues that natural selection acts on both the host and its microbiome (together forming the holobiont); thus, evolutionary studies should incorporate both. The extreme dietary adaptation of vampire bats provides a suitable model to investigate the effect of selection across the genome and microbiome, and thus allows exploration of the role of host-associated microbiome in the evolution of specialized diets.

Results and discussion
Here, we explore the contributions of both the common vampire bat's nuclear genome and gut microbiome to its adaptation to obligate sanguivory. To this end, we generated both the common vampire bat genome and fecal metagenomic data sets as a proxy to study its gut microbiome, as well as faecal metagenomic data sets from other non-sanguivorous bat species. We used these data sets for comparative genomic and metagenomic analyses. Specifically, we analysed the common vampire bat genomic landscape, the ratio of substitution rates at non-synonymous and synonymous sites (dN/dS), putative gene loss and gene family expansion/contraction, and computational predictions of the functional impact of aminoacid substitutions. We also performed microbial taxonomic and functional profiling, identified the microbial taxonomic and functional core of the common vampire bat, and identified enriched microbial taxa and functions. Following a hologenomic approach, we identified elements in both the host genome and microbiome that could have played relevant roles in adaptation to sanguivory. Genomic landscape. We sequenced and de novo assembled the ~2 gigabase (Gb) common vampire bat genome using Illumina sequencing technology (Supplementary Information 1). The genome is smaller than that of other mammals, but similar to previously reported bat genomes 21 . The initial assembly (~100 × mean coverage, scaffold N50 = 5.5 Mb and N90 = 933 kb; Supplementary Figs. 1 and 2) was subsequently improved using the in vitro proximity ligation-based technology for assembly contiguity refinement developed by Dovetail Genomics 22 . We obtained a final highquality assembly with scaffold N50 = 26.9 Mb and N90 = 9.46 Mb, contig N50 = 36.6 kb and N90 = 8.8 kb (Supplementary Table 1, Supplementary Fig. 3 and Supplementary Information 1). We used our annotated common vampire bat genome (see Methods-Protein-coding gene and functional annotation) for comparative genomic analyses with publicly available bat genomes and other mammalian genomes (Supplementary Table 2 and Fig. 1).
Repetitive elements can significantly contribute to genome evolution. Thus, for a genomic landscape characterization, we first compared transposable elements in the common vampire bat genome to those within the genomes of non-phyllostomid bats with other diets: the carnivorous Megaderma lyra (greater false vampire bat, Megadermatidae), the insectivorous Pteronotus parnellii (Parnell's moustached bat, Mormoopidae) and the frugivorous Pteropus vampyrus (large flying fox, Pteropodidae) ( Fig. 1, Supplementary Table 3 and Supplementary Note 1). We identified a 1.6-to 2.26-fold higher copy number of the MULE-MuDR transposon in the common vampire bat genome relative to the other bat genomes. The high mutagenic capacity of MULE-MuDR has been demonstrated to have played a critical role in the evolution of some plants 23 . Furthermore, transposable elements in general may cause structural or functional changes within the genome and alter epigenetic regulation of the genes into which they are inserted 24 . Therefore, we explored whether these elements might have also played a role in the evolution of sanguivory by analysing their genomic location in the nuclear genome of the common vampire bat. We found that the identified common vampire bat transposable elements, MULE-MuDR elements in particular, were located within genomic regions enriched for gene ontology (GO) functions related to the challenges of sanguivory, such as antigen processing and presentation, defence response to viruses, lipid metabolism, and vitamin metabolism ( Supplementary Information 2).
A sanguivorous diet facilitates exposure to blood-borne viruses that could lead to an increase in genomic invasion by retroviral and non-retroviral endogenous viral elements (EVEs). Thus, we next characterized their presence in the common vampire bat genome. Compared to previously published EVE studies on non-Chiropteran mammals, the common vampire bat exhibits a greater diversity of non-retroviral EVEs in terms of the number of integrations, including endogenized viral genes from avian Bornaviridae   Supplementary Information 3). Surprisingly, and in contrast to the prior expectations given its sanguivorous diet, the diversity of endogenous retroviral elements (ERVs) in the common vampire bat is very low compared to other bat species 26 . The only proviral elements detected were DrERV 27 and DrgERV, both present in low copy numbers (Supplementary Note 3, Supplementary Fig. 5 and Supplementary Information 4). We hypothesize that genome colonization by ERVs could have been restricted by the genomic adaptations in the common vampire bat genome against ERV insertion and proliferation. In support of our hypothesis, we identified expansion of the anti-retroviral gene TRIM5 family (Viterbi P = 0.00088, Supplementary Information 5 and Supplementary Note 4).
Genomic adaptations to sanguivory. Feeding specialization often requires morphological and physiological adaptations in traits such as the sensory apparatus (for example, infrared sensing), locomotion, digestion, kidney function and immunity ( Fig. 2a and Supplementary Information 6). For example, it has been shown that vampire bats have a loss of sweet taste genes and reduction of bitter taste genes 28 . In agreement, such genes were also identified in our putative gene loss analysis (Supplementary Information 7 and Supplementary Note 5). It is likely that the function of those genes is related to sanguivory, because sweet and bitter taste receptor genes influence glucose homeostasis in humans 29 . Interestingly, we found that the common vampire bat bitter taste receptor gene TAS2R3 has experienced episodic positive selection and shows two speciesspecific positively selected sites (PSSs) on topological domains, one of them having a potential impact on protein function (PROVEAN score = − 4.4) (Supplementary Table 4 and Supplementary Fig. 6). Among the enriched GOs of the differentially evolving genes, we identified functions related to the regulation of RNA splicing, which could be relevant to sanguivory given that D. rotundus produces submandibular tissue-specific splicing isoforms to counteract the prey's response to injury 30 (Supplementary Information 8 and Supplementary Note 6). Regarding the recruitment of alternatively spliced forms, the ganglion-specific splicing of TRPV1 has been found to underlie the vampire bat's infrared sensing ability 31 . Interestingly, we found that PRKD1, which directly modulates the product of TRPV1 32 , is positively selected and exhibits species-specific PSSs (branch-site test P = 3.39 × 10 −10 , branch test Supplementary Information 8). These examples suggest an important role of alternative splicing as a form of regulatory evolution fundamental to sanguivory (Supplementary Note 7). However, it is clear that despite the number of detected genomic features related to sanguivory adaptation, they alone cannot address all of the challenges posed by this diet (Fig. 2b).
Gut microbiome diet and phylogenetic influence. We generated Illumina shotgun metagenome data in order to compare the gut microbiomes of 13 faecal samples from common vampire bats with those of non-sanguivorous non-phyllostomid bats; specifically, eight frugivorous Rousettus aegyptiacus (Egyptian fruit bat, Pteropodidae), five insectivorous Rhinolophus ferrumequinum (greater horseshoe bat, Rhinolophidae) and five carnivorous Macroderma gigas (ghost bat, Megadermatidae) bats (Supplementary Information 9). We obtained a median of 15.8 Gb of sequencing data (~37.6 million 100-bp paired-end reads) for each dietary category. After filtering low-quality bases, adaptor sequences and bat-genome-derived reads, we obtained a median of 2.77 Gb of high-quality data for each species, totalling 86.73 Gb of data. We identified taxa and functions present only in the common vampire bat microbiome (gut microbiome core), as well as taxa and functions that exhibit statistically significant differences in abundance or contribution to variation between the different microbiomes (Supplementary Information 6 and 8, Supplementary Tables 5  and 6 and Supplementary Note 8).
It has been observed previously that similarity in the taxonomic composition of vertebrate gut microbiomes (including bats) can be influenced by the diet and the phylogenetic relationships of the respective host species 33 . Overall, the common vampire bat microbiome taxonomic composition is more similar to that of the insectivorous and carnivorous bats than to that of the frugivorous bat. This may reflect a phylogenetic influence on the microbiome taxonomic profile (Fig. 3a, Supplementary Fig. 7 and Supplementary Notes 9 and 10). In contrast, the vampire bat microbiome is strikingly different to that of the compared bats at the functional level, which was characterized by the KEGG annotations of the microbial non-redundant gene catalogues assembled from the metagenomic data sets. While there is little differentiation between the functional gut microbiomes of carnivorous, insectivores and frugivorous bats, the common vampire bat functional gut microbiome is almost completely distinct, and exhibits the least intra-species variation between the samples (Fig. 3b, Supplementary Fig. 8

Nature ecology & evolutioN
This suggests that the functional profile is less influenced by phylogeny than the taxonomic profile, and that the common vampire bat gut microbiome harbours a set of functions specialized to its extreme diet (Supplementary Note 11). Subsequently, we analysed the comparative genomic and metagenomic results in a hologenomic framework to demonstrate how both components contribute to adaptation to sanguivory ( The hologenomic framework of sanguivory. Viscosity and subsequent coagulation represent a challenge for ingestion and digestion of blood. Besides developing potent anticoagulants in its saliva 34 , the common vampire bat hologenome addresses this challenge in various ways. For instance, REG4, involved in metaplastic responses of the gastrointestinal epithelium, was found to be under ongoing positive selection (M8a/M8 test P = 0.047) with possible functional impact on its carbohydrate-binding capacity, including binding of the anticoagulant heparin (Supplementary Table  4 and Supplementary Information 8). Furthermore, we identified genes in the common vampire bat microbial functional core from pathways for degradation of heparan sulfate and dermatan sulfate, both being polysaccharides involved in blood coagulation ( Supplementary Information 10). We also identified an enrichment in the common vampire bat microbial gene l-asparaginase (Fisher's P = 0.00027), which decreases protein synthesis of coagulation factors 35 (Supplementary Information 11).
Besides specialized digestion, sanguivory poses other challenges related to the poor nutritional value of blood itself, as well as to the side effects that may arise due to the blood components being the sole dietary source (Fig. 2b). We identified elements in both the genome and gut microbiome that might be involved in solving each of these challenges discussed next.
Hologenomic solutions to nutritional challenges. Low nutrient availability. Obligate sanguivory requires adaptation to very low levels of some nutrients, such as essential amino acids and the vitamin B complex 36,37 , and very high levels of others, such as salt 38 . Our data clearly demonstrate how both the host and its associated gut microbiome have dealt with these challenges. We found the gene LAMTOR5 to be positively selected in the common vampire bat genome (false discovery rate (FDR)= 2.02 × 10 −7 , Supplementary  Information 8). This gene is involved in the response to nutrient starvation 39 , suggesting that the common vampire bat metabolism has adapted to the low nutrient content available in blood. Similarly, we identified KOs in the common vampire bat microbial core related to energy and carbohydrate metabolism ( Fig. 4b and Supplementary Information 11). For example, when compared to the other bats, we identified an enrichment in the common vampire bat microbial genes involved in response to low nutrient availability (RelA/SpoT family protein, Fisher's P = 0.0064, and guanosine pentaphosphate, Fisher's P = 0.0018), and enzymes in the common vampire bat core involved in the reverse Krebs cycle (Supplementary Information 10), which is used by some bacteria to produce carbon compounds from CO 2 and water (abundant blood components). We speculate that the presence of such metabolic pathways indicates the growth of specific microbes in the gut's environmental conditions resulting from blood consumption.
The holobiont has also provided solutions to the lack of important nutrients in blood ( Fig. 4c and Supplementary Information 8). For example, the PDZD11 gene in the common vampire bat genome, involved in vitamin B 5 metabolism, evolved faster in the common vampire bat genome relative to the other examined bats (branch test P = 1.97 × 10 −10 ). We further postulate that the microbiome also contributes in tackling the low nutritional challenge by providing necessary nutrients. For example, compared to the other bat microbiomes, the common vampire bat gut microbiome had the highest number of enriched enzymes related to the biosynthesis of cofactors and vitamins, such as carotenoid (Supplementary Fig. 8B and Supplementary Information 10 and 11). Furthermore, we identified enzymes involved in the metabolism of butyrate, an

Nature ecology & evolutioN
important nutrient for cells lining the mammalian colon derived from bacterial fermentation 40 , enriched in the common vampire bat gut microbiome as well as in the vampire bat gut microbiome core ( Supplementary Information 10 and 11).
Lipid and glucose assimilation. Besides vitamins and other nutrients, lipids and glucose may not be readily available in blood. Furthermore, vampire bats have a reduced capacity to store energy reserves. In agreement with this, we identified GO enrichment of lipid metabolism on genes with dN/dS values statistically higher or lower compared to other bats (Fig. 4a,c and Supplementary  Information 8). For example, we identified the gene FFAR1, which plays an important role in glucose homeostasis, as evolving faster in the common vampire bat compared to the other bats (branch test P = 3.68 × 10 −5 ) and containing amino-acid substitutions with a possible functional impact on its binding ability (Supplementary Table 4). This may enable the common vampire bat to better utilize the available glucose. The common vampire gut microbiome also exhibits unique solutions to the challenge (Fig. 4b,d and Supplementary Information 10 and 11). Differences in the carbohydrate and glycan metabolism functional profile were identified in the principal component analysis (PCA) comparing the different microbiomes ( Supplementary Fig. 11), which place the vampire bat gut microbiome profile within a cluster separate from those of the non-sanguivorous bats. Importantly, we identified enrichment in the microbial gene glycerol kinase in the common vampire bat (Fisher's P = 0.0027), which plays a key role in the formation of triacylglycerol and in fat storage, and its deficiency causes symptoms such as hypoglycaemia and lethargy in a mouse model 41 .
Hologenomic solutions to non-nutritional challenges. Immunity. Due to its sanguivorous lifestyle, the common vampire bat risks direct contact with blood-borne pathogens from prey. Consequently, we observed > 280 bacterial species known to be pathogenic to some mammalian species present exclusively in the common vampire bat gut microbiome ( Supplementary Information 12). For example, we identified enrichment of genes from Borrelia ( Supplementary Information 11) and Bartonella as one of the most abundant genera in the common vampire bat compared to the other bats ( Supplementary Information 13). These bacteria are known to be transmitted by sanguivorous invertebrates (ticks, fleas, mosquitoes and lice). This suggests that the abundance of this genus could be a shared pattern of sanguivorous species. While several studies have elucidated part of the expected genomic immunity-related adaptations 6 , analysis of the full genome enabled us to identify more elements related to immunity, such as defence response to virus and antigen processing and presentation (Fig. 4a,c and Supplementary  Information 8). For example, we identified the antimicrobial gene RNASE7 to be positively selected (branch-site test P = 0.004) and containing amino-acid substitutions that may increase its bactericidal capacity (Supplementary Table 4). In addition, when compared to the gut microbiomes of non-sanguivorous bats, that of the common vampire bat contains a large abundance of potentially protective bacteria, such as Amycolatopsis mediterranei (P < 0.05),   (64) c Carboxylic-acid metabolism (22) Fatty-acid metabolism (46) Amino-acid betaine metabolism (2) Defence response to virus (3) Carnitine metabolism (4) Ammonium-ion metabolism (5) Small-molecule catabolism (6) Single-organism catabolism (10) Organic-acid metabolism (28) Antigen processing and presentation (35)

Fig. 4 | Traits in both the genome and gut microbiome with direct roles in the adaptation for sanguivory. a,
GOs of the single-copy orthologous genes with a dN/dS ratio that is statistically higher or lower in the common vampire bat in comparison to the other bats. b, KOs from the common vampire bat gut microbiome functional core. c, Genes with a dN/dS ratio that is statistically higher or lower in the common vampire bat in comparison to the other bats and that are directly associated with sanguivory challenges. d, Taxa and genes from the common vampire bat gut microbiome core directly associated with the adaptation to sanguivory.

Nature ecology & evolutioN
which has been shown to produce antiviral compounds against bacteriophages and poxviruses 42 (Supplementary Information 13).
Iron assimilation. Iron concentration represents a significant challenge to sanguivory. Although the concentration of free iron ion is not high in the blood, severe haemolysis (for example, during digestion of blood) could result in high levels of iron that, if absorbed in excess, could accumulate and disrupt the normal function in organs such as the liver, heart and pancreas. Interestingly, we identified the light and heavy chains of the iron-storing protein ferritin (encoded by FTL, FTH1) under gene family expansion in the common vampire bat genome (Viterbi P = 0 and Viterbi P = 0.0012, respectively, Supplementary Information 5). In addition, we identified an enrichment of the iron-storing protein ferritin (Fisher's P = 0.0014), suggesting that the gut microbiome also contributes to solving this challenge ( Fig. 4d and Supplementary Information 11).
Nitrogen waste and blood/osmotic pressure. The high abundance of protein in the blood and its rapid ingestion could lead to accumulation of nitrogenous waste products, primarily urea, which could lead to renal disease-like symptoms (for example, high blood pressure and fluid retention). This challenge is exacerbated by the abundance of salts in blood, which pose additional osmotic and blood pressure challenges. We see at the genome level that this is addressed by a higher rate of evolution in the common vampire bat genes compared to the other bats involved in disposal of excess nitrogen (Fig. 4a,c and Supplementary Information 8), such as PSMA3 (branch test P = 2.08 × 10 −7 ). This challenge seems to also be addressed by the gut microbiome. The PCA of the copy number of genes involved in amino-acid metabolism distinguishes the common vampire bat in a single cluster separated from the other bat species analysed ( Supplementary Fig. 11), suggesting a specialized microbial amino-acid metabolism capacity. We also identified enrichment in the common vampire bat microbial gene urease subunit alpha (ureA, Fisher's P = 0.016) involved in urea degradation ( Supplementary Information 11).

Conclusions
It is clear from our results that the common vampire bat has adapted to sanguivory through a close relationship between its genome and gut microbiome. We identified a phylogenetic and dietary impact on the common vampire bat gut microbiome and uncovered an unexpected genomic viral and repetitive element genomic makeup. We showed that extreme dietary specializations, such as that of the common vampire bat, provide a comparative framework with which to tease apart the relative roles of genomes and microbiomes in adaptation. In conclusion, our study illustrates the benefits of studying the evolution of complex adaptations under a holobiome framework, and suggests that vertebrate adaptation studies that do not account for the action of the hologenome may fail to recover the full complexity of adaptation.

Methods
Genome sequencing and raw-read processing. We shotgun-sequenced the D. rotundus genome using a wing biopsy from a sample collected by the NIH through the Catoctin Wildlife and Zoo in Thurmont, Maryland, USA. The capturing method and dead preservation procedure of the specimen are unknown. The age and sex of the dead individual are unknown. Sampling permits were given to BGI for the sequencing of the specimen, originally as part of the BGI 10K genome project. Genomic DNA was extracted at the Laboratory of Genomic Diversity and was fragmented to 2-10 kilobases (kb). Sequencing libraries were constructed with insert sizes 170 bp− 10 kb, according to the Illumina protocol for sequencing on Illumina HiSeq2000 following the manufacturer's instructions. We sequenced reads of 49 bp for the long-insert-size libraries (2 kb, 5 kb and 10 kb) and 100 bp for the short-insert-size libraries (170 bp, 500 bp and 800 bp). Sequencing errors were corrected on the basis of the frequency of nucleotide strings of a given length and low-quality reads were filtered out using SOAPfilter 43 as follows. (1) Remove reads with > 10% uncalled nucleotides (Ns).
(2) For short-insert-size libraries (< 2 kb), reads were removed if the quality score of > 60% bases was < 7. For large-insert-size libraries (≥ 2 kb), reads were removed if the quality score of > 80% bases was < 7. (3) Adapter sequences, and duplicate or identical reads were removed. (4) Read pairs were removed if Read1 and Read2 were completely identical. (5) For short-insertsize paired-end sequences, reads with overlapping length ≥ 10 bp between the Read1 and Read2 were removed.
Genome assembly. We estimated the genome size of D. rotundus using Kmerfreq 44 by dividing the total number of seven decamers by the peak of the seven decamer Poisson distribution. High-quality reads were assembled using SOAPdenovo 43 as follows.
(1) Short-insert library reads were assembled as initial contigs ignoring the sequence pair information.
(2) Reads were aligned to the previously generated contig sequences. Scaffolds were constructed from short-insert-size libraries to large-insert-size libraries by weighting the paired-end relationships between pairs of contigs, with at least three read pairs required forming a connection between any two contigs. (3) Gaps in the scaffolds were closed using GapCloser 43 . Genome quality assessment was performed by downloading a publicly available D. rotundus transcriptome 45 and aligning the transcripts to the genome using BLAT 46 .
Genome contiguity improvement. We prepared two Chicago libraries 22 using 5 μ g of high-molecular-weight DNA obtained from D. rotundus cultured cells from the San Diego Zoo collection, which were originally derived from a skin sample taken from between the shoulder blades of a D. rotundus individual. Permits for this were obtained from the San Diego Zoo Global. The capturing method and dead preservation procedure of the specimen are unknown. The age and sex of the dead individual are unknown. DNA was extracted with Qiagen Blood and Cell Midi kits according to the manufacturer's instructions. The steps required for building the Chicago libraries were performed as described in ref. 22 . The libraries were sequenced using Illumina HiSeq 2500 2 × 100 bp rapid run. Our initial D. rotundus assembly, shotgun sequence data and Chicago library sequences were used by Dovetail Genomics as input data for HiRise, as described in ref. 22 . Genome assembly contiguity statistics were obtained using a minimum N track length of 1 to delimit the contig blocks within the scaffolds.

Protein-coding gene and functional annotation.
Homology-based gene prediction was performed using as a reference the Ensembl gene sets of Myotis lucifugus, Pteropus alecto, Myotis davidii, horse and human. We aligned the protein sequences of the reference gene sets to the D. rotundus assembly using tblastn 47 and linked the blast hits into candidate gene loci with genBlastA 48 . We filtered out those candidate loci with homologous block length < 90% of the query length. We extracted genomic sequences of candidate gene loci, including the intronic regions and 3 kb upstream/downstream sequences. The sequences were passed to GeneWise 49 to search for accurately spliced alignments. We filtered out pseudogenes containing more than one frame error for single-exon genes. Potentially pseudogenized single exons were removed if they were part of a multiexon gene. We then aligned protein sequences of these genes against UniProt using blastp and filtered out genes without matches. We also filtered out genes that had > 80% repeat regions. De novo gene prediction was performed with AUGUSTUS 50 using a published common vampire bat transcriptome 45 as a training data set and with masked transposable-element-related repeats. We filtered out partial and < 150 bp predicted genes. Genes that aligned over 50% of their length to annotated transposable elements were filtered out. Finally, we built a non-redundant gene set with the homology-based evidence prioritized over the de novo evidence. If de novo genes were chosen in the reference gene set, we retained only those with > 30% of their length aligning against UniProt 51 and that contained at least 3 exons. The integrated gene set was translated into amino-acid sequences, which were used to search the InterPro database with iprscan_4.8 52 . We used BLAST to search the metabolic pathway database in KEGG 53  Repeat annotation. Repeat annotation was performed on the genomes of D. rotundus, P. parnelli, M. lyra and P. vampyrus. Transposable elements were identified using RepeatMasker 55 and RepeatProteinMask against the Repbase transposable element library 56 . We used RepeatScount, PILER-DF and RepeatModeler-1.0.5 55,57 to construct a de novo transposable element library, which was then used by RepeatMasker to predict repeats. We predicted tandem repeats using TRF 58 . LTR_Finder 59 was used to detect long terminal repeats (LTRs). The Repbase-based annotations and the de novo annotations were merged.
Non-retroviral EVEs. We constructed a comprehensive library of all nonretroviral virus protein sequences available in GenBank and EMBL. We used DIAMOND 60 to search these sequences against the D. rotundus genome. We extracted the matching amino-acid sequences and performed reciprocal blastp-like searches 61  The best-fit amino-acid substitution model for each alignment was identified using jModelTest2 73 and trees were inferred under maximum-likelihood criteria using RAxML 65 .
Orthologous gene families. Using the D. rotundus genome against a range of other mammalian species, we performed clustering of orthologous genes using two strategies. (1) Identifying single-copy orthologues in the species by using the TreeFam method 74 .
(2) Identifying 1:1 orthologues by building pair-wise orthologues between D. rotundus and the other species and using a reciprocal best hit (RBH) plus synteny approach as described in ref. 75 .
dN/dS analyses. We used PAML codeml 76 on the cleaned CDS alignments from the two sets of orthologous families and the corresponding phylogenetic tree as reported in ref. 77 . To identify genes with accelerated evolution in the D. rotundus lineage we ran the two-ratio branch model. As a null model, we used the one-ratio model. Using these results, we performed likelihood ratio tests (LRTs) to identify genes with significant P values. To adjust for multiple testing, we used the FDR method. To identify genes with positively selected sites in D. rotundus, we also used a branch site with PAML codeml 76 . LRT and FDR were computed as carried out for the branch model tests.
Gene family expansion/contraction. We used CAFE 78 with the results from the single-copy orthologous gene families. To obtain the dated tree required as input for CAFE, we obtained divergence times from the fossil dating records from TimeTree 79 . We concatenated the CDSs aligned with MAFFT and used PAML mcmctree to determine split times with the approximate likelihood calculation method. Genes with > 200 copies in 1 of the species were filtered out.
Putative gene loss. We identified genes putatively lost in D. rotundus as previously described in ref. 80  Taxonomic and functional identification. The reads were cleaned with Trimmomatic 91 and prinseq-lite 92 . The data sets were mapped against the closest available bat genome and only the non-mapping reads were kept. MGmapper 93 was used for the taxonomic identification of invertebrates, protozoa, fungi, virus and bacteria. We kept the species identified with a coverage value higher than the firstquartile from the coverage distribution of the corresponding database and filtered out those found on the corresponding extraction blanks. Rarefaction curves from each data set were obtained using an in-house script with the MGmapper results. We then performed de novo assembly using IDBA_UD 94 , predicted genes using Prodigal 95 and generated a non-redundant gene catalogue with usearch 96 . The non-redundant catalogue was searched with ublast 96 against the UniProt database 51 for functional and taxonomic annotation. We used DIAMOND 60 to search the unmapped reads against the UniProt database keeping only the best hit for functional and taxonomic annotation. We annotated the UniProt protein identifications using KEGG orthology (KO) and eggNOG IDs.
Taxonomic and functional metagenomics comparison. We filtered on the basis of the breadth of coverage and the identifications of the extraction blanks. We removed any non-microbial hit and any taxa in which the paired reads matched different genera or only one of the reads had a hit. The counts were normalized by percentage. We identified a microbial taxonomic and functional sanguivorous core by comparing the filtered sets of the bats and keeping as core those taxa and genes identified only in the D. rotundus samples. We manually examined the taxa from the filtered taxonomic identifications, and the KO and COG annotations from the filtered non-redundant gene set catalogue. We compared the normalized abundance of taxa and functions between D. rotundus and the non-sanguivorous bats as follows. (1) Using the distributions of the different functional categories from D. rotundus and each non-sanguivorous bat species with a Wilcoxon ranksum test.
(2) Using the entire taxonomic and functional data sets, as well as down-sampling the normalized count values. Sampling values were the minimum, median and third-quartile values of the count distributions. With the resulting data sets, we calculated the Euclidean, Bray-Curtis and Jaccard distance metrics with the R package vegan 97 , and used the Ward hierarchical clustering method using UPGMA and Ward, and performed PCAs with prcomp and the GPA method.
(3) We identified taxa and functions significantly contributing to the variation between the D. rotundus and the non-sanguivorous bat species. We examined the rotation matrix from the PCA of the normalized counts, excluding the four deepest sequenced samples, of the species and genus microbial taxonomic levels and the KEGG functional pathways. We identified the most significantly abundant