DarkCideS 1.0, a global database for bats in karsts and caves

Understanding biodiversity patterns as well as drivers of population declines, and range losses provides crucial baselines for monitoring and conservation. However, the information needed to evaluate such trends remains unstandardised and sparsely available for many taxonomic groups and habitats, including the cave-dwelling bats and cave ecosystems. We developed the DarkCideS 1.0 (https://darkcides.org/), a global database of bat caves and species synthesised from publicly available information and datasets. The DarkCideS 1.0 is by far the largest database for cave-dwelling bats, which contains information for geographical location, ecological status, species traits, and parasites and hyperparasites for 679 bat species are known to occur in caves or use caves in part of their life histories. The database currently contains 6746 georeferenced occurrences for 402 cave-dwelling bat species from 2002 cave sites in 46 countries and 12 terrestrial biomes. The database has been developed to be collaborative and open-access, allowing continuous data-sharing among the community of bat researchers and conservation biologists to advance bat research and comparative monitoring and prioritisation for conservation.


Background & Summary
Human civilization has left its footprint on every part of the planet, in the process driving what is frequently referred to as the sixth mass extinction 1,2 . Conservation prioritisation requires a rigorous assessment of vulnerable species as well as their habitats to develop priorities for conservation. Biodiversity data integration and synthesis are significant empirical steps to identify priorities in strategically using the limited funds allocated to conservation 3 . However, the data needed to develop such priorities with rigour are often lacking. The diversity and distribution of a subset of terrestrial vertebrates have become an umbrella for taxonomic and spatial conservation, despite the known biases present in popular open datasets 4,5 . Efforts to mitigate extinction risks or protect key habitats often disproportionately focus on particular taxa, ecosystems, or regions 6,7 . This approach neglects many other equally important species and their habitats and compromises the maintenance of ecosystem services provided by diverse functional groups 8,9 . Cave ecosystems are critical for bats, with around half of all bat species reliant on caves, with a high rate of endemism 10,11 . Of the more than 1400 known extant bat species distributed across almost all terrestrial habitats around the globe, at least 679 species are known to be cave-dwelling [11][12][13] . Many of these species occur in biodiversity hotspots that are threatened by varying anthropogenic and natural threats 13,14 . Caves are important habitats for bats and other unique species but are nonetheless threatened and in need of urgent conservation 10 . Despite hosting high endemism, cave ecosystems receive little attention in terms of fund allocation and appropriate priorities for scientific studies and conservation compared to their surface counterparts such as agricultural and forest ecosystems 10,13,[15][16][17][18] . Cave taxa are adapted to light-limited underground environments and most of them are dependent on mobile species such as bats to transport organic nutrients into these environments [19][20][21] . Bats are keystone species in karst ecosystems and ideal cave conservation surrogates, delivering vital energy sources into caves as they regularly forage from outside ecosystems 22 . Nevertheless, conservation attention towards cave-dwelling bats remains limited compared to other mammalian taxa. Thus, there is an urgent need for better data to develop effective conservation strategies for bats 13 .
Effective conservation decision-making relies on the accuracy and precision of the data used to design present and future management strategies 5,7 . Identifying priority caves for conservation requires an understanding of species diversity, endemism patterns, interactions with other organisms, and threats within and outside these systems 17,23 . Additionally, while numerous organisations and collaborative efforts aim to database bat distributions, comprehensive and specific datasets for cave-dwelling bats, including their distributions and ecological # A full list of authors and their affiliations appears at the end of the paper. Data Descriptor opeN traits, are currently lacking. Large databases for species distributions such as the Global Biodiversity Information Facility (GBIF) exist and openly provide distribution data for bats. However, due to the enormous amount of information within these databases, it is challenging to selectively evaluate data for specific ecosystems such as caves, and thus more specialist datasets are needed to facilitate appropriate habitat-based prioritisation.
To address this knowledge gap, we created DarkcideS 1.0 (https://darkcides.org/), a global database for bats in karsts and caves, to advance global bat cave vulnerability and conservation mapping initiatives. The creation of the dataset primarily aims to map and digitise the distribution of cave-dwelling bats to facilitate the assessment of their vulnerability to landscape threats. DarkCideS 1.0 represents a publicly available database of cave-dwelling bats across time and space, including their estimated population (e.g., counts), geographical distribution (latitude and longitude), ecological traits, levels of endemism, conservation status, and threatening processes. The purpose of the DarkCideS 1.0 initiative is to centralise and develop an open-access platform for information exchange among bat researchers and conservation biologists to advance the development of targeted conservation measures and macroecological studies (Fig. 1). Potential applications of the database include assessing species conservation status and extinction risks; understanding drivers of extinction, cave conditions, and landscape threats; accurately developing species distribution models; and determining long-term cave conservation priorities at regional to global scales.

Methods
The DarkCideS database was initially conceptualised and developed by KCT, JAG, and ACH as part of the "Global Bat Cave Vulnerability and Conservation Mapping Initiative" in 2014, and later with the "Mapping Karst Biodiversity in Yunnan" and the "Southeast Asian Atlas of Biodiversity" projects. The initiative includes developing tools and methods (e.g., the Bat Cave Vulnerability Index 14 ) and synthesis (e.g., the global bat cave vulnerability assessment 11 ) to identify conservation priorities and important bat caves in the tropics. Since 2019, the initiative has expanded and potential collaborators and contributors were invited through scientific conferences (Association for Tropical Biology and Conservation 2018, International Bat Research Conference 2019), social media platforms, and personal correspondences. At present, the database has 36 collaborators from twenty countries on six continents with expertise and research interests in bat conservation. Four main datasets for all known cave-dwelling bats were built for the DarkCideS database version 1.0.
Datasets and compilation for species checklist. The first dataset contains taxonomic checklists for all extant cave-dwelling bats species extracted from the expert-based International Union for the Conservation Union (IUCN) Red List database version 2020-1 (Table 1). We screened and included all bat species that were reported to use, roost in, or aggregate in "Caves", "Underground", and "Karsts" habitats in any part of their life histories. We also scanned major publicly available bat cave databases from expeditions such as "Bats in China" (http://www.bio.bris.ac.uk/research/bats/China/) and UNEP-EUROBATS (https://www.eurobats.org/) for European bats 24 for additional information and datasets. In addition, the first dataset contains species ecological traits, distribution range, and threatening processes (Table 1).
Information per species was pooled from the IUCN Red List versions 2020-1 25 . Species taxonomy was then curated and updated (e.g., synonyms or merged species) using the nomenclature from Simmons and  www.nature.com/scientificdata www.nature.com/scientificdata/ Cirranello 12 . The "checklist for global cave-dwelling bats" derived from the IUCN Red List includes 679 species. Meanwhile, the DarkCideS 1.0 dataset contains occurrence data for 402 species from 16 families representing 59% of all cave-dwelling species 11 (Fig. 2). We found a marginally significant relationship between the species richness and proportion of threatened species between the IUCN-based global cave-dwelling bat and DarkCideS datasets (Kendall's τ b = 0.60, P = 0.07). The highest completeness of sampled species is in the Neotropics (67.38%) and Indomalayan region (66.08%), and the greatest gaps are in Austral-Oceania (40.28%). Highest endemism was recorded in Austral-Oceania (58.62%) (χ 2 = 227.32, df = 5, P < 0.001) (Fig. 2a). The proportion of threatened species is highest in the Indomalayan region (16%) realm (χ 2 = 281.18, df = 5, P < 0.01) (Fig. 2a). Most bat families have a coverage of 30 to 60% of species, but four families had all cave-dwelling species in the DarkCideS database, and three smaller families had no species included (Fig. 2b).
Habitat preference, distribution, ecological status, and traits. We classified species distribution by biogeographical realm (Indomalaya, Austral-Oceania, Afrotropical, Neotropical, Palearctic, and Nearctic) and terrestrial biomes following Olson et al. 26 . We described species' major habitat breadth based on IUCN Level 1 classification https://www.iucnredlist.org/resources/habitat-classification-scheme (Caves, Forests, Savanna, Desert, Urban, Artificial, and Wetlands). Species current conservation status (Data Deficient, Least Concern, Near Threatened, Vulnerable, Endangered, and Critically Endangered) and population trends (e.g., Unknown, Decreasing, Stable, Increasing) were categorised using standard IUCN Red List assessments. Using the same criteria, we categorised species endemism as geopolitically endemic (e.g., country-endemic, and non-endemic) when a species occurs only in a single country or state territory 27 , and island endemism was classified as island-restricted or predominantly mainland 28 . The highest country endemism was in the Eastern Hemisphere with the highest in the Austral-Oceania (40%) region, followed by the Afrotropical (21%), then the Indomalayan region (16%). However, the highest proportion of threatened species was in the Indomalayan region (43%) and the Neotropics (22%) (Fig. 2a). www.nature.com/scientificdata www.nature.com/scientificdata/ Furthermore, current geographical ranges were assembled from the Phylacine 1.2 database 28 based on IUCN species ranges. Three species traits were included: the adult body mass (in grams) per species were derived from Phylacine 1.2 28 and generation length from Pacifici et al. 29 . For trophic groups, we derived diet information from EltonTraits 1.0 30 . We grouped species as frugi-nectarivorous for all species that forage on plant-based resources (e.g., fruits, leaves, and nectars). As species foraging smaller vertebrates (i.e., fish, birds, and rodents) and larger invertebrates are very few, we classified them as carnivores along with insectivorous bats. Species that forage on both resources were grouped into omnivores (Table 1).
Species threatening process. We identified potential threats for each bat species listed in the checklist using the information from the IUCN Red List assessments (version 2020-1) in addition to threats highlighted   www.nature.com/scientificdata www.nature.com/scientificdata/ in the literature. The IUCN Red List standardised its classification based on Salafsky et al. 31 , but we reclassified the threatening process into three key categories: Direct, Indirect, and Natural (Table 1) based on the drivers of threat 10,14,32 . Direct threats (T dir ) refer to the threats or risks that are direct to or in cave systems with immediate and perceivable impacts on populations or behaviour of species. This category includes direct human impacts (e.g., persecution, eviction, and cave closures) and the use of caves for harvesting bats, tourism, religious visits, and mining (minerals or guano). Indirect threats (T ind ) refer to the threats outside cave systems or within cave proximity, of which the impacts to populations are secondary or non-immediate but otherwise detrimental. Examples include deforestation, agriculture, and urbanisation. Lastly, Natural threats (T nat ) refer to threats that are natural in origin, though their frequency may be impacted by human activities, and that may directly or indirectly impact populations, such as diseases (e.g., White-nose syndrome) and climate-driven risks (e.g., drought, extreme cold) ( Table 1).
Bat cave georeferencing. The second dataset contains the bat cave geographical location (latitude/ longitude) and recorded species (Table 2, Fig. 3a). We used the Web of Science and Google Scholar to search  www.nature.com/scientificdata www.nature.com/scientificdata/ online literature, databases, and repositories for published information on cave-dwelling bats from 1990 to 2021. We used the following combination of keywords: (Bat* OR Chiroptera OR Chiroptera fauna*) AND (Diversity OR "Species richness" OR abundance OR distribution OR conservation OR ecology) AND (Cave* OR Cave-dwelling OR Cave-roosting OR underground* OR subterranean OR karst* OR Limestone). We also set a "create alert" in Google Scholar whenever new related papers were published. The data mining process for version 1.0 ended in June 2021. Our search returned 753 papers. We also searched using the Baidu Research engine for Chinese literature and self-archived ResearchGate to maximise search results. To ensure the precision of the datasets included in DarkCideS 1.0, we filtered all published literature to only include those papers or reports with complete species names and geographical records. We contacted corresponding authors with requests to provide us with geographical data when these were missing from their papers or supplementary materials. In the circumstance that we were unable to find the data, and the corresponding author did not respond to our request, that "cave site" was excluded from the database. We converted all species and cave latitude and longitude into WG8 84 decimal degrees with five significant figures. The second dataset of DarkCideS 1.0 contains 6746 georeferenced occurrences for 402 species 11 from 2002 cave sites (Fig. 3a). Cave sites occur in all continents except Antarctica, with most of the data originating from tropical and temperate biomes (Fig. 3b). We have cave records from 46 countries of which China and Brazil have the highest number of caves recorded (Fig. 3c).
Cave landscape features and vulnerabilities. The condition of surface ecosystems and the extent of threats are significant determinants of cave-dwelling bat diversity 11 . Yet, standardising the vulnerability of caves and underground ecosystems from threats on a global scale is challenging 11,14 . To address this, the surface www.nature.com/scientificdata www.nature.com/scientificdata/ ecosystem was mapped as a proxy to assay cave vulnerability to threats using remotely sensed landscape features. The third dataset included in the database contains the measured land-use and landscape features of the cave surroundings using the georeferenced data from the second dataset (Table 3; Fig. 4). The selected landscape features measurements of the 2002 cave sites were selected based on Tanalgo et al. 11 . We included the estimated distance and measures of twelve landscape variables in the database, including canopy cover height 33 , tree density 34 , distance to freshwater bodies 35 , bare ground cover change 36 , short vegetation cover change 36 , tall tree cover change 36 , for vulnerabilities we included distance to urban areas 36 , distance to roads 37 , mine density 38 , night light 39 , relative pesticide exposure 40 , and population density 41,42 . For distance variables, the "distance to feature" tool was used in ArcMap 10.3 and distances were mapped at a 1-km resolution.
Cave bat parasites and hyperparasites. Parasites, while being among the most diverse modes of life, are often disregarded in conservation strategies 43 . It is well established that parasites affect the stability of food webs and ecosystem health, but hyperparasites have thus far been severely understudied. For future studies on host associations across multiple trophic levels and on the effects of climatic conditions and land-use changes, parasites and hyperparasites are part of our DarkCideS 1.0 database. The fourth dataset lists the parasitic bat flies and their Laboulbeniales fungal hyperparasites associated with cave bats. Data were collected from several sources, including our fieldwork data 36 , Haelewaters et al. 44 , and de Groot et al. 45 . Bat fly taxonomy followed Dick and Graciolli 46 and Graciolli and Dick 47 and fungal taxonomy followed Index Fungorum 48 . In addition to the conspicuous bat flies, bats are host to several other lineages of parasites mites and ticks, lice, fleas, bugs, and earwigs 49,50 . Consequently, the fourth dataset will be expanded on in future versions of DarkCideS with data on these parasitic organisms. A recent call for global collaborations among bat scientists and collaborations to generate multitrophic data of bats, bat flies, and fungi 50 along with the current DarkCideS 1.0 initiative will contribute to a general understanding of how ecological and life-history traits are correlated with bat parasitism and how host associations may change under changing conditions.

Data Records
The complete database for global cave-dwelling bats was organised in four main datasets stored in separate Excel workbooks (.csv file format). Each dataset contains unique sequential name IDs that correspond to metadata, variables, and references. All datasets included in the database are available and open-access from Figshare online repository 51 and through a public website (https://darkcides.org/). The resolution of the publicly available cave and species occurrences were reduced for the protection of caves and to prevent hunting and harvesting. Database users can request high-resolution data of georeferenced species occurrence and cave sites from the corresponding authors. When a substantial dataset is available, all additional datasets will be updated in Figshare.

Technical Validation
The data included in this database are mainly derived from public, expert-based databases, published material and bat researchers, therefore ensuring the accuracy of the included data. We provided the corresponding reference (when applicable) for each cave record for cross-referencing and data validation purposes. When published "cave datasets" were unclear or lacked detailed information, we communicated with the corresponding authors. The DarkCideS database aims to be a long-term biodiversity data exchange platform by including new data from fieldwork and assessments. Authors can upload their dataset containing species records, geographical information, and landscape threats on the web page. The corresponding authors will receive new data entries for validation before being merged into the database.
www.nature.com/scientificdata www.nature.com/scientificdata/ We encourage continued contributions to the DarkCideS database as we aim to regularly update the entries for species checklists, traits, geographical locations of caves, and species occurrence data. For species ecological status (e.g., current conservation status, population trends, geopolitical endemism), we will update entries after every IUCN Red List assessment cycle. The database will be updated when new data are contributed and corrected when an error in the data entry is reported to any of the corresponding authors. New entries will be quality screened based on the criteria listed above before adding to the database (Fig. 5). Once an update is made, a release note will be published on the database website. When updating new versions of DarkCideS, we will continue to make available previous releases. Contributors will be included as co-authors when the next version of the database is published. Furthermore, as each cave has a unique ID, additional surveys of other taxa at the same locality can be integrated into the database, to provide a backbone for enhancing our understanding of cave biodiversity through time.

Usage Notes
All datasets included in DarkCideS are publicly available under a Creative Commons Attribution 4.0 International Public Licences (https://creativecommons.org/licenses/by/4.0/), where users and authors may freely use our datasets, with the condition that the sources are credited and acknowledged, the original license is linked, and any modifications and treatments to our data are indicated in the final work or material.
Although we aim to maximise spatial coverage with datasets from across the globe, we acknowledge that geographical biases inevitably exist 52 . For example, we have multiple datasets from the Palearctic, Indomalayan, and Neotropical realms, whereas very little data, originated from the Afrotropical region (see Fig. 3). We also encountered similar coverage bias in country-level data richness. For example, Indonesia is one of the most diverse countries for estimated cave-dwelling bat species richness 11 , but a very small number of species were included in the current version of the database. The database is intended as a long-term data-sharing platform, and we hope to fill these gaps in the next versions of the database. Further data and better coverage will provide a better index for regional prioritisation in addition to further research on bat diversity patterns and threats.

Consortia authorship. The DarkCideS database is a continuous project. To promote global collaboration
and equitability, all present and future members of the DarkCideS initiative and consortia (https://darkcides.org/ our-team/) will be considered bona fide authors of the current and future versions of the database.

Code availability
No code was used to generate the data presented in this data paper.