Introduction

Subterranean environments are notoriously underexplored. It has been estimated that more than 80% of Australia’s subterranean fauna have yet to be discovered1. Subterranean environments can be found above and below the water table. Examples of these environments include caves, cavities, aquifers and anchialine systems. Previous research demonstrates that subterranean environments exhibit a high diversity of (largely invertebrate) taxa that are adapted to a lack of light, and also to variable temperature, nutrient, salinity and dissolved oxygen levels and in some cases, water stratification2. Short-range, endemic species are common, as highly fragmented environments pose barriers to gene flow, fostering evolutionary drift over time3. Commonly surveyed aquatic ‘stygofauna’, found within aquifers and anchialine systems, include fresh and saltwater fish, eels, gastropods, salamanders, flatworms, beetles, water mites and crustaceans such as amphipods, decapods, isopods, ostracods, copepods and syncarids4.

The use of genetic techniques in conjunction with traditional biospeleological sampling and morphological assessment can provide in-depth information in relation to patterns of stygofauna diversity, discrete lineages, colonisation and speciation histories. Biospeleological studies that have incorporated genetic methodology have largely focused on the use of single-source “barcoding” or genome building, where individual specimens are collected and targeted for sequencing5,6,7,8,9,10. However, single-source sequencing remains reliant upon the use of capture-based sampling, which is intrinsically linked to the measurement of biodiversity. Whilst capture-based sampling is fundamental to biospeleological research, it can be hindered in subterranean environments that present difficult to access underground voids and networks. Biospeleological research could therefore benefit from a non-invasive bioassessment tool that complements capture-based sampling in order to gauge stygofauna diversity and distribution.

Environmental DNA (eDNA) metabarcoding has been widely developed within the last few years for use in marine, freshwater and terrestrial environments and has demonstrated its power as an efficient, non-invasive and highly sensitive taxa detection tool11,12,13,14. However, its application in subterranean environments has not yet been thoroughly explored. Preliminary research has demonstrated the viability of eDNA metabarcoding in detecting multi-species compositions from underground water samples15,16,17,18,19,20, although these have largely focused on microbial communities. Further research is needed to assess the applicability of eDNA metabarcoding for eukaryotic stygofauna detections in subterranean aquatic ecosystems, which through limited surveying and subsequent reference barcoding of biota, may be hindered by incomplete reference databases. There is great potential to expand the use of eDNA metabarcoding, not only for the detection and monitoring of both described and undescribed eukaryotes in subterranean environments, but also to investigate community assemblages across trophic levels and haloclines, the evolution and population diversity of stygofauna, and the interconnectivity of underground ecosystems.

Christmas Island (CI; 10.4475° S, 105.6904° E), one of Australia’s external Indian Ocean territories, forms the pinnacle of an isolated seamount that re-emerged above sea level approximately 5.66–4.49 Ma21. A highly developed karst landscape has formed out of imbedded carbonates and approximately 30 accessible caves have been documented, ranging from plateau and freshwater stream caves; to fissure caves, collapsed caves, sea caves, and coastal caves with ocean access points22. Rainfall percolates through limestone fractures and solution holes and is largely discharged by coastal and offshore springs, however there are also major inland springs at Waterfall Spring, Ross Hill Gardens and The Dales22. Cave fauna are considered to form a significant component of this island’s unique ecosystem23 and at present comprise at least 17 out of a total of 253 endemic species documented on CI24. However, the diversity and distribution of stygofauna, in particular anchialine fauna, across CI’s extensive underground networks requires further research25. This is of particular importance given the use of the karstic landscape for phosphate mining and as a water supply for local households and businesses. The subterranean interconnectivity of the majority of the CI caves are unknown, but it is suspected that caves within close proximity (such as the Whip and Runaway Caves that incidentally also share many species of macrofauna) are connected underground26. The need for a comprehensive biodiversity audit of CI’s extensive subterranean habitats makes it an ideal location to conduct a broad eDNA metabarcoding survey of eukaryotic macro-stygofauna. Our main objectives were to: (1) identify putative new occurrence and extend distribution records for CI’s subterranean stygofauna using eDNA metabarcoding; (2) assess variation in community composition of cave and spring sites, and; (3) investigate potential underground interconnectivity across CI by combining biotic and abiotic (environmental parameter) data. Overall, we seek to evaluate the applicability of eDNA metabarcoding as a non-invasive, highly-sensitive tool for biospeleological assessment.

Methods

Field sampling

Six one-litre water replicates and one 50-ml sediment sample were collected from 23 cave and spring sites across CI (Figs. 1 and 2; Table 1) during the late dry season in October 2018, totaling 159 samples across a 110 km2 area. Approximately two cave and spring sites were sampled in a day over a two-week period. For public safety and to protect the integrity of the caves, the exact GPS coordinates for our sampled sites are not published here, however they can be requested from Parks Australia (Christmas Island). Sediment could not be taken from the Jedda and Jane-up Cave sites because conduit streams through these plateau caves provide the town water supply and hence access is restricted. However, water samples were taken with permission from the WaterCorp testing taps at these sites. Sediment was sampled at all remaining sites, by collecting a top layer of sediment (approximately 2 cm depth) from underwater sediment sources. Water samples were collected at the surface of each site using bleach sterilised Nalgene bottles and then immediately stored on ice. Each sample was individually filtered across Pall 0.2 μm Supor polyethersulfone membranes using a Pall Sentino Microbiology pump (Pall Corporation, Port Washington, USA), within two hours of collection. A bleach solution (approximately 10% household bleach) was used to clean filtration equipment between samples and a one litre sample of this was filtered at the end of each sampling day to serve as a filtration control throughout laboratory processing. Filter membranes and sediment samples were immediately frozen and stored at −20 °C prior and post-transportation to a quarantine facility within the Trace and Environmental DNA (TrEnD) Laboratory in Perth, Western Australia.

Figure 1
figure 1

Location of eDNA sampling sites on Christmas Island. Orange spheres give the approximate number and location of sampling sites; topographical symbols indicate whether samples from each respective site were taken from within a cave system or at a surface spring. Hillshade relief and national park boundary data was sourced from Geoscience Australia53. Map was produced in ArcGIS Desktop 10.654.

Figure 2
figure 2

Imagery from eDNA sampling localities across Christmas Island. (a) Water and sediment sampling in The Grotto cave site. (b) The Dales wetland area where Hugh Dales Waterfall and CI-079 spring sites were sampled. The (c) ocean entrance and (d) a chamber passage through the extensive Lost Lake Cave system. Photos: Danny Wilkinson and Weidi Koh http://wasg.org.au/.

Table 1 eDNA site information on Christmas Island.

A range of environmental parameters were also taken at the time of sampling from each respective site (Table S1). Measurements of water acidity (pH), temperature (°C), conductivity (mS) and salinity (ppt) were collected using a Hanna HI98129 tester (Hanna Instruments; Victoria, Australia). Air saturation and dissolved oxygen (mg/L) measurements were collected using an OxyGuard Handy Polaris 2 DO Meter (OxyGuard; Farum, Denmark).

Laboratory processing

DNA was extracted from half of each filter membrane and 250 mg of each sediment sample using a DNeasy PowerLyzer PowerSoil Kit (Qiagen; Venlo, the Netherlands) following the manufacturer’s instructions. This was completed within two weeks post-collection. Remaining half filters and sediment have been stored as an extraction back-up. Filtration controls and extraction blanks, containing no sample, were extracted and processed alongside all samples in order to detect any cross-contamination introduced from the laboratory environment. DNA was amplified using three previously published PCR assays27,28,29 to largely target bony fish, molluscs and arthropods (such as crustaceans and insects) from our mixed environmental samples (see Table 2 for details). Quantitative PCR (qPCR) amplification was performed using fusion tagged primers that consist of an Illumina sequencing adaptor, a unique multiplexing index (8 bp in length) and a primer sequence from each respective assay. All qPCR reactions were prepared in an ultra-clean trace DNA facility and thermocycling was carried out in a physically separated laboratory (see Supplementary Information Sect. 1 for qPCR reagents and conditions).

Table 2 Metabarcoding assay information for subterranean eDNA surveys on Christmas Island.

Each eDNA sample was amplified in duplicate and pooled into larger amplicon libraries at equimolar ratios based on qPCR ΔRn values. Each library was size-selected (retaining amplicons between 160–450 bp for the 16S assays, and 200–600 bp for the 18S assay) using a Pippin Prep (Sage Science, Beverly, USA), and was then purified using the Qiaquick PCR Purification Kit (Qiagen, Venlo, the Netherlands) following manufacturer instructions. Final libraries were quantified using a Qubit 4.0 Fluorometer (Invitrogen, Carlsbad, USA) and if necessary were diluted to 2 nM prior to sequencing. Libraries were sequenced on either a 300 cycle (for unidirectional sequencing of the 16S amplicons) or 500 cycle (for paired-end sequencing of the 18S amplicons) MiSeq V2 Standard Flow Cell on an Illumina MiSeq platform (Illumina, San Diego, USA), housed in the TrEnD Laboratory at Curtin University, Western Australia.

Bioinformatics

Unidirectional and unmerged paired-end sequencing reads were demultiplexed using OBITools (v1.2.9)30 and the insect package31 in RStudio (v1.1.423)32, respectively. Demultiplexed data was then quality filtered using the DADA2 pipeline33 in RStudio (see Supplementary Information Sect. 2 for bioinformatic parameter details). Resulting amplicon sequence variants (ASV) for each assay were then queried against NCBI’s GenBank nucleotide database34 (accessed in 2019) using BLASTn and also against a curated 16S rDNA Western Australian fish database27 via Zeus, an SGI cluster, based at the Pawsey Supercomputing Centre in Kensington, Western Australia. Linnaean taxonomic assignments of ASVs were curated using a lowest common ancestor approach (https://github.com/mahsa-mousavi/eDNAFlow/tree/master/LCA_taxonomyAssignment_scripts35, see Supplementary Information Sect. 2); consolidated taxa assignments were then additionally categorised based on associated environment and biogeographic distribution data obtained from CI subterranean biodiversity surveys25,36 and the World Register of Marine Species (WoRMS)37. Putative new occurrence records were additionally assessed for whether all congeneric taxa have been barcoded for the targeted gene region. Any ASVs that were detected in filtration and/or extraction blanks were entirely removed; remaining ASVs that share the exact Linnaean taxonomy assignment were then merged using the phyloseq ‘tax_glom’ function38 in RStudio. This produced a taxonomic-based matrix. Read abundance was converted to presence/absence data in PRIMER v739 for subsequent statistical analyses.

Statistical analyses

Variation in the community composition was firstly tested between the water and sediment samples to determine whether there is a difference in the type and number of taxa detected between the two sample types. The presence-absence data of taxa at each site was converted to a Jaccard similarity matrix and tested for the effect of sample type and site using a two-way crossed PERMANOVA in in the PERMANOVA + add-on40 of PRIMER v7. Site variation by sample type was visualised by principal coordinates analysis (PCO) in PRIMER v7. Species accumulation per replicate (of the two sample types) and a comparison of the number of taxa per sample replicate was graphed using the vegan ‘specaccum’ function41 and ggplot42, respectively, in RStudio. Additionally, a similarity percentage analysis (SIMPER) was conducted in PRIMER v7 to identify contributing taxa to pairwise dissimilarity between the two sample types.

Sampling replicates of both sample types were then merged per site and converted to a Jaccard similarity matrix to examine the overall community composition variation between sites. Taxa accumulation per site was graphed using vegan in RStudio. Distance-based linear model (DistLM) analyses were conducted in the PERMANOVA + add-on using normalised measures of acidity, temperature, salinity and dissolved oxygen, in addition to latitude/longitude, as environmental and spatial predictor variables. These analyses were initially conducted across all site types (cave and spring), and then within cave and spring sites separately. The environmental parameters of conductivity and air saturation were omitted due to collinearity with salinity and dissolved oxygen, respectively. Site variation was visualised by PCO using the stats function ‘cmdscale’ in R Studio; significant predictor variables corresponding to the DistLM analyses were overlaid using the vegan ‘ordisurf’ function41. A SIMPER analysis was conducted to identify contributing taxa to pairwise dissimilarity between site type (cave/spring) and salinity. Original salinity readings (ppt) were categorised into the following salinity groups: freshwater ≤ 0.49 ppt, oligohaline 0.5–4.9 ppt, mesohaline 5.0–17.9 ppt, polyhaline 18.0–29.9 ppt, and euhaline (seawater) ≥ 30.0 ppt43 (see Table 1 for site salinity groupings). Hierarchical clustering, using group-averaging, and SIMPROF analyses were applied to both the community composition (biotic) and environmental parameter (abiotic; including latitude/longitude) datasets in PRIMER v7, to identify groupings of sites that may potentially reflect underground interconnectivity.

Results

Sampling and sequencing statistics

The three metabarcoding assays yielded a total of 35,698,221 sequencing reads across 159 samples. The mean number of filtered sequences (post-quality, denoising and chimera filtering) was 74,014 ± 73,211 per replicate sample for the 16S Fish (short) assay; 66,152 ± 115,417 per replicate sample for the 16S Crustacean assay; and 22,968 ± 16,704 per replicate sample for the 18S Universal assay (Tables S2, S3). The 16S Crustacean assay did yield unbalanced read numbers (post-quality filtering) between some replicates/sites, that given equimolar pooling prior to sequencing, is purported to reflect low template crustacean eDNA at specific subterranean sites (Table S3).

ASVs that were detected in filtration and extraction blanks and/or are common laboratory contaminants were omitted from all samples and subsequent analyses; this included ASVs for minnows (genus: Phoxinus), a branching bryozoan (Fredericella sultana), human (Homo sapiens), junglefowl (Gallus gallus), cat (Felis catus), pig (Sus scrofa), cattle (Bos taurus) and turkey (Meleagris gallopavo). We also omitted all ASVs for taxa outside of our study scope of subterranean macrofauna; this included any ASV in the domain Bacteria, hairybacks (phylum: Gastrotricha), the kingdom Fungi, ciliates (phylum: Ciliophora), nematodes (phylum: Nematoda), microscopic flatworms (phylum: Platyhelminthes), plants (clade: Streptophyta) and algae (phylum: Cryptophyta).

Taxa accumulation curves based on the addition of each sampling replicate per site (Figure S1) indicated that six 1 L water replicates (chosen a priori to sampling) was not completely sufficient to maximise the observed taxonomic richness at each site. On fitting polynomial curves to the taxa accumulation curves, it was extrapolated that on average 18.6 ± 20.4 one-litre water replicates would be required to maximise observed taxa richness at each site. The addition of a sediment replicate also provided an increment in taxa diversity beyond those detected with water. Therefore, more taxa were likely to be detected if further water and sediment replicates were examined per site. Variation in the composition of taxa detected between the water and sediment sample types was highly significant (P = 0.000, df = 1; Table S4, Figure S2). A single water sample detected on average a higher number of taxa than a single sediment sample (Figure S3), however, this difference was not statistically significant. A SIMPER analysis of pairwise dissimilarity between the sample types indicated that the water samples were able to detect a large proportion of the overall detected taxa (Table S5), however sediment provided a greater detection rate for yellow nipper crab (Geograpsus crinipes) and whiteleg shrimp (Penaeus vannamei).

Overall diversity

A total of 25 taxa (1.5 ± 1.0 ASVs per taxa) were detected with the 16S Fish (short) assay, 15 taxa (5.1 ± 7.8 ASVs per taxa) with the 16S Crustacean assay, and 77 taxa (2.0 ± 2.5 ASVs per taxa) with the 18S Universal assay (Table S6). Overall, the three metabarcoding assays yielded 115 identifiable taxa, representing 71 families within 60 orders of the phylums Chordata, Cnidaria, Porifera, Arthropoda, Mollusca, Annelida and Bryozoa (Fig. 3, Table S6). The majority of these taxonomic assignments were resolved to a species level (37.4%), followed by order level (22.6%), genus level (20.8%), family level (17.4%) and class level only (1.7%). The detected taxa were found to be largely associated with marine environments (53.9%), followed by terrestrial (37.4%), freshwater (20.9%) and brackish (14.8%) environments (Table S6). Of these taxa, 64.3% are circumglobal, 6.1% are distributed across the Indo-West Pacific and 5.2% more broadly across the Indo-Pacific, with smaller distributions in Africa, Asia and the eastern Indian Ocean. Taxa accumulation based on the addition of cave/spring sites did not plateau, indicating more taxa are likely to be detected if further sites are examined (Figure S4).

Figure 3
figure 3

A total ordinal-level dendrogram of chordate, cnidarian, porifera, arthropod, mollusc, annelid and bryozoan taxa detected by multi-assay metabarcoding on 159 eDNA samples collected across Christmas Island.

Thirteen bony fish taxa (class: Actinopterygii) were detected (three at family level only, three at genus level only, and seven at a species level) from 11 families within eight orders (Table S6). The assemblage was predominantly comprised of marine fish detected in the anchialine caves; this included snooks (family: Centropomidae), giant trevally (Caranx ignobilis), flying fish (family: Exocoetidae), oriental trumpeter whiting (Sillago aeolus), black triggerfish (Melichthys niger) and halfbeak (Oxyporhamphus micropterus). Three notable detections in the freshwater caves were Indonesian shortfin eel (Anguilla bicolor bicolor), gudgeon (genus: Eleotris) and carp/minnows (family: Cyprinidae); see Table 3 for more information on taxa of conservation, biodiversity and biosecurity importance. We also report three putative new fish occurrence records: Cyprinidae, Cottus and Gobio gobio. However, the latter species cannot be verified to a species level as not all congeneric taxa for Gobio are represented by a reference barcode in the database. Therefore, we cannot rule out that our assignment of Gobio gobio may represent a closely-related taxon. See Table S6 for all putative new occurrence records, percent identity and status on whether all congeneric taxa of the respective new occurrence records have been barcoded.

Table 3 Taxa of conservation, biodiversity and biosecurity importance.

Forty-seven arthropods (phylum: Arthropoda) were detected (10 at order level only, 11 at family level only, 10 at genus level only and 16 at a species level; Table S6). The assemblage was largely comprised of insects, in addition to arachnids, crustaceans, collembola and millipedes. The majority of the detected arthropods in this study are known to be terrestrial ground-dwelling taxa. We detected a range of land crabs that inhabit the forest floor on CI, but were also spotted in this study within entrances of cave systems; these included the orange-legged crab (Tuerkayana magnum), the yellow nipper (Geograpsus crinipes), the little nipper (Geograpsus grayi) and the purple crab (Gecarcoidea lalandii). Of the aquatic taxa, we detected freshwater brine shrimp (Artemia franciscana), copepod (Nitokra), shrimp (family: Atyidae), ostracods (Darwinula stevensoni and Schlerochilus) and whiteleg shrimp (Penaeus vannamei; see Table 3).

Community composition and clustering

A distance-based linear model (DistLM) analysis of community composition across all sites, indicated that site type (cave/spring) explained the highest proportion of fitted variance (9%), followed by salinity (6.1%; Table 4, Table S7). This is visualised in the PCO in Fig. 4a. Taxa richness per site was found not to significantly differ between cave and spring sites (P = 0.840, df = 1, Table S8, Figure S5), indicating that compositional variation between the site types is not driven by unequal taxa richness. Similarity percentage analysis (SIMPER) was used to identify which subterranean taxa contributed most to pairwise dissimilarity between the site types and between salinity groupings (Tables S9 and S10). In examining cave sites only, we found that compositional dissimilarity is driven by a longitudinal transition, in addition to dissolved oxygen. Cumulatively, these two significant predictor variables explain 21.7% of the total fitted variance between cave assemblages (Fig. 4b, Table 4, Table S7). Within spring sites only, the DistLM identified a latitudinal and dissolved oxygen effect on compositional dissimilarity, however, these were not significant (P = 0.103 and P = 0.298 respectively, Table 4, Fig. 4c, Table S7). Hierarchical clustering based on the community composition (Jaccard similarity) of all sites revealed a number of discrete groupings; eight were significantly separated by SIMPROF (P < 0.05, Fig. 5). For the abiotic environmental data (Euclidean distance), SIMPROF identified three groupings that were significantly different from each other (P < 0.05, Fig. 6). A high possibility of local interconnection was attributed to three cave and spring groups which exhibited both biotic and abiotic clustering. These sites were Whip Cave and The Grotto, Jones Spring and Waterfall Spring, and Lost Lake Cave site 1 and site 2.

Table 4 Summary table of the distance based linear model (DistLM) analyses for subterranean fauna.
Figure 4
figure 4

Principal coordinates analysis (PCO) of bony fish composition in (a) all sites, (b) cave sites only and (c) spring sites only. Latitudinal, longitudinal, salinity and dissolved oxygen gradients are overlaid if identified as a predictor variable in corresponding DistLM analyses. The proportion of variation explained by each axis is shown on the axis labels.

Figure 5
figure 5

Cluster analysis of CI community composition similarity. Site composition is comprised of all assigned taxa resulting from the three metabarcoding assays. Solid lines indicate groups that the SIMPROF analysis identified were significantly different from each other (P < 0.05).

Figure 6
figure 6

taken from each site at the time of sample collection. Solid lines indicate groups that the SIMPROF analysis identified were significantly different from each other (P < 0.05).

Cluster analysis of CI abiotic environmental similarity. Environmental data incorporated into this analysis included acidity, temperature, salinity, dissolved oxygen, latitude and longitude readings

Discussion

Subterranean diversity

Our multi-assay eDNA metabarcoding approach successfully detected a wide range of chordates, cnidarians, porifera, arthropods, molluscs, annelids and bryozoans from CI’s subterranean habitats. Despite targeting CI’s aquatic stygofauna, our water and sediment samples also produced detection hits for troglofauna (terrestrial cave fauna) that have likely shed DNA into the water below. This demonstrates that water and sediment can be used to detect troglofauna. However, it is likely that other sample types (e.g. soil samples) may provide a greater detection rate for subterranean terrestrial species.

It should also be noted that there was some bycatch of sub-surface terrestrial taxa, such as yellow crazy ant (Anoplolepis gracilipes), and ocean-dwelling marine taxa, such as giant trevally (Caranx ignobilis) and brown tubular sponge (Agelas schmidti). The detection of sub-surface terrestrial taxa was not unsurprising given that rainwater percolates through the karst landscape, potentially carrying organisms or traces of terrestrial DNA into the system. The detection of ocean-dwelling marine taxa may be explained by tidal influences on coastal caves. We retained all taxa in our multivariate site analyses however, as it can be difficult to extricate whether all non-stygobiotic or non-troglobiotic taxa were detected via shared water intrusion through karstic voids or are legitimately part of the subterranean community composition (through potential local adaptations) on CI.

Notable subterranean detections included a putative new occurrence record of Collembola (genus: Willowsia) across five sites, which potentially resolves the classification of unidentified Collembola specimens previously collected in Runaway Cave, Jane-up Cave, The Grotto, Grants Well, 19th Hole and Whip Cave44, although a specimen of the genus Cyphoderopsis has also been identified from an unreferenced cave on the island45. Likewise, the detection of prawn (genus: Litopenaeus) at three cave sites may also resolve unidentified specimens collected on CI of the encompassing family Penaeidae46. An extension in the distribution records of Indonesian shortfin eel (Anguilla bicolor bicolor) and gudgeon (genus: Eleotris) across CI’s subterranean habitats aids management purposes, particularly as the former is an elusive species that is rarely seen and only reported in two locations36, whilst the latter may represent a highly unique lineage exhibiting cave adaptations such as a pale body colour and enlarged pectoral and caudal fins36,44.

The overall assemblage also included a total of 21 putative new occurrence records, however their verification requires additional reference barcodes of subterranean fauna in order to rule out incorrect assignments to a closely-related (not yet barcoded) taxon. Subterranean environments, and by extension their fauna, are notoriously under surveyed. As such, only a small number of vouchered specimens have been barcoded for commonly targeted gene regions (i.e. COI, 16S and 12S rRNA). Differences between sexes and life-stages that are classified as different species can also complicate the picture when assembling reference barcode material. One further complication is the high likelihood that vouchered subterranean specimens have been preserved using formalin, such as the two specimens of the rare cave-dwelling flashlight fish (Photoblepharon palpebratum) collected from Thundercliff Cave on CI (J. DiBattista, personal communication, January 2019). The preservative unfortunately fragments DNA, modifies bases and creates crosslinks within the DNA47, making it challenging to extract and piece together reference sequences. In addition, subterranean environments are highly fragmented habitats, facilitating the evolution of short-range endemic species3. These unique lineages require representation in reference databases in order to make robust (species-level) assignments. However, despite an incomplete reference database, this study has demonstrated that eDNA metabarcoding can still reveal a wide diversity of subterranean taxon detections, including some at a species level. Additionally, the detection of unknown species using eDNA (such as the 22.6% of assignments that could not be resolved beyond order level) can be used to direct traditional sampling efforts for specimen acquisition, taxonomic classification and barcoding.

Overall, this eDNA metabarcoding study produced a comparatively high detection rate (115 taxa from 60 orders) to previous CI subterranean surveys that were conducted within a similar sampling time frame25,36,48. For example, a submarine and anchialine cave survey using baited traps and visual (SCUBA) surveillance, identified a total of 54 species across 11 coastal CI cave sites, within a 5 week period from 2010 to 201236. A three week survey of CI subterranean environments in 1998 using visual searching, trapping, haul netting and fixed nets, identified 13 aquatic and 17 terrestrial taxa48. Lastly, an extensive expedition in 2006 indicated difficulties in obtaining specimens of the ostracod Humphreysella and the anchialine shrimp Procaris, despite a three-week survey of the original collection site25. Environmental DNA metabarcoding may therefore offer a complementary approach to capture-based sampling by detecting elusive subterranean species and providing comprehensive biospeleological assessments.

Composition and connectivity

The cave and spring systems on CI are distinct and host a different complement of subterranean community assemblages. Given spring water naturally flows through underground and near-surface conduits, we expected spring sites to contain subterranean and potentially surface taxa. Therefore, we anticipated additional taxa in spring sites compared to those detected in the cave sites. However, variation in taxa richness between cave and spring sites was not significant. The spring sites typically produced more sub-surface and terrestrial taxa, whilst the caves sites produced subterranean, largely aquatic, taxa. For example, spring sites were typified by freshwater ostracods (family: Darwinulidae, including Darwinula stevensoni), land crabs (Discoplax magna, Gecarcoidea lalandii and Geograpsus crinipes), ticks (order: Sarcoptiformes) and freshwater jellyfish (Craspedacusta sowerbii), whilst the cave sites were characterised by slender springtails (Willowsia), demosponges (order: Haplosclerida), whiteleg shrimp (Penaeus vannamei), and carp/minnows (family: Cyprinidae).

Variation in salinity between the freshwater springs (0.2–0.4ppt) and a combination of freshwater, brackish and saltwater cave sites (0.3–26.0ppt) also had a strong influence on the overall composition. Distinct fauna were found to be associated with each salinity level, for example freshwater and oligohaline sites (0–4.9ppt) were characterised by freshwater ostracods, land crabs, gudgeon (Eleotris) and clitellate oligochaete worms (family: Naididae); mesohaline sites (5.0–17.9ppt) by brine shrimp (Artemia franciscana), sea sponge (Iophon) and giant trevally (Caranx ignobilis); and lastly the polyhaline site (18.0–29.9ppt; Thundercliff Cave only) by demosponge (Callyspongia), black triggerfish (Melichthys niger) and brown tubular sponge (Agelas schmidti).

In examining the variation between cave and spring sites separately, the effect of salinity is no longer significant as there is a reduction in salinity variation within each site type. Cave sites were then found to vary on a longitudinal transition across the island—indicating that site dissimilarity increases with distance—in addition to a dissolved oxygen influence. Surface water at most of the cave sites was well oxygenated (ranging between 4.0–5.7 mg/L), although Freshwater Cave (4.0 mg/L) exhibited a lower concentration than previous reports on the island (Humphreys & Eberhard, 2001). Spring sites also varied on a distance-based (albeit latitudinal) and dissolved oxygen transition across the island, although this was not statistically significant. Notably, however, Freshwater Spring exhibited a low dissolved oxygen level of 2.5 mg/L, well below that of any other sampled site. Despite these low oxygen levels which can induce hypoxia in some freshwater fish species49, two bony fish taxa (genus: Eleotris and family: Leiognathidae) were detected.

Potential underground connectivity was assessed across biotic (community composition) and abiotic (environmental parameter) hierarchical cluster analyses. Sites which exhibit both biotic and abiotic clustering and are presumed to have a high possibility of connection include Whip Cave and The Grotto, Jones Spring and Waterfall Spring, and Lost Lake Cave Site 1 and Site 2. Whip Cave, an anchialine cave, and The Grotto, a coastal marine cave with strong sea currents, are accessed at a distance of 80 m apart. They exhibit highly similar environmental parameters and community composition, and were both noted onsite to be tidally influenced. A possible hydrological connection between Whip Cave and The Grotto has been previously reported22,44, in addition to a connection with Runaway Cave44. However, we were unable to sample Runaway Cave as part of this project. Our data supports the premise that Whip Cave and The Grotto are connected.

There are no previous reports of connectivity between Jones Spring and Waterfall Spring, however, given they occur in close proximity (1.07 km apart) it is a possibility that they are fed by shared flow systems. Freshwater Spring is situated approximately halfway between Jones and Waterfall Spring and was therefore expected to cluster with these spring sites. Whilst Freshwater Spring is comprised of a similar community composition, it exhibited very different abiotic readings in relation to dissolved oxygen and air saturation. Therefore, we can only suspect based on this evidence that Jones Spring and Waterfall Spring have a high possibility of interconnection. Lost Lake Cave Site 1 and Site 2 were sampled from a continuous stream passage; the former taken approximately 150 m from the sea entrance and the latter sampled far into the cave extent (more than 500 m from sea entrance). It was therefore expected that these two sites would be classified as having a high possibility of interconnection.

Sites with a medium possibility of interconnection (i.e. sites which exhibit either biotic or abiotic clustering) include the plateau sites of Jane-up Cave, Jedda Cave, WiFi Cave and Grants Well, CI-079 and the Hugh Dale Waterfall, and Sepulchral Soil Sink and 19th Hole. Jedda Cave is the mainstay of CI’s water supply and is purported to access a subterranean flow between Grants Well and Jane-up Cave25,50. In order to trace the potential flow of water between Grants Well and Jane-up Cave (1.3 km apart), stream water was previously spiked with salt and measured downstream; the through-flow time between the two sites was three hours (400 m/h), confirming their interconnectivity51. Abiotic clustering in this study indicates that Jane-up Cave, Jedda Cave, WiFi Cave and Grants Well (located within 1.29 km) are potentially all connected underground, with almost exact water quality readings across the four sites. The spring sites of CI-079 and the Hugh Dale Waterfall, located within The Dales wetland area, also exhibited abiotic clustering and are within a 0.6 km distance of each other. Likewise, the cave sites of Sepulchral Soil Sink and 19th Hole, located east of Flying Fish Cove, exhibited abiotic clustering and are located approximately 1.9 km apart.

The detection of marine taxa can also elucidate sites with ocean connections and therefore potential tidal influence; this included Thundercliff Cave, Lost Lake Cave, The Grotto, 19th Hole, Whip Cave, Sepulchral Soil Sink, Ryan’s Ripper Rift, Hosnies Spring and Ross Hill Gardens Site 1 and 2. All of the cave sites, except for Sepulchral Soil Sink, were noted onsite to have had hourly changes in water level which was attributed to tidal influence. Sepulchral Soil Sink, however, produced salinity readings that were in the range of other tidal influenced sites on CI indicating that it also had an ocean connection. The detection of marine taxa from the three spring sites of Hosnies Spring and Ross Hill Gardens Site 1 and 2 was surprising given they were classified as freshwater based on the surface readings. However, these spring sites are all within 1 km of the coastline; it is therefore possible that they have an ocean connection but exhibit a halocline (salinity) gradient between the groundwater and surface spring.

Conclusion

The use of eDNA sampling as a bioassessment tool in caves where population dynamics may be extremely fragile to external pressures circumvents the impacts from traditional biospeleological approaches, where specimens must be continually captured for verification beyond initial vouchering and barcoding. Such processes require multiple visits to caves compounding the impacts that are exerted to conservation sensitive areas and fauna. This not only increases levels of impacts exposed to the cave, but increases risk to researchers, and requires additional permits, time and resources to obtain samples. Whilst sampling cave water or sediment for eDNA analyses is logistically easier, karst groundwater is subject to extreme fluctuations in water level during wet and dry seasons/years. This must be taken into consideration when eDNA sampling over successive periods52. The power of eDNA metabarcoding lies in its ability to widely amplify a target taxonomic group without specific taxonomic expertise to morphologically classify taxa. This is particularly beneficial in subterranean assessments given the prominence of endemics and taxa that exhibit cave adaptations (e.g. pale body morphs, increased sensory organs, elongated appendages and reduced eyesight).

Subterranean ecosystems are notoriously under surveyed, largely because of logistical difficulties in accessing underground voids and networks. We demonstrated that the application of eDNA metabarcoding assays to water and sediment collected from cave and spring conduits can be used to characterise multi-trophic eukaryotic subterranean diversity from voids that may be inaccessible to conventional biospeleological surveying. We detected a wide range of chordates, cnidarians, porifera, arthropods, molluscs, annelids and bryozoans from freshwater and anchialine spring and cave sites across CI. Community composition was found to vary based on site type (i.e. cave or spring) and salinity; cave sites were additionally influenced by dissolved oxygen and longitudinal gradients. We update distribution information for taxa of biodiversity importance, such as the Indonesian shortfin eel and cave-adapted gudgeon, and potentially resolve unidentified specimen classifications for Collembola and prawn that were previously reported on CI. Lastly, we combined eDNA-derived eukaryotic community composition and environmental (water quality) data to investigate potential underground interconnectivity across CI; based on hierarchical clustering we identified three groups with a high possibility of interconnection. We strongly advocate for ongoing development of subterranean reference databases to facilitate the implementation of eDNA metabarcoding as a biospeleological survey tool, particularly for stygofauna. With this development, we expect that eDNA metabarcoding will be increasingly employed for subterranean multi-trophic surveying, which may reveal food webs and enlighten subterranean ecosystem functioning. We anticipate that this study demonstrates the potential for using multi-marker eDNA metabarcoding approaches for subterranean stygofauna surveying and exploration.