Environmental variables and definitive host distribution: a habitat suitability modelling for endohelminth parasites in the marine realm

Marine nematodes of the genus Anisakis are common parasites of a wide range of aquatic organisms. Public interest is primarily based on their importance as zoonotic agents of the human Anisakiasis, a severe infection of the gastro-intestinal tract as result of consuming live larvae in insufficiently cooked fish dishes. The diverse nature of external impacts unequally influencing larval and adult stages of marine endohelminth parasites requires the consideration of both abiotic and biotic factors. Whereas abiotic factors are generally more relevant for early life stages and might also be linked to intermediate hosts, definitive hosts are indispensable for a parasite’s reproduction. In order to better understand the uneven occurrence of parasites in fish species, we here use the maximum entropy approach (Maxent) to model the habitat suitability for nine Anisakis species accounting for abiotic parameters as well as biotic data (definitive hosts). The modelled habitat suitability reflects the observed distribution quite well for all Anisakis species, however, in some cases, habitat suitability exceeded the known geographical distribution, suggesting a wider distribution than presently recorded. We suggest that integrative modelling combining abiotic and biotic parameters is a valid approach for habitat suitability assessments of Anisakis, and potentially other marine parasite species.


Results
Habitat suitability modelling. Based on the results of the Maxent permutation importance, as a measure for variables' contribution to the habitat suitability model, the following six variables were identified as most important abiotic factors for the potential distribution of Anisakis species: land distance, mean sea surface temperature, depth, salinity, sea surface temperature range as well as primary production. These six variables were used as abiotic factors together with the modelled habitat suitability for the definitive host species of the respective Anisakis species as biotic factors to model the habitat suitability for the Anisakis species in the final model ( Fig. 1, Table 2).
The modelling results for the nine Anisakis species are displayed in Figs 3-5. Warmer colours indicate a higher modelled habitat suitability and thus, a higher occurrence probability for the considered Anisakis species. Additionally, the occurrence data for the Anisakis species (Figs 3-5: dot maps in column 1) and the definitive host species diversity (Figs 3-5: column 2) for the respective Anisakis species are visualized. For the definitive host species diversity, we summed up the binary modelling results (according to the sensitivity equals specificity optimization criterion) 33 for all recorded definitive host species of the respective Anisakis species. A high definitive host diversity means that several of the recorded definitive host species are modelled to find suitable habitat conditions in a certain region.
"Area under the Receiver Operator Curve (AUC)" values for the Maxent models of the nine Anisakis species are summarized in Table 3. The AUC value is a commonly used measure of model performance independent of a threshold. The AUC values scores between 0 and 1. Assuming unbiased data, higher values indicate that areas with high modelled habitat suitability tend to be areas of known presence and areas with lower modelled habitat suitability tend to be areas where the species is not known to be present 34 . An SDM with an AUC value of about 0.5 is assumed to be as good as a random model. For all considered Anisakis species, the AUC values are high (above 0.95). Table 3 lists the number of occurrence records for each Anisakis species as well as the number of environmental variables (6 abiotic variables and a varying number of hosts (biotic variable) depending on the species) used for the modelling. The habitat suitability for five definitive host species (i.e. Feresa attenuata, Sotalia fluviatilis, S. guianensis, Mesoplodon bowdoni, M. europaeus) could not be modelled as for these species insufficient (less than 10) or no occurrence records were available from AquaMaps 35 . Thus, for some Anisakis species the number of considered definitive host species variables is less than the number of recorded definitive host species (Table 1).

Zoogeography of Anisakis spp.
A comparison of the occurrence data dot maps (Figs 3-5: dot maps) and the modelled habitat suitability (Figs 3-5: third column) shows that the potential area of distribution (= high modelled habitat suitability) extends in some cases the recorded geographical area as documented by the occurrence data points. According to sampling data currently available, Anisakis simplex sensu strictu, for example, was exclusively found in fish hosts of the northern hemisphere, and here mainly along the North/North-East Atlantic as well as in waters of the West and East Pacific Ocean, whereas the habitat suitability maps suggest it to be present also in more southern areas, e.g. between the Antarctic Peninsula and South America and the waters around New Zealand.
The modelled host diversity patterns (Figs 3-5: column 2) differ to some extent from the modelled habitat suitability for the Anisakis species (Figs 3-5: column 3), i.e. the distribution of the parasites do not always overlap with the diversity hotspots of the definitive hosts and vice versa. This is particularly striking in the above mentioned area of the Central East Pacific ( . Hotspots with both high data density and diversity (all nematode species) can be found in the North/North-East Atlantic, the Mediterranean and the West and East Pacific Ocean along coasts of Japan/China and North America, respectively. Hotspots of cetacean definitive diversity (Figs 3-5: column 2) are located in the area of the tropics and subtropics, especially along the west coast of Central America (e.g. Mexico, Guatemala, Ecuador, Peru), Indian Ocean (Bay of Bengal, Indonesia, Thailand) and China, A. berlandi) as well as A. typica and two sister-species A. nascettii and A. ziphidarum 23 . The second clade consists exclusively of the A. physeteris-complex (A. brevispiculata, A paggiae, A. physeteris) 6,23 . Both A. simplex-complex and A. physeteris-complex are considered cryptic species, distinguishable only by means of molecular analyses as well as slight morphological differences (e.g. tail length/total body length ratio; spicule length) 23 .
The distribution (dot maps) of Anisakis species generated in this study coincides in large parts with results of earlier studies 6,9 suggesting an uneven occurrence of the nine species among the different climate zones and oceans. Some new records were added, however, those are mostly located in already known endemic areas.
Hot spots of occurrence records were found in the Mediterranean region, in the region of Japan, along the North-American coasts as well as in the waters of the North-Atlantic; areas with extensive fisheries and economically (and consequently scientifically) important fish species. Hot spots of host diversity are not inevitably congruent with these areas, mostly because reliable occurrence data for the definitive hosts usually stem from regions with a good accessibility and a scientific and public interest (e.g. "whale watching", scientific research cruises, cruise itineraries), which do not necessarily overlap with the areas where intermediate fish hosts are caught.
The close phylogenetic relationship between some of the species is mirrored in similar modelling results using biotic and abiotic variables. Anisakis simplex s.s. has exclusively been recorded by means of molecular methods from hosts in the northern hemisphere, mainly in the Atlantic and Pacific Ocean as well as in the Western Mediterranean (Fig. 3). The distribution of data points indicates a main distribution area of this nematode species in the entire North Atlantic. Below 20°N latitude, no evidence of the species' occurrence has been found to date. The map of the habitat suitability shows some areas of the southern hemisphere (e.g. New Zealand), where occurrences would be expected due to suitable local environmental conditions. An accumulation of occurrences along Japanese coastal waters, which was also observed for A. pegreffii (see below), can most likely be explained by increased economic research interests in potential harmful organisms in commercially highly significant and often consumed raw fish species.
Anisakis pegreffi shows a disjunct distribution, with a lack of evidence along the North American West Coast and some additional findings between South America and the Antarctic Peninsula and South Africa and New Zealand.
The distribution of A. berlandi, along the Pacific Northwest Coast, the Southern Ocean, the Weddell Sea and the coast of South Africa is comparable to that of the two closely related sister species, A. simplex s.s. and A. pegreffii. The close phylogenetic relationship of these three species is reflected in a similar distribution pattern (Fig. 3). Small differences between the modelling results for these three sister species probably arise due to different species specific host preferences (see Tables 1 and Fig. 2).
Due to its characteristic distributional pattern along the tropics and subtropics, A. typica stands out from other members of the genus (Fig. 4). The model proposes a circumglobal distribution with temperature and land distance constituting decisive factors. The highest correlations between definitive host and parasite distribution arise for the warm-water adapted species of the genus Stenella as well as coast-associated Delphinids such as Tursiops truncatus and T. aduncus, suggesting their large influence on the distribution of A. typica.
Anisakis ziphidarum and the sister species A. nascettii were sporadically detected in the Central and South-East Atlantic as well as in New Zealand and the Mediterranean Sea (Fig. 4). A. ziphidarum was further detected in Japanese waters. Anisakis nascettii was described only very recently by Mattiucci, Paoletti et Webb (2009) 36 , which explains the low number of molecular evidence. For both species, potential habitat suitability was    Fig. 2) can represent a "lack-of-data" (i.e., occurrence of specific species has not been recorded so far), or point to a lack of host suitability for the parasite.
Drivers influencing Anisakis spp. distribution. Land distance, mean sea surface temperature, depth, salinity, sea surface temperature range as well as primary production, were identified as most important abiotic variables (variable important analysis) impacting the distribution of Anisakis. Klimpel and Rückert 38 demonstrated the influence of physical systems (mixed and stratified waters), and in particular hydrographic conditions like fronts, on the abundance and distribution of marine helminths of the raphidascarid nematode Hysterothylacium aduncum from the North Sea. Fueled by an increased primary and secondary production, permanent upwelling of nutrients resulted in an accumulation of predators and prey in the vicinity of halo/thermocline fronts, and thus, favouring the transmission of parasites among hosts. Højgaard 4 found that hatching time of Anisakis simplex eggs varied inversely with temperature and that survival time increased with salinity but decreased with temperature, supporting the hypothesis that Anisakis simplex is rather adapted to (off-shore) pelagic marine environments with high salinity. Both temperature and salinity were identified crucial in the variable importance analyses of this study. The influence of land distance and depths is probably an effect of the individual association of certain definitive host groups to either offshore or inshore marine habitats. Sperm whales, for example, usually inhabit deep sea habitats and tend to migrate in off-shore waters over large distances resulting in the dispersal of parasite propagules (e.g. A. physeteris) in offshore pelagic waters 39 . In contrast, the delphinid species Tursiops aduncus and T. truncatus, common hosts of A. typica (Table 1 and Fig. 2), are rather coast-associated favouring the distribution of this parasite along in-shore habitats.
Habitat suitability modelling approach. The modelled habitat suitability using abiotic environmental parameters and biotic host distribution data reflects the observed distribution pattern for all nine Anisakis species quite well. The generally high accordance between the distribution of occurrence data and modelling results is supported by high AUC values. As the AUC values depend on the number of considered occurrence records the AUC values may not be used in order to compare the performance of modelling results between species.
Sampling bias is one of the issues that has to be considered when modelling species' distribution using correlative approaches and may have affected our data here, i.e. Anisakis and definitive host species occurrences. Occurrence data are always affected by a sampling bias, e.g. species are commonly recorded in specific areas more often than in others due to a larger interest in that region compared to others, resulting in unequal probabilities of records. Hot spots of recorded occurrence and hotspots of actual occurrences may thus differ, which may yield a false representation of the species' niche reducing the reliability of the modelling results.
In some regions (e.g. South Atlantic for A. simplex s.s./A. pegreffii) the area with high modelled habitat suitability exceeds the area with recorded presences of the Anisakis species. This could have several causes: Despite suitable habitat conditions, Anisakis species do not occur in these regions which may be due to a potential dispersal limitation of the Anisakis species or the associated host species (e.g. migratory vs. more stationary cetacean definitive hosts). A mismatch could also be caused by limited sampling efforts, i.e., that the Anisakis species occur in these regions with high modelled habitat suitability but no (molecular, to cryptic species) occurrence has been recorded there to date (sampling bias). A third reason might be that a crucial factor relevant for Anisakis habitat requirements was not considered in the modelling. Overall, it is not clear which of these factors might best explain the "overprediction". The present study confirmed the postulated zoogeographical ranges of Anisakis spp in certain climate zones and oceans. The present approach using abiotic environmental parameters and biotic host distribution data reflects the observed distribution pattern for all nine Anisakis species quite well. Land distance, mean sea surface temperature, depth, salinity, sea surface temperature range and primary production were found to be the driving abiotic factors, which are in particular relevant for early marine parasitic life stages. However, final hosts are indispensable for a parasite's reproduction in a certain area. Thus, the integration of host species distribution is considered crucial when modelling habitat suitability for parasites and abiotic and biotic factors should therefore be used in an integrated approach and are equally valid for risk assessments of zoonotic diseases.  Table 2). The habitat suitability for the nine Anisakis species was finally modelled based on both: abiotic factors (land distance, depth, salinity, sea surface temperature (mean, range), primary production) and biotic variables (i.e. the modelled habitat suitability of the respective definitive host species). For visualization, maps were built using Esri ArcGIS 10.3 (www.esri.com/software/arcgis).
Scientific RepoRts | 6:30246 | DOI: 10.1038/srep30246 Methods Species distribution modelling approaches. Habitat suitability models, also called species distribution models (SDMs) or environmental niche models (ENMs), are correlative approaches that model the potential geographical range of a species subjected to various environmental variables 40 . Based on the information in which location a species is observed to be present or absent, and the environmental conditions prevailing there, the species-environment-relationship is estimated as a function of the relative habitat suitability in relation to the considered environmental conditions. The modelled species-environment-relationship (niche function) can then be projected onto maps showing the relative habitat suitability or occurrence probability for the considered species. Here, the maximum entropy niche modelling approach (implemented in the freeware MAXENT) 41,42 was used with the default settings but only using linear, quadratic and product features 43 . The MAXENT approach is one of the most commonly employed algorithms to model species potential ranges (e.g. 44 ) and scores well in comparative studies (e.g. 45 ). According to Baldwin 46 , the maximum entropy approach is relatively insensitive to spatial errors associated with location data, requires few locations to construct useful models, and performs better than other presence-only modelling algorithms. This may be in particular striking considering the fact that assessment of real absence data is not feasible for Anisakis species.
Occurrence data. Habitat suitability modelling was carried out using only Anisakis occurrence data from studies that have performed molecular methods to guarantee unambiguous identification to (cryptic) species level (e.g. PCR-RFLP, SSCP, direct sequencing; MAE: Multilocus-Allozyme-Electrophoresis). In addition to the presence data that have been included in a former approach described in Kuhn et al. 6 , the ISI Web of Knowledge online database was searched according to a set of search terms (Anisakis, Anisakid) to assess the novel research articles in the field that have been published since 2011 and extract specific locality records. In total, 101 publications were considered in the present model 9,[16][17][18]22,36, 37 cetacean species are currently recognized as definitive host for Anisakis (see Table 1). Occurrence records for these host species were derived from the online tool AquaMaps 35 with a reduced spatial resolution of 1 decimal degree.
Environmental data. Marine environmental data were obtained by the Global Marine Environment Datasets (GMED) 32 (Table 2). It has been suggested that interpolation errors might arise in the North Polar regions, thus, cropped data sets with an extent up to 70°N were used.
The environmental data were loaded at a spatial resolution of 5 arc minutes (= 0.083 decimal degree) and rescaled at a spatial resolution of 1 decimal degree computing the mean. This resolution is in accordance with the resolution of the occurrence records for the Anisakis species.
Of all marine environmental variables provided by GMED, only a subset of 14 environmental variables ( Table 2) was considered for the further analysis. This choice of 14 abiotic predictor variables among all available marine environmental variables provided by GMED was based on correlation as well as ecological relevance. These 14 variables were used in an intermediate step (IS1), to model the habitat suitability of the nine Anisakis species with the primary aim to identify those variables that on average contributed most to a Maxent model. In order to identify the most important abiotic factors the Maxent models were run for each of the nine Anisakis species. The Maxent permutation importance (i.e. a measure for the variables' contribution to the Maxent model) was calculated for each of the 14 abiotic factors and was transformed into ordinal rank scale for each of the nine Anisakis species. The six abiotic factors with the lowest median for all nine Anisakis species were taken as a final subset of abiotic factors in order to model the habitat suitability for the Anisakis species, together with the modelled habitat suitability of all known final host species as biotic factors (final model). This was done in order to reduce the number of predictor variables for the final modelling (integration of abiotic and biotic factors), thus, reducing the risk of overfitting. Based on the same subset of 14 abiotic factors (Table 2) as considered in IS1, the habitat suitability for each of the definitive host species (Table 1)  Here, definitive host distributions were included as anisakids are very specific to these, but less towards their intermediate hosts. "Finding" intermediate hosts refers to the mainly passive digestion by different crustaceans and fish, which are generally widely distributed in the oceans, thus, a much greater distribution could be expected and hence, area of habitat suitability leading to an overprediction of suited habitats for the different Anisakis species if based only on intermediate hosts.
In order to evaluate the modelling results, dot maps with the recorded occurrences of each of the nine Anisakis species were created. Additionally, host species diversity maps of the respective Anisakis species were built. For these maps, the continuous modelling results (from the intermediate step IS2) for the definitive host species were converted into binary data (1 -suitable habitat conditions, 0 -unsuitable habitat conditions) using the threshold that minimizes the difference between sensitivity and specificity 33 . These binary modelling results were then summed up for all known definitive host species of a certain Anisakis species resulting in definitive host diversity maps that show the number of definitive host species for the respective Anisakis species with modelled habitat suitability. For visualization, maps were built using Esri ArcGIS 10.3.

Correlation analyses.
Spearman correlation coefficients between the modelled habitat suitability for the different Anisakis species based on the 14 abiotic variables (IS1) and the modelled habitat suitability for the definitive host species based on the same 14 abiotic variables (IS2) were calculated to identify pairs of definitive host