Current and future climatic regions favourable for a globally introduced wild carnivore, the raccoon Procyon lotor

Invasive species are considered as one of the major threats to biodiversity and represent a major challenge in the conservation of natural ecosystems, in preventing damage to agricultural production, and human health risks. Environmental Niche Modelling has emerged as a powerful tool to predict the patterns of range expansion of non-native species and to direct effective strategies for managing biological invasions. The raccoon, Procyon lotor, is a wild mesocarnivore presenting a high adaptability and showing successful introduced populations worldwide. Here, we modelled the current and future climatically favourable areas for the raccoon using two protocols, based on data sets filtrated in geographic and environmental spaces. Projections from these models show extensive current favourable geographical areas covering extensive regions of temperate biomes. Moreover, predictions for 2050 reveals extensive new favourable areas north of the current favourable regions. However, the results of the two modeling approaches differ in the extent of predicted favourable spaces. Protocols using geographically filtered data present more conservative forecasts, while protocol using environmental filtration presents forecasts across greater areas. Given the biological characteristics and the ecological requirements of a generalist carnivore such as the raccoon, the latter forecasts appears more relevant and should be privileged in the development of conservation plans for ecosystems.

The comprehension of the relationships between organisms and their environment, and an understanding of the dynamics of species distributions is crucial to be able to predict species responses to the current environmental crisis. The recent development of computational and statistical tools allowed the emergence of new techniques to model ecological niches and potential species distributions (e.g. refs [1][2][3][4][5][6][7]. By associating geo-referenced occurrence data and environmental variables, Environmental Niche Modelling (hereafter called ENM) (also known as Species Distribution Modelling), makes it possible to evaluate the potential ecological niche of a species, and thus to identify the geographical areas that are favourable to its presence [8][9][10][11] . Associated with future climate change scenarios, ENMs also make it possible to predict the evolution of these favourable areas within a temporal framework [12][13][14][15][16][17][18][19] . In this way, ENMs open up a wide range of research, especially with regard to biodiversity conservation issues, notably through the study of rare or threatened species [20][21][22][23][24][25][26][27][28][29][30] , or conversely, of invasive alien species (e.g. refs [31][32][33]. Invasive alien species are now considered as one of the major threats to biodiversity and represent a prime challenge in the conservation of natural ecosystems. The present study focuses on a globally introduced mammal, the northern raccoon (Procyon lotor). The raccoon is a carnivore native to the North American continent. Yet, this species has been moved to different areas through the pet trade and for commercial purposes (exploitation of its fur). These activities have led to its introduction and spread in many regions worldwide. The species has been present in several Caribbean islands since the 17th century 34 , in Japan since the 1960s 35 , in Azerbaijan and Iran since 1991 36 , but also in several European countries where populations are currently expanding 37 . Since its www.nature.com/scientificreports www.nature.com/scientificreports/ Bioclimatic niche model. Modelled niches differed between the EF and GF approach, with sharper responses modelled for all variables except bio1 under the EF approach (Fig. 2). Most importantly, the EF approach identified a clear response to bio2 and bio12, whereas the GF approach identified very shallow responses. Conversely, EF niches showed a broader tolerance to bio1 (a plateau between 8 and 20°C) than GF niches (a single optimum at 8 °C).
As expected, both EF and GF approaches predicted similar favourable areas for the species over most of the temperate and sub-tropical regions (Figs 3, 4 and S1). However, the EF approach also predicted substantially larger favourable areas than GF, representing respectively 44.2% and 11.9% of all terrestrial areas (Figs 3, 4 and S1). Interestingly, the EF approach predicted that most tropical areas in Africa, and South-East Asia were favourable for the raccoon, whereas GF results were limited to temperate and dry areas of the world.
Both approaches present important differences in favourable areas at 2050. Favourable spaces are predicted to respectively increase by 18.5% (with RCP2.6; Supplementary Fig. S2) or 17.6% (with RCP8.5; Fig. 5) of its actual size with the GF approach, with broad new areas increasing from the northern border of the native range and

Discussion
This study presents the most complete analysis of the bioclimatic envelope favourable to the presence of the globally introduced carnivore Procyon lotor, and the first forecast of areas favourable to the species at a global scale using multiple modelling techniques and methods. Our results highlight the variability in environmental conditions between the native range and the regions where the raccoon was introduced. Consequently, niche models were computed with occurrences sampled from both native and non-native areas. These models created using two distinct methods demonstrate the high variability in results arising from presence selection and pseudo-absence generation techniques. Accordingly, projections for 2050 reveal major differences in favourable space from one model to another. However, both approaches highlight the ecological plasticity of the raccoon, showing large spaces favourable to the species at a global scale. Furthermore, projections for 2050 predict a high stability of current favourable areas, particularly in regions where the raccoon is already present. Niche conservatism tests performed for the three main regions colonized by the raccoon showed contrasting results regarding niche equivalence and niche similarity. Despite the fact that equivalency tests suggest that the European and the Japanese niches are relatively conserved, similarity tests reveals significant differences between the three regions of introduction and the native niche. Yet, a common factor appears to explain the differences between the native niche and the three colonized areas. As shown in several other invasive species 107,108 , these differences are explained more by unfilling than by expansion. Unfilling may be the result of different processes including ongoing colonization, slow dispersal, or the impossibility to reach new areas 98,105,109,110 . Unfilling values for the Caribbean and Japan niches may then result from the peculiar dispersion and distribution constraints involved by insular environments. On the other hand, unfilling values for the European region can be explained, at least in part, by the ongoing expansion of the populations.
The differences in environmental conditions between the different areas where the species is present, particularly in the Caribbean region, show the necessity of taking into account occurrence data from both native and non-native areas in order to model the bioclimatic niche of the raccoon as accurately as possible. Moreover, considering that the niche of a species is conserved from a region to another implies, in absolute terms, that the niches are at equilibrium, i.e. that the species is present in all the available niches in the introduced area, and absent in all regions not in accordance with its ecological requirements. However, regarding the raccoon, as the European populations are still expanding, the possibility of niche expansion in later stages of colonization cannot be excluded 111 . In that respect, the niche of the raccoon cannot be considered conserved in the different regions of introduction.
In this study, two modelling approaches were used. These two approaches differ by the selection of occurrence data and by the method of generation of pseudo-absences. The first approach, GF (Geographical Filtration), relies on using occurrence data that are spatially thinned and pseudo-absences that are randomly generated within all the available environmental space. The second approach, EF (Environmental Filtration), is based on an environmental filtration of occurrence data and the use of pseudo-absences randomly generated outside of the environmental space occupied by the species.
Consequently, model responses differ substantially. Overall, response curves obtained through the EF approach display a better normalization and a reduced variance. This results in projections diverging particularly with regard to the extent of regions identified as favourable. As expected, the GF approach generates smaller favourable areas than the EF approach. This difference may arise from the fact that model responses for bio1 present a substantially wider range of favourable temperatures with the EF than with the GF approach. In addition, the geographical filtration, particularly subject to sampling biases, maintains an artificial heterogeneity in the representation of the favourable environmental conditions that may also explain the differences between the predictions of the two approaches. Finally, the random generation of pseudo-absences within the entire available geographical space inevitably leads to the generation of pseudo-absences in sites presenting known favourable environmental conditions. For example, pseudo-absences can be selected within the current range of the species. This results in a confounding influence on the identification of favourable conditions by the models, which might be more important in the case of species presenting a tolerance to a broad range of bioclimatic conditions such as the raccoon. The EF approach, by selecting pseudo-absences within conditions where the species has never been reported, eliminates this bias and allows for a better characterization of the species niche.
Results obtained through the EF approach also appear to be better matched to the ecology of the species. The raccoon is present in a wide range of habitats, elevations and climatic conditions. Raccoon populations thrives in regions as diverse as the Caribbean mangroves, the European temperate forests or the Laurentian mixed forests of North America. Thus, the restricted temperature optimum as well as the limited favourable areas identified through the GF approach appear inconsistent regarding the ecological requirements of an ubiquitous mammal www.nature.com/scientificreports www.nature.com/scientificreports/ such as the raccoon. Moreover, the binary projections, resulting from the GF approach, and highlighting the highly favourable space, present a very incomplete coverage of the regions where the presence of the raccoon is proven, particularly in its native distribution range. Through the example of the raccoon, it appears that an approach based on a geographical filtration of the presence/pseudo-absence data can lead to an underestimation of the area favourable for the studied species; which may have important consequences in a context of biodiversity management, and particularly in regions of introduction of alien species.
Our results reveal the raccoon's tolerance to a very wide range of bioclimatic conditions, resulting in vast areas favourable to the species at a global scale. By highlighting the regions that are most likely to be colonized, these projections represent important tools in orienting and optimizing the monitoring efforts of the species. Since the first introductions in Europe in the early 20th century, raccoon populations have thrived and are currently present in all Western European countries. Some populations are also rapidly expanding from the center of Europe to the east, where our results identify highly favourable areas, extending in to Russia and the Middle East. Similarly, our results indicate highly favourable areas covering a large number of islands, such as different islands of the Caribbean and the Japanese archipelago, but also Madagascar, New Zealand, or Tasmania. Island environments are particularly vulnerable to the colonization of invasive alien species, notably carnivores, which are often poorly represented in native ecosystems 112 . The impact of introduced raccoons may thus be more damaging on islands where native fauna has been little exposed to predation, and where raccoons themselves are exposed to little predation and competition.
Invasive species management is a priority for the conservation of ecosystems and biodiversity. ENMs represent particularly effective tools to grasp the distribution of these species at different spatial and temporal scales as they can take into account the ongoing and future environmental change. Our analyses of favourable geographic space to the raccoon at the horizon 2050 predict, with both approaches, a global stability of current favourable areas (Fig. 5), with the exception of the Caribbean islands where a potential extirpation of the species may take place by 2050. Moreover, new favourable areas appear to extend broadly in northern regions in continuity with maintained favourable spaces. These results were expected given the fact that climate change scenarios predict a global warming of arctic regions, and that temperature was identified as the most influential bioclimatic factor for the raccoon (three out of the four environmental variables used in modelling correspond to temperature measures; Supplementary Fig. S3). However, while lost space is limited and fragmented in GF approach projections, it appears more pronounced with the EF approach. Thus, projections with the GF approach predict a clear expansion of favourable areas, whereas projections with the EF approach show a slight contraction and a northward www.nature.com/scientificreports www.nature.com/scientificreports/ shift of favourable spaces. Lost spaces are principally located at tropical latitudes, and newly favourable areas mainly cover regions to the north of current favourable areas. These observations are similar for both representative greenhouse gas concentration pathways scenarios (RCP2.6 and RCP8.5), but more pronounced in the case of RCP8.5. In accordance with the EF approach predictions, a northern shift has also been predicted for the species in its native range, as is the case for other North-American carnivores 113 and northern European species 114,115 . In northern Europe, the development of spaces that are favourable to the raccoon may represent an additional threat to boreal ecosystems, where temperature changes are predicted to be the highest 116 . Arctic species are exposed to fewer competitors, predators, parasites and diseases 117 . Potentially affected by climate change, these species can be even more vulnerable to the introduction of new competitors, predators and potential vectors of disease such as the raccoon. This risk appears all the more likely as carnivores might be better able to keep pace with climatic changes than other mammals 118 .
In conclusion, the raccoon presents a tolerance to a very wide range of bioclimatic conditions resulting in extensive regions currently favourable to the species. Moreover, predictions for 2050 reveals wide newly favourable areas north of the current favourable regions. However, differences in the extent of predictions between the two modelling approaches used in this study reveals the importance of the selection of presence and pseudo-absence data. This study demonstrates the importance of data selection and processing in ENM modelling given their crucial role in our understanding of species distribution dynamics and in the development of conservation plans for ecosystems.
Methods species occurrence data. Occurrences of P. lotor were compiled from the online databases VertNet and GBIF, in addition to the databases of the Guadeloupe National Park, Martinique Regional Park, Saint-Martin Natural Reserve, French national wildlife organizations, recent literature, and personal observations. Only recent observations (from 1950 to the present day) with complete and precise locations were selected, giving a total of 20425 occurrences (Supplementary Table S1). All occurrence records were aggregated into 0.08° cells corresponding to the resolution of environmental variables, resulting in a total number of 5922 records (Supplementary Fig. S4). Unfavourable: areas that are currently unfavourable remain unfavourable in the future; Lost: areas currently favourable that will lose their favourable nature in the future; Maintained: areas that are currently favourable and will still be favourable in the future; New: areas that are currently not favourable but would become favourable in the future.
www.nature.com/scientificreports www.nature.com/scientificreports/ environmental data. Current and future favourable environmental envelopes for P. lotor were calculated using 19 bioclimatic variables (derived from temperature and precipitation measures; Table 2 Table 2), for two Representative greenhouse gas concentration pathways scenarios (RCP): the most optimistic RCP2.6 (with a radiative forcing of +2.6 W/m² for the period 2000-2100), and the most pessimistic RCP8.5 (with a radiative forcing of +8.5 W/m²).
Collinearity and the influence of the 19 bioclimatic variables were tested, using a protocol adapted from Leroy et al. 25 and Bellard et al. 61 (Supplementary Method S1). This allowed for the identification of four non-collinear variables that significantly influenced the distribution of P. lotor ( Analysis of niche conservatism. Following the methodology proposed by Warren et al. 3 , and further developed by Broennimann et al. 103 , we calculated niche overlap, equivalence, and similarity between the native range (4956 occurrences) and the three major areas colonized by P. lotor: the Caribbean (39 occurrences), Europe (868 occurrences), and Japan (56 occurrences). Niche overlap was calculated using a PCA approach and calibrated with the four selected environmental variables for each area of interest. PCA scores of the species occurrences on the first two axes were projected onto a grid of cells delineated by the minimum and maximum PCA scores of the environmental variables. Next, a kernel density function was applied to estimate a density of occurrence for each cell of the grid (see Di Cola et al. 120 for methodological details). Thereafter, niche overlap was estimated using Schoener's D 121 . This index varies between 0 and 1; 0 meaning no overlap, while 1 meaning identical niches. This index was then used to assess niche equivalence and similarity. Occurrences in each compared range were pooled and randomly split into two datasets with equal size as the original dataset. Niche overlap was then estimated with the D index. This procedure was repeated 100 times to create a null distribution. The observed D was then compared to these simulated values, and the null hypothesis of equivalence between niches was rejected when observed values fell outside of the 95% confidence interval of the simulated distribution. Alternatively, niche similarity test investigates whether the niches in the native or invaded range predict one another better than expected by chance. Occurrences in one area were randomly reallocated within their respective available environmental space and simulated niches were compared to the niche of the other area with both indices. This procedure was repeated 100 times to create a null distribution. The observed D was then compared with the simulated values, and the null hypothesis of similarity of the tested niche to the other was rejected when observed values fell outside of the 95% confidence interval of the simulated distribution.
Along with niche overlap test, niche expansion and unfilling were also calculated. The expansion index corresponds to environmental conditions in the invaded area that are absent in the native area. Conversely, unfilling refers to environmental conditions in the native area that are absent in the invaded area. Both indices vary between 0 and 1.
All niche conservatism tests were performed using the package ecospat v2.1.1 122 implemented in the R software 123 Data preparation and pseudo-absence selection. The preparation of presence-only data for ENMs included two important steps. The first one consisted of filtering presence data to reduce autocorrelation and sampling bias (e.g., refs 124,125 ). The second one consisted of selecting pseudo-absence data to calibrate models. Here, we applied two different methods to filter presence data and select pseudoabsences: the first method focused on the geographical space (hereafter Geographical Filtration, GF) whereas the second one focused on the environmental space (hereafter Environmental Filtration, EF).
Geographical filtration (GF) is commonly applied in the ENM literature. We used the R package SPTHIN v.2.1-2 125 to remove duplicated occurrences in a radius of 100 km, and therefore reduced auto-correlation biases. Consequently, 682 occurrences were retained. Next, we randomly sampled an identical number of pseudo-absences across the available geographical space, with three repetitions. This approach does not filter out repeated occurrences in similar climatic conditions, and samples pseudo-absences in areas that can fall within areas favourable to the species. Therefore, we expect this method to produce a modelled distribution that fits  www.nature.com/scientificreports www.nature.com/scientificreports/ more closely the initial distribution of species occurrences, a "lower-bound" estimate of the species distribution as opposed to the Environmental Filtration method described below.
Environmental filtration (EF) focuses on the distribution of data in the space of environmental variables. Varela et al. 124 showed that removing duplicated records in the environmental space consistently improved the quality of model predictions, contrary to geographical filtering. Therefore, we created a gridded environmental space on the basis of the four selected environmental variables in which we projected all the conditions existing in the geographical space. Next, we projected within this environmental space all the presence points of the species and removed all duplicate points per cell, which resulted in a total of 1036 occurrences. To select pseudo-absences within the environmental space, we applied a procedure to avoid selecting pseudo-absences within environmental conditions that are favourable for the species. To do so, we calculated the restricted n-dimensional convex hull of presences, defined as the smallest convex hull encompassing all occurrence points. This restricted convex hull is considered as a proxy of the favourable environmental conditions outside which we randomly generated 1036 pseudo-absences (i.e., in equal number to the filtered occurrences) with three repetitions. Such a procedure is supported by the statistical theory of model-based designs, also known as "D-designs" which are assumed to minimize prediction variance (see Hengl et al. 76 ). Overall, the environmental filtering of presences will result in a decreased autocorrelation 124 and the convex hull will minimize the risk to sample pseudo-absences inside favourable conditions. Therefore, we expect this procedure to produce a larger estimate of the species niche than the GF procedure, hence a larger potential distribution. In addition, we consider this approach to produce a better description potential distribution of the species compared to the GF approach as it corrects for biases linked to heterogeneity in sampling intensity and reduces the risks of generating pseudoabsences falling within favourable areas. ensemble modelling process. Model calibration. Nine different modelling techniques were calibrated and evaluated [126][127][128][129] : Artificial Neural Networks -ANN 130 (Ripley, 1996); Classification Tree Analysis -CTA 131  Because our procedure is a presence/pseudo-absence procedure, we did not calculate discrimination capacity metrics (e.g. the area under the receiver operating characteristic curve or the true skill statistics) because (1) these metrics are designed to be calculated on real absences, and (2) they are dependent on prevalence, which can lead to spurious evaluations of ENMs 138,139 . Rather, we evaluated our models with the Boyce index, specifically developed for such data 109,140,141 . The Boyce index assess how much model predictions match the observed distribution of species occurrences. Values of the Boyce index vary between −1 and +1. Positive values indicate a model with predictions that are consistent with the distribution of occurrences in the evaluation dataset whereas negative values indicate a model with predictions that are not consistent with the distribution of occurrences. Boyce index values close to zero mean that the model is not different from a random model. Models with a mean Boyce index higher than 0.7 were selected ( Supplementary Fig. S6). Consequently, all models were kept for EF modelling, but MAXENT was removed for GF modelling. These analyses were performed using the R package ecospat v2.1.1 122 .
Forecasts of current and future favourable areas were obtained by ensemble forecasting method. Current and future ensemble forecasts represent consensual projections of the six modelling techniques, obtained through averaged distributions of favourability scores 4,142,143 . To discriminate current favourable and non-favourable areas, and visualize shifts by 2050, probability distributions were transformed into binary projections. These binary projections were obtained using a probability threshold that maximized the Sørensen value 139 . Sørensen indices were calculated using 70% of occurrences as training data, and the remaining 30% as testing data.

Data Availability
Occurrence data are available in Supplementary