Biotic threats for 23 major non-native tree species in Europe

For non-native tree species with an origin outside of Europe a detailed compilation of enemy species including the severity of their attack is lacking up to now. We collected information on native and non-native species attacking non-native trees, i.e. type, extent and time of first observation of damage for 23 important non-native trees in 27 European countries. Our database includes about 2300 synthesised attack records (synthesised per biotic threat, tree and country) from over 800 species. Insects (49%) and fungi (45%) are the main observed biotic threats, but also arachnids, bacteria including phytoplasmas, mammals, nematodes, plants and viruses have been recorded. This information will be valuable to identify patterns and drivers of attacks, and trees with a lower current health risk to be considered for planting. In addition, our database will provide a baseline to which future impacts on non-native tree species could be compared with and thus will allow to analyse temporal trends of impacts.

www.nature.com/scientificdata www.nature.com/scientificdata/ spatial resolution (continental scale) or small study extent and the limited consideration of the extent of damage caused by insects and pathogens. With the data presented here it will be possible to investigate whether drivers of and mechanisms underlying the variability in the level of damage might differ at a finer spatial resolution. By providing important baseline information the data will also serve to compare future data with and thus allow to analyse temporal trends of impacts of pest species on non-native trees. Not least this database allows to identify trees with a lower current health risk to be considered for planting, while we acknowledge that the plant health situation is not static and new or more severe attacks may occur for different reasons, e.g., through imported or naturally arriving non-native pests or pathogens, or caused by climate change (e.g. through better breeding conditions of pests) or by additional host shifts, which may occur with considerably prolonged cultivation time or area of cultivation 4,5 .

Methods
We designed a Microsoft Excel spreadsheet that allowed for straightforward recording of the occurrence of the biotic threats and their overall impact on specific non-native trees in a country. The required information was NNT (non-native tree), COUNTRY (for which the information was provided), ORGANISM_GROUP, ORDER, FAMILY, GENUS, scientific NAME of the biotic threat, AUTHOR of the taxon name, ORIGIN (continent of origin of the biotic threat), main host species (latter omitted), 1ST_OBSERVATION (year of 1 st observation of the damage), PRIM_DAMAGE and SEC_DAMAGE (primary damage, which is the symptom most detrimental to the tree health, and, if any, secondary type of damage, which is an additional symptom), LEVEL of impact on an individual tree, MAX_AREA (maximum continuous area impacted), AGE_CLASS (tree cohort where impact occurs), CONFIDENCE level for the provided information, REF (references), COMMENTS (latter omitted) and DATA_PROVIDER (name and email address). We predefined selection options for the following fields: ORGANISM_GROUP, ORIGIN (multiple selections allowed), PRIM_DAMAGE and SEC_DAMAGE (primary and secondary type of damage), LEVEL, MAX_AREA and AGE_CLASS (Table 1). Because we received multiple entries for PRIM_DAMAGE and mainly for SEC_DAMAGE we transformed these columns in the final table into single columns for all eight damage types where '0' means not observed, '1' observed and '2' observed and originally filled as primary type of damage. To describe the overall data quality, the data providers could choose among three CONFIDENCE levels on the type and severity of impact/damage ('high' -Reliable/high quality data sources on impact. The case was reviewed and verified by an expert; 'medium' -Reliable/medium quality data sources of impact. The impact was either reported by a reliable forester but could not be reviewed and verified by an expert or it was published in a professional journal, but it is not clear from the publication that it was checked by an expert; 'low' -Low quality data sources of impact. Observation was reported by a non-professional or reported in a non-professional journal, without being confirmed by an expert; Table 2). Any missing information was indicated with NA. In the final table, one line/entry was allowed per biotic threat per non-native tree per country. This procedure was chosen as the middle ground between requiring a detailed description of every single attack incidence (not feasible) and simple occurrence recording of a biotic threat on a non-native tree (limited value). The tree species to be investigated in this study were selected mainly based on their importance in silviculture and the area they occupy 2 , but also some less widely distributed species were included. In total, we requested biotic threats information for 24 non-native trees. One subjected non-native tree, Acacia melanoxylon with only one entry from Spain, was discarded due to limited data.
We approached forest damage experts in all 36 European member countries of NNEXT. Country representatives in NNEXT were either themselves forest protection experts or they contacted experts at universities or national forest research stations/institutes. Twenty-nine countries responded to our request to fill the biotic threats database, but only 27 countries eventually filled the database in the requested manner.
The information on the year of 1 st observation turned out to be difficult to verify and thus we received many NAs or ambiguous information. Several improvement steps were taken. If "<" than a specified year had been entered, we changed it to the specified year (e.g. <1850 was changed to 1850). If a period was indicated, we changed it to the mean year (e.g. 1890's was changed to 1895). If any text had been entered that could not be interpreted in a way to yield a certain year, we changed this to NA (e.g. 'several times' or 'since introduction of a non-native tree'). To fill NAs, we took a look at reference of the entry, and whenever the reference was a publication specific to this biotic threat in that country, we took the year of the publication as a substitute. To distinguish such data from the original information on the year of 1 st observation received from the data providers, we added the column YEAR_ADDED indicating with 1/0 whether the year was filled from the references or not.
The columns on the type of damage appeared to be filled very heterogeneously by our national experts and left many gaps. We thus decided to homogenize this information across countries based on current knowledge on species autecology and considering the information provided by the country experts. The damage information was also restructured. Instead of primary and secondary type of damage, we introduced eight columns for the eight possible types of damages, where the observation of a type of damage was given as 0 (not observed), 1 (observed) or 2 (observed and filled as primary type of damage by data providers, see above). This was done by the three database managers (one forest entomologist, one forest pathologist and one general forest ecologist). Although one particular pest or pathogen species could cause damage to different parts of the same tree in different countries (e.g. an insect that attacks different tree parts during larval and adult stage), we decided to use a common classification of all eight types of damage across countries because of the above-mentioned inconsistent data provided by country experts, but still allowed for a weighting of the primary damage in each country. While this approach will not allow for testing fine scale differences in damage types caused on a particular tree species across Europe, it will still allow for coarse differences in main impacted tissue among countries and identifying traits of species that might increase the probability of attack. Furthermore, the LEVEL and MAX_AREA of damage provided by our database still varies among countries, allowing for tests on the main drivers of damage variability across Europe, e.g. depending on county-specific differences in the number of congeneric tree species, 1 -no effect: Noticeable impact/damage but no effect on individual tree fitness  www.nature.com/scientificdata www.nature.com/scientificdata/ time since introduction of the non-native tree and area of the non-native tree, which have been shown to be main drivers at the European scale 4,5 .
The database managers added two additional columns. A column SPECIALISATION was introduced to differentiate the host plant niche breadth of species based on the main higher plant lineages 6 following the approach published by Gossner et al. 7 . An organism was assigned 'monophagous' if it attacks species of one genus, 'oligophagous' if it attacks species of one higher plant lineage (i.e., bryophytes, ferns, gymnosperms, angiosperms: monocots, angiosperms: basal eudicots, angiosperms: eurosids, angiosperms: euasterids) and 'polyphagous' if it attacks species from more than one higher plant lineage. The column RELATIVE_ORIGIN categorises the biotic threats into species native to Europe but not at the origin of the non-native tree ('Europe'), species from same origin as the tree species and not native in Europe ('origin'), species native in both, Europe and the home range of the non-native tree ('both') and species from another region, neither Europe, nor the origin of the non-native tree ('third').

Data records
We provide a database on synthesised biotic threat information for 23 non-native tree species (Abies grandis, Abies

Number of cases
High -Reliable/high quality data sources on impact. The case was reviewed and verified by an expert. 1786

Medium -Reliable/medium quality data sources of impact. The impact was either reported by a reliable forester but could not be reviewed and verified by an expert or it was published in a professional journal, but it is not clear from the publication that it was checked by an expert.
189

Low -Low quality data sources of impact. Observation was reported by a non-professional or reported in a non-professional journal, without being confirmed by an expert. 265
NA 64

technical Validation
All country tables were checked by the database managers for formal correctness of the information provided. For example, entries for fungi occurring only on dead plant material or in association with ectomycorrhiza were removed, because they are not the topic of this data collection. The database managers checked all scientific names, families and orders and changed them where necessary to the current accepted name. Fungal nomenclature is particularly complex due to sexual (teleomorph) and asexual (anamorph) states having different names. The new codex rules adopted at the International Botanical Congress in 2011 advocated the abandoning of dual naming system for pleomorphic fungi ("one fungus, one name" convention) 9 . Therefore, we took the currently valid name from the online databases for fungal nomenclature, Indexfungorum.org and MycoBank.org, matched  Table 4. Data quality assessment by the data providers concerning a potential bias in insect pests and pathogens, and steps taken thereafter (contacting of new experts and completing the database). Every data provider agreed to check for new records and fill NAs in existing entries. The bias is evaluated according to the following scheme: 1 -'The data well reflect the situation of the pest/pathogen impact. There is no bias due to prioritization of certain tree species and/or lack of experts'; 2 -'The data on pest/pathogen impact have some bias. The bias due to prioritization of certain tree species and/or lack of experts is, however, minor'; 3 -'The data on pest/pathogen impact have major bias. Due to prioritization of certain tree species and/or lack of experts the data does not reflect the complete situation in the country and thus should not be used in a cross-country analysis'; n.a. -not applicable.
www.nature.com/scientificdata www.nature.com/scientificdata/ with the most recent relevant taxonomic literature. For animals we used the Fauna Europaea 10 as baseline and adapted recent changes. ORIGIN was completed and corrected by the database managers.
Two rounds of data quality checks involving the data providers were included in the data acquisition procedure. We calculated simple descriptive statistics for every country to determine the number of entries per non-native tree, the number cases where for a tree occurring in a country (based on NNEXT information 2,11 ) we did not have any biotic threats entries in the database and the number of entries per organism group and non-native tree (compare Table 3). In online-only Table 1 we provide the number of entries per tree species per country. The pre-analysis helped to indicate missing or biased information. A short, country specific report on these findings (stating i.e. whether there was a considerable and unexpected imbalance between insect pests and pathogens and for which non-native trees occurring in the country no records had been provided) was sent to the data providers. The report was accompanied with a formal request to (i) provide a personal assessment of the quality and completeness of the data (bias assessment), (ii) to name options to improve the data quality, (iii) to add new entries and (iv) to fill missing information in existing database entries. For the assessment of a potential bias in the number of entries and level of detail provided for different tree species for example due to missing experts or lacking economic value of a tree species resulting limited recording and knowledge on attacks, we offered the following categories: 1 -'The data well reflect the situation of the pest/pathogen impact. There is no bias due to prioritization of certain tree species and/or lack of experts'; 2 -'The data on pest/pathogen impact have some bias. The bias due to prioritization of certain tree species and/or lack of experts is, however, minor'; 3 -'The data on pest/pathogen impact have major bias. Due to prioritization of certain tree species and/ or lack of experts the data does not reflect the complete situation in the country and thus should not be used in a cross-country analysis'; This call for data quality check and completion was successful and led to a large number of new and completed entries. In a second round of quality check, the data providers were given a final chance to update the database and give a final assessment of the data quality/bias (Table 4).