Background & Summary

The taxonomy, distribution, and biology of European butterflies has been studied since the 18th century. Due to the precise knowledge of changes in distribution and abundance, driven by extensive citizen science contributions, and their trophic specialisation and immediate responses to environmental changes1 they are frequently used as indicators of environmental change2. Recently, a series of comprehensive resources have been published for European butterflies comprising a detailed taxonomic list3, a dataset for 15,609 sequences for the COI mitochondrial markers for all Western-Central European species4, a dated phylogenetic tree for all European species5, atlases describing their detailed distributions6, and climatic risk assessments7. In turn, species traits are fundamental descriptors of feeding ecology, life-history, morphology, resource use, behaviour and physiological constraints8. It has long been recognised9 that the availability of such data is largely limited and incomplete. Only recently has a geographically extensive series of traits describing climatic preferences based on temperature and precipitation been produced10 along with a series of six traits describing some features of feeding ecology, morphology, and life histories of butterflies from Western and Central Europe4. Species traits have been used to explain extinction risk and conservation status11,12,13, colonisation and distribution changes14,15, phenology and potential for range shifts in relation to climate change16,17,18 and phylogeographic patterns4 over large spatial scales. More detailed trait information at smaller spatial scales has been used to identify ecological groupings of species, for both butterflies in the British Isles19 and macromoths of Central Europe20. However, most studies which have used trait information to identify ecological relationships, current extinction risks and distributions have tended to use either limited sources of information and/or limited numbers of butterfly species or have been concentrated at the regional scale21,22.

Because single sources of trait information may be limited (e.g. in geographical scope) or conflict with each other23, we present a new comprehensive open-access trait database24, with a maintained version ( of the European and Maghreb butterflies. We have aimed to pull together all the existing trait information available for each species. This has been done by synthesising the existing information from field guides, ecological atlases, reliable on-line sources, expert opinion and journal articles. Our database provides trait information for 542 taxa and covers 25 main traits (some subdivided - giving 217 trait states in total), including life history, resource use by all life-cycle stages, and behavioural information. Where specific traits are variable within species we also give data on this variability. We also process this data to provide multinomial and continuous variables and measures of their variability, resulting in a matrix of 542 species by 31 variables. We also list our information sources for the traits. Although some previous trait-based analyses have included vegetation associations, our trait database does not include these for two reasons. First, the habitat a species occurs in is determined by the occurrence and spatial distributions of species-specific resources25,26. Second, resource, life-history and behavioural traits can be used to predict the vegetation structures in which species occur19.

Our database provides an outstanding resource for improving our understanding of fundamental mechanisms and processes such as how traits define species occurrence and co-occurrences, their responses to environmental change, their spatial dynamics, and their associations with vegetation structures. Since traits vary within different taxonomic groups, understanding their evolution and variability within different branches of the tree of life can also provide insights into phylogenetic constraints on species resource requirements and ultimately on their local abundance and large-scale occurrence and vulnerability to environmental change.


Taxon and geographic coverage

Our dataset (542 taxa) represents the complete butterfly fauna of mainland Europe including the western parts of Russia, the European islands, Macaronesia and the Maghreb (North Africa). This includes all of the 496 species occurring in Europe according to the latest checklist of European butterflies3, but we have also included taxa (Table 1) that are confined the Maghreb or have very divergent traits from the nominate species according to some sources within our study area (Fig. 1). The nomenclature is consistent with that used in the checklist3. Trait information is recorded for all the families of butterflies included in the geographic area (Papilionidae, Hesperiidae, Pieridae, Riodinidae, Lycaenidae and Nymphalidae). For species that also occur beyond the study area, trait information was taken from the main study area, if possible. For example, for those species that have a pan-Palearctic distribution only information from the European range was included in our dataset.

Table 1 Species not in the European checklist3 but included in the trait database of European and Maghreb butterflies.
Fig. 1
figure 1

The geographic area covered by the European and Maghreb butterfly database.

Trait information was gathered from sources including field guides, books and atlases27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62, scientific papers63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166, and some selected online resources (,,,, http://eurobutterflies.com,,,,, and direct observation in the field. Species-specific information sources are given in the database24 and website ( In cases with multiple sources of trait information, data from peer reviewed papers were preferentially used; in practice this made up a small proportion of the total trait data. In cases where differences were identified in trait information between different sources, and could be identified as representing trait diversity, all sets of information were included in the trait database. Where sources clearly conflicted, we used the information that we deemed the most reliable. When published information was lacking, we inferred traits using photographs from two reliable sources ( & if the traits could be unequivocally determined. This included hostplants and hostplant types, egg-laying location, larval location, adult feeding, adult basking type and basking sites. Trait information based on photographs was independently assessed by the authors in order to check the validity of the inferences. For some taxa certain trait information was not available in any source, and thus it is missing in the database.

The first version of the trait database24 was finalised after three steps had been completed for all taxa. These were 1) mining all the standard references (e.g guides and atlases) for trait information, 2) filling gaps in trait information through a thorough literature search via Google Scholar and PubMed, and 3) emailing and asking experts on particular taxa for additional trait information.

Trait types

Our database covers the traits of all stages of the butterfly life cycle. Many trait types included in the database along with their subdivisions into individual states were derived from an earlier treatment of the butterflies of the British Isles19 but the trait types have been extended for this database. Individual traits were defined prior to the beginning of data collation to allow for unambiguous coding. Comprehensive trait definitions are in the file traitdefinitions.pdf on the Dryad repository24 and curated on-line version ( Most of the traits types in our raw data trait database (state table) are coded as binary, variables, but a minority are continuous (Online-only Table 1). Most traits included in this dataset are divided into multiple sub-traits. For example, the trait ‘overwintering stage’ comprises four binary sub-traits each of which indicates one stage of a species’ life cycle: egg, larva, pupa or adult. A species can have any combination of 0 and 1 for each of these sub-traits. This allows for the coding of trait plasticity across a species’ range. Likewise, voltinism is coded as binary values for different states. This coding of the basic data (state table) has been transformed (traits table) into a series of multinomial traits, derived from binary states in the state table, and as continuous data, where the data in the state table is also continuous (Online-only Table 2). Additional variables are added in the traits table to describe variability within traits, resulting in a matrix of 542 species x 31 trait variables. Presenting the basic data within the database (state table) facilitates adding data in the future, which could include novel combinations of sub-trait states, whilst providing the processed data (traits table) and original raw data (state table) aids different analytical procedures.

Traits are divided into four main types: ‘life history’, ‘morphological’,‘resource-based’, and ‘behavioural’. Life history traits describe a species’ life cycle related to reproduction and also to growth and survival, including the number of generations per year (voltinism), egg laying strategy (egg laying type) and overwintering stage. We use wing size (both forewing length and precisely defined wingspan) as key morphological traits because they have been used in previous trait-based analyses, being correlated with mobility167,168,169, development time170, and reproductive output171, as size correlates with many aspects of life history172,173. Wingspan is included as it also includes an approximate measure of thoracic size, and thus flight muscle mass which may influence flight capacity and dynamics. ‘Resource-based traits’ describe species’ relationships with environmental resources. Resources include consumables that can be depleted over time when used or utilities that are not depleted. For example, ‘adult feeding’ describes the range of resources consumed by adults which may be temporarily or permanently depleted. Likewise, ‘adult roosting’ describes structures (utilities) used for roosting behaviour; these structures are resources, and although not directly consumed there may be a finite number of suitable features of this type within a location which may be limiting factors for local populations and may become the subject of both interspecific and intraspecific competition25,26.

Some traits in the database are primarily behavioural such as ‘mate locating type’, but these traits are also closely linked with traits that relate more directly to resource-usage (in this case with ‘mate locating location’); thus, behavioural traits can also be linked to resources. Larval hostplants are examined in detail in several traits because of their importance for the life cycle and population structure of butterflies. Some authors of previous work using butterfly traits have included ‘habitat breadth’ as a trait14,174, although the physical structures/vegetation types occupied by species are not traits themselves, but the result of species occurring in those locations where their essential resources co-occur in spatial patterns and densities that they can use and these can change substantially across the geographic area our database covers. Essentially, species habitats are defined by their resources25,26 and the resource requirements that species have are fundamental traits. Biotope or habitat associations are therefore not included in this dataset as they can be derived from the traits described in our database24. Additionally, biotope traits have been shown to have poor reproducibility among different trait sources23 and have been found to be less useful than other types of traits for understanding the responses of butterflies to environmental change over time at a large scale19,23. At a smaller scale, biotope associations may be useful characteristics for aiding in butterfly conservation and habitat classification, but any attempt to synthesise information at a large geographic scale describing habitat preferences from multiple sources would likely be both error-prone and probably too coarse for most analyses. We also did not include measures of climatic requirements and geographic ranges in our dataset since they are already publicly available in the CLIMBER dataset10.

Data Records

The database24 deposited on the Dryad Digital Repository and the live version ( including species specific information sources and a PDF-file describing each of the variables in the raw state table and traits table (ButterflyTraitDefinitions.pdf). The live version includes a mechanism for feedback and adding new information. For some taxa there are missing data and some traits currently have more missing values than others. Life history and hostplant related traits are extensively covered with few missing values, but behavioural traits have the most missing values as they usually require direct observation in the field, thus the disparity. However, the types of traits with missing data (Table 2) indicate where targeted fieldwork is required. Likewise, species with poor overall data also warrant targeted future effort.

Table 2 The percentage of each family within the European and Maghreb butterflies trait database with incomplete trait data, described by 31 multinomial and continuous variables in the traits table.

Technical Validation

The records included in the database are based on previously published information from field guides, ecological atlases and peer reviewed journal articles, supplemented with the authors’ personal observations. We are therefore confident as to their accuracy. When sources highlighted that records for a particular trait were doubtful, this information was not included in the dataset. The author team comprises experts on butterfly ecology coming from seven countries across Europe thus ensuring the highest level of repeated quality control while providing best knowledge across the biomes in Europe. The authors have examined the dataset to check for errors and to assess the accuracy of the trait information included. All data included in the dataset is fully referenced which allows anyone to go back to the original records for any piece of trait information. The dataset currently contains some missing values, especially for highly localised species and we intend to keep the database ‘live’ and to manage updates with new information. Certain traits such as voltinism and phenology (flight months) are known to vary across the latitudinal gradient as these traits may in part be responses to accumulated growing degree days175. We are confident that we have captured variability of these traits for the majority of species by consulting trait sources that encompass both the full European range as well as smaller areas. We will accept data into our live version of the database from existing resources, unpublished information and new published information. Each species has its own reference list so existing data can be checked and new information correctly integrated into the database. Data submission methods are described in the live database.

Usage Notes

We have provided the first extensive database of butterfly traits in Europe and North Africa. Of particular value is the species and geographic coverage and the extensive sets of traits that we have included. This provides an outstanding resource for improving our understanding of fundamental processes such as how traits define species co-occurrences and their responses to environmental change, their spatial dynamics, and their associations with vegetation structures. Since traits vary within different taxonomic groups, understanding their evolution and variability among different branches of the tree of life can also provide insights into phylogenetic constraints on species resource requirements and ultimately on their local abundance and large-scale occurrence and vulnerability to environmental change. As our trait database includes a large component of resource requirements for all life-history stages it can also be used to aid conservation efforts by focusing on resources that may be limited for vulnerable species at small to large spatial scales. Additionally, the inclusion of behavioural traits within the database can contribute to increasing our understanding of the roles of behavioural characteristics in determining species occurrences and resource use.

We have minimised processing of the data within the state table of the database. Individual variables in this state table may have poor linear relationships or spurious negative correlations due to their statistical distributions and outlier effects which can constrain both phylogenetic and ecological analyses. Although fuzzy methods of multivariate analyses may accommodate these issues176 the processed multinomial and continuous variables with measures of variability provided in the traits table facilitates more conventional approaches to multivariate analyses. For some species there is missing data and some traits currently have more missing values than others. Whilst updating the database will supply some missing data there are imputation methods177,178 that can be used to predict these values and we are confident that in the absence of verified data, imputed data can be used to retain both species and traits with missing values within analyses.