Spread of a model invasive alien species, the harlequin ladybird Harmonia axyridis in Britain and Ireland

Invasive alien species are widely recognized as one of the main threats to global biodiversity. Rapid flow of information on the occurrence of invasive alien species is critical to underpin effective action. Citizen science, i.e. the involvement of volunteers in science, provides an opportunity to improve the information available on invasive alien species. Here we describe the dataset created via a citizen science approach to track the spread of a well-studied invasive alien species, the harlequin ladybird Harmonia axyridis (Coleoptera: Coccinellidae) in Britain and Ireland. This dataset comprises 48 510 verified and validated spatio-temporal records of the occurrence of H. axyridis in Britain and Ireland, from first arrival in 2003, to the end of 2016. A clear and rapid spread of the species within Britain and Ireland is evident. A major reuse value of the dataset is in modelling the spread of an invasive species and applying this to other potential invasive alien species in order to predict and prevent their further spread.

Invasive alien species are widely recognized as one of the main threats to global biodiversity. Rapid flow of information on the occurrence of invasive alien species is critical to underpin effective action. Citizen science, i.e. the involvement of volunteers in science, provides an opportunity to improve the information available on invasive alien species. Here we describe the dataset created via a citizen science approach to track the spread of a well-studied invasive alien species, the harlequin ladybird Harmonia axyridis (Coleoptera: Coccinellidae) in Britain and Ireland. This dataset comprises 48 510 verified and validated spatio-temporal records of the occurrence of H. axyridis in Britain and Ireland, from first arrival in 2003, to the end of 2016. A clear and rapid spread of the species within Britain and Ireland is evident. A major reuse value of the dataset is in modelling the spread of an invasive species and applying this to other potential invasive alien species in order to predict and prevent their further spread.

Background & Summary
The invasion process for an alien species involves various stages, notably introduction, establishment, increase in abundance and geographic spread 1 . An alien species that spreads and has negative effects (which may be ecological, economic or social) is termed invasive 2,3 . Invasive alien species are widely recognized as one of the main threats to global biodiversity [4][5][6] . There are a number of international agreements which recognize the threat posed by invasive alien species, which are designated as a priority within the Convention on Biological Diversity Aichi biodiversity target 9 (https://www.cbd.int/sp/targets/ rationale/target-9/) and are relevant to many of the Sustainable Development Goals (http://www.un.org/ sustainabledevelopment/sustainable-development-goals/). An EU Regulation on invasive alien species came into force on 1 January 2015 (http://ec.europa.eu/environment/nature/invasivealien/index_en.htm) and subsequently a list of invasive alien species of EU concern was adopted for which member states are required to take action to eradicate, manage or prevent entry. Rapid flow of information on the occurrence of invasive alien species is critical to underpin effective action. There have been few attempts to monitor the spread of invasive alien species systematically from the onset of the invasion process. Citizen science, i.e. the involvement of volunteers in science, provides an opportunity to improve the information available on invasive alien species 7 . Here we describe the dataset created via a citizen science approach to track the spread of a well-studied invasive alien species, the harlequin ladybird Harmonia axyridis (Coleoptera: Coccinellidae) in Britain and Ireland. This species was detected very early in the invasion process and a citizen science project was initiated and widely promoted to maximize the opportunity to gather data from the public across Britain and Ireland.
Harmonia axyridis was introduced between approximately 1982 and 2003 to at least 13 European countries 8 as a biological control agent. It was mainly introduced to control aphids that are pests to a range of field and glasshouse crops. From the early 2000s it subsequently spread to many other European countries, including Britain and Ireland. It is native to Asia (including China, Japan, Mongolia and Russia) 9 and was also introduced in North and South America and Africa 10 . Harmonia axyridis was introduced unintentionally to Britain from mainland Europe by a number of pathways: some were transported with produce such as cut flowers, fruit and vegetables; others arrived through natural dispersal (flight) from other invaded regions 11 . To a lesser extent H. axyridis also arrived from North America 12 . The major pathways of spread to Ireland were probably natural dispersal (from Britain) and arrival with produce. Harmonia axyridis is a eurytopic (generalist) species and may be found on deciduous or coniferous trees, arable and horticultural crops and herbaceous vegetation in a wide range of habitats. It is particularly prevalent in urban and suburban localities (e.g. parks, gardens, and in or on buildings) 13 .
Citizen science approaches to collecting species data are becoming increasingly popular and respected 14 . Advances in communication and digital technologies (e.g. online recording via websites and smartphone applications; digital photography) have increasingly enabled scientists to collect and verify large datasets of species information 15 . For a few species groups, including ladybirds, verification to species is possible if a reasonably good photograph of the animal is available. In late 2004, shortly after the first H. axyridis ladybird record was reported, funding was acquired from Defra and the National Biodiversity Network (NBN) to set up and trial an online recording scheme for ladybirds, and H. axyridis

Methods
This dataset (Data Citation 1) comprises 48 510 spatio-temporal records of the occurrence of H. axyridis in Britain and Ireland, from first arrival in 2003, to the end of 2016. For its type it is thus an unusually substantial dataset. Whilst the records were collated and verified by the survey organizers, the records themselves were provided by members of the public in Britain and Ireland. Uptake to the Harlequin Ladybird Survey was undoubtedly assisted by the pre-existence of the Coccinellidae Recording Scheme (now the UK Ladybird Survey), supported by the Biological Records Centre (within NERC Centre for Ecology & Hydrology) 16 .
Reflecting the general diversification of citizen science through innovative use of technology 17  records from 2003 and no earlier records have been received, supporting the case that the earliest records in the dataset represent the onset of the invasion process for this species. Indeed H. axyridis has a relatively high detectability (e.g. 19 ) and rapid reproductive rate, so is unlikely to have arrived unnoticed.
Each record represents a verified sighting of H. axyridis on a given date (or range of dates) and comprises one or more individual ladybirds observed from one or more life stages (larva, pupa, adult). Records are from Britain (England, Wales and Scotland, including offshore islands), Ireland (both Northern Ireland and the Republic of Ireland), the Isle of Man and the Channel Islands (primarily Guernsey and Jersey) and are mainly from the period 2004 to 2016. The earliest record of H. axyridis in Britain was initially thought to be from 3 July 2004, but three earlier records (from 2003) were received retrospectively. The data records represent species presence and there are no absence data available.
The majority of the records were received from members of the public via online recording forms (at www.harlequin-survey.org or www.ladybird-survey.org) (Supplementary Figure 1) or via smartphone apps (iRecord Ladybirds or iRecord -www.brc.ac.uk/irecord) (Supplementary Figure 2), with some records (especially in earlier years) received by post. Other records, particularly from amateur expert 16 coleopterists and other naturalists, were received in spreadsheets.
The spatial resolution of the records is variable. Many include an Ordinance Survey grid reference (converted to latitude and longitude), enabling resolution to 100 metres or less, but many others were derived at 1 km resolution from a UK postal code (UK Government Schemas and Standards, http://webarchive.nationalarchives.gov.uk/20101126012154/http://www.cabinetoffice.gov.uk/govtalk/ schemasstandards/e-gif/datastandards/address/postcode.aspx). The option on the online recording form to enter the location via a UK postal code was provided to make the entry of records easier for members of the public unfamiliar with grid referencing systems. Whilst the resolution is thus reduced for these records, the reduction in user error (e.g. the problem of grid reference eastings and northings being transposed) is an advantage 20 . The postal code method was applicable for sightings of H. axyridis made within 200 metres of a specified postal code, so could not be used for a minority of records where the ladybird was seen in a remote semi-natural habitat. The spatial resolution of the records tended to increase over time, as the number of records received via the smartphone apps increased, and these records generally have GPS-generated latitudes and longitudes. LONGITUDE Longitude (WGS 84) for the centre of the grid reference supplied in the GRIDREF field.

GR_1KM
The grid reference of the 1 km grid square in which the occurrence record lies (only populated if the original grid reference is at 1000 m precision or finer).

LATITUDE_1KM
Latitude (WGS 84) for the centre of the 1 km grid square in which the record occurs.

LONGITUDE_1KM
Longitude (WGS 84) for the centre of the 1 km grid square in which the record occurs.

GR_10KM
The grid reference of the 10 km grid square in which the occurrence record lies.

LATITUDE_10KM
Latitude (WGS 84) for the centre of the 10 km grid square in which the record occurs.

LONGITUDE_10KM
Longitude (WGS 84) for the centre of the 10 km grid square in which the record occurs.

ABUNDANCE
A text field containing any abundance information that was supplied with the record. RECORDER ID(s) for recorder name(s) that submitted the record. In the BRC database recorder information is standardized to Surname followed by initials (e.g. Smith, J.). For this reason, the recorder ID relates to a unique standardized name and not an individual. A single ID can refer to multiple individuals and a single individual can also be associated with multiple recorder IDs.

Data Records Repository
The dataset is freely available for download from the Environmental Information Data Centre (EIDC) catalogue (Data Citation 1). The dataset is provided as a single tab-delimited text file, with each line representing a single record.

Constituents of Species Records
Each species record includes 19 fields (Table 1).

Figures and Tables
The figures and tables here show a summary of the dataset, notably the number of verified H. axyridis records received by year (Fig. 1), by month (Fig. 2), by vice county (Fig. 3 and Table 2 (available online only)) and the spread of H. axyridis in Britain, the Channel Islands and Ireland from 2003 to 2016 (Fig. 4).

Technical Validation Record Verification
Verification of the records was made by the survey organizers (led by HER and PMJB but also including others) on receipt of either a photograph or ladybird specimen. The records received from amateur expert  coleopterists and other naturalists are regarded as accurate (i.e. without the survey organizers seeing a photograph or specimen) and have been included in the dataset. Many further online records were received that remain unverified (i.e. no photograph or specimen was sent, or the photograph was of insufficient quality to enable identification) or were verified as another species. All such unverified or inaccurate records are excluded from this dataset. For discussion of these issues (partly relating to our dataset) see 21 . Verified records were regularly uploaded to the NBN Gateway (now the NBN Atlashttps://nbnatlas.org/). There the records could be viewed via online maps, which helped to encourage further recording.

Recording Intensity
Recording intensity by the public was not consistent over time and was influenced by media coverage, publicity events by the survey organizers, and other factors. The number of records in a period is also influenced by weather conditions and seasonality: the main peak in record numbers each year tended to be from late October to early November, the period in which H. axyridis generally moves to indoor overwintering sites (hence this is when many people first notice the species in their homes).
There is also spatial variability in recording intensity: more records come from areas with high densities of people (Fig. 3). Across Britain and Ireland there were a number of particularly active local groups or individuals which contributed hotspots of recorder activity, e.g. London. To many recorders, juvenile stages (especially pupae and early instar larvae) were less noticeable and more difficult to identify than the adult stage, thus limiting their recording. The possibility of a reporting bias towards sightings early in the season also exists (i.e. some recorders may have reported their first sighting of H. axyridis, but not subsequent sightings). In order to minimize this effect, the importance of recording multiple sightings was stressed to recorders. The peaks in record numbers observed late in each year also suggest that any effect of this potential bias was minor. There is probably a further minor temporal bias towards recording on some days of the week (e.g. weekend days) more than others.

Technical Validation
In addition to the expert verification detailed above, each record has also undergone a series of validation checks that are designed to highlight other potential issues with the data. Checks were performed on the date information supplied with the record to ensure that both the start and/or end dates supplied are in recognized formats, are valid dates, are in the past or present (e.g. no future dates), and where both supplied that the start date is prior or equal to the end date. The location information is also checked to ensure that the supplied grid reference is in a recognized format and is a valid grid reference and that the supplied grid reference is from a 10 km and/or 1 km square that contains land. If other location fields were supplied with the grid reference (such as 10 km grid reference, vice county, tetrad or quadrant codes, etc) they were cross-checked to ensure consistency.