Background & Summary

The incidental catch of non-target species (bycatch) during fishing operations is one of the major global threats for marine megafauna, including cetaceans, sea turtles and elasmobranchs1,2,3,4,5. Since these species have different life-history traits and distribution, their populations are considered to be particularly vulnerable to direct mortality caused by fishing operations, and fisheries bycatch can contribute to their decline6,7,8. However, the ability of scientists to provide advice on potential options for management measures on vulnerable species is still limited due to data availability9.

Within the EU Data Collection Framework, Member States have an obligation to collect and deliver a wide range of fisheries data needed for scientific advices10. Data are usually recorded and stored at national or at regional level by different bodies (e.g., public institutions, Non-Governmental Organizations NGOs) in different databases. Hence, datasets are fragmented, not readily available, nor easily accessible for scientific, management and conservation purposes.

Appropriate management strategies that can mitigate the impact of fisheries on vulnerable species are urgently needed, but they require easily accessible data from systematic and independent monitoring programmes11. In the case of the Mediterranean Sea, many authors have documented that dolphins12, sea turtles13, sharks and rays14,15 interact with and are incidentally taken by different types of fisheries, including trawlers, longlines, and gillnets16. Nevertheless, there is little quantitative data available on historical bycatch data of this marine megafauna and, only recently, few authors have started to share time series of Mediterranean fishery data in public repositories17,18,19,20.

To our knowledge, this is one of the first initiatives to make public historical bycatch data of marine megafauna recorded in the most heavily impacted basin of the Mediterranean Sea21, the northern central Adriatic Sea. This area supports a rich and valuable marine biodiversity including marine megafauna and is subjected to a variety of sources of anthropogenic pressures, mainly intense fishing activities, eutrophication, large urban development along coastal areas, and environmental pollution22,23,24,25. Since the early eighties, the northern central Adriatic Sea has been intensively exploited by many fisheries, including the Italian midwater pair trawl fishery, which is one of the largest in the Mediterranean26. Between 2006 and 2019, an extensive monitoring programme of accidental catches of marine megafauna has been conducted on this fishery under permit issued by the Italian Ministry of Agriculture, Food and Forestry (Fishery and Aquaculture directorate), in compliance with the Italian obligations to the Council Regulation (EC) 812/2004 and the EU Data Collection Framework. The primary goal of the programme was to identify and assess the impact of fisheries bycatch on cetaceans in the Italian midwater pair trawl fishery. Then, species of conservation concern like sea turtles and elasmobranchs were also included in the monitoring activity. In this framework, long-term fishery dependent data collected by trained observers provided the most reliable information on the interaction of different vulnerable species with a specific fishing gear in the northern central Adriatic Sea.

From the data collection, a database was built, and three datasets were extracted and described in the present work. Some information included in these datasets have already proved to be useful evaluating the impact27 and predicting the incidental catches of elasmobranchs28,29 and sea turtles30 in the Italian midwater pair trawl fishery. The information gathered in the datasets can help to assess the extent of fisheries bycatch on different vulnerable species in the Mediterranean Sea. Hence, these datasets can be particularly helpful for understanding the ecology of these species and identifying appropriate fishery management measures.


Data collection

Between 2006 and 2019, an average of 13 trained observers per year monitored all fishing operations of 4392 fishing trips and collect bycatch data of cetaceans, sea turtles and elasmobranchs on board 68 pelagic trawlers >15 m of overall length (LOA) in the northern-central Adriatic Sea. For each haul, they recorded operational parameters (e.g., haul duration, time of net setting and hauling, trawling speed) and environmental variables (e.g., geographical coordinates, water depth). Bycatch specimens were measured to the nearest cm using a measuring board and weighed to the nearest gram using an electronic scale or a dynamometer for the largest specimens. For each individual, physical status was assessed by examining body condition including the presence of any injuries, bleeding, the response to external stimuli, the general activity and locomotion (Table 1).

Table 1 Details of variables included in the shapefiles available in the Marine Data Archive.

The monitoring activity was designed according to fleet dynamics that in the case of midwater pair trawl fishery is highly variable in space - depending on the distribution of the target species (small pelagics) – and time – in terms of the national regulation of fishing effort (i.e., trawlers must respect temporal closures during weekends and spawning periods of the target species) and should operate within Italian waters. Based on these considerations, observations covered between 3 and 7% of the total annual fishing effort of midwater pair trawlers operating in the northern-central Adriatic Sea.

Database framework

All information collected on board by fishery observers were reported in a dedicated spreadsheet. Each file was read and checked for potential erroneous entries by using a series of Python routines. After validation, the data were uploaded in the “BYCATCH” database hosted at the Italian National Research Council (CNR) Institute of Marine Biological Resources and Biotechnologies (IRBIM) of Ancona, Italy. BYCATCH was built in MySQL and it was managed and maintained using Python, R and different database management tools (e.g., phpMyAdmin and MySQL Workbench). BYCATCH consists of a collection of tables that store interrelated data. The main database tables are illustrated in the diagram shown in Fig. 1. Each record is associated to a unique ID which allows to create relationships between tables and to generate different datasets. BYCATCH contains fishing operations of the Italian midwater pair trawl fishery monitored between 2006 and 2019, the geographic distribution and biological information of incidental catches of cetaceans, sea turtles and elasmobranchs and dolphin sightings. An overview of all data stored in BYCATCH is provided Online-only Table 1. The geographic distribution of all bycatch specimens is shown in Fig. 2 and for each species, the number of specimens recorded every year and their bycatch rate are shown in Fig. 3. While BYCATCH was built specifically to house bycatch data of the species noted above, the database structure was designed to easily be applied to other species and to include different type of information (e.g., catches of target species and environmental variables). From BYCATCH, three datasets were extracted via queries directly in MySQL or through the free software environment R31 using RMySQL package, a Database Interface and ‘MySQL’ Driver for R32.

Fig. 1
figure 1

Diagram showing the core database tables. Colours group the information by type: (a) anagraphic table and fishing vessels; (b) geographical data; (c) fishing trips and hauls; (d) target species captures and length frequency distributions. The purple box collects biological data of bycatch species, with (e) containing data about the species; (f) specimens’ sightings; (g) represent the core bycatch tables for elasmobranchs, cetaceans and turtles; (g) additional data on species and sex is shared by multiple tables. Following the meaning of the codes visible in the Marine turtles table: fr.distr, frequency distribution; CCLmax, Maximum curved carapace length; CCWmax, (cm) Maximum curved carapace width; SCL, Straight carapace length; SCW, Straight carapace width.

Fig. 2
figure 2

Distribution of total bycatch of (a) dolphins; (b) sea turtles; (c) sharks and (d) skates and rays recorded between 2006 and 2019 in the northern central Adriatic Sea. The number of individuals per taxonomic group is represented by a colour scale applied to a grid with 5 nm cells.

Fig. 3
figure 3

Number of specimens of (a) dolphins; (b) sea turtles; (c) sharks and (d) skates and rays recorded between 2006 and 2019 and their relative average estimated bycatch rates (individuals/n. hauls).

Data Records

The three datasets are available on the Marine Data Archive (MDA)33. All datasets include data aggregated per year and per cell using grid cells of 5 nm which cover the northern central Adriatic Sea. The datasets consist of a collection of shapefiles arranged in a GeoPackage format34. In all shapefiles, the attribute table displays the cell ID and the mean of geographical coordinates of each cell. Then, each shape file contains the following specific information:

  1. (I)

    Monitored fishing effort of mid-water pair trawlers in the northern central Adriatic Sea. This dataset contains 17654 monitored fishing operations arranged in 14 layers (the time series 2006–2019). Each layer (year) describes the monitored effort in terms of number of hauls per cell (n_Hauls), the number of fishing hours per cell (fsh.hrs), the average haul duration in minutes (avH.Len) per cell, the average towing speed in knots (avHspeed) and the average sea depth in meters (avDepth).

  2. (II)

    Incidental catches and morphometric data of marine megafauna. This data collection includes 3529 bycatch events arranged in 23 shapefiles (one per species). Three taxonomic groups are represented within the dataset: bottlenose dolphins, loggerhead turtles and elasmobranchs (sharks and rays). Within each shapefiles’ attribute table, species abundance is recorded in terms of number of individuals per cell and per year. The dataset also contains biological and morphological information of 7496 bycatch specimens aggregated by cell. Depending on the taxonomic group, additional information reported by each cell includes morphometric data, namely the average body length (cm), body weight (kg), disc width (cm, only for batoids), carapace measurements for marine turtles, gender classification (females, males, unknown), and the physical conditions of the individual released at sea. An accurate description of all the variables considered is provided in Table 1.

  3. (III)

    Sightings of bottlenose dolphin (Tursiops truncatus). This file contains 6953 observed individuals from 3011 fishing operations aggregated by year and cell.

Technical Validation

The collection of tables stored in BYCATCH is the result of an intense compilation and validation process of a 14-year time series of marine megafauna bycatch. All information collected on board by fishery observers were reported in a dedicated spreadsheet developed in Microsoft Excel. To preserve the quality of the data, avoiding data entry errors and typing (e.g., wrong entry format, misspelling, missing information), a series of conditional formatting rules were created with Excel Visual Basic for Application (VBA) macro. When the user was filling the spreadsheet, if the data were not validated against a specific entry format or range, the inconsistency was highlighted by an error message. Then, the user could immediately solve any identified issue, and proceed with data entry process. Specifically, the rules were set up to highlight potential:

- Mismatches between fishery observers and their corresponding monitored fishing vessel and harbour of provenience: each fishery observer monitored a specific fishing vessel from a specific harbour.

- Inconsistencies between temporal variables: all fishing operations recorded at specific dates and times should be included within the time frame of the corresponding fishing trip. In addition, the timing of the starting of the hauling should be reported before than the timing of its conclusion.

- Inconsistencies of recorded fishing operations: the duration of each fishing operation recorded on board was compared with an estimated duration, calculated as the ratio between the length (distance between the starting and ending points of a fishing operation), and the average trawling speed (kn) of the vessel. A maximum of 30% of discrepancy between observed and calculated durations was allowed.

- Wrong vessel speed: considering that the usual speed of a midwater pelagic trawler should fall within the 3–5 (maximum) kn speed range, only values inside this range are allowed.

After data validation, all spreadsheets were uploaded in BYCATCH through Python, and tables were updated to hold it in a consistent format. During the upload, the programme repeats some checks on the data (e.g., temporal inconsistency, length and speed of fishing operations) and screen geographical location of fishing operations (e.g., position on land). If an error was found, the wrong record was flagged to be further investigated and corrected.

Usage Notes

The datasets are freely available and stored in the Marine Data Archive (MDA) and should be appropriately referenced by citing the present paper. The datasets can be used by various end users. For instance, fishery scientists can examine where the Italian midwater pair trawl fishery occurred over 14 years in the northern central Adriatic Sea. Knowing where fishing is occurring is crucial to assess potential impact on different vulnerable species being taken in the basin. From the datasets, the monitored fishing effort can be coupled with Automatic Identification System (AIS) transmitted data to quantify unknown vessel tracks providing a more realistic picture of the overall fishing activity. Then, ecologists can also couple the present historical bycatch data of different species with other existing time series coming from different sources (e.g., monitoring activities from different Mediterranean regions, aerial surveys, commercial landings, interviews with fishers). In this contest, meta-analyses, Bayesian statistics and other approaches can be used to combine and analyse data from a variety of sources to evaluate long-term trends of potential threatened marine megafauna and how their populations have changed over time14,35,36,37,38. The datasets described in this work can also help conservation biologists and managers to fill gaps in marine megafauna knowledge and to develop more accurately managements strategies.

However, users should take into account some limitations of the datasets. Overall, historical bycatch data of different marine megafauna species exhibit a large number of zero observations (no catch). This is a regular case in fishery, when data regards non-target species, like dolphins, sea turtles and elasmobranchs which are caught less frequently than target species (small pelagic fish in the case of midwater pair trawl fishery). Thus, when evaluating the relative abundance of such species over time, the modelling approach considered should deal with potential interpretation problems35,36. Previous studies showed that CPUE data can be modelled to address the effect of operational and environmental factors that can affect catch rate. For instance, historical CPUE of two myliobatids, the common eagle ray (Myliobatis aquila) and the bull ray (Aetomylaeus bovinus) included in the dataset were modelled by28 using Generalized Additive Models (GAMs). This procedure was applied in a delta modelling approach which allowed to model the probability of species occurrence and the magnitude of catch events separately (see details in28). The results indicated that the predictive accuracy of the delta-modelling strategy was rather good and a similar approach was used by30 to evaluate the seasonal distribution of bycatch events of loggerhead (Caretta caretta). Furthermore, Zero-inflated General Linear Models (GLMs) were used by29 to examine the relative abundance of four elasmobranch species - common smooth-hound (Mustelus mustelus), common eagle ray (Myliobatis aquila), spiny dogfish (Squalus acanthias), and pelagic stingray (Pteroplatytrygon violacea) - included in the dataset. The results aimed to standardize annual trends of the CPUE, considering the best set of covariates among those tested (see details in29). In addition, users should consider that the datasets include fishery-dependent data which can suffer of intrinsic bias related to the stochasticity of the distribution of marine megafauna and a lack of a well-defined sampling design of the monitoring activity in space and time. Indeed, following the considerations in27,29,30 a non-equal distribution of the monitoring activity was conditioned by fleet dynamics (e.g., fishing closure, fish market preferences and price) and bureaucratic delays of the project, which affected both observed pattern and estimation of bycatch events.