Data on the diets of Salish Sea harbour seals from DNA metabarcoding

Marine trophic ecology data are in high demand as natural resource agencies increasingly adopt ecosystem-based management strategies that account for complex species interactions. Harbour seal (Phoca vitulina) diet data are of particular interest because the species is an abundant predator in the northeast Pacific Ocean and Salish Sea ecosystem that consumes Pacific salmon (Oncorhynchus spp.). A multi-agency effort was therefore undertaken to produce harbour seal diet data on an ecosystem scale using, 1) a standardized set of scat collection and analysis methods, and 2) a newly developed DNA metabarcoding diet analysis technique designed to identify prey species and quantify their relative proportions in seal diets. The DNA-based dataset described herein contains records from 4,625 harbour seal scats representing 52 haulout sites, 7 years, 12 calendar months, and a total of 11,641 prey identifications. Prey morphological hard parts analyses were conducted alongside, resulting in corresponding hard parts data for 92% of the scat DNA samples. A custom-built prey DNA sequence database containing 201 species (192 fishes, 9 cephalopods) is also provided.


Introduction
Accurate information about what pinnipeds eat is challenging to obtain, yet vital for assessing the impacts of pinnipeds on prey populations and pinniped interactions with fisheries and other predators. Biomass reconstruction using prey hard parts in scats can theoretically provide useful quantitative estimates of diet for pinnipeds (Bowen 2000, Tollit et al. 2007), but certain key concerns have proved hard to solve Boyle 1991, Tollit et al. 2006), particularly the possibility of not detecting (or severely underestimating) important prey contributions. This may occur if soft bodied prey are not represented by hard parts (Olesiuk et al. 1990), if only the fleshy parts of large or spiny prey are consumed (e.g., the bellies of salmon) or if a prey's hard parts are preferentially regurgitated (e.g., cephalopod beaks; see Bigg and Fawcett 1985). Furthermore, prey with robust skeletal elements may be overrepresented compared with prey with fragile skeletons that poorly survive the digestive process (Jobling andBreiby 1986, Murie andLavigne 1986). In addition, a number of commercially and trophically important prey taxa (notably Salmonidae, Scorpaenidae, and Elasmobranchii) can typically only be identified using hard parts to the family/genera level, rather than the species level. This makes assessing predation levels on specific salmon species highly challenging. Importantly, hard parts can provide estimates of prey size if some account of the level of digestion is made.
Recent advances in molecular technologies have already proven useful in a number of marine mammal dietary studies (e.g., Reed et al. 1997, Jarman et al. 2002, Casper et al. 2007, Tollit et al. 2009), notably by increasing taxon-level detection rates and improving species resolution. Importantly, captive feeding studies have reliably (>95%) detected different prey species fed in varied quantities by extracting prey DNA from scat soft-part matrix (prey flesh remains) and have shown detection of prey is limited to a 48-h period after feeding (Deagle et al. 2005). In contrast, passage times of hard parts are far more variable, especially cephalopod beaks, due to long-term retention in the digestive tract Fawcett 1985, Tollit et al. 2003), complicating accurate diet composition estimation. Overall, molecular approaches have the potential to evaluate and alleviate some of the potential biases and limitations associated with reconstructing diets using hard-part identification (e.g., Casper et al. 2007, Tollit et al. 2009, Tollit et al. 2017. These include the ability to differentiate across salmon species, as well as detect species where hard parts have not been consumed or are highly digestible.

Project Description
As part of Study #2 -Validation of diet reconstruction within the Predation by Harbour Seals on Salmon Smolts Project, SMRU Consulting Canada undertook the following 3 tasks; 1) Completed a scat prey biomass reconstruction using approximately 210 scats collected in 2016 during Study 1 of this Project (Non-estuary seal diet). This required size estimation using prey species-specific mass-length regression applied to recovered hard-parts and the application of available numerical correction factors (NCFs) to account for complete digestion or loss of prey hard parts.
Version 1 2018-07-27 2 2) Developed bespoke computer code to estimate diet using both biomass reconstruction and frequency of occurrence methods applied to prey hard parts. 3) Ran Monte Carlo bootstrap simulations on all three diet estimation methods (prey occurrence from prey DNA and prey hard parts identifications, Biomass Reconstruction from prey hard parts, and diet reconstruction quantified using 'relative read abundance' of prey DNA sequence data) using scat data from task 1 to understand the 95% confidence intervals around diet estimates, as well explore biases and limitations of the different methods of reconstructing diets. Work was undertaken in collaboration with Chad Nordstrom, Sheena Majewski, as well as Drs Thomas and Trites. The goal of the study was to address key questions that have been raised about the methods used to estimate consumption of salmon by seals, specifically comparing biomass reconstruction versus diet quantified using DNA sequencing.

Scat prey biomass reconstruction and dietary index comparisons
Harbour seal scats were collected in the Strait of Georgia by DFO in 2016. Individual scats were rinsed through mesh strainers and all retained prey hard parts were identified to the lowest possible taxonomic group based on diagnostic morphological criteria developed by Pacific IDentifications Inc. using comprehensive comparative reference skeletons. Scat samples with no prey remains or only the remains of polychaetes or crustaceans were excluded from further dietary analysis. A subsample of 212 scats collected in spring and summer were provided by DFO for prey biomass reconstruction (BR) diet estimation. A total of 96% contained prey (n=204 scats), excluding 5 samples that contained Nereisis sp., which is considered likely secondary prey.
Prey hard part Biomass Reconstruction (BR) requires a number of iterative steps, which fall into two main types; estimating the number and size of prey, followed by diet index modeling. The methods used in this study generally follow those outlined in Tollit et al. (2015;2017), with supplementary information provided in Table 1. Prey length information was based on size class estimates provided (where possible) from each species by Pacific IDentifications Inc. using reference material. Median prey total lengths were derived within each size class category, noting size class categories were typically 5-12 cm broad and increased as prey size increased. Prey mass was subsequently calculated by applying Prey Length-Prey Mass allometric regressions (Tollit et al. (2015)), with modification to standard length or fork length where necessary using published data or reported values taken from www.fishbase.org. Seven new project-specific regressions based on 2015 and or 2016 DFO fishery data were developed for key prey species, including using survey data specific to the Strait of Georgia for the five most prevalent prey (Pacific hake, Pacific herring, Walleye pollock, Shiner perch and Plainfin midshipman). DFO data was also used to estimate the size of juvenile salmon based on trawl survey data collected in 2016 across the Salish sea. Prey size for low prevalence prey without speciesspecific size estimates, including skate spp., lamprey spp. and cephalopod spp. were derived from various sources (see Table 1 for details).
Version 1 2018-07-27 3 Numerical correction factors (NCFs, Tollit et al. 2015 and2017) were applied to account for total digestion of prey hard parts and where data was available, prey size-specific NCFs were applied (see Tollit et al. 1997;2015;2017;Philips 2005; see Table 2 for details). Due to the lack of studies that derive all structure NCFs for key British Columbia prey species, this study relied on NCFs derived from active sea lion captive feeding studies that collected comparable (i.e., standardized) data across multiple key BC species. Proxy species NCFs were used when no other data existed and based on prey structure robustness (Pers. Comm, Susan Crockford, Pacific IDentifications Inc.). As described in multiple captive pinniped feeding studies (e.g., Bowen 2000), NCFs applied in this study reduced the importance of robust prey species like large gadids and cephalopods, compared to prey species with fragile hard parts like herring, sandlance, anchovy, sardine (forage fish) and salmon. Given the high uncertainty around some NCFs (Table 2), the sensitivity of biomass-based diet estimates was assessed by comparing dietary contributions with and without NCFs applied. For prey hard part versus DNA methodology comparative purposes, dietary contributions of prey species were pooled into 8 key prey groupings (following Trites et al. 2007). These were Gadids, Forage fish, Cephalopods, Flatfish, Hexagrammids (greenlings spp.), Rockfish, Salmon and all 'Other' species (see Table 2 for information on groupings).
Custom R code was developed to provide two reconstructed composite diet model variants (termed 'Variable' and 'Fixed'; sensu Laake et al. 2002). Variable diet models allow for variability in foraging success (meal size) between animals as a result of variable biomass contribution across scats (i.e., contribution of prey species to the total depends on the total biomass of prey predicted). Fixed diet models assign the proportion of each prey species within each scat and each scat contributes equally to the total. Proportion of prey biomass (̂) was first determined using the variable biomass (BR-Variable) and fixed biomass (BR-Fixed) models (Laake et al. 2002): where, bi = the total biomass of prey type i, where there are w possible prey types bik = the total biomass of prey type i in the k th scat, where there are w possible prey types s = total number of fecal samples containing prey Overall dietary contribution was also calculated using percent modified frequency of occurrence (MFO), and the test index percent Split-Sample Frequency of Occurrence (SSFO) methods (Olesiuk et al. 1990). An assessment of scat resampling error levels was made using bootstrapping techniques in which median (500th sorted point estimate) diet estimates and nonparametric 95% confidence intervals (25th and 975th sorted point estimates) were derived by randomly resampling scats with replacement and bootstrapping 1000 times.
Version 1 2018-07-27 4 In order to assess the limitations of different dietary indices to scat field sampling protocols we used bootstrap resampling with replacement to sample 20, 30, 40, 50, 70, 100, 150 scats randomly from the total number of scats available and repeated the process 1000 times. An assessment of scat sample size error levels was described using BR-Fixed and DNA-fixed median (500th sorted point estimate or 50%ile) diet estimates and nonparametric 95% confidence intervals (25th and 975th sorted point estimates).
DNA-based prey detections and relative proportions were produced via custom bioinformatics routines and provided by Chad Nordstrom (Coastal Ocean Research Institute) following highthroughput sequencing performed at the Molecular Genetics Laboratory, Pacific Biological Station, Nanaimo. DNA sequence data was used to determine the contribution of prey species within each scat using "relative read abundance" as described in Deagle et al. (2018). DNA sequence data can provide an estimate of prey proportion within each scat which allows one to calculate a DNA-Fixed diet estimate, but not a DNA-Variable estimate. Hard part information was used to determine salmon size class information for those scats containing salmon.

Dietary reconstruction based on prey hard parts
A total of 204 of 212 scats processed contained 365 prey species hard part occurrences from more than 30 prey species. A total of 1366 individual prey items were detected, numerically dominated by Pacific hake (49.3%) and Pacific herring (26.5%), followed by Walleye pollock (4.9%) and Shiner perch (3.6%)( Table 2). Most of the Pacific hake (96%) were in the size range 20-34 cm. Salmon was detected in 12 scats, 2 of which contained both juvenile and adult sizes. Breaking this down, a total of seven scats contained hard parts from 15 Salmon classified as juveniles (median size 23 cm) and another 8 Salmon classified as adults (median size 34-47 cm), also distributed across seven scats in total. After the application of NCF, all 204 scats represented ~229 kilograms of reconstructed prey biomass (scat mean prey biomass = 1,122g, standard deviation = 1,799g, range 29-17,752g).
The percent diet contribution (and 95% bootstrap resampling confidence intervals) by species group for each major dietary index based on prey reconstruction using hard parts methodology is provided in Tables 3-5 and diet contribution by common species names in Table 7. All diet indices are dominated by Gadids (mainly Pacific hake and to a lesser extent Pollock) and Forage fish (mainly Herring and to lesser extent Shiner perch). The BR-Fixed (with numerical correction factors, NCF) contribution of Salmon is 3.6% (95% CI = 2.0-5.4%), similar to BR-fixed without NCF (Tables 4 and 5, Figure 1-3). Adult Salmon make up ~55% of the total. BR-Variable (with NCF) diet contribution of Salmon is higher at 6.3% (95% CI = 2.5-11.4%) (Figure 1), reflecting the contribution of larger Salmon to the total mass across all prey (Table 6). For this index, adult Salmon contribute ~75% of the reconstructed salmon biomass total (Table 8).
As expected, the application of NCFs reduce the importance of Gadids and increases the importance of Forage fish (Table 8), especially when using the BR-Variable indices. The sensitivity of BR-Variable (both with and without NCFs) to the influence of large prey is well illustrated by the results for the Hexagrammid Ling cod. Four large (length 64-87.5 cm) Ling cod (the only Hexagrammids identified) were detected in four scats. This results in 1.4-2% occurrence, 1.8-2.0% for BR-Fixed (with and without NCFs), but 6.9-13.1% for BR-Variable (with and without NCFs).

Dietary reconstruction based on DNA
A total of 195 of the same 212 scats contained 373 DNA-based prey species occurrences from 42 prey species, including all 5 species of Pacific salmon (as well as Atlantic salmon). Occurrences were dominated by Pacific hake (n=109), followed by Pacific herring (n=42), then by Shiner perch and salmonids. Walleye pollock and Armhook squid were also notable contributors. Note, salmon hard part data was used to classify salmon detections into juvenile and adult occurrences (as per Thomas et al. 2017). Juvenile salmon were detected across 21 scats (with 26 species occurrences overall which results from different species of Salmon being found in the same scat). Adult Salmon were detected across five scats with 13 species occurrences overall within these five scats.
The percent diet contribution (and 95% bootstrap resampling confidence intervals) by species group for each major dietary index based on DNA methodology is provided in Table 6 and for common species names in Table 8. We provide information for the two general approaches of summarizing DNA sequence count data. Firstly as 'occurrence' (i.e. presence/absence of taxa, MFO and SSFO) and secondly as 'relative read abundance' (i.e. proportional summaries of counts, DNA-Fixed). Similar to hard parts, DNA-Fixed diet contributions are dominated by Gadids (mainly Hake and to a lesser extent Pollock) and Forage fish (mainly Herring and to lesser extent Shiner perch) (Tables 6 and 8). The overall DNA-Fixed contribution of Salmon was also similar to BR-Fixed results at 3.8% (95% CI = 2.0-5.4%), with adult salmon estimated to contribute around one third of this total contribution (Table 8).

Dietary reconstruction comparisons
Both methods identified a similar number of prey species occurrences and both methods identified similar contributions from the eight key prey groupings using the Fixed model variant (Figures 1 and  2). Variable BR estimates provided lower contributions of Gadids (both Hake and Pollock) and a higher Version 1 2018-07-27 4 contribution of Forage fish (e.g., Herring) (Figures 1 and 3), both largely reflecting the application of NCFs. In addition, Variable BR estimates provide higher contributions of Salmon and Hexagrammids, reflecting the large mass of a small number of individual prey items. Overall, prey group rankings between methods were significantly correlated across both occurrence and reconstruction methods (Table 9) i.e., both methods provided statistically similar prey group rankings.
Occurrence indices were compared across prey species groups based on prey hard part and DNA identifications (Figure 2). The MFO contribution of Salmon exhibited the biggest difference across methods, with two-fold higher occurrence rates for DNA (MFO 7.5%) than hard parts (MFO 3.6%), reflecting both the ability of DNA to achieve salmon species resolution, as well as greater detection rates across 2016 scats. Differences across salmon SSFO were smaller, partly reflecting multiple DNA detections of Salmon species within the same scat. DNA also detected more Cephalopod occurrences.
The contribution of Salmon predicted by DNA-Fixed was 3.8%, very similar as that predicted by BR-Fixed with NCFs at 3.7% or without NCFs at 3.6%. The BR-Variable model predicted 6.3% with NCFs and 5.6% without NCFs, noting that 95% confidence intervals overlap across all estimates and are highest for BR-Variable estimates (reflecting the BR-variable estimates are strongly influenced by a small number of scats).
Confidence intervals were smallest for the more abundant prey groups (Gadids and Forage fish), ranging between ~8-18% of median for both Fixed indices (and 25-35% for BR-Variable) (Tables 5 and  6). For Salmon confidence intervals were larger with 42-47% for Fixed models and 60-81% for BR-Variable. Less abundant prey groups had even wider estimates (Tables 5 and 6).

Effects of scat sub-sampling protocols on dietary reconstruction
A major potential bias in estimating diet from scats is making that assessment with too few scats (termed sampling error). To assess this bias, percent diet contribution by species group for a range of scat sample sizes was compared across comparable diet indices using prey hard parts and prey DNA sequences isolated from scats (median and 2.5%ile and 97.5%ile confidence intervals based on bootstrap resampling with replacement) for DNA-Fixed and BR-Fixed (with NCF) (Figure 4).
This sampling error simulation highlighted a number of clear biases in diet estimation. Firstly, diet estimates diverged from the all scat estimates when sample sizes were small (<50 scats), especially for prey species with low occurrences (e.g., Hexagrammids, Flatfish and Rockfish). With the exception of BR-Fixed forage fish estimates, divergence was found to be an underestimate. Secondly, 95% confidence intervals were as might be expected, far wider when small number of scats were sampled. Typically, 95% confidence intervals approached those based on all scats after 100 scats were subsampled. Thirdly, 95% confidence intervals were similar for Gadids, Forage fish and Salmon across methods, but varied across other species groups (e.g. Flatfish).    Table 8. Percent diet contribution by common species name using DNA sequence data isolated from scats (the 50%ile or median is shaded with 2.5%ile and 97.5%ile confidence intervals based on bootstrap resampling with replacement) for prey occurrence (MFO and SSFO) and proportions based on prey sequences (DNA-Fixed) (Fixed=uses fixed proportions within each scat based on prey biomass reconstruction). Salmon diet contributions are highlighted in italics and order based on DNA-Fixed.

Statistical Analyses
Spearman Rank correlations between species group indices highlighted that association between all DNA versus BR comparisons were statistically significant.

Discussion
This study compared harbour seal prey dietary contributions between diet indices based on methods using prey hard parts recovered in scats and methods using prey identified by DNA sequencing, with a focus on diet estimates derived for salmon. The 'occurrence' diet indices are based solely on presence/absence of a prey species or grouping (MFO and SSFO), while the 'quantitative' indices (Fixed or Variable) determine the proportional contribution based on the size of prey and then biomass reconstruction (when using prey hard parts) or based on 'relative read abundance' of prey sequence data (when using DNA, see Deagle et al. 2018 for details).
Across both 'occurrence' and 'quantitative' diet indices there was a significant correlation in the contribution of key prey species groups (including Salmon) determined using prey hard parts versus DNA (all Rho > 0.8) (Table 9). Both methods identified a similar number of prey species occurrences and estimated a diet dominated (40-59%) by Gadids (mainly Hake and to a lesser extent Pollock) followed (22-36%) by Forage fish (mainly Herring and to lesser extent Shiner perch). Salmon contributed between 2.8-7.5% depending on method and index used. Cephalopods, Hexagrammids and Plainfin Midshipman ("Other" prey group) were additional and typically smaller contributors to the overall diet. The dominance of hake and herring in the diet of harbour seals in the Strait of Georgia was previously reported by Olesiuk et al. (1990), but the diet in 2016 clearly exhibited a higher occurrence of both Pollock and Anchovy.
The method comparison between the occurrence index MFO was best correlated (Rho = 0.976). However, the MFO contribution of Salmon exhibited the biggest difference across methods, with twofold higher occurrence rates for DNA (MFO 7.5%) than hard parts (MFO 3.6%), reflecting both the ability of DNA to achieve salmon species resolution, as well as greater detection rates across 2016 scats. Differences across Salmon SSFO were smaller, partly reflecting multiple DNA detections of Version 1 2018-07-27 14 salmon species within the same scat. DNA methods also detected more cephalopod occurrences. MFO considers overall contribution of a species to be the same irrespective of the number of species within a scat, whereas SSFO makes an equal contribution assumption, whereby the more prey species in a scat the less important each one is overall. SSFO seems to be the more reasonable assumption of the two, but the extent of index differences will depend mainly on diet diversity and feeding patterns.
While both methods identified similar contributions from the eight key prey groupings using the Fixed model variant (Figures 1 and 2), Variable BR estimates provided lower contributions of Gadids (both hake and pollock) and a higher contribution of Forage fish (e.g., Herring) (Figures 1 and 3), largely reflecting the application of NCFs. In addition, Variable BR estimates provide higher contributions of Salmon and Hexagrammids, reflecting the large mass of a small number of individual prey items. For example, the contribution of salmon predicted by DNA-Fixed was 3.8%, very similar as that predicted by BR-Fixed with NCFs at 3.7% or without NCFs at 3.6%. Confidence intervals are ~42-47% around this median value. The BR-Variable model predicted 6.3% with NCFs and 5.6% without NCFs (similar to DNA-SSFO estimates of 6.1%), noting that 95% confidence intervals overlap across all estimates but are highest for BR-Variable estimates (reflecting the BR-Variable estimates are strongly influenced by a small number of scats). Thus, one of the limitations of BR-Variable index is that it can inflate the contribution of particularly large (or abundant) prey found in only a few scats (e.g., Hexagrammids and Salmon). The application of NCFs has a lesser effect on resulting estimates of salmon importance, but has a larger effect on Gadids and Forage fish estimates.
Captive studies to date document biomass models perform better than occurrence models across a range of diet scenarios (Casper et al. 2006;Tollit et al. 2006;Philips and Harvey 2009), with variable models providing marginally better predictions of actual mass fed than fixed models, noting that these studies did not vary meal size appreciably. The variable models attempt to capture the unpredictability of foraging and the apparent pulsed nature of hard part recovery in scats. Thus animals presumably foraging most successfully are given a higher weighting. This assumption may also have a tendency to bias towards larger animals and, as observed in our study, can sometimes inflate the contribution of particularly large or abundant prey found in only a few scats. It is therefore particularly important to generate and consider confidence intervals (using re-sampling methods, e.g., Stenson and Hammill 2004) for biomass models (especially variable-based ones) to provide a measure of error and to allow the impact of any outliers to be critically assessed.
Given the differences observed between fixed and variable methods, further fine-scale foraging studies are needed to determine if meal size actually varies systematically with prey type and availability (justifying variable models). Overall, we recommend deriving diet composition estimates using both variable and fixed methods due to the plausible but largely untested assumption that reconstructed biomass of scats reflects variability in foraging success and meal size (consumption), as well as due to differences in digestion and subsequent deposition, and the probable inclusion of incomplete scat samples in most scat studies.
As observed in previous methodological comparisons (Tollit et al. 2009), prey hard parts found in scats can provide information on prey size and numbers, but identifications of Salmon, Rockfish, and often Flatfish and Cephalopods are not possible to species. In contrast, DNA methods can provide far higher species resolution, with all 5 species of Pacific Salmon as well as Atlantic salmon detected in processed 2016 scats, as well as multiple species of Flounder, Sole and Rockfish detected. Both reconstruction methods ('occurrence' based or 'relative read abundance') detected prey species undetected by the other technique (Tables 7 and 8). This likely reflects longer passage rates of certain hard parts compared to the soft scat material used in DNA analysis, coupled with differences in detection sensitivity. King et al. (2008) reviewed the pros and cons of different DNA-based approaches to molecular analysis of predation. Major areas of difficulty as well as sensitivity issues include short post-ingestion detection periods and cross-amplification, though good primer design and assay optimization can prevent these problems arising. Thomas et al. (2013) also highlighted from captive feeding studies that diet proportions using DNA sequence data were improved by tissue correction factors and prey-specific corrections based on lipid content. No account of this source of bias has been included within our assessment of confidence intervals.
Numerous captive feeding studies have shown that differential erosion and passage rate of prey items in relation to their size and robustness is one of the major sources of error in the analysis of prey remains from scats (Harvey 1989;Tollit et al. 1997;Bowen 2000;Grellier and Hammond 2006). Our study has attempted to take into account these sources of error (e.g., through the use of NCFs and Pacific IDentifications size estimates). However, we note that the pinniped NCFs used were not available for all species and are highly variable for some species (Table 2), and that it was not always possible to apply robust species-specific size NCFs. In addition, size estimates used in this study were median values. Nevertheless, future BR-based diet studies must continue to address digestion-related biases (Cottrell and Trites 2002;Tollit et al. 2003;Grellier and Hammond 2006).
Notably, we don't have any way to compare the diet indices, regardless of which reconstruction method was employed, for these 212 wild collected scats against the 'true diet' of harbour seals in this region, at this time of year. In this study, 212 scats were collected to reconstruct harbour seal diet, ensuring that minimum sample sizes for reliable reconstruction of primary important prey species were met (Trites and Joy 2005). With larger numbers of samples (i.e., >> 94 scat; Trites and Joy 2005), the study is also protected against sampling error associated with such factors as differences in scat volume (Arim and Naya 2003) and differences in dietary preferences of individual animals. On the other hand, pooling all animals into a single spatial assessment for seals in the Georgia Strait, combines a diversity of collection locations that may hide more regional and temporal consumption trends (Arim and Naya 2003). It is considered critical to collect scat samples that reflect the population in time and space if diet estimates are then used to assess annual consumption of certain prey. In reality, it is not clear that a scat represents a sample of the diet at the fine-scale location at which it was collected. Prey passage rate variability may cause bias -with foraging by animals that are more distant or spend long periods at-sea likely underestimated. Harbour seals diet studies are likely less affected by these issues as they typically forage inshore for relatively short durations.
Diet reconstruction is comprised of what percent diet each of the identified components in the scats constituted the original consumption proportions. Reconstructions are therefore restricted numbers between 0 and 100, with all diet components obligatorily summing to 100%. Thus, with rarer species or species groups, there is a higher probability of missing its occurrence with fewer collected scats Version 1 2018-07-27 16 ( Figure 4; Hexagrammids, Rockfish, Flatfish), and this results in an automatic compensation to increase the percentage of other consumed species. This source of bias in BR-Fixed reconstructions decreases with sample sizes as shown in Figure 4 in which the dashed line represents the all sample reconstruction relative to reconstructions with smaller numbers of scats (X-axis). Many dietary studies tend to pool prey remains into species groups as we have done. By reducing the number of individual species to 8 species groups can reduce the probability of small sample bias. Overall, the sub-sampling error simulation highlighted a number of clear biases in diet estimation. Firstly, diet estimates diverged from the all scat estimates when sample sizes were small (<50 scats), especially for prey species with low occurrences (e.g., Hexagrammids, Flatfish and Rockfish). With the exception of BR-Fixed forage fish estimates, divergence was found to be an underestimate. Secondly, 95% confidence intervals were as might be expected, far wider when small number of scats were sampled. Typically, 95% confidence intervals approached those based on all scats after 100 scats were sub-sampled. Thirdly, 95% confidence intervals were similar for Gadids, Forage fish and Salmon across methods, but varied across other species groups (e.g. flatfish) (Figure 4).
One of the key goals of this project was to determine the consumption of different Pacific salmon species and age classes. Using hard parts analysis, it is notoriously difficult to distinguish between Pacific salmon species. This limitation is minimized using DNA methodology. On the other hand, with hard part remains it is possible to distinguish between juvenile and adult salmon, and in most cases to reconstruct the original size of the salmon consumed. Using DNA methodology, there is no way to determine the size of the original fish that was consumed. For this reason, many researchers are suggesting a combination of both reconstruction techniques would give the best reconstruction of harbour seal diet, at least where consumption of Salmon is of key importance (Tollit et al. 2009).
Overall, this study provides a number of important conclusions relevant to assessing the biases and limitation in estimating the diet of harbor seals in the Strait of Georgia using various methods.
1) Prey group proportions (including Salmon) based on DNA sequence (relative read abundances) were similar to those derived using prey hard part Biomass Reconstruction. 2) DNA detected higher rates of Salmon (and Cephalopods) than hard parts, but Fixed quantification methods provided near identical contributions (3.8% v 3.7%). BR-Variable quantification provided higher estimates of salmon (6.3%) similar to those estimated using DNA-SSFO (6.1%). DNA methodology estimated the contribution of juvenile salmon to be ~33% of the total, while BR methods estimated this to be 75%. 3) BR-Variable allows for an unequal contribution across scats (an approach that aims to allow for differences in meal size and foraging success). As a consequence, the importance of large or abundant prey is often inflated compared to using a BR-Fixed method. Our study highlight this index has comparatively high confidence intervals caused by a small number of scats having a large influence on diet estimates. Further studies are required to fully understand which index might be more realistic, but the two approaches might be considered upper and lower estimates of prey importance. 4) Confidence intervals generated reflect re-sampling of scats and do not account for additional sources such as that involved in scat field sampling success, prey size and number estimation and errors involved in DNA prey identification and sequence quantification. Confidence Version 1 2018-07-27 17 intervals are smaller for more abundant prey groups (Gadids and Forage fish), ranging between ~8-18% of the median for both BR-and DNA-Fixed indices. For salmon confidence intervals were larger with 42-47% for both Fixed models and 60-81% for BR-Variable. 5) The application of NCFs to account for complete digestion or loss of hard parts had a relatively small effect on BR estimates of salmon, with effects more apparent on alternate prey groups (Gadids and Forage fish). 6) Our assessment of sub-sampling biases highlights diet estimates diverged (typically underestimated) from the all scat estimates when sample sizes were small (<50 scats), especially for prey species with low occurrences (e.g., Hexagrammids, Flatfish and Rockfish).
Confidence intervals were typically large if less than 100 scats were sub-sampled. 7) While this study has shown that the use of DNA sequence data to quantify diet provides estimate similar to that estimated from hard parts, a composite diet estimate using information based on both methods is considered a best case, largely because it provides increased detectability rates, enhanced species identification resolution, and prey size information.