Introduction

Of all of the human impacts on global biodiversity, biological invasions are perhaps the most damaging, worst controlled, and most difficult to mitigate (Strayer 2010; Hirsch et al. 2016b). Although species have been colonizing new areas for much of geological time (Brown and Sax 2004), the current rate of species introductions and seemingly idiosyncratic biogeographic nature would seem to be unprecedented in global history (Ricciardi 2007). Existing evidence suggests that freshwater ecosystems are heavily invaded and that hundreds or thousands of species have been introduced to freshwater ecosystems worldwide, including such damaging invasives as the floating water hyacinth (Eichhornia crassipes), Asian clam (Corbicula fluminata), and spiny water flea (Bythotrephes longimanus) (Mills et al. 1993; Gherardi et al. 2009; U.S. Geological Survey 2017). Organisms are transported and released via a number of major vectors, including intentional stocking, escapes or releases from aquaria, gardens, or bait buckets, escapes from aquaculture and horticulture, transport in ballast water, and dispersal via man-made canals (Ricciardi 2006; Strayer 2010). Most of these vectors are selective in terms of what taxa they transport, and invasions in general are taxonomically nonrandom (Mills et al. 1993). Because of their economic, food, and sport value, fishes such as Common Carp (Cyprinus carpio), various salmonids (family Salmonidae), and European Perch (Perca fluviatilis), among others, have been intentionally transported and stocked worldwide (Strayer 2010). Other species, such as the Western mosquitofish (Gambusia affinis), were introduced for biocontrol purposes, or introduced unintentionally, as in the case of Red Lionfish (Pterois volitans) in the Caribbean Sea. As a result, fishes are among the most introduced organisms globally (i.e. 430 out of 744 and 153 out of 432 established alien species in fresh waters of North America and Europe, respectively) (Gherardi et al. 2009; Strayer 2010).

The Round Goby (Neogobius melanostomus Pallas) is relatively small (i.e. <25 cm maximum length) benthic fish species native to the Ponto-Caspian region of Eastern Europe. Now listed as one of the 100 worst invasive species globally (DAISIE 2017), the Round Goby was first found in the Laurentian Great Lakes in 1990 (Jude et al. 1992) and in the Baltic Sea in 1991 (Skora and Stolarski 1993). It subsequently spread to all five Great Lakes (Jude et al. 1995), and is continuing to expand its non-native range in both North America and Europe (Corkum et al. 2004; Azour et al. 2015; Kotta et al. 2016). Specific traits that provide the Round Goby an invasive advantage include its broad tolerance of environmental conditions, diverse diet, high fecundity coupled with paternal offspring care, and aggressive behavior in acquiring food and spawning sites (Marsden et al. 1996; Corkum et al. 1998; Macinnis and Corkum 2000). The species has become the dominant benthic fish in many areas of the Great Lakes and is of great concern owing to intense predation of native fish species’ eggs (Jude 1997). Development of canals and waterways and the increase in commercial shipping and recreational boat traffic across both Europe and North America have provided new dispersal pathways and opportunities, greatly contributing to the species’ rapid spread (Britton and Gozlan 2013; Roche et al. 2013; Hirsch et al. 2016b).

Previous research has demonstrated that Great Lakes populations of Round Goby are characterized by significant population structure (Brown and Stepien 2009) and that this population structure has remained relatively consistent over time (Snyder and Stepien 2017). This structure has been suggested to result from a combination of factors, including variation in the population subsamples originally introduced to different areas (Snyder and Stepien 2017), natural dispersal of fishes (Bronnenhuber et al. 2011), and human-mediated dispersal via ballast water (LaRue et al. 2011) and bait bucket or other small-scale transport (Jude 2000). However, none of these mechanisms has been tested in a rigorous framework at the large scale. Each of these introduction or transport mechanisms should leave a characteristic genetic signal in the population, and is discussed in detail below.

Natural dispersal of the Round Goby was previously thought to be limited. Adults display high site fidelity (Wolfe and Marsden 1998), with home ranges conservatively estimated at 5 ± 1.2 m2 (Ray and Corkum 2001) but individuals occasionally migrate much greater distances (i.e. up to 2 km; Wolfe and Marsden 1998). In addition, while larvae exhibit diel vertical migration, the consequence to natural dispersal is likely small since the larvae are negatively buoyant and remain near the surface for only very limited periods (Hensler and Jude 2007). Paradoxically, the supposed tendency for limited movement in Round Goby is juxtaposed by the rapid spread to new sites. For example, in tributaries on the Great Lakes, upstream spread has been estimated to range from 0.5 to 4 km yr−1 (Kornis and Vander Zanden 2010; Bronnenhuber et al. 2011), while in estuarine systems it can be an order of magnitude higher (i.e. 30–50 km yr−1; Grosholz 1996; Azour et al. 2015). Spread may be facilitated, in part, by intraspecific competition or aggression, which could promote migration of subdominant individuals out of areas of high density (Kornis et al. 2012). Gradual natural dispersal of small numbers of individuals, as would be expected of Round Gobies, should lead to an identifiable pattern of isolation by distance, where populations become less and less genetically similar with increasing geographic distance. While natural dispersal has been suggested to play some role in the population genetic structure of Round Gobies (LaRue et al. 2011; Snyder and Stepien 2017), by itself it is clearly insufficient to explain the population genetic patterns previously identified (Brown and Stepien 2009; LaRue et al. 2011).

In addition to natural dispersal, freshwater ballast water transfer moves Round Gobies within the Great Lakes ecosystem (LaRue et al. 2011). In this case, nocturnal ballasting operations that correspond with the nocturnal pelagic feeding of larval Round Gobies could potentially take up thousands of individuals and subsequently transport them to seed new areas or add novel genotypes to established populations (Hensler and Jude 2007; Hayden and Miner 2009; Kornis et al. 2012). This scenario is consistent with the rapid and substantial range expansions observed throughout the Great Lakes (Wolfe and Marsden 1998), and shipping has been demonstrated to be negatively related to pairwise genetic differentiation in goby populations in Lake Michigan (LaRue et al. 2011). Transport via shipping within the Great Lakes system could lead to a pattern of anomalous similarity between geographically disjunct (and potentially distant) populations, or may function to diminish any isolation by distance signal in the genetic data. Transport in ballast within the Great Lakes is also consistent with previous genetic research that suggested that some outlying (relative to the original invasion site) sites show high genetic diversity (Brown and Stepien 2009).

The once common practice of using Round Gobies as bait may also have played a role in their dispersal, as they have been found in areas where no ballast water is discharged (Art Timmerman, OMNRF, pers. comm., Guelph, ON, Canada). While bait-bucket transfers between large, established populations is possible, and would serve as a source of migrants (possibly with little overall impact on the population genetic signal), populations established via bait bucket transfer should show evidence of founder effects, and would be characterized by low genetic diversity and could be highly distinct, due to the random sampling process.

Earlier studies have demonstrated that introduction, dispersal, and consequent range expansion of the species in the Great Lakes is likely a complex process characterized by multiple modes involving both natural dispersal and human assistance. However, those mechanisms should result in characteristic population genetic signatures, and hence some understanding of the Round Goby colonization of the Great Lakes should be possible through detailed genetic analyses. The goals of this project were twofold: first, we identified large-scale genetic structure in Round Goby populations in and around Ontario, Canada and second, we tested specific hypotheses about the role of initial colonization vs. secondary transport and/or natural dispersal in building this genetic structure. We predict that the cluster around the original founding population in Lake St. Clair will have the highest genetic diversity, and that diversity will decline with distance from Lake St. Clair. Because shipping within the Great Lakes moves both huge quantities of ballast water annually, and huge numbers of individuals simultaneously, we expect that ballast water transport may play a dominant role in the population structure of Round Goby in the Great Lakes. Thus, we predict that distant populations may be anomalously similar if located near major ports or along shipping routes and should show little or no decrease in diversity with distance while populations away from shipping routes will show clear patterns of founder effects. In addition, we predict that diffusive dispersal will be the principal mechanism that fills in areas away from deballasting locations, so that evidence of isolation by distance should be apparent with increasing distance from such points. We test each of these hypotheses using a suite of population genetic tools at both the population (sampling location) and individual level.

Materials and methods

Sample collection and geographical information

Round gobies (total N = 1958) were sampled from 32 sites in Lakes Huron, Erie and Ontario in summer and autumn of 2005 and 2006 (Table 1). Sampling sites were selected based on known presence of the species, accessibility, and ease of capturing sufficient sample sizes. The number of fish collected per site ranged from 21 to 128, with an average of 61 individuals per site. Fish were collected using a variety of sampling techniques including seine nets, hook and line, and benthic trawl or hand spearing, depending on sampling site. Fish were humanely euthanized in accordance with Ontario law and University of Windsor Animal Care protocol. A small portion of the caudal fin was collected and stored in 95% ethanol.

Table 1 Population genetics summary statistics by sampled population of Neogobius melanostomus

Microsatellite genotyping

DNA was extracted using a Wizard® Genomic DNA Purification Kit (Promega, Madison, WI, USA) or the 96-well, glass-fiber plate protocol described by Elphinstone et al. (2003). All samples were genotyped at nine polymorphic microsatellite loci following Dufour et al. (2007). PCRs were prepared in 7 µL total volumes containing 50–100 ng of template DNA, 1× PCR buffer (Sigma-Aldrich, Oakville, Canada; 100 mM Tris-HCl, pH 8.3; 500 mM KCl), locus-specific concentrations of MgCl2 (Tables 2), 0.2 mM of each dNTP, 0. 025 uM IRDye® infrared dye labeled forward primer (IR700, IR800, MWG Biotech, High Point, NC, USA) with 0.05 U of Taq DNA polymerase (Sigma-Aldrich, Oakville, Ontario, Canada). Thermocycler profiles consisted of a 2 min initial denaturation at 94 °C, followed by 35 cycles of 15 s at 94 °C, 15 s locus-specific annealing temperatures (Table 2), 30 s extension at 72 °C, with a final extension step at 72 °C for 2 min. Amplified PCR products were visualized on a LI-COR 4300 DNA Analysis System (Lincoln, Nebraska USA). Allele sizes were determined using Gene ImagIR 4.05 (Scanalytics, Inc. Rockville, Maryland USA) and the manufacturer’s size standard (50–350 bp). Individuals were randomized on all gels to avoid allele-size scoring bias.

Table 2 Nine microsatellite loci for the Round Goby, Neogobius melanostomus (following Dufour et al. 2007) with primer sequences, annealing temperature (Ta), and MgCl2 concentration

Genetic diversity

Mean observed (HO) and expected heterozygosities (HE) for all populations across all loci were calculated using ARLEQUIN VER. 3.5 (Excoffier and Lischer 2010). The inbreeding coefficient FIS was estimated using ARLEQUIN VER. 3.5 and tested for deviations from zero using a permutation test (1000 permutations) with significance values adjusted using the Bonferroni correction for multiple tests. Inbreeding coefficients were significant for 8 of 32 populations (Table 1). Therefore, we tested for the presence of null alleles using MICRO-CHECKER VER. 2.2.3. (Van Oosterhout et al. 2004). Null alleles were identified in five of eight populations with significant FIS, so we used FREENA (Chapuis and Estoup 2007) to estimate global and pairwise standard FST and unbiased corrected FST using the “excluding null alleles” (ENA) method. ENA-corrected and -uncorrected pairwise FST values were significantly different in a paired two-sample t-test (t = 5.3187, df = 495, P < 0.001). However, both measures were very highly correlated (Pearson correlation = 0.998) and the absolute mean of the differences in FST (0.0009) is unlikely to be biologically meaningful. We therefore used uncorrected FST values as the response variable in our isolation by distance models, described below. We also used BOTTLENECK VER. 1.2.02 (Cornuet and Luikart 1996) to test for recent bottlenecks or founder effects in the sampled populations, as bottlenecks could also lead to changes in allele frequencies and heterozygosity. Mean allelic richness (AR) for all populations at all loci was estimated using the standArich package in R (Alberto et al. 2006). We tested for an effect of colonization history on genetic diversity by regressing mean population allelic richness (AR) against the shortest water distance from the putative site of first introduction (St. Clair River—Fig. 1; Crossman et al. 1992; Jude et al. 1992) using a simple linear regression in R (R Core Team 2017). The expectation is that sites close to the original introduction site have not undergone secondary bottlenecks, have had time to recover from founder effects, and may have received supplementary introduced individuals from other areas and thus should exhibit the highest genetic diversity. We also tested for significant differences in allelic richness among clusters identified in STRUCTURE and NJ trees using an analysis of variance in R.

Fig. 1
figure 1

Relationship between distance (km) from the original Round Goby introduction site (St. Clair River, COUR) and standardized allelic richness (AR) for all populations. Populations with anomalous AR relative to the regression are labeled and shown with solid points. Site codes are defined in Table 2

Population genetic structure

Due to the inherently non-equilibrium nature of population structure in recently invaded species (Fitzpatrick et al. 2012) and our desire to characterize mechanisms of range expansion, we first characterized large-scale population genetic structure using individual-based clustering. We used the program STRUCTURE 2.3.4 (Pritchard et al. 2000) to perform genetic clustering of individuals with an admixture model, with location prior information included. All runs were performed for K = 1–11 with ten replicates for each K and 250,000 burn-in iterations, followed by 750,000 MCMC iterations. The Evanno et al. (2005) ΔK method, calculated using a custom R script (R Core Team 2017), was used to select the most likely K. Bar plots based on the most likely replicates were constructed using a custom script in R (R Core Team 2017). We also tested for genetic structure among sampled populations using a population-based method by calculating genetic distances based on the Cavalli-Sforza chord distance (Dc) using the “genet.dist” function in R package “hierfstat” version 0.04–22 (Goudet and Jombart 2015) and then constructing a neighbor-joining (NJ) tree based on these distances using the “nj” function in R package “ape” version 3.5 (Paradis et al. 2004). Both STRUCTURE and the NJ tree identified three populations (Hastings [HAS], Midland [MID], and Port Severn [PSE]) as highly distinctive, so we also performed both analyses with these three populations excluded, to assess their influence on results.

Isolation by distance

We tested for an effect of geographic distance on genetic similarity between populations using the “mantel.test” function in R package “ape” version 3.5 (Paradis et al. 2004). Population (site) genetic divergence was assessed using FST calculated in FREENA (described above), while geographic distance was measured as the shortest water distance between sites. Additionally, pairwise genetic distances were linearized using the FST/(1−FST) (Rousset 1997) conversion and were regressed against the shortest water distance between pairs of sites using a simple linear regression in R. Because samples were collected at a fine spatial scale in Lake Erie, we were able to test the hypothesis that populations in eastern Lake Erie were founded via ballast water transport from the original founding site near Lake St. Clair. We tested the relationship of genetic distance and geographic distance between populations in the St. Clair River-Detroit River corridor (COUR, BR, RO, MK, LA, DU) and populations in eastern Lake Erie (all populations from PST east). In this case, we suggest that eastern Lake Erie populations were likely founded as the result of ships deballasting prior to entering the Welland Canal, near Port Colborne (PC), and subsequently spreading naturally westward. Thus, the predicted pattern should be one of increasing genetic distance with decreasing geographic distance to Lake St. Clair.

Assignment and dispersal

To investigate individual dispersal among sites we used a Bayesian genotype assignment approach outlined by Rannala and Mountain (1997) in the computer program GENECLASS 2.0 (Piry et al. 2004). To identify successful assignment, we estimated a likelihood ratio as the highest rank probability for assignment divided by the second highest rank probability. If the likelihood ratio obtained was greater than four, meaning it was four times more likely to originate from the higher ranked population than the next highest ranked population, the individual was deemed successfully assigned. The likelihood ratio threshold of four is arbitrary, but sensitivity analyses showed that our results did not vary qualitatively with threshold values ranging from two to nine, though the numbers of successfully assigned individuals did change. Individuals were categorized into one of three categories: self-assignment, migrant of known source (i.e. likelihood ratio > 4), or unidentified. We performed individual assignment initially with all 32 populations, and subsequently with populations grouped into the six clusters identified in the NJ and STRUCTURE analyses.

Results

Genetic diversity

Observed heterozygosity values ranged from 0.42 to 0.65, while expected heterozygosity ranged from 0.51 to 0.66 (Table 1). Estimated global FST was 0.067 (95% CI: 0.054–0.080). Standardized allelic richness ranged from 3.5 to 6.8 at n = 24, with the highest values recorded for St. Rose Beach (RO) and LaSalle (LA) near the putative original site of introduction in the St. Clair River, and the lowest value recorded at Port Severn (PSE), in eastern Lake Huron. Mean standardized allelic richness also differed significantly (p < 0.001) between the three major population clusters identified in our STRUCTURE analysis and neighbor-joining trees (see below), with the highest mean (±SD) recorded for the western Lake Erie cluster (including McFarland and Trenton in eastern Lake Ontario; 6.17 ± 0.24) and similar means found for eastern Lake Erie (5.07 ± 0.33) and Lake Huron (5.32 ± 0.32). Allelic richness was low overall for the three highly distinctive populations: Port Severn (3.5), Hastings (4.0), and Midland (5.1). Standardized allelic richness was negatively correlated with distance from the putative site of first introduction in Lake St. Clair (Fig. 1, p = 0.003, Adj. R2 = 0.23). Three populations fell significantly further than expected from the regression line. Two of these populations, Trenton (TR) and McFarland (MCF), had anomalously high allelic richness for their distance from the St. Clair River site (COUR), while Port Severn (PSE) had anomalously low allelic richness. Excluding these three populations and rerunning the linear regression improved both the significance (p < 0.001) and the fit to the regression line (Adj. R2 = 0.76), but the overall pattern remained similar. BOTTLENECK results identified two populations (HAS and PSE) as potentially having undergone a recent reduction of their effective population size.

Population genetic structure

The patterns of population divergence identified were generally consistent between Dc NJ trees and our STRUCTURE analysis. The NJ tree (Fig. 2a) identified three major clusters, approximately corresponding to (1) the western half of Lake Erie, (2) the eastern half of Lake Erie, and (3) Lake Huron. The best-fitting (highest likelihood) K = 6 STRUCTURE result (Fig. 3) showed a similar pattern, with three major clusters corresponding with the same groupings as in the NJ tree, and three additional clusters corresponding with highly distinct individual populations. These three populations (HAS, MID, and PSE) were highly divergent, characterized by unusually long neighbor-joining branch lengths in the phylogenetic tree, and each assigning to a unique cluster in our highest-likelihood (K = 6) STRUCTURE result. However, dropping these three populations and recalculating Dc distances resulted in a tree that was essentially structurally unchanged (Fig. 2b), suggesting that these three populations did not disproportionately affect the overall pattern.

Fig. 2
figure 2

Neighbor-joining trees based on Cavalli-Sforza chord distance for a all sampling sites and b all sites excluding Midland (MID), Hastings (HAS), and Port Stanley (PSE)

Fig. 3
figure 3

a Distribution of K = 6 STRUCTURE clusters for Neogobius melanostomus in the Eastern Great Lakes. Data are the mean individual membership coefficient (i.e. proportion of each individual’s genome inherited from ancestors in a given cluster) for each sampling location. K = 6 cluster memberships for populations are indicated by the pie chart colors: western Lake Erie (blue), eastern Lake Erie (yellow), Lake Huron (green), Port Severn (PSE, orchid), Midland (MID, dark red), Hastings (HAS, orange). b STRUCTURE bar plots for K = 2 to K = 6. Population order from left to right corresponds approximately with geographic population order from northwestern Lake Huron through Lake Erie and Lake Ontario. Full population names and locations are listed in Table 2. Vertical bar colors represent cluster memberships in each row. Gray bars indicate lake or waterway locations for sampling locations: A, Lake Huron; B, Lake Erie; C, Lake Ontario; D, Trent-Severn Waterway

A few populations were assigned to genetic groups in both analyses that did not correspond to their geographic locations. Three populations, Burlington (BUR) in western Lake Ontario, Colchester (COL) in western Lake Erie, and Port Colborne (PC) in eastern Lake Erie, were genetically similar to populations from Lake Huron. One population, Jay Gould (JG), sampled in western Lake Erie, was genetically similar to populations in the eastern end of the lake. Finally, two populations from eastern Lake Ontario, McFarland (MCF) and Trenton (TR), were closely related to populations from western Lake Erie and the St. Clair River/Detroit River corridor.

Isolation by distance

Overall, Round Goby population structure reflected a complex pattern of subdivision and isolation by distance in the eastern Great Lakes. There was strong support for the correlation of genetic and geographic distance when all populations were considered (Mantel Z = 15839, p = 0.001, Adj. R2 = 0.29; Fig. 4, upper panel). However, closer inspection of the distribution of genetic distances shows that all of the largest pairwise genetic distances involve two of the three highly divergent populations identified in the NJ tree and STRUCTURE analysis, PSE and HAS. Mean pairwise FST/(1−FST) (±SD) for PSE was 0.17 (±0.04), while for HAS it was 0.21 (±0.04). Mean pairwise FST/(1−FST) (±SD) for all other comparisons was only 0.05 (±0.03).

Fig. 4
figure 4

Relationships between geographic distance (km) and genetic distance (FST) for all populations (upper panel), excluding putative outlier populations (middle panel), and for all pairwise comparisons between populations in the St. Clair River-Detroit River corridor (COUR, BR, RO, MK, LA, and DU) and populations in eastern Lake Erie (PST and all populations east; bottom panel). Pairwise comparisons indicated with triangles in the upper panel and excluded from the middle panel include all those involving Hastings (HAS), Port Severn (PSE) (both indicated with Δ), McFarland (MCF), and Trenton (TR) (both indicated with )

Pairwise comparisons involving the two populations in eastern Lake Ontario (McFarland [MCF] and Trenton [TR]) also had larger-than-expected residuals from the regression line. In this case, pairwise FST/(1−FST) value between these two populations and populations in western Lake Erie fell well below where they were expected, based on geographic distance. Therefore, we also performed the linear regression analysis excluding the four putative outlier populations (HAS, PSE, MCF, and TR), as their genetic structure may be more consistent with long-distance anthropogenic transport than with natural (or semi-natural) stepping-stone dispersal. With the four outlier populations excluded, there was still strong statistical support for the correlation of genetic and geographic distance (p < 0.001, Adj. R2 = 0.42; Fig. 4, middle panel).

Pairwise comparisons between populations near the original introduction (St. Clair River, Lake St. Clair, Detroit River) and populations in eastern Lake Erie showed a significant pattern of decreasing population genetic distance with increasing distance (p = 0.004, Adj. R2 = 0.10; Fig. 4, lower panel), consistent with an initial founding event near Port Colborne and subsequent diffusive spread westward.

Assignment and dispersal

Bayesian assignment analysis using all 32 populations (Fig. 5) demonstrated that we were able to successfully assign 662 of 1958 individuals based on our conservative likelihood ratio (>4) threshold. Four hundred and forty-eight individuals were assigned to their population of origin, while 214 were identified as migrants of known origin. Of migrants of known origin, 127 were assigned within the same STRUCTURE cluster (Fig. 6), while 87 were assigned to a population in another STRUCTURE cluster (Fig.7). Per population, successful assignments ranged from 8.0% (DU) to 96.8% (PSE). The average percentage of successful assignments was 30.9% (±20.8 [SD]) across all populations. On average, more individuals were assigned to their population of origin (19.5 ± 24.1%) than were identified as migrants of known origin (11.4 ± 6.0%). The percentage of self-assigned fish ranged from 0% (MCF, NI, NY3, NY6) to 95.2% (PSE), with the highest percentages seen in the three highly divergent populations (PSE, HAS, MID). The percentage of fish identified as migrants ranged from 0% (HAS, BUR) to 25.8% (MR). Among the 214 migrants of known origin that were identified, the largest fraction of individuals (~40%) moved less than 100 km, though distances ranged from 5 to 867 km (Fig. 8). With populations grouped into the six clusters identified by NJ and STRUCTURE analysis, we were able to assign 1261 of 1958 individuals. Of the individuals confidently assigned to one of the six clusters, 1075 were assigned to their cluster of origin while 186 were assigned to another cluster.

Fig. 5
figure 5

Map of sampling locations with individual assignments from Geneclass 2. Black pie slices indicate proportion of individuals assigned to the sampled population (self-assigned). Gray slices indicate individuals assigned to a population other than the sampled population (migrants). White slices indicate individuals that could not be assigned according to the likelihood ratio > 4 criterion

Fig. 6
figure 6

Source and destination populations for individuals assigned to a non-source population within the same STRUCTURE cluster as their sampling population. Arrows indicate putative direction of migration and are scaled to the number of individuals moving between populations. Colors within population circles correspond to STRUCTURE clusters in Fig. 3: western Lake Erie (blue), eastern Lake Erie (yellow), Lake Huron (green), Port Severn (PSE, orchid), Midland (MID, dark red), Hastings (HAS, orange)

Fig. 7
figure 7

Source and destination populations for individuals assigned to a non-source population in a different STRUCTURE cluster as their sampling population. Arrows indicate putative direction of migration and are scaled to the number of individuals moving between populations. Colors within population circles correspond to STRUCTURE clusters in Fig. 3

Fig. 8
figure 8

Frequency distribution of distances traveled by individual Round Goby migrants in the eastern Great Lakes. Migrants were identified by genotype assignment to a population other than their sampled population with a likelihood ratio >4. Distance was measured as the shortest over-water distance between sites

Discussion

Our results are consistent with the hypothesis that ballast water transport, both in the initial colonization and in secondary spread, has played a dominant role in the population genetic structure of Round Gobies in the Great Lakes. The current population structure, which has been suggested to have been relatively stable for more than a decade (Snyder and Stepien 2017), primarily resulted from the movement of large numbers of individuals via ballast water. Initially, these individuals were moved from the native range, and secondarily, within the Great Lakes. Further movement of individuals, involving both natural migration and human-mediated transport of small numbers of individuals, has served to establish some outlying populations and spread individuals out of the initial points of establishment. The dominance of ballast water transport in the Round Goby example highlights one major weakness of current ballast water management: although requirements exist to prevent the introduction of non-indigenous species to the Great Lakes (Canadian Coast Guard 1989, United States Coast Guard 1993), once a species is in the Great Lakes it can be quickly transported throughout the system, rendering any attempt at management too little, too late.

Within the invaded range, we identified three main clusters in our neighbor-joining trees, approximately corresponding with Lake Huron, Eastern Lake Erie, and Western Lake Erie/Lake St. Clair, plus three highly divergent populations associated with these lake clusters. The same pattern was identified in our STRUCTURE analysis, with six clusters identified, and closely matching our neighbor-joining results. Previous authors have suggested that some of these areas (e.g. the Bay of Quinte in Lake Ontario) are likely the recipients of independent introductions from outside the system (Brown and Stepien 2009). We argue that there is no need to invoke multiple introductions to account for the observed genetic patterns. Instead, the pattern of population divergence is consistent with natural and human-mediated dispersal within the Great Lakes following an initial introduction in and around Lake St. Clair. Although multiple introductions from genetically divergent sources may have contributed to the current distribution of Round Gobies in the Great Lakes, we would argue that it is a relatively small part of the whole story. Secondary transportation events and range expansions within the Great Lakes likely played a more significant role. The anomalous similarity of geographically disjunct populations, including TR and MCF in the Bay of Quinte with the St. Clair River populations, Port Colborne (PC) and the St. Clair group, and Burlington (BUR) and the populations in Lake Huron, give evidence for a dominant role of ballast water transport, followed by secondary diffusive spread. Supporting our hypothesis is the fact that many of these anomalous sites are near ports, or rivers/canals, where ballasting/deballasting is likely to occur. Deballasting before entering narrow, shallow channels allows ships to increase trim and improves maneuverability and has previously been shown to be associated with sites of first discovery of non-indigenous species (Colautti et al. 2003; Grigorovich et al. 2003).

Several studies on the dispersal of invasive species have identified human transport as critical to mediating dispersal and range expansion (e.g. Buchan and Padilla 1999; Suarez et al. 2001; Matthews et al. 2014; Horvitz et al. 2017), including in Round Goby in the Great Lakes and Europe (Bronnenhuber et al. 2011; LaRue et al. 2011; Kotta et al. 2016). In this study, genotype assignment analysis characterized Round Goby dispersal distances in the Great Lakes as highly variable, with distances ranging from no dispersal (individuals assigned to their source population) to over 800 km, a distance clearly indicative of human-mediated dispersal if true. Here, long-distance dispersal mainly reflects ballast water dispersal (LaRue et al. 2011; Kotta et al. 2016), plus the transport of small numbers of individuals via bait buckets or recreational boats (Bronnenhuber et al. 2011; Hirsch et al. 2016a). The dispersal events in the range of approximately 600–800 km are migrants primarily moving between the lower Detroit River/western basin of Lake Erie corridor and the Bay of Quinte (MCF and TR) or from Burlington (BUR) to Lake Huron via ballast water. The similarity of the Burlington sample and Lake Huron populations, along with assignment results that indicate movement of a number of individuals from Burlington to Lake Huron, suggest that Lake Huron populations may have been founded by individuals brought from the port at Burlington via ballast water. An alternative (or additional) route may have been overland transport, given that the distance between sites is relatively short. However, given the relatively high allelic richness in Lake Huron, our results are more consistent with ballast water than bait bucket transport. Unsampled areas in the south of Lake Huron would be expected to show a secondary contact zone between this Burlington-Lake Huron cluster and the Lake St. Clair cluster, as natural dispersal filled in the gap between the founding sites of the two clusters. Other putative long-distance transport events are between the eastern Lake Erie cluster and Lake Huron, potentially consistent with bait bucket transport. Dispersal in the range of 100–400 km primarily reflects migration between western and eastern basins of Lake Erie. Supporting our hypothesis that the region of Port Colborne is a secondary introduction site, several long-distance individual assignments are between this site and populations in the western Lake Erie cluster. Although incorrect assignments could occur, our identified dispersal pathways are consistent with FST values that reveal low genetic divergence between sites in the St. Clair River-western basin of Lake Erie corridor and the Bay of Quinte sites, as well as between sites in the St. Clair-Detroit River corridor and the extreme eastern basin of Lake Erie.

The prevalence of significant pairwise FST values, coupled with local clustering within the STRUCTURE analysis, NJ tree, and significant IBD relationships suggest that natural dispersal within the Great Lakes was also a significant factor in overall colonization of the system. The significant correlation between distance from the St. Clair River (the first identified invasion site) and genetic diversity also supports this hypothesis, although it assumes that distance from the river is negatively correlated with time since introduction. Individual assignments also provided evidence for substantial natural dispersal; 39.7% of the identified migrants traveled less than 100 km. Such limited dispersal distances are unlikely the result of ballast water movement. Although adult Round Gobies display high site fidelity, interactions between conspecifics may tend to push subdominant or juvenile individuals from areas of high population density into less populated regions (Kornis et al. 2012). In flowing systems, such as the rivers connecting the Great Lakes, diel vertical movement of larvae (Hensler and Jude 2007) may also play a role in transporting individuals between populations. While the vertical migration of Round Goby likely evolved as a predator avoidance or prey pursuit behavior, it would also serve to disperse larvae farther than expected for benthic, negatively buoyant organisms (Hensler and Jude 2007).

The three populations that clustered separately in our STRUCTURE analysis, HAS, MID, and PSE, located in the Trent-Severn Waterway (HAS) and far eastern Lake Huron (MID and PSE), are strongly divergent from other populations in the data set. Two of the populations (HAS and PSE) showed evidence of bottleneck effects, and all three show significantly decreased allelic richness when compared to other populations. This suggests that they were founded by very small samples of individuals from elsewhere in the invaded range, and given the lack of large-scale shipping activity in the area, the presumptive mechanism of dispersal is bait-bucket transfer (Bronnenhuber et al. 2011; Kornis et al. 2012).

The spread of the Round Goby in the Great Lakes has been remarkably rapid; since their initial appearance in 1990, the species colonized all five lakes within 5 years (Marsden et al. 1996). This is surprising if we assume a dominant role for natural dispersal because, based on mark-recapture studies, Round Gobies were characterized as highly philopatric with limited dispersal tendencies (Wolfe and Marsden 1998; Ray and Corkum 2001; Kornis et al. 2012). Thus, human-mediated dispersal clearly contributed to the rapid invasion of the Great Lakes by the Round Goby (Hensler and Jude 2007; LaRue et al. 2011), as has also been shown in the Baltic Sea (Kotta et al. 2016). We argue that ballast water transport has been dominant in structuring Round Goby populations in the Great Lakes. Moreover, we reiterate the problem of current ballast water management in the system: although invasive species may be prevented from entering by current regulations (Canadian Coast Guard 1989, United States Coast Guard 1993), if they do manage to establish they are likely to be spread rapidly throughout a system with few barriers.

Data archiving

Microsatellite genotype data is available from the Dryad Digital Repository: https://doi.org/10.5061/dryad.56v1c.