Introduction

The northern Pacific system is controlled by multiple interdecadal patterns of climate variability that stem from different physical sources1. Groundfish populations in the northeastern Pacific Ocean followed the six- to 12-year warming and cooling periods of the El Niño-Southern Oscillation (ENSO)2. Sea surface temperature and pressure changes in the North Pacific are captured by the Pacific Decadal Oscillation (PDO), which is separate from ENSO behavior in the region3. A third climate cycle described more recently and termed the Northeast Pacific Gyre Oscillation (NPGO) follows variations in ocean nutrient cycling and phytoplankton abundance and plays a role in the larger system of climate variability with ENSO and PDO4,5.

Dramatic, permanent changes in marine species compositions in response to shifts in climate modes, commonly referred to as regime shifts, such as the strong one observed in 1976-1977 in the northern Pacific Ocean may be the convergence of several climate patterns switching phases within the same time period6,7. This switching of regimes makes it difficult to identify which specific patterns are culprits in affecting distinct marine populations8,9. Many studies on fisheries systems continue to focus on these interdecadal climate modes as primary sources of population variability of marine fishes. However, climate modes alone are insufficient to accurately describe variability found in many commercially valuable marine populations on multi-year scales10. Other sources of oceanic variability not captured by climate modes exist and should be considered when attempting to create accurate models to describe and predict changes in marine populations, even in areas that appear to be dominated by shifts in climate regimes. The complicated interplay of ocean-climate systems in the North Pacific region makes it difficult to identify which and how specific indices are culprits in affecting distinct marine populations.

Many of the marine fishes in the northern Pacific Ocean are commercially important species that contribute significantly to the economy of the United States and are important sources of food both domestically and internationally1113. Several of these populations are managed by international or regional fishing commissions to control commercial harvests and monitor population health11,1416. These organizations are becoming more concerned about the role climate plays in maintaining healthy fish populations, especially as marine fishes do not recover from population collapses as quickly as previously believed17. Fishing activities are increasingly concentrated on deeper-dwelling species18,19. While focus on the effect of various climate modes has dominated ecological research on fishes in the North Pacific region, relationships between the marine environment and atmospheric trends are nuanced and may involve complex lagged effects, particularly for deepwater populations20. Organisms that inhabit the deep ocean are also problematic to study, as they are not adapted to surface-level conditions and prove difficult to sample and keep alive, making experiments in laboratory conditions impossible or prohibitively expensive. Determining which specific environmental variables contribute to fluctuations in the populations of these species from observational data would represent major progress in discerning the impact of climate variability on marine ecosystem health and how those changes affect the economy and food security. A model structure able to accommodate a suite of environmental variables that vary spatiotemporally would be necessary to examine effects of many environmental covariates on deepwater marine populations simultaneously.

Consider the single index varying coefficient model (SIVCM) of the form

$$\begin{aligned} y_{i} = \{G({\varvec{\theta }}_{0}^{T}Z_{i})\}^{T}X_{i}+\varepsilon _{i}, \qquad i=1,\dots ,n \;, \end{aligned}$$
(1)

where \(y_{i}\) is the response, \(Z=(z_{1i}, z_{2i}, \ldots , z_{ti})^T\) and \(X=(x_{0i}, x_{1i}, \ldots , x_{pi})^T\) with \(x_{0i} = 1\) are predictor variables, \({\varvec{\theta }}_{0}\) is a vector of unknown coefficients representing the single-index direction, \(G(\cdot ) = (g(\cdot )_0, \ldots , g(\cdot )_p)^T\) are nonparametric coefficient functions, and \(\varepsilon\) are the random errors21. By setting Z to longitude, latitude, and time triplets, the SIVCM is a convenient structure for incorporating spatiotemporal effects for multiple environmental predictors in X. Spatiotemporal variation among values is incorporated through three-dimensional functions based on spline smoothing2227, accounting for spatial and temporal autocorrelation and improving prediction and inference28,29. Similar structures have been widely used in the related spatially-varying coefficient models30,31 and temporally-varying coefficient models32, including applications in forestry33, ecology34, and economics35,36. Spline-based methods, which are often used to estimate \(G(\cdot )\) in SIVCMs37,38, are more robust for spatially correlated data and do not require the spatial variation to be specified by a functional form39,40.

For such models, selection of important predictor variables in X is typically of interest. Forward selection, backwards elimination, and stepwise selection methods are unstable for models with many predictors and even with advancements to the algorithms, these methods are considered sub-optimal for variable selection, particularly for high-dimensional models41. Penalty-based regression procedures, such as ridge regression and least absolute shrinkage and selection operator (LASSO) estimation, penalize large regression coefficients to reduce overfitting. LASSO additionally performs variable selection by penalizing small regression coefficients to zero, effectively removing these coefficients from the model42. LASSO works particularly well for models with many predictors because it shrinks large coefficients to zero rather than minimizing them, and it is computationally efficient43. Group LASSO incorporates information about groupings of variables into the penalty function, which is particularly important for categorical predictor variables44. While selection for varying coefficient models (VCMs), a lower-order relative of the SIVCM, have built on both the smoothly clipped absolute deviation (SCAD) and LASSO approaches4548, selection procedures of the single-index direction coefficients or the functions in SIVCMs have primarily used SCAD penalties4951. SCAD procedures are unbiased, but they are sensitive to initial estimation and parameter tuning48. LASSO procedures are typically simpler to implement than SCAD, and group LASSO correctly selects important variables for VCMs where the number of dimensions far exceeds the number of observations52. In this analysis, I used a combination of least squared-based spline estimation and group LASSO (LSSGLASSO) proposed by Sun et al.53 to select coefficient functions and estimate the index parameters in a SIVCM of spatiotemporally-varying environmental covariates potentially contributing to changes in groundfish populations in the North Pacific Ocean. With this application, I aimed to establish relevant environmental factors that influence populations of focal groundfish species in this region.

Methods

Annual surveys of several groundfish species are taken at established locations in the waters along the coast of Alaska by the Alaska Fisheries Science Center (AFSC), a division of the National Oceanic and Atmospheric Administration (NOAA). Catch per unit effort (CPUE), also referred to as catch rate, and mean weight in kilograms of six groundfish species determined at each location for each survey year were obtained for years 1979–201354,55. The six groundfish species of focus in this analysis were Pacific cod (Gadus macrocephalus), Pacific halibut (Hippoglossus stenolepis), sablefish (Anoplopoma fimbria), rougheye rockfish (Sebastes aleutianus), shortraker rockfish (Sebastes borealis), and shortspine thornyhead (Sebastolobus alascanus). Air temperature in degrees Celsius (ATMP), sea level pressure in hPa (PRES), wind speed in meters per second averaged over eight-minute periods (WSPD), sea surface temperature in degrees Celsius (WTMP), and the average height in meters of the highest one-third of all waves in 20-minute sampling periods (WVHT) measured daily from buoys in the Gulf of Alaska were obtained from the National Data Buoy Center and summarized by monthly means56. Temperature in degrees Celsius measured at the sea floor (hereafter bottom temperature) was obtained from the AFSC Resource Assessment and Conservation Engineering (RACE) Division’s bottom trawl surveys57. Zooplankton biomass volume given in number per cubic meter (hereafter plankton) were obtained from the NOAA’s Coastal and Oceanic Plankton Ecology, Production, and Observation Database58. Alkalinity (Alk), chlorophyll (Chl), nitrate (NO3), dissolved oxygen (Oxy), phosphate (Phos), and silicate (Sil) concentrations at depths of 75, 400, and 900 meters were obtained from the NOAA’s World Ocean Database59.

Since environmental data can include hundreds of predictor variables with potential multi-collinearity, a practical method for reducing variables before performing group LASSO is necessary for the LSSGLASSO procedure to be computationally feasible. Therefore, backward variable elimination of explanatory environmental variables using computed variance inflation factors (VIF) was performed. The VIF for explanatory variable i was calculated as \(1/(1-R_i^2)\) where \(R_i^2\) was the correlation coefficient of the linear model with variable i regressed against all other explanatory variables. VIFs for all explanatory variables were computed, and the variable with largest VIF that exceeded a value of 5 was removed60,61. The process was repeated after each removed variable, until no variables had a \(\text {VIF}>5\). None of the environmental variables considered for inclusion in the SIVCM had a \(\text {VIF}>5\). The max VIF for any explanatory variables included in the final model was 2.64.

Measurements of environmental variables were not recorded at the exact same locations across all years for which groundfish surveys were performed, thus spatiotemporal interpolation via inverse distance weighting was used to obtain monthly environmental measures for 1979–2013 at the precise locations where MESA surveys were conducted62,63. For all environmental variables, a seasonal amplitude was then calculated for each survey year for all locations. For physical variables ATMP, PRES, WSPD, WTMP, WVHT and bottom temperature, seasonal amplitude was defined as the mean of June, July, and August averages minus the mean of December, January, and February averages. This was owing to the fact that temperatures maximize in the summer and minimize in winter and winter storms enable strong mixing of ocean nutrients64,65. Seasonal amplitudes for chemical and biological variables including plankton, Chl, Alk, NO3, Oxy, Phos, Sal, and Sil were calculated as the mean of August, September, and October averages minus the mean of March, April, and May averages. Zooplankton biomass increases in May after winter vertical mixing of deepwater nutrients and benefits from spring phytoplankton blooms but has not yet been depleted by grazing from summer-migrating pelagic fishes and cephalopods66,67. Therefore nutrients are generally maximal in the spring and minimal in fall after depletion by phytoplankton, whereas zooplankton and chlorophyll would be maximal in fall after nutrient consumption and growth over the summer in the North Pacific region6870.

The AFSC groundfish surveys sample only adults55, and groundfish recruitment is related to the effects of physical and biological variables on early life stages71. Therefore, all environmental predictors were lagged based on the sexual maturity of the focal groundfish. Sablefish and Pacific cod female maturity is reached around 5 years old72,73. Average age of Pacific halibut females that have reached sexual maturity is 12 years, while for males reaching maturity the average age is 8 years74. Rockfish sexual maturity is typically attained from 3 to 7 years of age75. Therefore, lags of 5 years were used for environmental variables in models with sablefish, rockfish, or Pacific cod CPUEs and weights as responses. Lags of 10 years were used for environmental variables in models with responses of Pacific halibut CPUE and weight. These lags were also supported by the findings of Sun et al.53.

A SIVCM of the form given in (1) was fit, where \(y_{i}\) was the CPUE or mean weight of a groundfish at each location for each year, \(X=(\text{ x}_{0i},\ldots ,\text{ x}_{25i})^{T}\) with \(\text{ x}_{0i}=1\) and \(\text{ x}_{1i},\ldots ,\text{ x}_{25i}\) were the lagged seasonal amplitudes of environmental variables described previously, and \(Z=(z_{1i},z_{2i},z_{3i})^{T}\) were the longitude, latitude, and year for each observation. Selection of important variables was performed per year and over all years using the LSSGLASSO procedure53. The choice of knots for the B-spline approximation of the coefficient functions and the optimization of the tuning parameter in the LSSGLASSO procesure are covered in detail in Sun et al.53 The optimal tuning parameter was chosen using a BIC-type selection criterion, while the number of knots was fixed at eight. The number of knots can be chosen using a selection criterion such as GCV or BIC, but since basis expansion was only used for selection of the coefficient functions and every function was re-estimated using local linear regression, the choice of the number of knots does not affect the results53. The performance of the group LASSO procedure was tested against procedures where the group LASSO penalty was replaced by either a group LASSO with ridge penalty or a group SCAD with ridge penalty, and there were no appreciable differences between the three selection procedures. Heat maps were used to visualize selection of variables for each groundfish species. After selection, the SIVCM for all years of sablefish CPUE was refit with scaled \(y, X, \text {and}\; Z\) using only variables selected by LSSGLASSO. The selected coefficient functions for important predictors of sablefish CPUE were plotted to provide a detailed example of interpretation for the SIVCM functions.

To further explore if regional climate conditions influenced the selection of variables each year, I considered potential relationships between the selection of a variable over time and three climate indices that influence environmental systems in the North Pacific region. The Pacific Decadal Oscillation (PDO) monthly index, multivariate El Niño/Southern Oscillation bi-monthly index (MEI), and North Pacific Gyre Oscillation (NPGO) monthly index were obtained for 1979–20137678. Seasonal amplitude for climate indices was defined for each year as the mean of June, July, and August minus the mean of December, January, and February, since climate indices typically describe physical environmental conditions that contribute to temperature and pressure changes affecting ocean nutrient mixing79,80. Logistic regression was then used to fit the model \(g({\mathbb {E}}(y_i)) = \beta _0 + \beta _{k} x_{ki}\) for years \(i=1985, \ldots , 2013\) for each variable within a groundfish response, where \(g(\cdot )\) was a log-link function; \(\beta _0\) was the intercept term; \(\beta _k\) were linear coefficients, \(k \in \{ \text {PDO, MEI, NPGO} \}\); \(x_{ki}\) were seasonal amplitudes of index k lagged by 5 years when considering the responses of all groundfish except those of Pacific halibut, which were lagged 10 years; and \(y_i\) were binary indicators of whether a variable was or was not selected for each year. A Chi-square test was used to compare the models to an intercept-only model. P-values from the Chi-square test were adjusted within groundfish response to control for the false discovery rate (FDR) of multiple testing using the Benjamini-Hochberg procedure81.

Finally, it was necessary to determine whether PDO, MEI, or NPGO were good predictors of groundfish catches and weights and if the effect of a climate index on the selection of variables was related to that index’s direct relationship to the groundfish populations. I fit generalized additive models (GAMs) of the form

$$\begin{aligned} {\mathbb {E}}(y_i) = f_0 + f_1(\text {PDO}_i-\text {lag}) + f_2(\text {MEI}_i) + f_3(\text {NPGO}_i), \qquad i = 1985, \ldots , 2013 \;, \end{aligned}$$

with lagged seasonal amplitudes of PDO, MEI, and NPGO as three additive smooth predictors and groundfish CPUEs or weights as responses \(y_i\)26. Relationships of climate to groundfish responses cannot be assumed to be linear8284. Therefore, the GAM structure was preferred over a generalized linear model, because it permits nonlinear relationships between predictors and the expected response. P-values were estimated for each smooth predictor in each model and plotted in a heatmap for efficient visualization.

Results

Variables consistently selected by LSSGLASSO as important to Pacific cod CPUE were WVHT, Alk 900m, Chl 400m, Oxy 900m, Sil 900m, PRES, plankton, Alk 400m, NO3 75m, and Oxy 75m, which usually were in years 1986, 1990, 1992, 1995, 2006, and 2010 (Fig. 1, green). Variables consistently selected in 1992, 1996, and 2002 for Pacific cod weight were Sil 900m, WVHT, plankton, bottom temperature, Alk 75m, Alk 400m, Alk 900m, Chl 75m, Chl 400m, NO3 75m, Oxy 75m, and Oxy 900m (Fig. 1, orange). Shared variables selected for both CPUE and weight of Pacific cod were numerous in 1982 (Fig. 1, purple). WVHT, WTMP, ATMP, plankton, Alk 75m, and NO3 900m in years 1990–1991, 1997–1999, and 2013 were selected variables for Pacific halibut CPUE, whereas variables important for Pacific halibut weight included bottom temperature, Alk 75m, Chl 400m, Chl 900m, NO3 900m, Oxy 75m, Oxy 400m, and Sil 900m for years 1990 and 1992–1994 (Fig. 2). ATMP, WTMP, WVHT, Chl 75m, Oxy 75m, and Oxy 900m were important variables for Pacific halibut CPUE when all years of data were considered together. WTMP, WVHT, Alk 900m, and Sil 900m were most often selected as important for sablefish CPUE, while WSPD and ATMP were the only variables selected as important to sablefish weight more than once over the 29-year period of record and were the only variables important to sablefish weight when all years of data were considered together (Fig. 3).

A wide array of environmental variables were selected for rougheye rouckfish CPUE consistently and almost exclusively in years 1991, 2001, 2011, and 2013, whereas 1990 was the only year when more than one variable was selected for rougheye rockfish weight (Fig. 4). The most often selected variables related to rougheye rockfish CPUE were WSPD, Oxy 900m, Sil 900m, plankton, Chl 400m, and Oxy 400m, while Chl 400m, Sil 900m, and Oxy 900m were important when all years were considered together. In the case of shortraker rockfish, most environmental variables were selected as important for both CPUE and weight in 2000 and 2007 (Fig. 5, purple). WSPD, Chl 400m, Oxy 400m, and Oxy 900m were most often selected over time for shortraker rockfish CPUE, while 900m depths of NO3, Sil, Phos, and Oxy, 75m depths of Alk, Chl, NO3, Oxy, and Sal, and bottom temperature were most often selected as important to shortraker rockfish weight over all 29 years (Fig. 5). ATMP, WVHT, and Phos 75m were shared as selected variables for shortspine thornyhead CPUE and weight, and many environmental variables contributed to shortspine thornyhead weight in 1986–1987, 1998, 2000, and 2004 (Fig. 6). ATMP was selected most often for shortspine thornyhead CPUE, while WVHT, Chl 400m, and Phos 900m were consistently selected as important to shortspine thornyhead weight over the 29-year period.

Only Pacific cod CPUE, Pacific halibut CPUE, sablefish CPUE and weight, and rougheye rockfish CPUE had variables selected as important predictors when performing selection on data including all available years (Figs. 1, 2, 3, 4). Alk 900m was a common selected variable for Pacific cod and sablefish CPUEs for all years included in the selection procedure, while WTMP was selected for both sablefish and Pacific halibut CPUEs for all years.

Figure 1
figure 1

Variable selection analyzed for each available year and all years together for the SIVCMs with Pacific cod CPUE or WT as the response. X-axis represents years for which selection was performed, and all available years of data used in selection labeled as “allyrs”. Y-axis contains all variables in SIVCM from which selection was performed. Colors indicate selection of variable(s) important to CPUE, WT, neither response, or both responses, along with years missing either CPUE or WT response values. Line plot at top shows lagged seasonal amplitude values for the Pacific Decadal Oscillation (PDO), multivariate El Niño/Southern Oscillation (MEI), and North Pacific Gyre Oscillation (NPGO) for corresponding years.

Figure 2
figure 2

Variable selection analyzed for each available year and all years together for the SIVCMs with Pacific halibut CPUE or WT as the response. Axes, colors, and line plot are as described in Fig. 1.

Figure 3
figure 3

Variable selection analyzed for each available year and all years together for the SIVCMs with sablefish CPUE or WT as the response. Axes, colors, and line plot are as described in Fig. 1.

Figure 4
figure 4

Variable selection analyzed for each available year and all years together for the SIVCMs with rougheye rockfish CPUE or WT as the response. Axes, colors, and line plot are as described in Fig. 1.

Figure 5
figure 5

Variable selection analyzed for each available year and all years together for the SIVCMs with shortraker rockfish CPUE or WT as the response. Axes, colors, and line plot are as described in Fig. 1.

Figure 6
figure 6

Variable selection analyzed for each available year and all years together for the SIVCMs with shortspine thornyhead CPUE or WT as the response. Axes, colors, and line plot are as described in Fig. 1.

I plotted the selected coefficient functions of WTMP, Alk 400m, Alk 900m, Sil 75m, and Sil 900m for sablefish CPUE using all years of data to provide a detailed example of interpretation of the SIVCM coefficient functions (Fig. 7). Parameters representing the single-index direction estimated by the LSSGLASSO procedure were \({\widehat{{\varvec{\theta }}}}^T=(0.9346, -0.3380, -0.1109)\). The relationship between sablefish CPUE and space-time is highly nonlinear, as seen in the intercept function (\(g_0\)). For a given location and year, we can express the single-index direction as \({\varvec{\theta }}^T Z = 0.9346 z_1 - 0.3380 z_2 - 0.1109 z_3\), where \(z_1\) is longitude, \(z_2\) is latitude, and \(z_3\) is the year. For \({\varvec{\theta }}^T Z\) to increase, either longitude increases (moving eastward) or latitude decreases (moving southward). As time moves forward, i.e. year increases, \(\theta ^T Z\) decreases. Fixing two variables in Z while moving the third allows us to consider how each important variable in X affects the response as \({\varvec{\theta }}^T Z\) changes. For example, for a fixed latitude and year, increasing longitude leads to increasing \({\varvec{\theta }}^T Z\), which was associated with increasing, decreasing, and increasing effect of WTMP on sablefish CPUE (\(g_1\) in Fig. 7). So for low and high (east and west) longitude values within the range of the data, increasing WTMP was associated with increasing sablefish CPUE for fixed latitude and year. For medium longitude values, increasing WTMP was accompanied by decreasing sablefish CPUE, ceteris paribus. This same nonlinear relationship was also observed for Sil 900m on sablefish CPUE (Fig. 7, \(g_5\)). Alternatively for a fixed longitude and year, decreasing latitude (moving southward) precipitates an increase in \({\varvec{\theta }}^T Z\), which was associated with an increasing and then decreasing effect of Alk 400m on sablefish CPUE (\(g_2\) in Fig. 7). Hence for southern locations with fixed longitude and year, increasing Alk 400m was associated with decreasing sablefish CPUE. As another option for interpretation, increasing year while fixing longitude and latitude leads to a decrease in \({\varvec{\theta }}^T Z\). Therefore for a fixed longitude-latitude pair, increasing years was associated with increasing, decreasing, then increasing effects of Alk 900m on sablefish CPUE (Fig. 7, \(g_3\)). Thus in early and late years (e.g. 1985–1988 and 2010–2013) for a fixed location, increasing Alk 900m was related to increasing sablefish CPUE.

Figure 7
figure 7

Coefficient functions selected and estimated by LSSGLASSO as important predictors of sablefish CPUE.

Lagged seasonal amplitude of NPGO was related to the selection or non-selection of PRES, WSPD, BOT TEMP, Alk 75m, Alk 400m, Chl 75m, Chl 900m, NO3 75m, NO3 900m, Oxy 75m, Oxy 900m, Phos 900m, Sal 75m, Sal 400m, Sil 75m, and Sil 900m for shortraker rockfish weight (Fig. 8). Seasonal amplitudes of PDO, MEI, and NPGO climate indices in the North Pacific were not related to the selection of any environmental variables for any other groundfish responses.

Figure 8
figure 8

FDR-corrected P-values for relationship between selection of each environmental variable contributing to groundfish CPUEs or WTs each year and lagged seasonal amplitudes of climate indices PDO (a), MEI (b), and NPGO (c) fitted using logistic regression. Abbr: cod = Pacific cod; halibut = Pacific halibut; sable = sablefish; rrock = rougheye rockfish; srock = shortraker rockfish; thorny = shortspine thornyhead; CPUE = catch per unit effort; WT = mean weight in kg. P < 0.05 are indicated with red.

Lagged seasonal amplitudes of PDO, MEI, and NPGO were significant predictors of Pacific cod CPUE, while PDO and MEI were not significant predictors for any of the other groundfish responses (Fig. 9). Lagged seasonal amplitudes of PDO, MEI, and NPGO provided a moderate fit for Pacific cod CPUE (deviance explained\(=0.581\)). Lagged NPGO was also a good predictor of rougheye rockfish CPUE and provided a good fit to the data (deviance explained\(=0.641\)).

Figure 9
figure 9

Heatmap of P-values for nonlinear relationship between groundfish CPUEs or WTs each year and lagged yearly seasonal amplitudes of climate indices PDO, MEI, and NPGO fitted using generalized additive models (GAMs). Colors and abbr. are as described in Fig. 8.

Discussion

Deepwater species such as the groundfish examined in this study are difficult to monitor regularly and study extensively, so determining environmental variables that most accurately predict population health of these species is critical. In this study, I illustrated LASSO-type variable selection on the SIVCM capable of including spatiotemporal covariates for modeling large-scale population data, identifying the most significant factors affecting catch and mean weights of groundfish, and determining the functional relationships of these important predictors to the response for inference. Though the environmental variables considered in this application were collected in situ, the SIVCM with group LASSO-type selection would be ideal for spatially or spatiotemporally measured remotely sensed data, such as environmental satellite data.

CPUE and mean weights of groundfish contribute to calculations of relative population numbers and weights which are used to inform management and policy decisions85. The use of relevant environmental covariates influencing groundfish catch and weight can inform spatiotemporal modeling of groundfish distributions and aid prediction of distribution shifts due to climate change86. This knowledge is particularly urgent for predicting the effects of climate change and devising adaptive strategies for vulnerable populations such as the three rockfish species included in this study, stocks of which are highly sensitive to climate change in the eastern Bering Sea87.

Increased growth rates in juvenile sablefish with warmer water temperatures have been observed in laboratory settings88,89 and off the coast of Oregon, where early growth was related to recruitment90. Increased sablefish recruitment may be reliant on greater stratification developed by warmer surface water temperatures91, while warmer deep waters during the sablefish egg stage combined with colder surface waters during the pelagic larvae stage have been related to higher recruitment92. The importance of sea surface temperature to sablefish CPUE across all years established in this analysis is therefore in agreement with previous experimental and observational studies and stresses the importance of surface water temperatures on the success of juvenile sablefish90. Sogard90 presumed faster sablefish growth occurred due to high overall productivity reflected in the PDO index, however in this analysis there was no direct relationship between PDO and sablefish CPUE or mean weight, nor was PDO found to be related to to the selection of any environmental variables for sablefish CPUE or weight. However, the selection of silicate as important for sablefish CPUE may point to a relationship of primary production on sablefish, as silicate is an important regulator of phytoplankton growth in Alaskan marine waters93. The selection of air temperature and wind speed as important across all years for sablefish weight is consistent with the relationships of transport and of July winds during juvenile years to recruitment92,94 and suggests the importance of air and surface water movements on early-stage sablefish.

Chlorophyll and sea surface temperature were important across all years for Pacific halibut CPUE, along with air temperature, dissolved oxygen, and wave height. These findings are corroborated by previous observations of Pacific halibut. Temperature influences growth on Pacific halibut juveniles, which exhibit compensatory growth in response to periods of temperature-limited growth95,96. Bioenergetics model simulations and diet data suggest that adult growth may also be limited by temperature and prey quality or availability97. Pacific halibut weight data were no longer collected after 1994 by the AFSC surveys, so a consistent, long term signal in the data to determine important variables related to weight of Pacific halibut was unlikely. However, growth in early years contributes to year-class abundance for Pacific halibut95. The findings of chlorophyll, air temperature, and wave height as important to Pacific halibut CPUE point to sensitivity of Pacific halibut to bottom-up forces in the North Pacific Ocean98. Chlorophyll is an important indicator of changes in the carbon transport system and plankton biomass vital for energy transfer to higher tropic levels69,99, while air temperature and wave height influence the duration of wind-mixing events and change the depth of the mixed-layer, affecting primary and secondary production7,100102. A relationship between dissolved oxygen and Pacific halibut catch has also been established previously, with Pacific halibut likely adjusting to changes in oxygen environments through lateral rather than vertical movements103.

Variables selected as important to specific groundfish in this work that have not previously been explored encourage further investigation. For example, only deep water alkalinity contributed to Pacific cod CPUE across all years of available data, but Pacific cod CPUE was also related to PDO, MEI, and NPGO indices. While water temperatures have been related to interannual depth changes, range shifts, and population declines of adult Pacific cod104106, juvenile Pacific cod may respond to drivers other than temperature106. Alkalinity, as part of the ocean’s carbonate system, moderates seasonal pH variability107. Further, increased variability in the interannual carbonate system in Pacific Ocean is also attributed to variability in the PDO and ENSO108. Therefore the selection of alkalinity as important to Pacific cod catch may correspond to several climate modes strongly dictated by the carbonate system and closely correlated to variability in alkalinity. Then again, the importance of alkalinity may instead represent a sensitivity of early stage Pacific cod to ocean acidification, as increased atmospheric carbon dioxide changes alkalinity levels109,110.

Another example of previously unexplored environmental variables found in this study to be related to groundfish are the important variables for rougheye rockfish CPUE across all years, which were exclusively from deeper waters (>300 m). Since rockfish are viviparous, exposure to near-surface environment during development is eliminated. Rougheye rockfish CPUE was best predicted by NPGO seasonal amplitudes, while chlorophyll, dissolved oxygen, and silicate were all important for rougheye rockfish CPUE over all years. NPGO describes variations in salinity, nitrate, phosphate, silicate, oxygen, and chlorophyll, capturing important interannual and decadal biological patterns4. This suggests that the selection of variables affecting rougheye rockfish catch was the result of an existing, strong effect of nutrient cycling patterns captured by the NPGO index on rougheye rockfish populations.

Selection of important variables related to groundfish catch and weight using LSSGLASSO for SIVCMs provided important confirmation of previous findings and new avenues of research for large populations of groundfish in the North Pacific over an extended time period during which several large-scale regime shifts have been reported1,6,111,112. While large-scale climate indices are often favored when studying populations affected by regime shifts of the northern Pacific marine system, the specific pathways of their effects on a population or community are often difficult to disentangle. Most groundfish responses had a large array of environmental variables selected for specific years, indicating that the strength of environmental effects on groundfish may vary over time. Selection of environmental variables appeared to be unrelated to seasonal amplitudes of the PDO, MEI, and NPGO climate indices for most groundfish. Large-scale climate patterns are associated with groundfish responses1,2, however my results highlight the insufficiency of climate modes alone to accurately describe variability found in many commercially valuable marine populations on decadal scales10. These results indicate the existence of several pathways of climate and other environmental effects on groundfish responses. Ecological variability in these marine systems are therefore likely driven by gradual spatial and temporal climate variability that reduces resilience, with abrupt changes in climate modes precipitating permanent changes in vulnerable systems8.