Introduction

Ocean acidification (OA) is a central theme of the Intergovernmental Panel on Climate Change (IPCC) Special Report on the Ocean and Cryosphere in a Changing Climate1. While climate variability delays the emergence of climate change signals for other properties of the Earth system, the high signal-to-noise ratio of OA2 is a clear indicator of the effects of anthropogenic carbon emissions on the ocean surface. For >95% of the world’s ocean, the OA signal already emerged from the background natural variability3. The effects of acidification are far-reaching from high-latitude ice-covered regions to tropical warm water ecosystems, as climate models project a ubiquitous decline in surface pH in response to increasing atmospheric carbon dioxide concentrations4.

Global distributions of projected OA are well-documented in the literature4,5,6. Many of these studies that examine basin-wide changes in OA highlight the vulnerability of the Arctic to acidification7,8,9,10,11,12,13. Melting sea ice in response to climate warming creates a larger outcrop region for waters that are depleted in carbon dioxide - which is measured by its partial pressure in seawater (pCO2). The freshwater from ice melt contributes to reductions in sea surface salinity in the Central Arctic basin that accelerates a decline in both pH and the carbonate ion saturation states (Ω)12 making the water more corrosive. Increases in stratification associated with fresher and warmer waters in the Central Arctic limit nutrient supply from below, which further reduces values of Ω7. In the Greenland Sea, deeper mixed layers also contribute to lower values of Ω while also transporting the OA signal to the deep ocean. Import of relatively fresh North Pacific surface water through the Bering Strait also combines with increased input from riverine sources leading to a freshening of the Beaufort Gyre and a reduction in surface total alkalinity10, which is a measure of seawater buffer capacity. Rivers draining into the Arctic are often low in alkalinity and high in pCO2 making them another major source of acidification14,15.

Arctic OA can be thought of as the end result of many processes changing on different spatial and temporal scales. Given model biases in upper ocean stratification, salinity, mixing, and the resolution-dependent penetration of carbon-rich Atlantic waters16, regions of different OA are not guaranteed to be consistent across models or even with observations. Coupled models represent these processes with varying degrees of skill. For example, performance in simulating Arctic sea ice remains highly variable across the CMIP6 multi-model ensemble17. The dilution effect of melting sea ice contributions to Arctic OA is on the same order of magnitude as the air-sea disequilibrium of carbon dioxide13. Studies have also started to consider OA from a water mass perspective and find a positive correlation between maximum sea surface density and depth-integrated anthropogenic carbon uptake (Cant) among CMIP5 models18. Higher surface densities are linked with more transport from the ocean surface to the interior, and this relationship provides an emergent constraint on model projections of future carbon uptake18.

OA is expected to have negative impacts on marine ecosystems, partly through a reduction in the favorability of calcium carbonate formation. Marine organisms derive their exoskeletons by precipitation of aragonite and calcite. The carbonate ion saturation states with respect to aragonite (ΩA) and calcite (ΩC) are a measure of the dissolution potential of these mineral exoskeletons and thus the stress on marine organisms. Values of Ω > 1 indicate an oversaturation, or excess of the carbonate ion species. Values of Ω < 1 indicate undersaturation and increased competition for carbonate substrates between biological processes and the additional hydrogen ion concentrations [H+] resulting from acidification. The exact threshold of Ω when organisms become stressed is species-dependent19, however, the dissolution of calcium carbonate shells becomes increasingly likely20 at Ω < 1.

Within the Arctic basin, these different physical and biogeochemical processes driving acidification vary in space and time, underscoring that the Arctic cannot be treated as a single entity. Studies have sought to divide the Arctic into distinct geographical sub-regions based on their surface properties and dominant processes that are driving changes in OA7,12. These multivariate approaches, while successful, rely on the selection of physical and biological variables to define these regions and vary from study to study. Machine learning (ML) approaches have the potential to analyze sub-basin scale OA responses in an objective way and have already been successfully applied to observed and simulated ocean biogeochemical datasets21,22. A water mass-based approach, coupled with the objectivity of machine learning techniques, has the potential to transcend model biases and the limitations of geographical boundaries. Such an approach provides a framework that can facilitate the evaluation of projections of OA across different models and under different climate change scenarios.

An adapted version of the Systematic AGgregated Eco-province (SAGE) methodology22 is used here to objectively identify Arctic regions of surface OA in response to future climate forcing. The SAGE method uses unsupervised machine learning to objectively classify ecosystem biomes that are otherwise derived subjectively and is targeted at identifying clusters in highly non-linear data ubiquitous in the geosciences and beyond. We use simulation data from two state-of-the-art coupled models developed at NOAA’s Geophysical Fluid Dynamics Laboratory (NOAA-GFDL), described in more detail in the methods section. The two models, GFDL-ESM4.123 and GFDL-CM424, differ in their horizontal resolution and biogeochemical model complexity and offer a unique opportunity to examine similarities and differences with respect to OA. Results from historical simulations, a lower carbon emission future climate scenario (SSP2-4.5, ssp245), and a higher emission scenario25 (SSP5-8.5, ssp585) are analyzed with the clustering workflow using predictors based on 20-year averaged (2041−2060) sea surface temperature, salinity, and pH anomalies relative to a centennial average of years 1850–1949. The ESM4 model was run in both a “concentration-driven” mode and an “emission-driven” mode where the latter uses a fully prognostic CO2 tracer shared by all coupled model components. Table 1 describes the model configurations and experiments used in this study. Model results from their respective native tri-polar grids were remapped to a 1° × 1° spherical grid and the clustering framework was performed independently for the ESM4 and CM4 models (see methods).

Table 1 Models and experiments used in this study.

Results

Anthropogenic carbon uptake and accumulation in the Arctic

Anthropogenic carbon uptake by the ocean is the primary driver of OA. Both ESM4 and CM4 produce qualitatively similar patterns of Cant compared to the gridded 1° × 1° Global Ocean Data Analysis Project version 2 (GLODAPv2) climatology26 (Fig. 1, see methods). Based on a three-member ensemble of simulations driven by historical climate forcing, the globally integrated inventory of Cant between the years 1850 and 2011 is 153.1 ± 1.6 GtC for CM4. The Cant inventory is 143.8 ± 0.2 GtC in the concentration-driven version of ESM4 (historical) and 178.8 ± 2.8 GtC in the emission-driven version (esmHistorical) of ESM4. When accounting for additional anthropogenic carbon emissions between 1791 and 185027 (~10–20 GtC, see methods), both the CM4 and ESM4 esmHistorical inventories of Cant are high compared to the observed estimates (134.9 ± 24.0 GtC) and the suite of CMIP5 models27 (136.6 ± 14.0 GtC). The concentration-driven ESM4 historical simulation (historical) has lower Cant uptake28 which agrees more closely with observed estimates and the CMIP5 ensemble.

Fig. 1: Maps of climatological and model-simulated ensemble mean (n = 3) anthropogenic carbon (Cant) uptake for the year 2002.
figure 1

GLODAPv2 observationally-based gridded Cant climatology at 1° resolution (a) is compared with Cant for GFDL-CM4 (b), the concentration-driven version of GFDL-ESM4 (c), and the emission-driven version of GFDL-ESM4 (d). Stippling denotes grid points where the model-simulated Cant range between the minimum and maximum values from the historical ensemble overlaps the observational remapping error from the GLODAPv2 gridded product. Anthropogenic carbon uptake in the models is defined as the difference between the historical simulation and the models’ preindustrial control simulation. Negative values of Cant in the models are possible at grid point where the local effects of carbon export by changing circulation exceeds uptake.

Patterns of Cant differ in the Arctic between the CM4 and both versions of ESM4. The ensemble-mean pattern correlation (r2) with the GLODAPv2 gridded climatology for the region poleward of 65°N is 0.78 and 0.79 for the ESM4 historical and esm-hist simulations, respectively, and 0.87 for the CM4 historical simulation. This difference is related to the models’ representation of relatively warmer and salty water from the North Atlantic Current flowing into and across the Arctic basin. The warm, carbon-rich waters flow into the Arctic where they cool and flow underneath the relatively fresher surface transpolar current. This effect is present in CM4, which shows better agreement with the GLODAPv2 climatology. In both versions of ESM4, water flowing in from the North Atlantic terminates prematurely and there is increased accumulation of Cant in the Eurasian Basin. Despite the relative simplicity of the ocean biogeochemistry modeling component in CM4 (see methods), the high-resolution ocean contributes to the better representation of carbon uptake and transport within the core of the Arctic basin and is consistent with similar findings that explore the impact of model resolution on Cant accumulation and transport16.

To demonstrate the relative vulnerability of the Arctic to OA compared to the rest of the world, we define a metric of the acidification potential of CO2 uptake for a given location (Fig. 2). The metric is defined as the change in hydrogen ion concentration [H+] between the periods 2081–2100 and 1850–1949 normalized by the cumulative carbon uptake (Cant) over this entire time (1850−2100). The metric defines how much OA occurs at each model grid point per unit of Cant uptake and can be interpreted as an “efficiency” that quantifies how successful global anthropogenic carbon emissions are at leading to OA at a given location. While the change in [H+] is positive for all locations, cumulative carbon uptake is negative in some locations—particularly in the tropics where persistent upwelling causes the ocean to outgas CO2 from the deep ocean to the atmosphere. The metric is only defined for locations where there is both an increase in [H+] and positive cumulative carbon uptake.

Fig. 2: Acidification potential of cumulative carbon uptake.
figure 2

Results are for GFDL-CM4 (a, d), the concentration-driven version of GFDL-ESM4 (b, e), and the emission-driven version of GFDL-ESM4 (c). Results are shown for the higher emissions ssp585 scenario (ac) and lower emissions ssp245 scenario (d, e). The acidification potential metric reflects how efficient global anthropogenic carbon emissions are at driving surface ocean acidification at each grid point. The metric is defined as the change in surface hydrogen ion concentration (mol [H+] m−2) averaged over years 2081–2100 normalized by the time-integrated air-sea CO2 flux ( μmol CO2 m−2). Although the increase in [H+] is positive at all locations, the cumulative carbon uptake at a grid point may be negative, especially in the tropics. Only positive values of this acidification potential metric are shown.

The largest values of the acidification potential (>100 mol [H+] μmol−1 CO2) occur globally along the edges of the tropical water masses, in the main ACC region of the Southern Ocean, and in the Arctic. Acidification potentials are an order of magnitude larger in these regions compared to the rest of the world ocean. Within the Arctic, there are regional increases in the acidification potential along the North American coastline, with the largest values bordering the Beaufort Sea and the Queen Elizabeth Islands. As demonstrated in the next section, the fate of Arctic sea ice plays a role in making this sector of the Arctic particularly vulnerable to acidification.

Probabilistic projection and density-based clustering of arctic acidification

An ensemble-based application of the SAGE method to the CM4 and ESM4 projected changes in SST, SSS, and surface pH results in four dominant clusters in the Arctic that are broadly consistent across both models and scenarios (Fig. 3). The three predictors were chosen as a minimal set of variables that define surface water mass properties and acidification that are readily available through coordinated modeling experiments such as CMIP. The objective of this analysis is to characterize large-scale water mass responses in the Arctic. The number of grid points in each cluster declines as the clusters becomes smaller in area, making the signal-to-noise ratio smaller. Thus, only clusters where there is >50% of areal agreement between CM4 and ESM4 are considered robust, leading to the presence of unshaded regions in this figure. The full clustering for each model and scenario is shown in Supplemental Fig. 1.

Fig. 3: Objectively identified clusters of Arctic Ocean acidification for GFDL-CM4 and GFDL-ESM4 simulations using the adapted SAGE methodology (Sonnewald et al. 2020; see methods).
figure 3

Clusters are based on the mid-century (2041–2060) projected responses in sea surface temperature, sea surface salinity, and surface pH temperature compared to the early historical period (1850–1949). Results from the higher emissions ssp585 scenario (a, c, e) and lower emissions ssp245 scenario (b, d) are shown. Four distinct regions (shading) are present in the Arctic: the Central Arctic Surface Waters (CASW, blue), more general Arctic Surface Waters (ASW, red), North Atlantic Surface Waters (NASW, purple), and Sub-Arctic Surface Waters (SASW, green). Only clusters where there is >50% areal agreement between CM4 (a, b) and ESM4 (ce) are shown. Hatching denotes grid points where <32 of the 40 iterations (80%) of the clustering workflow disagree on the cluster assignment for that point. The solid line indicates the model-simulated boundary of September sea ice extent averages over the years 1979–2014. The dotted region indicates the model-projected September sea ice extent for the years 2041–2060. Grid points identified as noise (e.g., statistical outliers, inliers) are shaded as light gray.

The core of the Arctic basin is divided into two regions, a Central Arctic Surface Water (CASW, blue) region surrounded by a general Arctic Surface Water Region (ASW, red). Both of these clusters exhibit declines in surface pH that are larger (0.6–0.8) than the pan-Arctic average (Fig. 4a). The remaining two clusters include Sub-Arctic Surface Waters (SASW, green) and North Atlantic Surface Waters (NASW, purple), both of which exhibit declines in surface pH that are more modest (0.5–0.6) compared to the pan-Arctic average. The SASW cluster is the smallest region in terms of area. The SASW cluster fails to be identified in the lower emissions scenario for ESM4 and incorporates many of the grid points that are defined as part of the NASW cluster in the other models and scenarios. Since the average changes in the surface water mass properties in this cluster closely track the pan-Arctic average, the rest of the analysis will focus on the remaining three clusters.

Fig. 4: Simulated time series of cluster-average surface properties.
figure 4

Results are shown for surface pH (a), surface total alkalinity (b, mol m−2), sea surface temperature (c, °C), sea surface salinity (d, p.s.u.), air-sea CO2 flux (e, kg m−2 yr−1), and surface calcium carbonate saturation with respect to aragonite (f, ΩA, mol m−3/mol m−3) from the GFDL-CM4 (I, IV) and GFDL-ESM4 (II, III, V) models. Anomalies are calculated relative to the 1850–1949 average for all variables except for ΩA where the raw time series is presented. Results are shown for the higher emissions ssp585 scenario (red shaded group) and lower emissions ssp245 scenario (unshaded group). Time series for the pan-Arctic average (black), Central Arctic Surface Waters (CASW, blue), more general Arctic Surface Waters (ASW, red), North Atlantic Surface Waters (NASW, purple), and Sub-Arctic Surface Waters (SASW, green) are shown. Shading denotes the Monte Carlo-based uncertainty ranges for the single realization of each time series (see methods).

The mechanisms of high-latitude warming-induced sea ice loss are known to be drivers of enhanced OA in the Arctic. This is primarily through increasing the surface area of the Arctic ocean that is in contact with the atmosphere and the freshwater dilution effects of sea ice melt on the carbonate system balance. Summer sea ice extent is shown in Fig. 3 for the period from the start of the satellite era to the beginning of the future scenario simulations (1979–2014, contours) and for the mid-21st century projections (2041–2060, dotted region). September was chosen for this analysis as it represents both the climatological minimum sea extent and a proxy for multi-year ice. A threshold of 15% grid cell coverage by sea ice was used to differentiate ice-covered versus ice-free grid points.

The ASW cluster boundary bears rough correspondence to the present-day September sea ice extent in all the models and scenarios (Fig. 3). Some deviations from this relationship are evident, especially near the Labrador Sea, the Bering Strait, and in the Barents Sea. The location of the minimum September sea ice extent is related to the limits of penetration of warmer, saltier Atlantic surface waters. North Atlantic waters flow into the Arctic basin at subsurface to intermediate depths (>150 m depth), eroding the halocline and warming the surface ocean from below29. The approximate correspondence of this region to present-day sea ice concentrations is an important observable benchmark to identify regions where enhanced OA is likely to occur under a future high-emissions scenario.

The CASW boundary is mostly aligned with the mid-century September sea ice extent projections. The correspondence between the CASW cluster boundary and the presence of multi-year ice suggests that the CASW region of the Arctic is the last to become ice-free in the summer and highlights the importance of the year-round presence of sea ice in providing a physical barrier between the ocean and increased CO2 concentrations in the atmosphere. The approximate correspondence between the sea ice and CASW boundaries varies across models and scenarios, reflecting both the large internal variability of sea ice as well as a general uncertainty in climate model projections of future Arctic sea ice17. It is noteworthy, however, that the information content of the sea ice change is captured by the predictors used in the clustering algorithm (T, S, pH), whereas previous studies7,12 relied on multiple predictors to classify this region.

The primary water mass difference between the CASW and ASW regions is a more rapid decline in CASW sea surface salinity (~−2 to −2.5 p.s.u. compared with ~−1 to −2 p.s.u.) and accompanying reduction in surface total alkalinity ( ~ − 0.1 to − 0.2 mol m−3 compared with ~−0.05 to −0.1 mol m−3) during the first half of the 21st century. Freshwater contributions from sea ice melt and river runoff reduce the buffer capacity of this region15, and the increased exposure of surface waters depleted in pCO2 enhances carbon uptake that leads to acidification. Surface pH changes decline more rapidly in the ASW cluster in the first few decades of the 21st century, but the rapid decline in the buffering capacity of the CASW cluster allows for it to catch up to the ASW cluster during the middle and latter half of the century. Both clusters end up with similar pH changes by the year 2100 but have very different trajectories.

The NASW cluster includes the Greenland and Barents Seas in all the models and scenarios. The exceptions to this pattern are the inclusion of some North Pacific waters in the CM4 ssp585 and ESM4 esm-ssp585 simulations, and more general sub-Arctic waters in the ESM4 ssp245 simulation. This cluster exhibits a modest decline in surface pH compared to the pan-Arctic average, and the defining surface water mass feature is strong SST warming averaging between 4 and 6 °C in the ssp585 scenarios. The warmer water has reduced CO2 solubility, and since the origins of this water are at lower latitudes, they are more carbon-rich than the existing water masses in the Arctic. Both factors inhibit carbon uptake in this cluster relative to the central Arctic clusters. The NASW cluster is also the only cluster to maintain or even increase surface total alkalinity of ~0.05 mol m−2 with nearly constant sea surface salinity. This response enhances the buffering capacity in the cluster as well.

The air-sea CO2 flux increases during the simulations for each of the clusters. The increase in the air-sea CO2 flux reflects the increasing atmospheric concentration of carbon dioxide and the disequilibrium with the dissolved CO2 in the surface ocean. There is an inverse relationship in the Arctic between the clusters that exhibit the strongest decline in pH and the largest increase in the air-sea CO2 flux. The flux increases the most in the NASW cluster, peaking at values between 1 and 2 kg m−2 year−1 in the middle of the 21st century before rapidly declining as the surface water warms. The other clusters show a more gradual increase and plateau at the end of the 21st century. The NASW cluster is removed from the influences of coastal runoff, and combined with warmer SSTs that help maintain the buffer capacity, this cluster is less vulnerable to acidification than the other clusters. Since the physical ocean contributions in these sub-arctic waters are toward reducing carbon uptake through changes in solubility, there is the potential that biological mediation of carbon fluxes is important in this region given the large increase shown in the air-sea CO2 flux.

Pathways toward corrosive arctic surface waters

The impact of OA on ecosystems can be quantified through changes in the calcium carbonate ion saturation states, which represent carbonate levels present in seawater that are available to marine organisms. Here we analyze simulated changes in the carbonate ion saturation state with respect to aragonite (ΩA). All the clusters across the models and scenarios considered in this study exhibit declines in an initial baseline of ΩA of 1.5–2.5 over the historical period that accelerate during the 21st century (Fig. 4f). The declines are strongest under the higher emission rate ssp585 and esm-ssp585 scenarios where most of the clusters cross the threshold of corrosiveness (ΩA < 1) around mid-century. The NASW cluster in both models is an exception as ΩA remains above the corrosive threshold until the late 21st century. The delay of 2 to 3 decades compared with the pan-Arctic average is related to stable or increasing trends in sea surface salinity that maintain the surface water buffer capacity. The same effect is also seen for the SASW cluster in the esm-ssp585 scenario performed with GFDL-ESM4.

The transition to corrosive surface waters expressed as the fraction of cluster area is remarkably abrupt in the CASW and ASW clusters (Supplemental Fig. 2). The transition from baseline historical conditions (~20% corrosive) to complete corrosive conditions occurs over the course of 1–2 decades under the high carbon emissions. Internal climate variability will play a pivotal role in determining the exact timing of when this abrupt shift to corrosive surface water occurs in the middle of this century. As such, the dates shown in this study are not exact projections as they are based on a single model family and under the lower and higher emissions scenarios considered. An analysis that includes additional CMIP-class models, ensemble members, and considers additional climate change scenarios will likely result in a broader range of timing for this abrupt change. This uncertainty poses a challenge for adaptation and resilience efforts towards corrosive Arctic waters as the projected timescale of the transition is likely to be shorter than the uncertainty range of when such a transition occurs.

Internal variability limits confidence in projections of the exact timing of the earlier transition to corrosive surface waters in the CASW and ASW clusters relative to the pan-Arctic average when using annual average values of ΩA. The relative vulnerability of these two clusters to changing ΩA compared with the rest of the Arctic basin is more apparent when considering changes occurring during the annual cycle. Trends in ΩA are shown for each month in the annual cycle for the years 2015–2050 at the beginning of each SSP scenario (Fig. 5). Trends are shown for surface ΩA, surface pH, sea surface temperature, and sea surface salinity. All trends are expressed in units per decade.

Fig. 5: Projected monthly trends in surface carbonate and water mass properties.
figure 5

Trends shown for each month for the years 2015 to 2050 and are expressed in units per decade for surface calcium carbonate saturation with respect to aragonite (a, ΩA, mol m−3/mol m−3), surface pH (b), sea surface temperature (c, deg C), sea surface salinity (d, p.s.u.). Panels I and II correspond to GFDL-CM4 and panels III–V correspond to GFDL-ESM4. Trends are for the pan-Arctic average (black), Central Arctic Surface Waters (CASW, blue), more general Arctic Surface Waters (ASW, red), and North Atlantic Surface Waters (NASW, purple), and Sub-Arctic Surface Waters (SASW, green) clusters. Black vertical bars denote the standard error of each monthly trend.

The strongest trends in surface ΩA are seen in the CASW cluster for all models and all scenarios. Trends in ΩA of −0.2 and −0.25 decade−1 are found in the CM4 ssp585 scenario. Trends of −0.1 to −0.2 decade−1 occur in each of the other models and scenarios. Monthly trends in ΩA for ASW and CASW are equal to or stronger than trends in the pan-Arctic average, while trends in the NASW and SASW clusters are generally weaker than the pan-Arctic average. The strongest trends in ΩA for the CASW cluster occur in the late summer and early fall, while trends in ΩA for the remaining clusters are strongest in the late spring and early summer. Trends in surface pH (−0.04 to −0.07 decade−1) are consistent with trends in ΩA, with stronger summertime trends present for the CASW, ASW, and pan-Arctic averages. Overall, these trends in ΩA and pH highlight the vulnerability of the CASW cluster to the most rapid transition to corrosive conditions, especially in the late summer months.

The differences in trends in surface ΩA among the clusters exhibit relationships with the monthly trends in the surface water mass properties of temperature and salinity. The strong late summer declines of surface ΩA in the CASW cluster (−0.3 to −0.4 p.s.u. decade−1) are consistent with strong trends toward surface freshening and relatively weak trends in surface warming (<0.2C decade−1) during these same months. The SST trends in this cluster are very small compared to trends in salinity and indicate that surface freshening is the dominant driver of both the acidification and the decline in the buffering capacity of these waters. These late summer trends are also consistent with the changes in Arctic sea ice discussed earlier. Since this cluster is projected to retain first-year ice the longest into the 21st century, ice-free conditions in the Arctic in the summer months have the strongest impact on trends in acidification in this cluster compared to other months throughout the year.

The ASW and pan-Arctic averages are projected to experience stronger trends in SST (0.3–0.6 °C decade−1) and weaker changes in SSS (<0.02 p.s.u. decade−1) throughout the annual cycle. This suggests that this region has already experienced changes in surface carbonate and water mass properties in response to ice-free summer conditions during the historical period. Since these regions are now projected to experience their strongest declines in ΩA in the early summer months, the shortening of the wintertime Arctic sea ice season is most responsible for the declines early in the summer.

Discussion

The Arctic Ocean remains especially vulnerable to acidification compared with the rest of the world’s oceans. The application of unsupervised machine learning techniques has the ability to objectively elucidate patterns and relationships among physical ocean variables and components of the carbonate system that are driving acidification. It is remarkable that the geographic boundaries of the clusters are broadly consistent across the two models tested here and across both lower and higher emissions scenarios. CM4 and ESM4 have differences in their formulation (see methods), including nominal horizontal ocean resolution, ocean mixing parameterizations, biogeochemical model formulation and complexity, and their transient climate sensitivity. The surface OA response is primarily driven by changes in physical ocean properties. Overall, our results suggest that within the Arctic basin, changes in the surface freshwater balance and alkalinity are the dominant controls on the regional variability in OA projected among the clusters. Sustained efforts toward reducing model biases in Arctic mixing and stratification, for example, will play an important role in refining future projections of acidification.

Arctic sea ice is a major differentiator between the two central Arctic clusters identified in this study. The broad region in the central Arctic where acidification occurs more rapidly than the pan-Arctic average is defined by the observable present-day sea ice extent. This provides a powerful metric that could be used to constrain models and their future projections of OA. Arctic sea ice, in addition to being critical for other climate processes such as transient climate sensitivity and the sea ice-albedo feedback, plays an important role in determining the trajectory of acidification in the central Arctic. Continued efforts to reduce uncertainty in future projections of sea ice will directly relate to improvements in projections of Arctic surface acidification.

The abrupt nature of the transition from supersaturated waters with respect to aragonite to undersaturated waters that are corrosive poses a risk to some marine organisms where the decadal timescale of this change is short compared to the longer (i.e., evolutionary) timescales required for adaptation. The uncertainty in this timing of the abrupt transition to corrosive surface waters in the central Arctic is on the order of several decades across models and across forcing scenarios, which is comparable to the timescale of the change itself. Thus, narrowing the uncertainty in the timing of this abrupt change is important from a human perspective to properly assess and interpret risks to marine ecosystem environments. This timing difference is potentially related to several factors that warrant further exploration, including model formation, resolution, sea ice response, and transient climate sensitivity. However, there are clear indications based on the monthly trends in ΩA that regions presently covered by multi-year ice will experience the most rapid change in OA during the first half of the 21st century.

While much of this surface analysis has direct impacts on the epipelagic ocean, there is overlap with deep water formation regions in the North Atlantic. More exploration is needed to connect these surface changes to deep OA and the results of this cluster analysis suggest starting locations for tracing the origins of this signal. Additional work is also needed to effectively scale this specific clustering methodology across the CMIP ensemble of models. Factors such as model resolution, distributions of responses, treatment of outliers, and biases in the mean state (for example, sea ice) may limit the applicability of this specific method to other model outputs. However, these results suggest that with more exploration, robust generalized clustering methods could be developed to facilitate cross-model evaluation. The development of sea-ice-based emergent constraints on future Arctic OA would complement a growing body of existing constraints18,30 and ultimately aid in reducing uncertainty in projections.

These results highlight differences in the responses of just three fields—sea surface temperature, salinity, and surface pH—that can isolate the dominant geographic regimes of OA when analyzed with unsupervised machine learning techniques. Moreover, these regimes have very different timings of response and driving mechanisms that are less apparent when traditional oceanographic boundaries, such as all waters poleward of 65 °N, are used for analysis. Physical climate changes through the melting of Arctic sea ice, high-latitude warming associated with polar amplification, and high-latitude changes in freshening associated with an enhanced hydrologic cycle all contribute to the project changes in pH across the clusters. More water mass-based analyses are needed to fully understand the implications of OA and quantify uncertainty in projections throughout the world’s oceans, particularly in sub-regions of the Arctic which can be guided by the identified clusters. More specifically, a comparison of the different clusters of Arctic OA with long time series measurements31 and large-ensemble studies32 would help further characterize the role of natural variability in the timing of adverse impacts and narrow uncertainty in future projections.

Methods

Results from two coupled numerical climate models developed at the National Oceanic and Atmospheric Administration’s Geophysical Fluid Dynamics Laboratory (NOAA-GFDL) were used in this study. The models differ principally in their ocean model33 resolution with GFDL-ESM4 using a 0.5 tri-polar horizontal grid with mesoscale eddy parameterization and GFDL-CM4 using a higher eddy-permitting 0.25 horizontal grid. Both models contain variants of a 1° atmosphere component34,35 (AM4) with CM4 employing a simplified atmospheric chemistry scheme while ESM4 uses an interactive chemistry scheme. Both models use GFDL’s Sea Ice Simulator version 233.

The biogeochemical components used in both models differ in their complexity. ESM4 uses version 2 of the Carbon Ocean And Lower Trophics (COBALTv2) model28 comprised of 33 tracers capable of representing nutrient cycling and food web dynamics. CM4 uses version 2 of the Biogeochemistry with Light, Iron, Nutrients, and Gas (BLINGv2) model36, which has reduced complexity compared to COBALTv2 and six tracers that are more computationally efficient in the higher resolution ocean model. While BLINGv2 resolves fewer explicit ecosystem interactions than COBALTv2, both models perform well at representing the carbonate cycle and the transport of dissolved inorganic carbon and total alkalinity throughout the ocean36. Both models use the Model of the Ocean Carbonate System (Mocsy) routines for simulating components of the carbonate system37. Details on the initialization and performance of COBALTv2 can be found in Stock et al. 2020 and BLINGv2 in Dunne et al. 2020.

All simulations analyzed in this study38,39,40,41,42 were part of NOAA-GFDL’s contributions to the Sixth Coupled Model Intercomparison Project43 (CMIP6) and are available through the Earth System Grid Federation (ESGF). The carbonate ion concentration in equilibrium with aragonite (“co3satarag”) was not saved directly for CM4 and was calculated offline with the Mocsy routines using monthly mean fields of surface temperature, salinity, and dissolved inorganic carbon, total alkalinity, and phosphate. Time-evolving climate forcing from greenhouse gases, atmospheric aerosols and their precursors, volcanic emissions, and solar variations are used in the historical simulations. Future climate projections are based on the higher emissions (ssp585) and lower emissions (ssp245) scenarios25. CM4 uses prescribed concentrations of historical and future atmospheric CO2. ESM4 is run with both prescribed atmospheric concentrations of CO2 and fully interactive CO2 emissions that are consistent with the protocol of the Coupled Climate Carbon Cycle Model Intercomparison Project44 (C4MIP). Uncertainty estimates of the time series analysis were calculated based on a Monte Carlo approach where 100 synthetic time series were generated by randomly selecting values from a normal distribution where μ and σ were obtained by filtering the detrended time series with a 20-year boxcar window. Analysis code is available upon request.

Anthropogenic carbon inventory (Cant) was computed by integrating from the start of the historical simulation at the year 1850 through the end of the year 2011. An adjustment of 16.6 GtC was applied to both models to account for carbon between the years 1791 and 1850 following estimates from impulse response functions documented in Bronselear et al. 2017. Patterns of depth-integrated Cant are compared to the 1° × 1° GLODAPv2 gridded climatology26 which is normalized to the year 2002. This product derives Cant based on the transient time distribution of CFC-12 measurements in the GLODAPv2 database.

Differences in time-average surface temperature, salinity, and pH poleward of 45 °N for the years 2041–2060 compared to 1851–1950 were used as predictors for the unsupervised machine learning clustering routines. Linear trends for SST, SSS, and surface pH were computed at each grid point for the 500-year preindustrial control simulations performed with both models and were used to detrend the historical and future simulations. Native output from both models was regridded to a common 1° × 1° grid and the land-sea mask from World Ocean Atlas was applied prior to analysis. An isolation forest algorithm45 was used to remove both outliers and inliers in order to maximize broad regional signals and reduce the likelihood that the clustering algorithm would become overly focused in noisy regions. The t-distributed Stochastic Neighbor Embedding (t-SNE) dimensionality reduction technique46 was applied to reduce the three parameters to a two-dimensional space and to probabilistically collate data points that are close together in the SST, SSS, and pH space while separating those groupings from each other. This probabilistic projection was achieved by minimizing the Kullback-Leibler divergence. Unlike the original SAGE methodology22 where t-SNE was applied only once to the data per iteration, successive applications of t-SNE (n = 4) were found to successfully simplify the projection of the data (Supplemental Fig. 3). This repeated projection was seen to robustly increase the separation among the clusters of data while producing a reasonable trade-off between cluster separation and keeping the number of clusters manageable for analysis. The repeated application of t-SNE creates a highly complex objective function but is fundamentally the same as using only a single application (pers. comm. van der Maaten). The perplexity parameter for t-SNE was set to 500 as larger values capture more of the large-scale structure of the data compared to smaller values. The density-based spatial clustering of applications with noise (DBSCAN) algorithm47 was applied to the output of t-SNE with a minimum points parameter equal to 500. The epsilon parameter was adjusted to achieve a target number of clusters that were assessed visually. Given the stochastic nature of the t-SNE algorithm, the clustering workflow was repeated 40 times for each model. The dominant cluster assignment (i.e., the mode across the ensemble) is shown in Fig. 3 and the hatching denotes where <80% of the ensemble members failed to assign the same cluster designation to the grid point. Only clusters that had at least 50% areal overlap between CM4 and ESM4 were analyzed. All the cluster assignments for each model are shown in Supplemental Fig. 1.

The distributions of Arctic water masses in temperature, salinity, and pH space are not uniform and a clustering methodology that performs well on data that have complex underlying data density structures is needed. Initial work involving commonly used methods, such as K-means and DBSCAN, yielded results that over-emphasized localized density differences in water mass space. The methodology described here strikes an appropriate balance of performing density-based clustering while still retaining the broader scale water mass relationships in the Arctic.