Introduction

Groundwater plays an important role in the global water and carbon cycles1,2,3. It is estimated that almost 44 million km3 of groundwater could be stored in the upper 10 km of the continental crust4. This quantity exceeds the amount of water stored in ice sheets in Antarctica, Greenland, and glaciers combined (30.16 million km3)5,6,7, and far exceeds the combined fresh water available in lakes, swamps, and rivers (0.19 million km3)8. Substantial groundwater contributions occur in approximately 40% of non-dam stream sites in the United States9. In coastal waters, the flux of groundwater to oceans via subterranean groundwater discharge (SGD) is estimated at 2.2–2.4 × 1012 m3 year−110. This influx of groundwater to surface waters can impact ecosystem processes and biota11, increase bulk radiocarbon (14C) dissolved organic carbon (DOC) age in streams12,13 and lead to high apparent bulk 14CDOC ages of aquatic organisms14. The rate and extent of climate change, which is expected to have a significant impact on ecosystems, water and food availability, and human health, is influenced by the magnitude of greenhouse gas accumulation in the atmosphere15. Recent research suggests that groundwater is a significant source of inorganic carbon to surface environments, with studies showing that bicarbonate fluxes from groundwater are an important yet overlooked source of CO2 to the atmosphere16. Currently, the flux of groundwater DOC to the oceans is unknown17,18. However, they could be significant given that groundwater DOC concentrations typically exceed those of coastal DOC concentrations19. Furthermore, the potential for groundwater DOC conversion to atmospheric CO2 through degradation processes is poorly understood. This is partly due to the current lack of studies investigating the molecular transformation of DOM in groundwater systems, and a previous conceptual framework suggesting that highly aged DOM such as that found in deep groundwater should be recalcitrant (stable and unreactive) because more labile DOM is processed first.

DOM consists of tens of thousands of molecules primarily containing carbon (C), hydrogen (H), oxygen (O), nitrogen (N), and sulphur (S). Interactions with the environment such as mineralization by microbes, photodegradation through exposure to sunlight, additional input of local DOM sources, and removal of some formulae result in changes to the number and arrangement of atoms, thereby changing DOM reactivity in the environment. The rate and extent of DOM processing can be indirectly influenced by general water chemistry processes, such as high levels of water-rock interaction (thereby resulting in changes in pH and dissolved minerals20,21), or input of waters affected by agricultural and urban pollution22,23. For example, variations in water chemistry modulate bacterial community compositions which subsequently affects biodegradation pathways24, the production and photolysis of Fe3+-DOM complexes25, or the oxidation of DOM by hydroxyl radicals which are produced at lower pH26 and in the presence of nitrate and nitrite27. Also relevant to climate change is sea-level rise and over-extraction of groundwater in coastal areas, which can result in seawater intrusion into aquifers, and can have the effect of altering groundwater ionic strength28. Changes to ionic strength can affect the ability of DOM to adsorb to mineral surfaces29,30,31. Furthermore, seawater intrusion can influence terminal electron acceptor availability and microbial composition of groundwater32,33 which may result in changes to the metabolic pathways used to mineralise DOM and altered mineralisation rates34.

To date, most of our understanding on DOM cycling is based on non-groundwater oxic aquatic environments (e.g., lakes, rivers, and oceans)35,36 where carboxylic-rich alicyclic molecules (CRAM) containing relatively intermediate H/C and O/C ratios tend to be stable over time37,38,39,40. DOM formulae are classified as CRAM if they contain double bond equivalent (DBE)/C ratios of 0.30–0.68, DBE/H ratios of 0.20–0.95 and DBE/O ratios of 0.77–1.7539. Flerus et al.38 investigated the degradation of DOM in marine environments using Fourier-transform ion cyclotron resonance mass spectrometry (FT-ICR MS) and established a marine DOM degradation index (IDEG) using the ratio of the intensity of ten molecular formula found to be highly significantly correlated with Δ14CDOC (‰) in the ocean. Subsequently, a larger group of CRAM molecules, termed the Island of Stability (IOS), was shown to increase in relative abundance in aging marine DOM37. Increasing relative abundance of molecules lying within the H/C and O/C bounds of the IOS have since been shown to correlate with increasing DOC age in other non-groundwater aquatic environments36,41, thereby raising the question of whether the degradation trajectory of natural DOM towards intermediate H/C and O/C ratios may be consistent, irrespective of environmental conditions. This was further supported by observations of a cascade of degradation processes driving the accumulation of a stable background DOM of similar molecular formulae in marine and lake samples (i.e., molecular-level convergence)35.

Redox conditions play an important role in the degradation of DOM42, which in part is determined by groundwater turnover (i.e., inflow and outflow). In oxic environments, DOM degradation is kinetically controlled due to the high amounts of energy released from oxygen reduction43. Theoretically all DOM molecules are biodegradable in oxic environments44 with preferential biodegradation of high H/C (>1.5) aliphatic, protein-like, lipid-like and heteroatom-containing formulae in surface waters such as rivers, lakes, and oceans45,46,47,48,49. Groundwater is often more reduced than other aquatic environments due to biotic factors such as the progressive microbial consumption of molecular oxygen and other terminal electron acceptors, and abiotic factors limiting oxidation, including the exclusion of sunlight which produces reactive oxygen species that can oxidize DOM50. Research has shown that hydroxyl radicals can also be produced non-photochemically in dark environments51 due to the transfer of electrons from redox-active hydroquinone moieties to O252, or due to Fe-mediated reduction of O253,54. In both cases molecular oxygen is required, thus dark and continuously reduced conditions would prevent these reaction pathways from occurring. Deoxygenation or removal of oxygenated DOM formulae from solution may also result from abiotic processes, such as the preferential adsorption of higher O/C ratio formulae to iron oxyhydroxides55, or the hydrolyzation of diesters or beta-ketoesters by aqueous acid and the decarboxylation of the resulting diccid or beta-keto acid with heat56. In anaerobic environments, biotic degradation of DOM is thermodynamically controlled, favoring DOM preservation as a result of the high amount of energy required to consume organic carbon44. Under these conditions, preserved DOM often consists of formulae with low nominal oxidation state of carbon (NOSC < 0), including some amino acids, sugars and lipids, complex organics and membrane-type compounds44. Importantly, some low NOSC formulae can be aromatic and susceptible to photodegradation57. Hydrogenation of DOM can also occur in the absence of inorganic terminal electron acceptors, where unsaturated compounds may act as organic electron acceptors and serve as a H2 sink58. In geological settings, such as deep groundwaters, H2 may be made available through disassociation of water resulting from the radioactive decay of 238U, 232Th and 40K59. The hydrogenation of organic matter allows for high CO2:CH4 production ratios to persist in anaerobic environments58, thereby likely resulting in increases in H/C and decreases in O/C ratios of DOM. In groundwater, residence times can be millions of years, photo-oxidative processes are absent, and anoxic conditions often prevail. These anoxic conditions result in the slow biodegradation of DOM, which may additionally result in reduced degradation of microbial biomass and hydrogenated DOM formulae including biolabile high H/C aliphatics, peptides and sugars60.

Here, we hypothesize that reduced and dark groundwater conditions allow for the build-up and potentially long-term storage of both photolabile low O/C and biolabile high H/C formulae, in contrast to sunlight-exposed and oxic aquatic environments. We examine stream water, shallow groundwater, and deep groundwater samples, representing a theoretical trajectory from freshly generated surface DOM to highly processed deep aquifer DOM. This allows conclusions to be drawn about DOM transformation as it enters and persists in confined deep aquifers. Radiocarbon (14C) and high-resolution DOM characterisation techniques are used to resolve the groundwater DOM degradation trajectory and reveal for the first time, the molecular fingerprint of ancient DOM (up to 25,310 ± 600 years before present, BP) from a confined, methanogenic aquifer system.

Results and Discussion

Redox conditions, dissolved organic matter age and compositions

The DOM molecular formulae of streams, shallow groundwater from unconfined aquifers (<41 m below ground surface (m bgs)), and deep groundwater from a confined aquifer (>500 m bgs) were highly divergent, as shown by the van Krevelen Diagram (VKD) in Fig. 1. Stream water samples from this study contained dissolved oxygen (DO) > 7 mg L−1 (Supplementary Table 1), DOC concentrations between 1.34–8.26 mg L−1 (Supplementary Table 2), and modern bulk 14CDOC (i.e., younger than the radiocarbon reference year of 1950 CE). Local and regional shallow groundwater samples (n = 9) either contained detectable levels of dissolved oxygen (DO) and nitrate (NO3) or manganese (Mn2+), ferrous iron (Fe2+) or sulfate (SO42−)61 (Supplementary Table 1). These shallow groundwater samples contained low DOC concentrations (0.48–1.58 mg L−1, Supplementary Table 2), intermediate to recent dissolved inorganic carbon (DIC) ages of 4000 years BP to modern, and DOC ages of 2540 years BP to modern (Supplementary Table  3). The deep confined groundwater samples were low in DO, NO3, Mn2+, SO42− and Fe2+ (Supplementary Table 1), and experienced methanogenic conditions62 with low DOC concentrations (0.75–1.10 mg L−1). These deep groundwater samples had DIC ages of >50,000 years BP and DOC ages of 19,080–25,310 years BP (Supplementary Table 3).

Fig. 1: van Krevelen Diagram (VKD) with marginal density plots of molecular formulae.
figure 1

VKD shows formulae unique to only deep groundwater (yellow), shallow groundwater (blue) or stream water (green), and molecular formulae common to all samples (black) as presented in Supplementary Table 4. Black solid lines represent the direction of change in H/C and O/C for chemical reactions including hydrogenation, methylation (or alkyl chain elongation), reduction, and condensation112. Marginal density plots are area-normalized and use a Gaussian kernel function for kernel density estimation. Individual VKDs for all samples are shown in Supplementary Figure 1. The approximate region where carboxylic-rich alicyclic molecules (CRAM) lie is represented by the red dashed polygon57. It is noted that CRAM formulae may represent a highly complex mixture of isomers40. As FT-ICR MS does not allow for discrimination between isomers, it is not possible to determine the chemical structure of these individual CRAM formulae.

Dissolved organic matter processing in groundwater

In shallow groundwater, DOM undergoes transformation through the production of low O/C CRAM formulae and removal of low H/C aromatic formulae (Fig. 2, Supplementary Fig. 2) whilst deep methanogenic groundwaters show an increase in aliphatics and aromatic low O/C (<0.5) formulae (Fig. 2). An initial removal of aromatic groups from the stream water samples during processing in shallow groundwater, and the accumulation of these formulae in deep methanogenic groundwater is evidenced through double bond equivalent (DBE)/O slopes of 0.71 (R2 = 0.98, p = 2.7 × 10−14) and 1.07 (R2 = 0.98, p = 2.7 × 10−14) respectively (Fig. 2a, d). As carboxyl groups consist of one DBE and two O atoms, a linear regression slope for DBE/O of approximately 0.5 is indicative of DOM containing carboxyl groups63. Slopes greater than 0.5, as observed particularly in the stream water and deep groundwater (Fig. 2a, d), suggest the presence of more cyclic bonds63. Adsorption to mineral surfaces is a likely removal mechanism for these formulae in shallow groundwater55. A subsequent production or retention of low O/C CRAM formulae in shallow groundwater is revealed by the overall high percent relative abundance (% RA) of CRAM (91% RA, Fig. 2b), DBE/O slope of 0.54 (Fig. 2b), and the increase in CRAM % RA with Δ14CDOC (‰) observed in the shallow groundwater samples (Supplementary Figure 2a). These CRAM formulae contain a lower average weighted O/C ratio (0.47) compared to CRAM that are higher in intensity in stream water samples (0.53).

Fig. 2: Comparison of dissolved organic matter (DOM) formulae in stream water, shallow groundwater and deep confined groundwater.
figure 2

van Krevelen Diagrams (VKD’s; H/C vs O/C ratios) are shown on the left of each panel, whilst regression lines (solid black lines) showing the correlation between intensity weighted average double bond equivalent (DBE) and number of oxygen atoms in each molecule for formulae: a higher in median relative intensity (intensity difference <0) in stream water compared to shallow groundwater, b higher in median relative intensity (intensity difference <0) in shallow groundwater compared to stream water, c higher in median relative intensity (intensity difference <0) in shallow groundwater compared to deep groundwater, and d higher in median relative intensity (intensity difference <0) in deep groundwater compared to shallow groundwater (see also Supplementary Table 5). Green and blue dashed lines represent the DBE/O regression lines for offshore coastal DOM and inland riverine DOM63, respectively. Point colours in the VKD’s correspond to CHO (grey), CHON (black) and CHOS (red) formulae. “HUPs” refers to highly unsaturated and phenolic formulae. Note: the median differences in molecule intensities have been used in the VKDs to reduce the influence of outliers with very high or low compound intensity. Percentages of carboxylic-rich alicyclic molecules (CRAM) are weighted based on the total sum intensity of the CRAM molecules divided by the sum intensity of all molecules shown in the VKD. A comparison of intensity differences based on molecular mass is provided in Supplementary Figure 3. Separate VKDs are shown for each compound class (CHO, CHON and CHOS) in Supplementary Figure 4.

Marine and soil bacterial metabolites and by-products display high H/C and low O/C60,64,65, and can result in an input of N and S containing DOM66,67 which are rapidly aerobically biodegraded68. Deep groundwater samples from this study are similarly characterised by aerobically biolabile high H/C (1.5 > H/C < 2.0) aliphatic and peptide-like formulae, low O/C (<0.50) formulae, and heteroatom (N or S) containing DOM. These formulae account for 30.3%, 79.9% and 22.0% of the formulae that are higher in intensity in deep groundwater compared to shallow groundwater, respectively (Fig. 2d). Some of the low O/C formulae in Fig. 2d also contain low H/C and are classified as condensed aromatic or polyphenolics (Supplementary Table 5). Aromatic and polyphenolic formulae are sometimes associated with terrestrial inputs from vascular plants in rivers and streams69,70, sedimentary inputs in the subsurface71 (see Supplementary Note 1), or condensation reactions72. Fluorescence-derived parameters such as Biological Index (BIX), Fluorescence Index (FI) and Peak T, which can be used as indicators of recent microbial activity73,74, suggest a predominantly microbial source of these formulae in the deep groundwater samples. A negative correlation is observed between BIX and Δ14CDOC (‰) (p = 2.22 × 10−2, Supplementary Figure 5), indicating greater bacterial DOM contribution as groundwater DOM ages. This is supported by a high FI and tryptophan-like DOM (Peak T) in deep groundwater samples (Supplementary Figure 6 for flouresence excitation emission matrices and Supplementary Table 2). As previously noted, high H/C aliphatics, peptide-like and heteroatom containing formulae have a high biolability in aerobic environments. Their high abundance in deep confined groundwaters therefore suggests a slow microbial recycling of DOM into increasingly high H/C and N or S containing formulae rather than the preservation of aliphatic or heteroatom containing molecules from an original terrestrial source.These formulae are preserved in the methanogenic groundwaters along with photodegradable low O/C aromatic formulae57 which are protected from sunlight in aquifers. Furthermore, LaRowe and Van Cappellen44 demonstrated that the degradation of organic molecules with low nominal oxidation state of carbon (NOSC, approximately <0) is thermodynamically inhibited under reduced conditions, thus, biodegradation of the low O/C formulae in the deep groundwater samples would be limited in anoxic waters due to their low weighted average NOSC (−0.12).

Degradation in groundwater vs. surface aquatic environments

Significantly higher median H/C and lower median O/C ratios (both p = 2.2 × 10−16, Fig. 3a, b) are observed in groundwater compared to a range of non-groundwater aquatic environments including lakes, oceans, rivers, and streams. A progressive shift towards low O/C and higher H/C from stream water to shallow groundwater and deep groundwater is confirmed by significant positive and negative relationships between Δ14CDOC (‰) and weighted average O/C (p = 2.2 × 10−16) and H/C (p = 2.4 × 10−4), respectively (Supplementary Figure 5). This corresponds to a significant decrease in the % RA of IOS formulae in groundwater over time (p = 3.4 × 10−4). These findings contrast marine waters where decreases in H/C and increases in NOSC have been reported with increasing DOM residence times37,38, and are associated with an overall increase in IOS formulae over time in non-groundwater environments36,37,41 (p = 5.0 × 10−7, Fig. 3c).

Fig. 3: Comparison between groundwater and non-groundwater (river, ocean and lake) dissolved organic matter (DOM) composition.
figure 3

a, b show histograms of the O/C and H/C of formulae identified in a range of non-groundwater environments (orange) including Swedish105 and German35 lakes, Mediterranean Sea35,74, North Sea35,113, Antarctic bottom water35, North Atlantic deep water35, Suwannee River114, and streams from this study (Macquarie River, Bell River and Elfin Crossing samples). Groundwater DOM formulae identified in shallow and deep groundwater from this study are shown in purple. Comparisons of the numbers of H, C and O in each environment are shown in Supplementary Fig. 7. A comparison between percent relative abundance (% RA) of Island of Stability (IOS) formulae, % RA highly unsaturated and phenolic compounds (HUPs), biological index (BIX), % RA polyphenolic, % RA aliphatic and % RA condensed aromatics vs Δ14CDOC (‰) in non-groundwater samples from Kellerman, et al.36 are shown in ch with Spearman rho and p-values represented by rs and p respectively. Dashed lines in a, b indicate median values in non-groundwater (orange) and groundwater (purple) environments. F test to compare variances reveals that the variances are significantly different (p < 2.2 × 10−16), with Wilcoxon tests showing significantly lower and higher median O/C and H/C in groundwater compared to non-groundwater samples respectively (both p = 2.2 × 10−16).

Increases in IOS formulae in non-groundwater aquatic environments likely result from kinetic controls on DOM degradation under oxic conditions. In methanogenic portions of the ocean, such as the oxygen minimum zone or in anoxic marine sediments, there may be a build up of low O/C and high H/C formula similar to what is observed in deep groundwaters, due to thermodynamic constraints on DOM degradation. For example, Gan, et al.75 demonstrated an increase in low O/C and high H/C formulae in marine sediments from the Western Mediterranean Sea when exposed to methanogenic conditions for approximately 38 days. Slower respiration rates thus appear to drive the preferential biodegradation of high NOSC formulae and preservation of low NOSC DOM and aerobically biolabile high H/C microbial metabolites and biomass in groundwater76 (Fig. 4). In contrast, DOM in rivers, oceans and lakes typically represents a continual mixing of fresh DOM sources from primary production with pre-existing DOM produced from biodegradation and photodegradation in the primarily oxic conditions. This mixing governs the average DOM composition and results in an overall accumulation of higher NOSC and IOS formulae. Our findings therefore indicate that the stability of the IOS may be context-dependent, relevant in surface environments exposed to photo-irradiation and/or oxic or fluctuating redox conditions, but not relevant to dark and reduced groundwater environments.

Fig. 4: Contrasting dissolved organic matter (DOM) processing pathways in groundwater and non-groundwater (river, lake and ocean) environments.
figure 4

a DOM in shallow groundwater is removed following exposure of hetero-atom containing DOM and DOM with high nominal oxidation state of carbon (NOSC) to biodegradation in suboxic conditions, and the adsorption of aromatic and high NOSC formulae. Accumulation of low O/C, non-Island of Stability (IOS) carboxylic-rich alicyclic molecules (CRAM) occurs in this region. In deep groundwater where conditions are strongly anoxic, respiration rates are low and aerobically biolabile aliphatic and hetero-atom containing DOM are accumulated from microbial biomass in conjunction with a preferential removal of high O/C DOM. Hetero-atom containing DOM may be further accumulated through the hydrogenation of unsaturated formulae and inorganic sulfurization of lipids115. Aromatic DOM also accumulates from microbial biomass with possible contribution from DOM desorbed under high pH116. These processes result in an overall decrease in O/C and an increase in H/C with DOC age. Dark blue dashed arrows denote flow direction. Groundwater DOM processing contrasts with processing of DOM in rivers, lakes and oceans (b) where primary production and subsequent photodegradation of aromatic formulae results in either low or high O/C formulae, and high H/C formulae such as aliphatics which are then rapidly biodegraded. Primary producers may also release biolabile hetero-atom containing DOM68. Any build up of DOM in an anoxic pocket of seawater which is biolabile to aerobic microbes would be circulated and biodegraded upon return to oxic conditions. Exposure to sunlight and the primarily oxic conditions of marine environments result in an overall decrease in DOM H/C, and increase in O/C and IOS formulae with DOC age. Processing pathways for both environments are shown in van Krevelen Diagrams in upper right corners of each panel.

Implications

Our results show that the current paradigm of aged, highly processed, apparently stable DOM occurring in the centre of H/C versus O/C space may actually be constrained to well-mixed, oxic, open waters, and that the oldest DOM appears instead to occur in dark anoxic aquifers where molecular oxygen and attendant reactive oxygen species are unavailable. In these environments, the most persistent formulae are biolabile with high H/C, or contain low O/C including aromatic formulae. Notably, many low O/C aromatic formulae are readily converted into oxidised or aerobically biolabile aliphatic formulae with exposure to photoirradiation57. Natural processes and anthropogenic activities can transport large quantites of groundwater to surface environments where it is exposed to sunlight and oxygen. For example, subterranean groundwater discharge results in the movement of 2397 km3 year−1 of groundwater into coastal marine environments10. Additionally, global groundwater extraction for industrial, domestic and agricultural usage is estimated at 982 km3 groundwater year−177. A large proportion of this is is expected to be ancient DOM with up to 85% of total groundwater in the upper 1 km of continental crust shown to be recharged by precipitation >12,000 years ago78. Using the global median and mean groundwater DOC concentrations of 1.2 mg C L−1 and 3.8 mg C L−179 respectively, these values represent approximately 2.9–9.1 Tg of groundwater DOC exported to oceans from subterranean groundwater discharge annually, and approximately 1.2–3.7 Tg of groundwater DOC removed during groundwater extraction annually. This equates to total DOC fluxes of between 4.1–12.8 Tg year−1 associated with these processes, with the lower estimate approximately equivalent to twice the amount of DOC exported from the Mississippi River each year (2.10 Tg), and the upper estimate approximately equivalent to the amount of DOC exported annually from the Congo River (12.40 Tg)80. Rates of groundwater extraction are expected to rise throughout the 21st century, with annual global extraction projected at 1621 km3 by 209981, 61% higher than the value used here for estimating DOC from groundwater extraction. If exported from ancient groundwaters, this DOM would be highly biolabile and photolabile upon transport to surface environments. Our results therefore confirm that groundwater may be an important source of labile DOM upon transfer to surface environments, and highlights the importance of it’s inclusion in global carbon budget estimates.

Methods

Sampling

Sampling for shallow groundwater was performed at Maules Creek and Wellington, Australia (see Fig. 4 and for further context Supplementary Information in McDonough, et al.82). These shallow groundwater samples were collected from the hyporheic zone or regional unconfined aquifers, and are all known to be recharged by nearby streams during floods or dam releases82,83,84. Stream samples included Bell River and Macquarie River at Wellington, and Elfin Crossing on Maules Creek. Further information regarding the connection between surface and groundwater at Wellington and Maules Creek and the climatic conditions at the time of sampling for Bell River, Macquarie River, Elfin Crossing, WRS03, WRS05, BH12-4, BH18-2, EC31, EC3 and EC6 is also provided in McDonough, et al. and the associated Supplementary Information. BH17-2 and BH17-4 were sampled at Middle Creek, a tributary to Maules Creek. Deep groundwater was sampled from Walgett, Pilliga and Burren Junction within a confined thermogenic aquifer which forms part of the Great Artesian Basin, underlying 22% of the continent of Australia (Fig. 5). In total, three stream water samples, nine shallow groundwater samples (<41 m below ground surface (m bgs)) and three deep groundwater sample (>500 m bgs) were collected.

Fig. 5: Sampling sites. Sampling locations are shown as red diamonds, with the Great Artesian Basin boundary shown in yellow.
figure 5

Major cities are shown as white crosses.

Further information regarding sample locations and depths are provided in Supplementary Table 1. Low DOC concentrations are noted in two of the stream water samples (Bell River (1.87 mg C L−1) and Elfin Crossing (1.34 mg C L−1)) compared to the Macquarie River (8.26 mg L−1). Elfin Crossing stream water originates from an upstream spring85. This water has therefore been processed in the subsurface for a period of time before briefly re-emerging as surface water. Keshavarzi, et al.84 similarly show that Bell River water has been transported through the karstic limestone of the Wellington region.

All sites were sampled for bulk DOC concentration, liquid chromatography—organic carbon detection (LC-OCD), Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS), cations, anions, alkalinity, stable water isotopes, radiocarbon in DOC (14CDOC) and dissolved inorganic carbon (14CDIC), stable carbon isotopes in DOC (13CDOC) and DIC (13CDIC) using the sampling equipment, bottles and methods outlined in McDonough et al82.

Field parameters including dissolved oxygen (DO), pH, and electrical conductivity (EC) were measured before and during sampling using two HACH HQ40D multimeters attached to an in-line Sheffield flow-cell. Deep groundwater samples at Pilliga, Walgett, and Burren Junction were collected from artesian free-flowing bores after field parameters stabilised. Samples were collected directly from a 0.45 µm in-line fast flow filter (Waterra) connected to tubing which was attached to the bore sample taps. Stream water and shallow groundwater samples (<40 m bgs) were collected using the methods and equipment outlined in McDonough et al82. Alkalinity was measured through gran-titrations using sulfuric acid (H2SO4) within 12 hours of sample collection. Cation samples were acidified with 65% Suprapur® nitric acid within 12 hours of sample collection.

Pre-treatment, analysis, and post-processing

Pre-treatment methods for 14CDOC and 14CDIC are outlined in McDonough et al82. CO2 extraction and graphitisation procedures are outlined in Hua et al.86 14CDOC (pMC) results were converted to Δ14CDOC (‰) using the methods described in International Atomic Energy Agency87. Δ14CDOC (‰) values represent a per mille depletion or enrichment in 14C relative to the standard, normalised for 13C isotopic fractionation. This unit has been used to allow for comparison to other non-groundwater aquatic DOM degradation studies which predominantly report DOC age as Δ14CDOC (‰). Solid phase extraction (SPE) was performed prior to analysis by FT-ICR MS using reversed phase BondElut PPL sorbent (100 mg cartridge, Agilent Technologies) using the methods in Dittmar et al88. A detailed comparison of potential differences resulting from comparing bulk DOM LC-OCD and bulk DOM 14CDOC results with SPE DOM FT-ICR MS results are provided in the Supplementary Information of McDonough et al82.

A summary of the analytical instruments and laboratories where analyses were performed for each analysis type are provided in Table 1.

Table 1 Analytical instruments and analysis locations.

For FT-ICR MS analysis, the sample solution was infused via a microelectrospray source89 (50 µm i.d. fused silica emitter) at 500 nL/min by a syringe pump. Typical conditions for negative ion formation were: emitter voltage, −2.4–2.9 kV; S-lens RF level: 45%; and heated metal capillary temperature, 350 °C. DOM extracts were analyzed with a custom-built hybrid linear ion trap FT-ICR mass spectrometer equipped with a 21 T superconducting solenoid magnet90,91. Ions were initially accumulated in an external multipole ion guide (1–5 ms) and released m/z-dependently by decrease of an auxiliary radio frequency potential between the multipole rods and the end-cap electrode92. Ions were excited to m/z-dependent radius to maximize the dynamic range and number of observed mass spectral peaks (32–64%)93, and excitation and detection were performed on the same pair of electrodes94. The dynamically harmonized ICR cell in the 21T FT-ICR is operated with 6 V trapping potential95,96. Time-domain transients of 3.1 s were acquired with the Predator data station that handled excitation and detection only, initiated by a TTL trigger from the commercial Thermo data station, with 100 time-domain acquisitions averaged for all experiments97. Mass spectra were phase-corrected and internally calibrated with 3–5 highly abundant homologous series that span the entire molecular weight distribution based on the “walking” calibration method98. Experimentally measured masses were converted from the International Union of Pure and Applied Chemistry (IUPAC) mass scale to the Kendrick mass scale99 for rapid identification of homologous series for each heteroatom class (i.e., species with the same CcHhNnOoSs content, differing only by degree of alkylation)100. For each elemental composition, CcHhNnOoSs, the heteroatom class, type (double bond equivalents, DBE = number of rings plus double bonds to carbon, DBE = C –h/2 + n/2 + 1)101, and carbon number, c, were tabulated for subsequent generation of heteroatom class relative abundance distributions and graphical relative-abundance weighted images and VKDs. Peaks with signal magnitude greater than 6 times the baseline root-mean-square (rms) noise at m/z 500 were exported to peak lists, and molecular formula assignments and data visualization were performed with PetroOrg © software102. Molecular formula assignments with an error >0.5 parts-per-million were discarded.

Molecules were classified as CRAM where DBE/C were between 0.30-0.68, DBE/H were 0.20-0.95 and DBE/O were between 0.77-1.75 as per Hertkorn, et al.39. A modified aromaticity index (AImod) was calculated for individual molecular formulae per the methods of Koch and Dittmar103 and Koch, et al.104. Formulae with AImod ≤ 0.5, 0.5< to ≤0.66, and >0.66 are defined as highly unsaturated and phenolics, polyphenolic and condensed aromatic respectively. Formulae with 1.5  ≤  H/C ≤ 2.0, O/C  ≤ 0.9 and N = 0 are defined as aliphatic. Formulae with O/C > 0.9 are defined as sugar-like whilst peptide-like are defined as 1.5  ≤  H/C  ≤ 2.0, and N > 0105. IOS compounds listed in the Appendix of Lechtenfeld, et al.37 were matched to compounds within the current dataset and weighted by summing the relative intensity of IOS molecules within each sample and dividing by the sum of the relative intensities of all molecules within the sample. Degradation index (IDEG) values were calculated as per Flerus, et al.38. Figure 2 was prepared using ggplot2106 after calculating the difference in median intensity value of each molecule in stream water (Macquarie River, Bell River, and Elfin Crossing), shallow groundwater (BH17-2, BH17-4, BH12-4, BH18-2, EC31, EC6, EC3, WRS03, and WRS05) and deep confined groundwater (Pilliga, Burren Junction and Walgett). Average weighted DBE vs number of oxygen (nO) atoms were calculated by multiplying the DBE value for each molecular formula by the difference in intensity observed between stream water, shallow groundwater, and deep groundwater for the same molecular formula. The sum of the values within each category of nO (i.e., 2–21) was then divided by the sum of intensity differences within each category of nO to obtain the weighted average DBE for each nO. Spearman correlations presented in Fig. 3 were prepared in R v.1.1.456 using the ggplot2 library.

Post-processing for LC-OCD involved the use of ChromCALC (DOC-LABOR, Karlsruhe, Germany) which allows for chromatogram visualisation and peak fitting for DOC fractions. Hydrophillic fractions are assigned based on their molecular weights which are determined by retention times. Hydrophilic DOC is calculated as the concentration of DOC that passes through the column, whilst hydrophobic DOC (HOC) is defined as the portion of DOM that is retained on the column. HOC (% RA) is calculated by the DOC concentration of the sample minus the concentration of hydrophilic DOC. Further details can be found in Huber, et al107.

Fluorescence data was corrected for inner filter effects, 1st, and 2nd order Rayleigh scattering, and normalised to Raman Units using the area of the Raman peak (measured each day) in the Aqualog software after sample measurement. Post-processing to calculate peak T, biological index (BIX), and fluorescence index (FI) was performed in R v.1.1.456 using the packages staRdom and eemR. T represents tryptophan-like (ex: 275, em: 340) peaks respectively108. BIX provides information regarding recent autochthonous DOM contribution and is calculated as the fluorescence at an excitation of 310 nm and emission at 380 nm, divided by the fluorescence at excitation 310 nm and emission at 430 nm109. FI is calculated as the ratio of fluorescence at emission 450 nm and 500 nm obtained at an excitation of 370 nm. Higher FI values (~1.9) indicate microbially derived fulvic acids (FA), whilst lower values (~1.4–1.5) indicate terrestrially derived FA110. Higher BIX values (>1) represent higher amounts of microbially-derived proteins.

DOC release estimations and uncertainties

DOC release estimations from SGD and global groundwater extraction were obtained from Zektser, et al.10 and Margat, and van der Gun77 respectively. Zektser, et al.10 use a hydrologic-hydrogeodynamic method to estimate SGD. The calculation uses the average total thickness (200 m) of all aquifer systems from which groundwater is discharged to seas, the corresponding active porosity of rocks (0.1) in this zone, the average hydraulic gradient (0.005), and a 10 m day−1 coefficient of filtration. They estimate a shoreline length of 600,000 km. Four rock types are defined for the purposes of the calculations: sedimentogenic-porous, sedimentogenic-fissured, karstic and magmatogenic-metamorphic. Notably, two different methods were used to compare the values of the total groundwater discharge to oceans – hydrogeodynamic (2200 km3 year−1) and hydrologic-hydrogeological (2400 km3 year−1). Both show similar results, however hydrogeodynamic methods require initial filtration parameters which are often not available due to poor knowledge of hydrogeological conditions in many areas. For this reason, the 2400  km3 year−1 value is taken to be more accurate. Full details of assumptions and uncertainties are provided in Zektser, et al.10. Calculations of DOC extracted each year globally from groundwater were made based on the sum of the continental groundwater extraction values for irrigation, domestic and industrial uses provided in Margat and van der Gun77. The global annual groundwater extraction value (982 km3 year−1) was summed with the global annual SGD value (2400 km3 year−1) and multiplied by the estimated global mean and median groundwater DOC concentrations presented in McDonough, et al.79 (3.8 mg C L−1 and 1.2 mg C L−1). Global groundwater DOC concentrations are based on 9404 samples from 32 countries in 6 continents. The data show differences in DOC concentrations with changes in climate, land use, inorganic water chemistry (redox conditions), and groundwater residence times. The character of groundwater DOM is also likely affected by these parameters; thus, groundwater DOM lability will differ depending on factors which include, but are not limited to, terminal electron acceptor concentrations, water residence time (and duration of exposure to anoxic conditions), infiltration rates, water temperature, and landuse. McDonough, et al.79 identified significantly lower DOC concentrations in deep aquifers compared to shallow aquifers in a groundwater DOC dataset from the United States, suggesting that the median and mean global groundwater DOC concentrations may not be applicable when assessing the likely DOC flux from deep aquifers. We also note that whilst SGD occurs along the coast, samples in the study are taken from both inland and coastal regions, therefore the average climate types and land uses inherent in the 9404 groundwater sample locations may not represent the dominant land use and climate types in coastal areas. Finally, not all SGD will be from ancient or anoxic groundwaters which show the most photodegradable and biolabile character.