Quantifying and predicting antimicrobials and antimicrobial resistance genes in waterbodies through a holistic approach: a study in Minnesota, United States

The environment plays a key role in the spread and persistence of antimicrobial resistance (AMR). Antimicrobials and antimicrobial resistance genes (ARG) are released into the environment from sources such as wastewater treatment plants, and animal farms. This study describes an approach guided by spatial mapping to quantify and predict antimicrobials and ARG in Minnesota’s waterbodies in water and sediment at two spatial scales: macro, throughout the state, and micro, in specific waterbodies. At the macroscale, the highest concentrations across all antimicrobial classes were found near populated areas. Kernel interpolation provided an approximation of antimicrobial concentrations and ARG abundance at unsampled locations. However, there was high uncertainty in these predictions, due in part to low study power and large distances between sites. At the microscale, wastewater treatment plants had an effect on ARG abundance (sul1 and sul2 in water; blaSHV, intl1, mexB, and sul2 in sediment), but not on antimicrobial concentrations. Results from sediment reflected a long-term history, while water reflected a more transient record of antimicrobials and ARG. This study highlights the value of using spatial analyses, different spatial scales, and sampling matrices, to design an environmental monitoring approach to advance our understanding of AMR persistence and dissemination.


Results
The environmental dimension of antimicrobial residues and ARG can be described at two different spatial scales: a macroscale encompassing input sources of antimicrobials and AMR throughout the state of Minnesota, and a microscale, comprising the distribution of antimicrobials and ARG in specific waterbodies in relation to their proximate input sources.
Macroscale analysis. Potential input sources of antimicrobials and ARG were located heterogeneously throughout the state. Human population density was highest in the eastern-central part of Minnesota, which includes the Twin Cities metropolitan area. Wastewater treatment plants (n = 715) and hospitals (n = 128) were located throughout the state, and ethanol plants (n = 20) were largely located in the southern area. Regional livestock density varied by species. Swine farm density was highest in the southern part of the state, especially the south-western and south-central area. Cattle farms were located throughout the state, but the highest density of farms was found in the south-west (beef cattle) and central and south-east regions (dairy). Most turkey and chicken farms were in the eastern (turkey) and central (chicken) regions of Minnesota (Figs. 1,2).
The Global Moran's I test showed a statistically significant positive spatial autocorrelation for tetracycline concentrations in water (Index = 0.50, z-score = 2.30, p = 0.02), indicating a general tendency for locations near each other (distance between these sites was on average 15 km) to have similar tetracycline concentration values, while the concentrations of ciprofloxacin and sulfadimethoxine (measured in water samples), tetracycline and sulfadimethoxine (measured in sediment samples) and ARG abundance measured in both water and sediment   20 showed one significant high-high cluster and one significant low-low cluster, identifying locations that were closer together in distance and had similar tetracycline concentrations (Fig. 6). Contour maps of predicted antimicrobial concentrations and ARG abundance across Minnesota, generated using the kernel interpolation method, are shown in Figs. 7 and 8 for each antimicrobial and ARG in water and sediment samples. Areas of the state without sampling locations appear empty in the maps, and the darker color is associated with higher predicted antimicrobial concentrations and ARG abundance. Kernel interpolation results provided an approximation of the range of antimicrobial concentrations and ARG abundance likely to be detected at unsampled locations. Interpolation was not limited to waterbody sites but provided predictions for areas surrounding the sampling sites within a specific bandwidth, which included areas of land. For antimicrobial concentrations, the highest predicted values for ciprofloxacin in water were in one area expanding from the south-west towards the south-central area of the state, and in a smaller area towards the north-eastern side. Sulfadimethoxine predicted values were highest in the northern areas and in the south-west in water samples, while in sediment, the highest predicted values were in a small area in the south-east. The highest predicted values for tetracycline in water samples were in the south-west towards the center and the central area expanding towards the northern and southern areas, while in sediment samples, the highest predicted values were in the south-east/central regions. Antimicrobial concentrations at each sampling site are found in Supplementary  Table S11. For ARG abundance, the areas with highest predicted values for bla SHV in water were in the centralwestern areas and south-central areas, while in sediment it was in the south-west; for sul1 abundance in water, the highest predicted values were north-west and two areas in the central region, while in sediment it was central and south-west. Finally, for tet(A), the highest predicted abundance in water and sediment was in the south-east, and in the south-western area as well for sediment samples only (Figs. 7, 8).

Microscale analysis.
Associations between environmental and spatial factors and the antimicrobial concentrations in water samples (ciprofloxacin, tetracycline, and sulfadimethoxine) and sediment samples (tetracycline, and sulfadimethoxine) were assessed for 2019 samples (n = 19 water, n = 17 sediment). A moderate positive correlation was identified between water pH and sulfadimethoxine concentrations (r = 0.51, p = 0.04) in 2019 sediment samples (Supplementary Tables SS12-13).
There were moderate negative correlations between water temperature and sul1 (r = − 0.43, p = 0.04), sul2 There was no statistically significant effect of wastewater treatment discharge on antimicrobial concentrations in water or sediment samples in 2019 after adjusting for waterbody as a random effect in the Linear Mixed Models (LMMs). Antimicrobial concentrations, however, were highest at wastewater discharge sites compared to www.nature.com/scientificreports/ upstream and downstream sites at the same location. Upstream sites had higher concentrations of tetracycline in both water and sediment, and of sulfadimethoxine in water samples compared to downstream sites (Supplementary Tables S18-S19). In the case of ARG abundance, sul1 (p = 0.0009) and sul2 (p = 0.007) abundance in water samples, and bla SHV (p = 0.0001), intI1 (p = 0.02), mexB (p = 2.2 × 10 -16 ), and sul2 (p = 0.03) abundance in sediment samples were significantly higher at the discharge site (WT) compared to upstream (U) and downstream (D) sites. The estimated marginal mean (EMM) plots are shown in Supplementary Figures S5-S10. ARG abundance did not yield any significant differences between the two time sampling events (T1: end of June-early July 2018, and T2: end of July to mid-August 2018) when adjusting for the random effect of waterbody site for water or sediment samples during 2018, except for bla SHV (p = 1.84 × 10 -5 ) and intI1 (p = 2.48 × 10 -5 ) in water, where there was a higher abundance at T2. The difference in gene abundance between T1 and T2 was expressed as the difference in EMM, based on the LMMs. The EMM measures represent gene abundance at each site adjusted for waterbody site and they are plotted in Supplementary Figures S11-S12. www.nature.com/scientificreports/ Table 1. Summary statistics (arithmetic mean ± standard error, standard deviation (SD), median, and range: minimum (min) and maximum (max)) for antimicrobial concentrations (ng/L) in water samples across sites analyzed in 2018 (n = 17).

Discussion
Environmental monitoring is needed to advance our understanding of AMR persistence and dissemination. In this paper we describe an approach guided by spatial mapping to quantify and predict antimicrobials and ARG at two spatial scales (macro and micro) considering inputs from different sources and including two different sampling matrices (water and sediment) from waterbodies in Minnesota. Two different approaches were used for field sampling: convenience (in 2018) and spatially-guided (in 2019). Convenience sampling involves a subjective selection of field sites based on the "ease of obtaining a sample" 21 . It has the advantages of selecting sites with easier access, more availability, and it is a useful approach when there are limited resources. However, it is more likely to introduce biases into the results, which tend to be mostly descriptive 22 . Spatially-guided sampling involves underlying knowledge about the population of interest. In our study, spatial mapping using Geographic Information Systems (GIS) was used to determine the spatial distribution of human and livestock densities, wastewater treatment plant and other point source locations, as well as to provide information about waterbody locations throughout the state. This diverse information informed the second field sampling season (in 2019), as had been suggested in an exploratory study conducted in Ireland 14 .
Spatial analysis findings were mostly descriptive given the relatively small sample size throughout the entire state, which did not allow for a more sophisticated interpolation method such as kriging. Instead, we used kernel interpolation, which is an indicated method for small datasets 23 . Kernel interpolation results provided an approximation of the range of antimicrobial concentrations and ARG abundance likely to be detected at unsampled sites. Interpolation was not limited to waterbody sites but provided predictions for areas surrounding the sampling sites within a specific bandwidth. Spatial analyses were also used to identify clusters of antimicrobial concentrations (e.g., tetracycline), meaning locations that were closer together in distance and had similar tetracycline concentrations. Further, by mapping the sampling sites, we were able to determine the locations with the highest antimicrobial concentrations and ARG abundance.
The two spatial scales provided different information. The macroscale analysis was useful to identify large areas throughout the state with higher human and/or animal density that could influence antimicrobial concentrations and ARG abundance, but it involved higher uncertainty and less precision. Antimicrobials with highest concentrations included oxytetracycline, norfloxacin, clarithromycin, and erythromycin. The first three antimicrobials were detected in higher concentrations in lakes near human populated areas. Finding higher concentrations of oxytetracycline in these areas could be explained by its use in companion animals (dogs and cats), because this antimicrobial is not used in human medicine [24][25][26] . Norfloxacin and clarithromycin belong to the quinolone and macrolide antibiotic classes, respectively, which are among the most prescribed antimicrobials in Minnesota outpatient settings 27 . Similar findings have been reported elsewhere. Norfloxacin is one of the most frequent quinolones detected in lakes around the world 7 . Clarithromycin has been detected in human waste in a river in South Korea 28 and was widely detected in urban lakes in Vietnam 29 . Most of the research to date on antimicrobial concentrations in surface water has been conducted in Asia 7 . However, at the local level, the Minnesota Pollution Control Agency (MPCA) has conducted state-wide surveys during the past decade on emerging contaminants in the Minnesota environment. None of the three antimicrobials mentioned above (oxytetracycline, norfloxacin, and clarithromycin) were detected in quantifiable concentrations in a study focused on lakes and rivers 30 , and only clarithromycin was detected in a Minnesota site (Cedar River) in a study focused on rivers and streams 31 .
In our sediment samples, erythromycin was the compound with the highest concentration in a lake located in a state park, with minimal direct human or animal influence. This macrolide has been reported in high Table 3. Summary statistics (arithmetic mean ± standard error, standard deviation (SD), median, and range: minimum (min) and maximum (max)) for antimicrobial concentrations analyzed in sediment samples (ng/g) across sites in 2019 (n = 17). www.nature.com/scientificreports/ concentrations in lake water and sediment samples globally 7 . Erythromycin is used in humans and some animal species, and is also naturally occurring 32 , making it challenging to determine its specific origin. It is important to note though, that when comparing findings across studies, it is essential to keep in mind differences in laboratory methodology that can lead to erroneous comparisons. While the highest antimicrobial concentrations were found near human populated areas, sites with the highest ARG abundance were more widespread across waterbodies in the state, found both in areas near higher livestock density and areas without much human or animal influence. Even though the highest antimicrobial concentrations were found near human populated areas, there was no association between human density and antimicrobial concentrations. There was no association with neither of the environmental parameters (water temperature, water pH, conductivity) in the 2019 water samples. This could be a result of the low power of our study, the variable nature of water samples, or that antimicrobial concentrations might not be influenced by these parameters in the locations sampled. However, among sediment samples, there was a moderate correlation between water pH and sulfadimethoxine concentration. Sulfonamide persistence and transport (including sulfadimethoxine) are reported to be influenced by pH in soil, with decreased pH values leading to increases in sorption potential of sulfonamides in soil 33 .
The microscale analysis was useful to assess the influence of a specific point source on the measured outcomes (antimicrobial concentration and ARG abundance), but the information gained from this analysis was limited to the specific sites where samples were collected. There was no significant effect of wastewater on antimicrobial concentrations, but the concentrations were highest at the wastewater discharge site compared to upstream and downstream sites at the same location. This could indicate a relatively short spatial distance affected by the discharge. In the case of ARG, the abundance of several genes was significantly higher at the wastewater www.nature.com/scientificreports/ discharge point compared to upstream and downstream sites. Genes encoding resistance to sulfonamides have been reported to be more abundant than other ARG in studies of different types of wastewater treatment plants 34 and to be common in aquatic systems 11 . These genes (sul1 and sul2) are very mobile, and along with tetracycline resistance genes are the most studied ARG in lakes and rivers 7 . In our study, sul1 and bla SHV were the most abundant ARG across the waterbodies investigated throughout Minnesota. The gene bla SHV , which confers resistance to beta-lactam antimicrobials and thus it is of public health concern, has been increasingly reported in environmental studies around the world, mostly near or at wastewater treatment plants 18,35 . Other commonly detected genes in our study included tet(A), which is frequently found in wastewater environments (including in previous studies within Minnesota waterbodies 36 ), fish farm ponds, and swine lagoons 11 . Each of the spatial scales (macro and micro) underscores different aspects of environmental AMR, thus informing different opportunities for management and mitigation. Spatial scale has been previously discussed in the context of AMR in the environment, highlighting how causal factors important in the emergence, dissemination, and persistence of AMR can have spatial relevance at different scales 4,37,38 . Both sediment and water samples were collected because they provide different information about antimicrobial and ARG presence in waterbodies. Because sediment can retain antimicrobial compounds and ARG longer than water, sediment samples reflect the long-term history of antimicrobials and ARG in that location. In fact, higher concentration of antimicrobials in sediment can indicate accumulation with the potential for subsequent release into the water matrix 39,40 . Water samples reflect a more transient record of antimicrobial and ARG 32,39,40 . Laboratory quantification of antimicrobials in sediment proved more challenging than in water samples due to the more complicated matrix, leading to difficulties with chromatography and higher background noise in the www.nature.com/scientificreports/ detector. Even though we did not evaluate the relationship between water and sediment for the same sites in this study, past studies have shown that pharmaceuticals and pesticide concentrations in water and sediment are decoupled due to the differences in water and sediment residence time and transport 41 . For ARG, sul1, bla SHV , intI1, and mexB had the highest abundance in both water and sediment samples, potentially reflecting the ubiquity of some ARG across environmental matrices. This could also reflect the ability of these genes to be transferred within aquatic environments 42 . There were specific genes, however, detected only in one of the two sampling matrices. Specifically, bla OXA , impl3, intI3, and qacF were only detected in water, while ermB, mefE and tet(W) were only detected in sediment. This type of finding has been previously associated with the unique environmental conditions in each environmental matrix and to the existence of specific microorganisms in each one of the matrices 43 . Even though bla OXA was only found in water samples in 2018 at a low frequency across samples (21%), this gene is of importance for public health because it encodes for resistance to carbapenems, antimicrobials of last resort 44 .
One limitation to the interpretation of data collected in this study is that, in addition to lakes, samples were collected from rivers and creeks, because in some cases, these were the compartments up or downstream of a point source. Hydrologically, these waterbody types differ, including water flow dynamics, which may affect the retention times for antimicrobials and ARG. We were not able to compare the effect of waterbody type given the sample size limitations, but it is important to take this factor into account when assessing the results and planning future field studies. In addition, the spatial approach we used to make predictions at the macroscale was not as powerful as other statistical approaches such as kriging, which could have been conducted with a higher number of sampling sites. Further, the interpolation method we used did not discriminate between waterbodies and areas of land, which needs to be considered when interpreting the predictions.
Work is ongoing to improve these models through incorporation of additional sampling locations, and with an estimation of the spatial distribution of antimicrobial use for both animals and humans. Application of known prescribing and/or administration rates by drug class to mapped human and animal population densities will allow for inclusion of proportional antimicrobial inputs into the model. This approach will support spatial predictions of antimicrobial and ARG loads in the environment. In this study, we only used density of animals and humans as a proxy predictor of overall antimicrobial release, but this simplistic approach can be improved with antimicrobial-use data. Targeted field sampling and analysis efforts will also be essential to improve the resolution at the microscale. This detail is necessary to understand the influence of specific point sources of antimicrobials and ARG and to model potential mitigation strategies. Increased sampling on a smaller geographic  www.nature.com/scientificreports/ scale will also allow for incorporation of time as a measure of impact. Assessment of temporal factors is critical to correlate environmental findings with fluctuations in antimicrobial prescribing and the natural environment throughout the year.
In conclusion, this study highlights the value of using spatial analyses, different spatial scales, and sampling matrices, to design an environmental monitoring approach to advance our understanding of AMR persistence and dissemination. It also outlines the multiple professional disciplines and technical approaches needed to develop a comprehensive monitoring program for AMR and antimicrobials in the natural environment. A multidisciplinary approach to AMR ensures that all sectors (e.g., human health, clinic-based veterinary medicine, animal agriculture, crop management) that use antimicrobials are included in both the delineation of the problem and the identification of feasible solutions to mitigate release of antimicrobials and ARG into the natural environment. A collaborative, multisectoral approach at all levels-science, policy, and public engagement, is critical to protect human, animal, and ecosystem health.

Methods
Study area, study design, and sample collection. Water and sediment samples were collected from waterbodies (lakes, rivers, streams, creeks) throughout the state of Minnesota in the summers (June through August) of 2018 and 2019 to quantify antimicrobials and ARG. Summer was selected for sample collection as it is the only time of the year when this is feasible in Minnesota. Sample sites were selected by convenience in 2018. In 2019, spatial data for point sources such as wastewater treatment plants and animal agriculture locations, as well as other landscape features, were used to inform the sampling design. Sites in 2019 were selected at both a macro and micro scale. For the macroscale, study areas were identified throughout the state based on human and/or livestock densities. Within each of these study areas, points were identified where there was direct discharge of a wastewater treatment plant into a waterbody. The microscale sampling locations were then selected upstream, at the wastewater discharge, and downstream from the effluent with a maximum distance of 2 km spanning these microscale sites (Fig. 4). Additional details regarding sample collection procedures and locations are found in the Supplementary Information (Supplementary Methods and Supplementary Table S20).
Laboratory work. Antimicrobials. Chemical sources and purities relevant to antimicrobial extraction and measurement are reported in the Supplementary Information. Water samples were concentrated using solid phase extraction (SPE) using a method adapted from Kerrigan et al.(2018) and Meyer et al. (2007) 32,45 . Freeze dried sediment samples were solvent extracted with assistance of ultrasound and then cleaned up using SPE. Both water and sediment extracts were analyzed using liquid chromatography tandem mass spectrometry. De- Antimicrobial resistance genes. Filters obtained with the REXEED 25S ultrafiltration system in the field were back-flushed and concentrated as previously described 46 . The final concentrated solution (average volume = 9.6 ± 5.3 mL) was stored at − 20 °C until DNA extraction. To extract DNA, a small aliquot of the concentrated solution (0.3 mL) or sediment (average sample volume = 504 ± 29.7 mg) was mixed with 0.7 mL of lysis buffer (5% sodium dodecyl sulfate, 120 mM sodium phosphate buffer, pH 8). Mixed samples then underwent three freeze/thaw cycles followed by 90 min of incubation at 70 °C to lyse cells and release metagenomic DNA. DNA was then purified using the FastDNA™ kit (MP Biomedicals, Santa Ana, California, USA) as per manufacturer's instructions. A total of 20 ARG representing different molecular mechanisms of resistance and different antimicrobial classes were selected for this study. In addition, the 16S ribosomal RNA (rRNA) gene and integrase genes (intI1, intI2, and intI3) were included. Microfluidic qPCR (MF-qPCR) was performed as described previously 47 . Prior to MF-qPCR, every target gene except for 16S rRNA was amplified using the specific target amplification (STA) 47 . Primers and standard gBlocks® sequences used for this study are presented in Supplementary Table S24. The limit of quantification (LOQ) for the assay was 1000 copies/µL of DNA for the target gene mexB, and 10 copies/ µL of DNA for the rest of the target genes. Non-detects, defined as reactions that failed to produce a minimum amount of signal 48 , were replaced with ½ LOQ for those ARG that had non-detects in < 80% of the samples 18,19 . Those target genes that had ≥ 80% of non-detects across samples were removed from further analyses. Data analyses. Data sources. Livestock density at the state level was obtained from the Minnesota Pollution Control Agency (MPCA) 49 . These datasets included the geocoded location of the farms as well as the animal species and permitted number of total animal units on each farm 50 . Human population density at the census track level for Minnesota was obtained from the National Historical Geographic Information System (NHGIS) 51 , and wastewater treatment facility locations were obtained from the MPCA 52 . Human density, livestock density (broken out into swine, beef cattle, dairy, turkey, and chicken farms), wastewater treatment plant, ethanol plant, and hospital locations were mapped using ArcGIS Pro version 2.6.0. (ESRI®). Livestock density and human density data were extracted from a 5 km buffer around each sampling point. A small buffer size such as this was chosen to avoid the influence of other potential factors and sources on the outcome (antimicrobial concentrations and ARG abundance).
Descriptive statistics were conducted for both antimicrobials and ARG for each sample type (water and sediment) and year (2018 and 2019) separately. Antimicrobial concentrations (log 10 transformed prior to analyses to meet normality assumptions) and ARG abundance (expressed as absolute abundance as log 10 gene copies/L water or as log 10 gene copies/g sediment) from the 2019 data were evaluated individually for their association with environmental parameters (water temperature, pH, and conductivity) and with spatial data (human density, livestock density, and wastewater discharge) using Pearson correlation with the cor.test function from the stats package in R 53 . Alpha was set at 5% for statistical significance. From all antimicrobials analyzed in the laboratory, ciprofloxacin, tetracycline, and sulfadimethoxine were evaluated for water sample data analyses, while tetracycline and sulfadimethoxine were evaluated for sediment sample analyses. These antimicrobials were chosen as they represent different antimicrobial classes, have diverse chemistry, and have different use patterns (the fluoroquinolone class is often used in humans, and, at times, in companion animals and some livestock; tetracycline is broadly used across all animal species, including livestock; and sulfadimethoxine is used in veterinary medicine 24,[54][55][56][57][58]. Missing environmental parameters at sampling sites (water temperature, pH, conductivity) were imputed using the mean value of each parameter 59 . ARG abundance for 2019 was further evaluated for potential correlations with antimicrobial concentrations for each sample type. For these analyses, the function rcorr from the Hmisc package 60 and the package corrplot 61 were used, and an alpha was set at 5% for statistical significance.
For those waterbodies in 2019 with direct wastewater effluent discharge, the effect of the effluent on antimicrobial concentrations as well as on ARG abundance in water and sediment samples was evaluated by fitting linear mixed regression models (LMMs) to the log 10 transformed data using the lme4 62 and lmerTest packages 63 in R with the function lmer. In these models, waterbody was included as random effect, given the repeated sampling from the same locations in both 2018 and 2019, and site (upstream or U, downstream or D, wastewater discharge or WT) as a fixed effect, with upstream (U) as the reference level. Statistical significance was defined with an alpha level of 5%, and Satterthwaite's approximation was used to obtain the p values for the F test for each model 64 . Model assumptions were checked through the inspection of residual plots 65 . ARG abundance was also compared between the two-time sampling points for 2018 fitting LMMs as described previously. In these models, waterbody was included as a random effect, and time (T1: end of June-early July 2018, T2: end of July to mid-August 2018) as a fixed effect, with T1 as the reference level.
To assess spatial dependency, the Global Moran's I global clustering test was performed for each of the three antimicrobials stated above (ciprofloxacin, tetracycline, and sulfadimethoxine) and for the three ARG with the highest abundance. For the Moran's I global spatial measure, the null hypothesis tested was that there was no spatial autocorrelation across the study area 66 . An average fixed distance band of 30,765 m for water sample sites, and of 35,081 m for sediment sample sites, and Euclidean distance were used for Moran's I. The distance band was estimated using the tool 'Calculate Distance Band from Neighbor Count' from the Spatial Statistics toolbox of ArcGIS Pro (ESRI®). This tool determines the average distance between sampling locations (each location needs to have a minimum of one neighbor for Moran's I). If there was global clustering, the Anselin's Local Moran's I (LISA) test was used to indicate the physical location of the clustering, given Global Moran's I does not provide that level of detail 20 . For LISA, the same fixed distance band was used. The kernel interpolation (a.k.a. www.nature.com/scientificreports/ kernel smoothing) tool was used to visualize the antimicrobial concentration outcomes and ARG abundance from 2019 and to predict the antimicrobial concentrations and ARG abundance respectively beyond the sampling sites. Kernel interpolation was conducted using default parameters which can be found in Supplementary  Tables S25-S28. All geospatial analyses and mapping were conducted using ArcGIS Pro version 2.6.0. (ESRI®) and statistical analyses were conducted using R version 3.6.0 53 .

Data availability
All measurements of antimicrobial concentrations in water and sediment are available in the Data Repository for the University of Minnesota at https:// doi. org/ 10. 13020/ xcbx-t731. The other data used to support the findings of this study are available from the corresponding author upon request.