Boundary condition and oceanic impacts on the atmospheric water balance in limited area climate model ensembles

Regional climate models (RCMs) are indispensable in climate research, albeit often characterized by biased terrestrial precipitation and water budgets. This study identifies excess oceanic evaporation, in conjunction with the RCMs’ boundary conditions, as drivers contributing to these biases in RCMs with forced sea surface temperatures in a CORDEX RCM ensemble over Europe. The RCMs are relaxed to the prescribed lateral boundary conditions originating from a global model, effectively matching the driving model's overall atmospheric moisture flux divergence. As a consequence, excess oceanic evaporation results in positive precipitation biases over land due to forced internal recycling of moisture to maintain the overall flux divergence prescribed by the boundary conditions. This systematic behaviour is shown through an analysis of long-term atmospheric water budgets and atmospheric moisture exchange between oceanic and continental areas in a multi-model ensemble.

the atmospheric water budgets and indirectly precipitation through land-atmosphere feedback processes 40,41 , precipitation recycling 42 , and moisture transport 43 . Furthermore, Refs. [44][45][46] analysed the influence of sea surface temperatures (SSTs) on precipitation also over land areas. In Ref. 46 , corrected SST biases from the driving GCM led to circulation and atmospheric moisture transport changes in the RCM's water cycle and eventually to a reduced wet bias over continental areas of southern Africa. An increased evaporation over the Mediterranean Sea surface due to increased SSTs in Ref. 45 can lead to more extreme precipitation events during summer in Central Europe in association with cyclonic activity; a strong link also exists between Mediterranean SST and precipitation over the Anatolian Peninsula 47 . Reference 44 show in a climate change ensemble over the Baltic for some RCMs a positive correlation between SST and widespread precipitation.
In this study, we re-assess the atmospheric compartment of the hydrological cycle of RCMs [48][49][50] , as a driver of the of the terrestrial water budget. We follow theoretical water budget considerations applied previously 31,43,51,52 , where the simplified time-averaged general balance equation for total water in an atmospheric column is defined as 53 where W is the total column water content (in its liquid phase as precipitable water, as water vapor, and as ice), P is total precipitation (including snow fall), E is evaporation over the oceans and evapotranspiration over land. The atmospheric moisture divergence div Q is the vertically integrated horizontal moisture transport, defined as with the gravitational acceleration g, the specific humidity q, the horizontal wind vector V h , pressure p and surface pressure p s . Figure 2 gives a schematic overview of the water cycle components and fluxes, from the groundwater to the atmosphere. Because the atmospheric water storage change, i.e., the tendency term in Eq. (1), is over long time scales (here: dt is 1 year) negligible, the primary balance is between the atmospheric moisture divergence div Q and E and P. Thus, the atmospheric moisture divergence div Q is driven by E as the source and P as the sink. Hence a simplified atmospheric water budget or balance for a control volume of the RCM model domain, i.e., all land-ocean grid points, the land, and the ocean areas can be expressed as A net export of atmospheric water is expressed by a positive divergence, and vice versa. Because an RCM as a limited area model (LAM) numerically constitutes a boundary value problem, the atmospheric moisture divergence div Q of the complete model domain is constrained by lateral advection of the boundary forcing fields q in and q out (1) ∂W/∂t + div Q = E − P (2) div Q = −∇ · 1/g p s 0 qV h dp  www.nature.com/scientificreports/ from the driving model, such as a GCM. After initialisation, the RCM is driven by regularly updated, temporally and spatially interpolated meteorological boundary conditions for most of the prognostic variables. This is the fundamental property of a LAM. The procedure to ingest the forcing data into the RCM is still an issue that requires critical attention 4,5 . Based commonly on the theory described in Ref. 54 , a relaxation term is added within the lateral boundary relaxation zone to the prognostic equations, constraining the model to the boundary conditions 55 . Alternatively, less constrained boundary condition schemes than the Davies relaxation also exist, such as the Mesinger scheme 56 . In the reanalysis-driven coordinated RCM ensemble used in this study, all RCMs apply the same lateral boundary conditions based however on varying implementations of the relaxation in the different models of the ensemble. Through P and E, the atmospheric water budget is directly coupled with the terrestrial water budget, including groundwater where S is the total terrestrial water storage, R is runoff, and div Q g is groundwater divergence 52 .
The evapotranspiration over land, E L , and evaporation over the ocean, E O , is calculated based on bulk flux algorithms, using implementations of the Monin-Obukhov similarity theory framework 57 . Over water surfaces the turbulent vertical fluxes are calculated by the model's surface layer scheme, primarily controlled by the SSTs, friction velocities, and moisture, heat and momentum transfer coefficients. Over land, the friction velocities and transfer coefficients may be passed from the surface layer scheme to the land surface model to calculate the vertical fluxes. The surface scheme parameterizations differ between models and are very sensitive to the transfer coefficients, which in turn depend on the Monin-Obukhov universal functions and flux-profile relationships. It has been shown in model comparisons with forced SSTs that there are differences in E O 58 and that RCMs 58,59 tend to overestimate the latent heat flux over the oceans due to varying reasons 59,60 .
Using a simplified atmospheric water budget definition as given in Eq. (4), the goal of this study is to characterise the connection between the atmospheric water budget components over land and ocean in the constrained setup of an RCM ensemble, taken here from the CORDEX project, with prescribed div Q through the same reanalysis ERA-Interim boundary conditions. This constrained setup affords an assessment of the RCMs in maintaining the imposed global model's water budget and redistributing moisture within the model domain. While individual RCM results are not relevant in the characterisation, a model run identification is used to support the interpretation of results in comparison with evaluation studies such as Refs. 28,29,61 without the intention of pursuing a validation study or a detailed assessment of the individual behaviours of the RCMs. Through a conceptual budget study based on annual data, the goal is rather to re-assess fundamental atmospheric water source-sink relationships in a LAM in relation to the superimposed moisture flux divergence of the driving model along the boundaries.
The analyses show that in a state-of-the-art RCM ensemble over Europe using prescribed SSTs, the combination of the RCMs' numerical property of being a boundary value problem with excess oceanic evaporation www.nature.com/scientificreports/ contributes to the often-seen positive precipitation bias over land. The RCMs model solution is relaxed to the superimposed driving model's atmospheric moisture flux divergence, as prescribed through the lateral boundary conditions. As a result, excess atmospheric moisture is recycled through precipitation over land.

Results
Annual atmospheric water budgets. Figure (Figs. 1, 4), the atmospheric water budgets correspond to the overall moisture flux divergence div Q LO of the limited area models. As the atmospheric water storage change can be considered negligible over longer time spans (Eq. 3), div Q LO is expected to match the sink and source terms that are P LO  The boundary forcings may vary between individual RCMs due to differences in the configuration of the retrieval of the ERA-Interim data by different RCM modelling groups, differing RCM model grids, spatial interpolation settings, and boundary relaxation procedures. Thus, each RCM is constrained to potentially different boundary conditions, while resembling the same large-scale features of the driving REF forcing fields, which In a year such as 1996 with div Q LO ≈ 5 mm year −1 , i.e., P LO ≈ E LO , the lateral import is close the export of atmospheric water for the pan-European EUR-44 control volume under consideration. The fact that reanalyses such as REF are based on atmospheric water mass conservation constraints due to the analysis increments 43 is not relevant, because of the rationale that over the model domain, all MME members are constrained laterally by the same div Q LO of REF, which has to be matched in the simulations. However, in case of the pan-European MME under consideration, negative atmospheric water budgets (i.e., P LO > E LO ) dominate (9 out of 10 RCMs), with an average of div Q LO = − 36 mm year −1 over a 20-year period, despite the fact that all RCMs are supposedly driven by more or less the same atmospheric moisture and SST boundary conditions from REF. The obvious question is, where does the excess water originate from in the pan-European MME simulations?
In stark contrast, for the Mediterranean domain of Med-CORDEX ( Role of the ocean areas. The spatial patterns of E and P in Fig. 4 provide insight in the RCMs' systematic behaviour shown in Fig. 3; for brevity only shown for the pan-European model domain based on the EURO-CORDEX MME. The REF dataset in Fig. 4a,b shows typical long-term mean spatial patterns with regional P maxima associated with the Icelandic Low, orographic precipitation along the Norwegian coast, the Alps and Scotland, and declining precipitation in Northern Africa. The E distribution is characterized by a sharp contrast between ocean and land areas, and areas of high E associated with the North Atlantic Current, the Azores High and the Eastern Mediterranean. The spatial difference plot between the MME mean P and the REF data in Fig. 4c is characterized by generally higher MME precipitation in the Northern part of the model domain. The largest differences appear along mountain ranges, e.g., Norway, German mid mountain ranges, and the Pyrenees. In the 0.44° resolution RCMs, www.nature.com/scientificreports/ with stronger relief compared to the 0.75° resolution REF, more orographic rainfall and less rainfall in the adjacent lowlands is induced due to smoothing effects. Assuming a mean zonal atmospheric water transport with the prevailing westerly large-scale flow, the MME has a spinup (relaxation) zone along their westernmost boundary with less average precipitation than the global REF data (Fig. 4c), because the lateral forcing in the RCMs has to be picked up, e.g., by the microphysics and convection schemes before precipitation can be generated. Along the easternmost RCM boundary relaxation zone, where the main outflow from the RCMs takes place, an area with a maximum P difference indicates that the RCMs' atmospheric advection is relaxed against the lateral boundary conditions of the REF, which causes some of the excess precipitation in the RCMs. This is the area where eight out of nine RCMs in the EURO-CORDEX evaluation study of Ref. 28 (their Fig. 4) show strong precipitation biases during summer. The inter-model spread in P in the MME shown in Fig. 4e is also highest along the western and eastern relaxation zones. Orographic features and coastlines are also characterized by a larger spread, most likely due to differences in RCM model configurations, such as slight changes in the land-ocean masks or differences in the underlying orography used by the RCM.
The difference of E in Fig. 4d is indicative of larger values in the MME than in REF over nearly all ocean areas with a pronounced maximum difference in the Mediterranean, which is in line with an evaluation study of Mediterranean Sea water and heat budgets 58 and the E and P relationships in the eastern Mediterranean 63 . Differences between MME and REF over land are relatively small and might be attributed to differences in the landmask (close to the coast), landuse datasets (see, e.g., Southern Spain), or the RCMs' land surface model. Also, over land, the evapotranspiration is energy-or water-limited and controlled by vegetation 64 , indicated by the lower standard deviation of E of the MME in Fig. 4f over land as opposed to precipitation in Fig. 4e. However, over the ocean areas, especially the Mediterranean, the standard deviation σ(E O ) is very high, indicating a large MME spread and highly differing long-term mean latent heat fluxes, despite the fact that all MME members use the same prescribed SSTs from REF. This behaviour has been attributed to the choice of the surface layer scheme in combination with roughness length, exchange coefficients, and the stability of the boundary layer, model resolution, as well forced SST versus coupled ocean-atmosphere modelling approaches 47,58-60 . Figure 4g gives insight in the spatial distribution of source and sink areas of the atmospheric water balance. Independent of lateral in-or outflow of atmospheric moisture, the Mediterranean and southern parts of the Eastern Atlantic with the Azores High are net source areas of atmospheric moisture while continental areas act as net sinks, which is well-known 53,58,63,65 . In case of ocean areas, the difference of the E O -P O atmospheric water balances of MME and REF (Fig. 4h) reveals the largest differences over the Mediterranean due to a higher MME E O between 200 and 300 mm year −1 on average than in REF (Fig. 4d). The differences between the REF and MME atmospheric water balances over the Atlantic are smaller (Fig. 4h). Over land areas in Fig. 4h the continental atmospheric moisture sink is stronger in the MME E L -P L budget than in the REF budget, due to a higher P L in the RCMs (Fig. 4c).
Boundary condition effects and the connection of atmospheric water budgets over land and ocean-the origin of excess water. As the ocean areas play an important role in the RCMs' atmospheric water budgets, Fig. 5 separates the atmospheric water budget contributions of individual RCMs and REF on an annual and long-term basis for the EURO-CORDEX and Med-CORDEX models; the behaviour of the latter is described at the end.
In Fig. 5a, E LO versus P LO values are plotted based on all grid points. Apart from interannual variability, REF for the pan-European model domain has an almost divergence-free atmospheric water budget consistent with Fig. 3, where div Q LO is calculated based on the same value pairs. As expected, for ocean areas, E O > P O in Fig. 5b results in excess water in the atmosphere over the oceans. This leads to a characteristic E L < P L relationship over land (Fig. 5c), where the excess P L may generate runoff, and changes in the sub-surface storage and the div Q g term (Eq. 6), based on the parameterizations inherent in the different land surface schemes 52,53,65 . Considering the MME, the hydrological cycle is more intense for most RCMs compared to REF, with a higher E LO and P LO than REF (Fig. 5a), always constrained with the moisture inflow and outflow along the RCM boundaries as well as the SST. When comparing Fig. 5b,c, the relationship is basically inverted. With an infinite water source over the oceans, E O is literally unconstrained and close to the potential evaporation, E pot . Over land, E L is limited 64 , here P L shows a large spread, whose range resembles approximately the spread of E O . Interestingly, the 20-year means, of the individual RCMs show a clear tendency with nine out of ten falling below the 1:1 line, suggesting a negative divergence over the model domain enforced by the boundary condition, resulting in a surplus of precipitation over land areas (see Fig. 1). Thus, it appears that there are major differences between RCMs in the implementation of the same boundary condition of REF changing the model domain from being divergence-free to a net sink. In addition, means are shifted parallel to the 1:1 line due to an E O increase in comparison to REF (see Fig. 5b), leading to an additional contribution to the positive P L bias (see Fig. 1). This is the oceanic impact and is especially apparent in the results of individual RCMs that fall close to the 1:1 line. Thus, we identify differences in the implementation of the prescribed boundary conditions and increased oceanic evaporation as important factors in an intensified ocean-land recycling rate, that contributes to the positive precipitation bias in RCMs.
In comparison, the Med-CORDEX models are characterized by an atmospheric water budget surplus (see Fig. 3) and thus a positive divergence over the Mediterranean model domain. In Fig. 5d those RCMs follow REF in that all results fall above the 1:1 line, making the Mediterranean domain act as a net exporter of moisture, which is in contrast to the results for the pan-European model domain. The net export is however less pronounced in case of the RCMs illustrating again the differences in the implementation of the boundary conditions in the different models (a shift away from REF perpendicular to the 1:1 line). In addition, the RCM results are shifted parallel to the 1:1 line in the direction of an increased oceanic evaporation; in one of the coupled AORCMs this effect is clearly weaker than in the uncoupled counterpart.

The impact of the land on the RCMs' deviations from the REF budget.
To further assess how the MME water budgets deviate from REF as shown in Fig. 5d, we relate P and E from MME to REF. Following our rationale, the MME member should match the driving atmospheric water budget of REF where M represents all atmospheric moisture increments related to potential mass conservation continuity violations in the numerical implementation of the individual models; and div Q LO are the divergences that are Because this simplified mass budget framework does neither allow nor intend to determine the relative contributions of ΔM and Δdiv Q, these are summed in the increment ε = ΔM + Δdiv Q. Figure 6 shows the estimates of the different increments for REF and the RCMs for the pan-European and Mediterranean regions.
All but one pan-European RCMs show negative ε values (Fig. 6), i.e., they are net sinks and, thus, plot below the 1:1 line in Fig. 5d. In addition, the estimates of an increased E O and P L confirm the enhanced recycling rate that is identified as one of the factors for a positive precipitation bias in the RCMs; eight out of ten pan-European RCMs show this behaviour. The elevated E O leads also elevated P O , i.e., an enhanced recycling over the ocean, but there is a net transport onto the land areas. The individual models show a diverse behaviour with respect to the magnitude of their deviation from REF. What is generally apparent, however, is the low E L of many models in comparison to REF, which is another factor in the negative ε as described in Fig. 5d.
In Fig. 5d, pan-European models left of REF deviate either very little over the ocean from REF (CNRM-ALADIN53) or show a very strong recycling (HMS-ALADIN52). The Fig. 5d ocean-land atmospheric water budget relationship far off the 1:1 line of DMI-HIRHAM for example, is according to Fig. 6 mainly due to a combination of enhanced P L with low E L . The one model which is on average a moisture source and close to the 1:1 line in Fig. 5d (IPSL-INERIS-WRF331F) exhibits a higher E O than REF with enhanced recycling over water, but also a smaller than average E L deviation. At the same time, six out of ten pan-European RCMs, which show an above-average ΔP L as compared to REF in Fig. 6, also show high positive precipitation biases in Fig. 1; in three cases this is associated also with above-average E O and low E L deviations (HMS-ALADIN52, IPSL-INERIS-WRF331F, UCAN-WRF341I).   Fig. 6). In general, the RCMs' and AORCMs' atmospheric divergence is smaller than REF for the Mediterranean.

Discussion
Based on results from a EURO-CORDEX MME for a pan-European model domain, supplemented by a smaller Med-CORDEX MME for a Mediterranean model domain, our analysis indicates a clear deviation of individual RCMs of the integrated annual and long-term mean atmospheric water budgets from the driving REF ERA-Interim reanalysis. In other words, in the RCMs, differences in the implementation of the boundary conditions lead to differences in div Q LO between the RCMs and REF. Assuming that the RCMs are mass conserving, the boundary conditions will force the models' sink and source terms to balance the REF divergence via coupling of the atmosphere with the ocean and land surfaces. Thus, the prescribed divergence along the RCMs' boundary effectively determines the overall fluxes in the domain. Understanding the technical details of the implementation of the boundary forcing in the different models is clearly beyond the scope of this study. One potential reason may be that in the boundary relaxation, specific humidity and the velocities are treated individually. In addition, most RCMs show an intensified ocean-land moisture recycling that is systematic, overestimating E O compared to E O,REF and an excess P L .
From these RCM behaviours, we conclude that the differences in div Q LO in conjunction with excess E O contribute significantly to the positive P L biases seen in evaluation studies of RCMs 24,28,30,66 , because eventually the imposed atmospheric moisture flux divergence from the lateral boundary conditions of the driving global ERA-Interim reanalysis, must be met. Hence, the combination of boundary-imposed atmospheric moisture transport, oceanic surface layer fluxes, and recycling leads to biases in reproducing the observed terrestrial precipitation and the terrestrial water cycle. This behaviour is for some models more pronounced than for others.
The RCM and AORCM simulations of Med-CORDEX over the Mediterranean are all, in terms of the atmospheric water budget, net sources with a strong positive divergence over the model domain. In general, a similar behaviour can be observed as with the pan-European model runs.
With respect to the E O , model comparison studies with prescribed SSTs indicated differences in E O 58 and tendencies of RCMs to overestimate the latent heat flux over the oceans 58,59 . Reasons for the deviation from observations are overestimated surface exchange coefficients associated with high surface roughness in surface schemes; atmospheric stability associated with radiation schemes, that affect the surface energy balance and thereby coupling and fluxes 59 ; and atmospheric model horizontal resolution leading to different wind speed regimes and resulting turbulent exchange fluxes. Also, interactively coupled ocean-atmosphere models may lead to more realistic SSTs than RCM simulations with prescribed forced ocean SSTs 60 , which also affects precipitation over land areas [44][45][46] . The AORCMs with a coupled dynamic ocean used in this study also show a tendency, with two out of three model pairs, towards a reduction of the excessive E O seen in RCMs with prescribed SSTs.
The rationale of this study is founded on the basic atmospheric water mass conservation consideration, that is div Q = E-P must hold over long time scales. Because div Q is prescribed in a LAM, any changes in the source term must be balanced by changes in the sink term and vice versa, given that the numerical implementation follows mass conservation. This is independent of model resolution [34][35][36][37]67 , convection 36,37 and microphysics schemes 37,38 , aerosol treatment 39 , land-atmosphere feedback processes 40,41 or precipitation recycling 42 . These factors determine internal model behaviour and variability and come into play in studies of spatiotemporal patterns of fluxes and states of water cycle components, and local water budgets.
Our study expands these cause-and-effect relationships that are required in the explanation of the precipitation biases in RCMs. In summary, the driving model's superimposed moisture flux divergence, excess oceanic evaporation in RCMs with prescribed SSTs, and the recycling of moisture have a combined impact on terrestrial precipitation biases and water budgets. The results suggest a need for careful re-inspection of the RCMs' implementation of the boundary conditions and surface schemes, in conjunction with forced SSTs. In this context, Big Brother-type experiments could be used to assess the impacts, e.g., of the lateral boundary condition implementations 68 in the experiment preparation. These measures will help in reducing differences in the atmospheric water budgets and likely reduce excess E O in the RCMs and P L biases. The two major ongoing developments in regional climate modelling, towards convection-permitting spatial resolutions and coupled (particular ocean-atmosphere) regional Earth system models 4,5 , already address some of the sources of precipitation biases.

Methods
Model data. The RCM simulation results are from the World Climate Research Programme's (WCRP) Coordinated Regional Downscaling Experiment (CORDEX) project, a diagnostic model intercomparison project for CMIP6 9 . Data from two initiatives are used: EURO-CORDEX, for a pan-European model domain, is the main dataset under investigation; a supplementary smaller dataset is used from the Med-CORDEX initiative for a Mediterranean domain.
The EURO-CORDEX data is available through the Earth System Grid Federation (ESGF) data dissemination system 69 . Based on data availability on ESGF, ten RCM ensemble members from the EURO-CORDEX initiative, the European branch of the CORDEX project, are used. Important for this study, RCM data are from the EURO-CORDEX evaluation experiment 27,28 , i.e., the 3-to 6-hourly lateral boundary forcing and the prescribed daily SST, are from the ECMWF ERA-Interim reanalysis 70 and therefore similar for each ensemble member (driving experiment meta data: "ECMWF-ERAINT, evaluation, r1i1p1"). The following RCMs are used, identified by their unambiguous CORDEX experiment protocol institute ID, and the RCM model and  28,29,61 . RCM data are used on a CORDEX-defined rotated latitude-longitude EUR-44 grid at 0.44° horizontal resolution, with a 106 × 103 grid points focus domain. This EUR-44 grid defines the spatial reference throughout the study (Fig. 3). The boundary relaxation zone, which differs per RCM, depending on the individual dynamical downscaling setup, is removed before simulation results are checked for compliancy (CMOR standard) and ingested into the ESGF. Because the ALADIN RCMs use a different model grid, a 1st order conservative remapping to the EUR-44 grid with the Climate Data Operators (cdo) (v1.9.1) 71 , is applied. Variables used are precipitation [kg m −2 s −1 ] ("pr"), defined to include both liquid and solid phases from large-scale and convective clouds, and surface evaporation [kg m −2 s −1 ] ("evspsbl"), defined as the flux of water into the atmosphere due to conversion of both liquid and solid phases to vapor (from underlying surface and vegetation), at a daily resolution, available for the official EURO-CORDEX evaluation time span from 1989 to 2008, stored in netCDF files at 5-year intervals. For each RCM the respective landmask [0-100%] ("sftlf ") is available. The Med-CORDEX initiative 62 data are from the phase 1 core and tier 1 simulations, available through the Med-CORDEX data dissemination infrastructure (https ://www.medco rdex.eu, THREDDS server). The same variables as for EURO-CORDEX are used, albeit the base data retrieved are at a monthly temporal resolution. For compatibility, Med-CORDEX simulations at 0.44° horizontal grid resolution, on a 98 × 63 grid point focus domain (MED-44) are selected. This grid specification nearly exactly overlaps with the EUR-44 grid. Specific to the Med-CORDEX initiative experiment is the availability of coupled Regional Climate System Models (RCSM) or Atmosphere-Ocean Regional Climate Models (AORCM), that feature, e.g., fully interactive ocean model components, which cover the whole Mediterranean, usually at a higher resolution than the atmospheric model components. The subset of models used here has been selected as there is a coupled AORCM counterpart to each uncoupled RCM with prescribed SSTs. The following subset of three model pairs of Med-CORDEX models was available that matched the aforementioned criteria: CMCC/CCLM4-8-19/v2 (RCM, SST update daily from ERA-Interim) with CMCC/CCLM4-21-NEMOMFS/v1 (AORCM); CNRM/ALADIN52/v1 (RCM, SST update monthly from ERA-Interim) with CNRM/RCSM4 v1 (AORCM); GUF/CCLM4-21/v1 (RCM, SST update daily from ERA-Interim) with GUF/CCLM-NEMO/v1 (AORCM). Due to the complex model setups of the coupled AORCMs, model configurations are more heterogeneous than with EURO-CORDEX; for example, the computational grid in case of the AORCMs is often larger than the focus domain.
The ERA-Interim reanalysis 70 constitutes the RCM forcing and the reference dataset, "REF", in the analysis. The REF data are retrieved from the ECMWF's Meteorological Archival and Retrieval System (MARS) on a global grid with a 0.75° resolution. From the "Synoptic Monthly Means" data stream ("mnth"), monthly means of the 12 h accumulated sums for the 00UTC and 12UTC forecast start times are retrieved as grib files, covering 1989 to 2008. Variables used are total precipitation [m of water equivalent] ("tp", parameter ID 228), defined as "the accumulated liquid and frozen water, including rain and snow, that falls to the Earth's surface. It is the sum of large-scale precipitation … and convective precipitation …" (https ://apps.ecmwf .int/codes /grib/param -db?id=228), and evaporation [m of water equivalent] ("e", parameter ID 182), defined as "the accumulated amount of water that has evaporated from the Earth's surface, including a simplified representation of transpiration (from vegetation), into vapor in the air above" (https ://apps.ecmwf .int/codes /grib/param -db/?id=182). As with the ALADIN RCM data, REF is remapped to the EUR-44 grid using cdo 1st order conservative remapping after a format conversion from GRIB to netCDF, including scaling and offsetting for unit conversion to mm year −1 . The 0.75° resolution binary land-sea mask [0, 1] ("lsm", parameter ID 172) is resampled to the EUR-44 grid using a nearest neighbour resampling. Because there is nearly an exact match of the MED-44 and EUR-44 grids, the ERA-Interim reanalysis as prepared for the EUR-44 model domain is also used for MED-44, thereby reducing the Med-CORDEX model domain from 98 × 63 to 95 × 63 grid points. Given the larger deviations in model domain configuration and thereby differing spatial spinup zones with the Med-CORDEX MME, this simplification seems warranted.
Observations. Precipitation observations are used as an independent dataset to illustrate the actual precipitation bias of the RCMs. The E-OBS dataset 72 provides gridded daily precipitation data on a regular latitudelongitude grid at 0.1° and 0.25° resolution for Europe, based on the ECA&D (European Climate Assessment and Dataset) station data set, available through the Copernicus Climate Change Service. For this study, daily precipitation data [mm d −1 ] from the dataset version 21.0e of May 2020 is extracted using the cdo tools for the time span from 1989 to 2008, and temporally averaged to create long-term mean annual sums. A conservative remapping is applied to transfer the E-OBS data to the EUR-44 grid.
Data processing and analysis methods. The RCM and REF simulation base data for the extraction of P and E timeseries over the complete model domain, ocean, and land areas are in [mm year −1 ] water equivalent data arrays. Before the timeseries extraction, all data are on the original equal area 106 × 103 EUR-44 grid (10,918 grid elements total), or 95 × 63 MED-44 grid (5985 grid elements total, truncated from 98 × 63 grid dimension). Only REF is regridded from the regular 0.75 degree grid to the EUR-44 grid, see above. Based on a land-sea mask threshold of ≥ 0.5 for land, time series of the spatial sums over the ocean and land areas, as well as the complete domain are derived per RCM and for the REF dataset applying to each model its own land-sea mask. The land-sea masks of the individual models differ slightly from each other due to the individual preprocessing of the RCM's static fields; for the EUR-44 grid the difference of the land grid points of the binary landsea mask with the fewest land grid points to the one with the largest number of land grid points is 67 (0.6% of all grid points); the difference between the land grid points of the REF land-sea mask 5817 land grid points and the RCMs' land-sea mask average 5858 land grid points is about 0.4%, due to the different spatial resolutions of the Scientific Reports | (2021) 11:6228 | https://doi.org/10.1038/s41598-021-85744-y www.nature.com/scientificreports/ base data. The differences in the land-sea masks are considered small enough to neither justify a resampling of data to a common reference grid, taking into account a common land-sea mask, nor applying a weighing of spatial sums or means of precipitation and evapotranspiration based on the different number of land-sea mask grid points. For comparisons that involve the complete model domain or land or ocean areas exclusively, the spatial means of the annual sums are used (e.g., Figs. 3, 5a-c); in case ocean and land areas are compared, the means over the oceans and land areas are normalised by multiplying with the proportion of ocean and land points to the overall number of grid points, so that adding the weighted averages for the ocean and land areas yields the domain spatial average (Fig. 5d).

Data availability
The base data for the study (3D monthly and yearly data per RCM and REF on EUR-44 and MED-44 grid, extracted time series, land-sea masks, and base data for the plots, including the resampling weights for the REF data grid transformation) are available from the "Data Publication Server Forschungszentrum Jülich" under https ://www.re3da ta.org/repos itory /r3d10 00129 23.