Historical total ozone radiative forcing derived from CMIP6 simulations

Radiative forcing (RF) time series for total ozone from 1850 up to the present day are calculated based on historical simulations of ozone from 10 climate models contributing to the Coupled Model Intercomparison Project Phase 6 (CMIP6). In addition, RF is calculated for ozone fields prepared as an input for CMIP6 models without chemistry schemes and from a chemical transport model simulation. A radiative kernel for ozone is constructed and used to derive the RF. The ozone RF in 2010 (2005–2014) relative to 1850 is 0.35 W m−2 [0.08–0.61] (5–95% uncertainty range) based on models with both tropospheric and stratospheric chemistry. One of these models has a negative present-day total ozone RF. Excluding this model, the present-day ozone RF increases to 0.39 W m−2 [0.27–0.51] (5–95% uncertainty range). The rest of the models have RF close to or stronger than the RF time series assessed by the Intergovernmental Panel on Climate Change in the fifth assessment report with the primary driver likely being the new precursor emissions used in CMIP6. The rapid adjustments beyond stratospheric temperature are estimated to be weak and thus the RF is a good measure of effective radiative forcing.


INTRODUCTION
Tropospheric ozone has increased since pre-industrial times primarily due to photochemical reactions involving components emitted to the atmosphere by anthropogenic activities 1 .Such emissions include methane (CH 4 ), nitrogen oxides (NOx), carbon monoxide (CO), and non-methane volatile organic components (NMVOCs), collectively referred to as ozone precursors.In the stratosphere, the amount of ozone has decreased primarily due to anthropogenic emissions of ozone-depleting substances (ODSs) 2 .
Ozone is a greenhouse gas and considered as a "near-term climate forcer" 3 due to its short global mean lifetime of ~23 days in the troposphere for the present day 4 .Hence, it is not well-mixed in the atmosphere and that challenges the estimation of the historical development of ozone in the atmosphere.There are limited long-term historical observations of ozone, in spatial coverage and in the vertical [5][6][7][8] .For climate impact calculations, the vertical distribution of ozone is crucial, as its effect on trapping thermal infrared radiation is largest in the upper troposphere [9][10][11] .Due to insufficient long-term observations, the time series of the historical development of ozone and its radiative forcing (RF) since pre-industrial times must be estimated using models.
Ozone is the third strongest anthropogenic greenhouse gas forcer assessed by the Intergovernmental Panel on Climate Change (IPCC) in the fifth assessment report (AR5) with a total RF of 0.35 (0.15-0.55)W m −2 in 2011 relative to pre-industrial times (1750), where ozone changes in the troposphere are estimated to exert a RF of 0.40 (0.20-0.60)W m −2 and RF due to changes in stratospheric ozone of −0.05 (−0.15 to +0.05) W m −2 (5-95% uncertainty ranges) 3 .One of the main uncertainties in the estimates of tropospheric ozone RF is the incomplete knowledge of pre-industrial ozone concentrations 12 .The chemistry models have not been able to reproduce the low levels of ozone observed at the end of the nineteenth century 12 ; however, those observations are highly uncertain 7,8,13 .A recent study puts forward a constraint on historical ozone levels using oxygen isotopes in ice cores and they concluded that historical ozone levels were in line with modeled ozone levels, and that early observations were biased low 14 .
Understanding past drivers of climate change is crucial for assessing future climate change.The climate models contributing to the Coupled Model Intercomparison Project Phase 6 (CMIP6) 15 do not calculate the RF of the drivers by default.Separate initiatives under CMIP6, the Radiative Forcing Model Intercomparison Project (RFMIP) 16 , and the Aerosol and Chemistry Model Intercomparison Project (AerChemMIP) 17 aim to give an accurate characterization of the effective RF (ERF) in the climate models and assess how anthropogenic emissions have contributed to global forcing over the historical period, respectively.The ERF includes rapid adjustments due to changes in atmospheric temperature, water vapor, and clouds without any response from surface temperature changes 3,18,19 and goes beyond RF, which only includes the adjustment from stratospheric temperature 3 .ERF is the best measure to compare the global mean surface temperature change between different forcing mechanisms for perturbations to the Earth's radiation budget, and in particular when the effect of the small land surface warming is removed in climate simulations keeping sea surface temperature constant 20 .
In this study, we calculate the historical time development of total ozone RF since 1850 from the Earth system models contributing to CMIP6 (Table 1).A radiative kernel is developed for total ozone RF by detailed radiative transfer code calculations on idealized ozone perturbations (see "Methods" section and Supplementary Fig. 1).This kernel is used to calculate the RF time series based on the modeled changes in ozone mixing ratios since the 1850s in the historical CMIP6 experiment.For those CMIP6 models without ozone chemistry schemes, ozone fields are provided through the input data sets for Model Intercomparison Projects (input4MIPs).In addition, we use the kernel to calculate the RF on these fields and compare the results with that of Checa-Garcia et al. 21, who calculated RF on the input4MIPs ozone fields and the fields provided for the previous phase of CMIP.We focus on forcing due to total ozone changes, including both the effect of ozone increases due to ozone precursors and ozone decreases due to ODSs.Although these two effects mainly affect ozone in the troposphere and stratosphere, respectively, the two processes can have a substantial effect in the other region as well 22,23 .Ozone RF is often split into a tropospheric and stratospheric component 3 , but to split the RF based on the drivers (i.e., ozone precursors vs. ODSs) instead of the region in the atmosphere where the changes happen is more relevant for mitigation 23 .The RF calculated based on CMIP6 ozone fields is complemented with RF estimates based on simulations from a chemical transport model (CTM).Using the CTM results, RF is calculated due to changes in tropospheric ozone precursors only (ODSs changes excluded).In addition, RF is calculated in 1850 relative to 1750, to account for a different baseline, and the RF is extended up to 2020 (see "Methods" section).

RESULTS
The vertical distribution of ozone is crucial for the RF calculations.In Fig. 1, the relative difference in annual zonal mean O 3 between 1850 and 2010 is shown.The largest relative change in ozone over this period is found in the Northern Hemisphere (NH) troposphere, for the models that include tropospheric chemistry (all models, except the CNRM models).The increase is largest in the lower troposphere at mid-latitudes, with a tongue reaching up to the upper tropical troposphere where the relative increase for the models ranges from ~40 to ~80%.These are the regions with the largest ozone precursor emission sources 24 and highest insolation, respectively.All models have an ozone reduction in the upper stratosphere and also in the lower stratosphere at the southernmost latitudes over Antarctica.The weakest maximum reduction in this region of 13% is found for the MRI-ESM2-0 model.The strongest ozone depletion is modeled by the UKESM1-0-LL model with a maximum reduction of 58%.This model is also the model with the weakest increase in tropospheric ozone, with an increase of 15-20% in the upper tropical troposphere in the Southern Hemisphere (SH), and it is likely to be that the ozone depletion has affected the troposphere to a larger extent in this model than for the other models.In other regions of the stratosphere, the sign of the change in ozone differs among the models.In particular, ozone depletion in the lower stratosphere at high northern latitudes is strong in some models, but less evident in other models for the annual means.
For the period before stratospheric ozone depletion, there is a large spread in the modeled total ozone column with ~60 Dobson Units (DU) difference in latitude bands between 60°S and 60°N for the annual mean and >100 DU difference in both polar regions during spring (Supplementary Fig. 2).At the end of the period, 2010, the simulated total ozone still has a large spread, but the values are lower.The temporal development also differs among the models.The largest decrease in ozone is seen for the UKESM1-0-LL model with the most rapid reduction between 1980 and 1990.In the GISS-E2-1-H model, total ozone increases from 1950 to 1980, with a rapid decrease thereafter.Ground-based measurements and satellite data are available from around 1980 for observations of total ozone 25 .Weber et al. 25 estimated the trends in the observed total ozone prior to 1997 and from 1997 to 2016.In Fig. 2, these two trends are compared with the trends in the models from 1980 to 2000, and from 2000 to 2010.This is not an exact comparison, as the modeled data are averaged over 10 years.Most models have a negative trend for all regions over the period 1980-2000, whereas the models simulate a recovery post 2000, with trends closer to zero or above.This is the same feature as seen in the trend for the observations as estimated by Weber et al. 25 .
Changes in ozone in the upper free troposphere produce stronger RF per DU change than ozone changes in the stratosphere and changes closer to the surface (Supplementary Fig. 1).However, there are limited observations available for evaluating the long-term trend of ozone in the troposphere, e.g., Oltmans et al. 6 and Tarasick et al. 7 .The modeled ozone distribution for the year 2000 is compared with ozone sonde data in Supplementary troposphere, sonde data have been combined [26][27][28] to generate the Trajectory-mapped Ozonesonde dataset for the Stratosphere and Troposphere (TOST) product.Gaudel et al. 5 calculated trends over the period 1998 to 2012 using the TOST product for different latitude bands (shown in Fig. 3).The trend in TOST is close to zero between 30°S to 60°S similar to the trend in the models.In the northern extra tropics, a small positive trend is seen for the models that include tropospheric chemistry and for the TOST product.In the tropics, the TOST increase is larger than the modeled trends (30°S to 0 and 0 to 30°N).The largest increase in ozone in TOST was found after 2008 in the tropics, which we might not be able to capture due to the 10-year averages used in the figure.Ozone distribution and trends are influenced by internal natural variability including changes to the stratosphere-troposphere exchange [29][30][31][32] .The models simulate one possible climate realization and will not be able to reproduce observed inter-annual variability due to dynamical processes.A long time period is needed for tropospheric ozone trends assessment 8 .Also, Gaudel et al. 5 compared the trend in TOST with trends in satellite-based product of tropospheric ozone.The products differ in the quantification of tropospheric ozone trends, both in sign and magnitude, and to determine why the trend in satellite products differ is a remaining issue 5 .The long-term trend in the modeled tropospheric ozone column is also shown in Fig. 3.The most rapid increase occurred between 1950 and 1980, prior to the period of good coverage in spatial and temporal scale of observations.The increase in ozone since 1850 is larger in the NH than in the SH.The increase in global tropospheric ozone since 1850 has a range of 40-60% for the models with tropospheric chemistry, except for the UKESM1-0-LL.
The increase in the UKESM1-0-LL is ~24% since 1850, based on the tropopause diagnosed from the OsloCTM3 model (defined as potential vorticity of 2.5 potential vorticity unit, with upper limit 380 K potential temperature and lower limit of 5 km); this will differ from the tropopause definition used in other studies and in the individual models.However, Sellar et al. 33 found a similar relative increase in tropospheric ozone of ~23% from a preindustrial baseline for a different tropopause definition for UKESM1-0-LL.For comparison, in the Atmospheric Chemistry and Climate Model Intercomparison Project the multi-model mean for tropospheric ozone increased by 41% in 2000 relative to 1850 4 .With the exception of UKESM1-0-LL, the increase in tropospheric ozone in 2000 relative to 1850 is larger than 40% for all the models (Fig. 3).

Radiative forcing
Figure 4 shows the kernel-derived total ozone forcing time evolution from the modeled changes in ozone since 1850 (base period 1850-59).The majority of the models show an increase in the forcing over the entire time period and the calculated RF time series are close to or, more frequently, above the best estimate of the RF time series in IPCC AR5 3 (Fig. 4).At the end of the time period (2010), the forcing is in the upper part of the range assessed by IPCC for total ozone RF in 2011 (Fig. 4).For the two CNRM models without tropospheric chemistry, the estimated RF in 2010 is −0.09 and −0.12 W m −2 , within the range of ozone RF due to ODS assessed by IPCC AR5 3 .The outlier of the models, which includes both tropospheric and stratospheric chemistry, is UKESM1-0-LL, with a negative total ozone forcing in 2010, only slightly weaker than the CNRM models without tropospheric Also shown in Fig. 4 are the forcing estimates of Checa-Garcia et al. 21using the input4MIPs data.It is noteworthy that the averaging periods in Checa-Garcia et al. 21are not the same as in this study, but the estimated RF for the input4MIPs data are similar to our values.In addition, if the RF calculation is split between long-wave and short-wave forcing, the trend is similar for the explicit RF calculations of Checa-Garcia et al. 21and for the kernel calculations in this study (Supplementary Figs 4 and 5).The shortwave RF is slightly stronger using the kernel compared with Checa-Garcia et al. 21.
The temporal development of the RF since 1970 differs among the models (Fig. 4).For some models, the RF increases over the whole period (OsloCTM3, CESM-models, and BCC-ESM1).Some models have a flat or decreasing trend between 1980 and 1990 followed by an increase (GFDL-ESM4, input4MIPs, and E3SM-1-0, which used input4MIPs data in the troposphere).The GISS-E2-1-H and MRI-ESM2-0 models show an increase up to 1990.For the MRI-ESM2-0, the RF is constant thereafter, whereas GISS-E2-1-H has a renewed growth after 2000.
For OsloCTM3, the time period considered is extended both forward to 2020 (using IAMC-MESSAGE-GLOBIOM-ssp245 scenario emissions) and backwards to 1750.The RF in 1850 relative to 1750 is 0.03 W m −2 , less than the 0.04 W m −2 from Skeie et al. 34 adopted by Stevenson et al. 12 .For the period post 2010, the RF is 0.55 W m −2 in 2014, 0.54 W m −2 in 2017, and 0.56 W m −2 in 2020 relative to 1850.It is noteworthy that the OsloCTM3 model has the strongest total ozone RF and the value in 2010 (0.52 W m −2 ) is slightly above the upper range provided by IPCC AR5 (Fig. 4).
The RF between 2000 and 1850 from an additional simulation using OsloCTM3 with pre-industrial ODS in 2000 was 0.50 W m −2 .The RF of 0.50 W m −2 due to changes in tropospheric ozone precursors is in the middle of the range assessed by the IPCC AR5 of 0.50 (0.30-0.70)W m −2 (5-95% uncertainty range) for ozone precursors 3 and similar to the results of Søvde et al. 22 (0.52 W m −2 ) and of Shindell et al. 23 (~0.5 W m −2 ).Subtracting the RF from this simulation from the total ozone RF in OsloCTM3 gives an ozone RF of −0.06 W m −2 due to ODS, which is in the weaker part of the range assessed by the IPCC AR5 of −0.15 (−0.30 to 0.0) W m −2 (5-95% uncertainty range) 3 .For the two models that do not include tropospheric chemistry, ozone RF from ODSs is −0.13 W m −2 (CNRM-ESM2-1) and −0.08 W m −2 (CNRM-CM6-1) in 2000.The change is small in these two models for RF between 2000 and 2010 (Fig. 4).The OsloCTM3 RF and CNRM results for RF due to ODSs are in the weaker range of the IPCC AR5.Although not explicitly calculated, the ozone RF due to ODSs in UKESM1-0-LL is likely significantly stronger than in the OsloCTM3 and in the CNRM models.

Effective radiative forcing
The rapid adjustments other than stratospheric temperature adjustment are weaker than 0.1 W m −2 in a mean of simulations by three models (see "Methods") using different sets of ozone fields adjusted to the present day (2014) relative to pre-industrial (1850).The adjustments from surface temperature, tropospheric temperature, surface albedo, water vapor, and clouds nearly cancel (each of the rapid adjustments are of ~0.1 W m −2 in magnitude or weaker).See Supplementary Fig. 6 for the rapid adjustment terms as a mean of the three models and different approaches for their quantification.The rapid adjustment calculations are within the uncertainties in these approaches and thus our RF estimates can be treated as similar to ERF.

DISCUSSION
As CMIP6 model runs do not provide direct means of calculating time-varying RF due to anthropogenic changes in ozone, we calculate ozone forcing in this study based on the CMIP6 simulations over the historical period for those models that include tropospheric and/or stratospheric chemistry, using a radiative kernel.In addition, this study includes RF estimates using the OsloCTM3, a CTM driven by the same emission inventory as used in CMIP6, and RF estimates based on the ozone fields from input4MIPs prepared for CMIP6.
The full range of total ozone RF in 2010 (2005-2014) for models considering ozone changes in both troposphere and stratosphere is −0.064W m −2 to 0.56 W m −2 relative to the 1850s.UKESM1-0-LL is the only model outside the range of AR5 for ozone and a clear outlier (see discussion below).Two models do not include tropospheric chemistry; hence, the ozone RF calculated is due to ODSs alone.These, as well as estimates from an additional simulation with the OsloCTM3, result in RF in the weaker part of the range assessed by IPCC AR5 for ozone changes due to ODSs.Similar to earlier findings from ODS-only simulations (which keep ozone precursor emissions constant), ozone decreases due to ODSs affect both the stratospheric and the tropospheric ozone burden 22,23 .
The UKESM1-0-LL model appears to have a much stronger ERF due to ODSs than many of the other models (Morgenstern et al., in review), although not explicitly calculated here.O'Connor et al. 35 (revised version), e.g., report a present-day ERF from ODS of −0.18 W m -2 (including the direct radiative effect of the ODS themselves, the direct radiative effect of ozone, and other rapid adjustments) relative to the pre-industrial period.Given that previous estimates of the direct RF of ODS and ozone depletion are 0.36 W m -2 and −0.15 (−0.30 to 0.0) W m −2 (5-95% uncertainty range), respectively 3 , the offset by ozone depletion in UKESM1-0-LL appears to be too strongly negative.The study by Fig. 4 The time development of total ozone RF relative to 1850.The total ozone RF are calculated using a radiative kernel on ozone fields for the historical experiment in CMIP6, the input4MIPs data and OsloCTM3 results.Also shown in the figure is the forcing time development calculated by Checa-Garcia et al. 21on the input4MIPs data, the total ozone forcing time series from IPCC AR5 relative to 1850, the total ozone RF range, and the range of ODS-caused ozone RF assessed by IPCC AR5 in 2011 3 adjusted to 1850 as the base period.Models without tropospheric chemistry are indicated with a triangle.The RF based on time-slice simulations using the OsloCTM3 is indicated by a square.The underlying numbers for this figure are presented in Supplementary Table 2.

Morgenstern et al. (in review
) uses observed ozone trends from 1979 to 2000, to constrain the ERF due to ODSs from an ensemble of models and shows stronger ozone depletion in UKESM1-0-LL than observed, particularly in the SH.Using the tropospheric ozone kernel by Rap et al. 36 on the UKESM1-0-LL model resulted in an RF in the lower range of IPCC AR5, similar to tropospheric RF calculated here if the stratosphere is masked (Supplementary Figs.7a and 8a for two different definitions of the tropopause).This low estimate relative to IPCC AR5 may be due to the offset of −0.1 W m −2 from ODS on tropospheric ozone RF at the present day calculated by O'Connor et al. 35 , which is much stronger an effect from stratospheric ozone depletion than previous estimates (e.g., Søvde et al. 22 ; Shindell et al. 23 ).For the stratospheric ozone RF (Supplementary Figs.7b and 8b), the UKESM1-0-LL is outside the range from IPCC AR5 3 and is consistent with the Morgenstern et al. study.There are indications that the ozone hole is too deep in UKESM1-0-LL, does not capture the observed inter-annual variability in ozone, and the negative trends in polar latitudes are too strong 33 .The model has larger ozone burden in Antarctic spring compared with the other CMIP6 models in the preindustrial (Supplementary Fig. 2), whereas the OsloCTM3 with stratospheric ozone RF at and above the upper range of IPCC AR5 (Supplementary Figs.7b and 8b) is the model with the lowest preindustrial total ozone in the Antarctic spring (Supplementary Fig. 2).For the present day, the two models have modeled total ozone that is more similar to the rest of the models (Supplementary Fig. 2), but the magnitude in forcing differs.The main difference in total ozone is prior to the satellite era and it is hence difficult to exclude models, but we note UKESM1-0-LL as an outlier.UKESM1-0-LL is not able to reproduce the 1960-2000 surface temperature development 33 , although fixing ODS at 1950 levels, carried out as part of AerChemMIP 17 , does not have a large impact on the modeled global mean surface temperature behavior.
The multi-model mean for total ozone RF is 0.35 W m −2 with an SD of 0.16 W m −2 in 2010 (2005-2014) relative to 1850 (1850-1859) for the CMIP6 models (excluding the two models that do not include tropospheric chemistry), OsloCTM3 and input4MIPs (see Supplementary Table 2).This is 0.03 W m −2 stronger than what is calculated using input4MIPs data.Excluding, in addition, the outlier model, the multi-model mean estimate increases to 0.39 W m −2 , 0.08 W m −2 stronger RF than calculated using the input4MIPs data only.The use of the kernel on input4MIPs ozone changes provides RF within 4% of the presentday total ozone RF (0.30 W m −2 for 2010-2014 relative to 1850-1860) in Checa-Garcia et al. 21using a different radiative transfer code on the same ozone fields.It is noteworthy that the time periods considered in Checa-Garcia et al. 21and this study differ.
Although the total RF calculated for the input4MIPs ozone field (0.31 W m −2 ) lies at the lower end of the CMIP6 range, it is similar to the AR5 mean (0.31 W m −2 relative to 1850).This can be explained by the input4MIPs ozone being a merger of two model simulations, which, for their historical simulations, included emissions following the Chemistry-Climate Model Initiative's data protocol 37 .These emissions represent a merger of CMIP5 38 and RCP8.5 emissions, and not the official CMIP6 precursor emissions (which were not yet available at the time the input4MIPs database had to be prepared for CMIP6).On a global scale, the emission of NOx showed a rapid increase from 1990 to 2000, with ~15% higher anthropogenic emissions for the year 2000 in the CMIP6 emissions inventory compared with CMIP5, in which emissions remained stable from 1990 to 2000 24 .As a consequence, CMIP6 precursor emissions result in global ozone changes that are fairly similar to those in input4MIPs up to the 1980s; however, they show a much stronger increase thereafter.The fact that the difference is mainly emission-driven can also be seen when comparing the input4MIPs ozone changes in Fig. 2 with those from CESM2(WACCM6), a slightly earlier version of which was used to derive the input4MIPs ozone.This also provides a possible explanation why most models driven by emissions of ozone precursors in CMIP6 have a higher RF compared with the IPCC AR5 best estimate 3 , which was heavily based on model simulations using the emission inventory of Lamarque et al. 38 , and a different time evolution post 1990.Post 1990, the RF based on the CESM2 models increases, whereas the RF based on the input4MIPs data are more stable (Fig. 4).Uncertainties in the historical emission inventories of tropospheric ozone precursors are not assessed in Hoesly et al. 24 , but assumed to be larger than the difference between the CMIP5 and CMIP6 emission inventories at the global scale, larger on the regional scale, and to increase back in time 24 .Uncertainties in the precursor emissions beyond the difference between CMIP5 and CMIP6 emissions are not considered in this study.
Finally, it is important to keep in mind that the spread in the total ozone RF as presented here are not based on results from 12 independent models.There are relationships among the models and the input4MIPs data (see Supplementary Table 1).CESM2 (CAM6) uses prescribed ozone fields from a CESM2(WACCM6) simulation, BCC-ESM1 uses an earlier version of the chemistry scheme in CESM in the troposphere and input4MIPs data in the stratosphere, whereas E3SM-1-0 uses input4MIPs data in the troposphere.Also, the input4MIPs data are constructed using the chemistry schemes in an earlier version of the CESM(WACCM) model.If related models are excluded, the uncertainty range slightly increases (Supplementary Table 2).

METHODS Data
The climate models in CMIP6 include ozone in different ways for the historical simulations from 1850 up to the present day.Some models use prescribed ozone concentrations from the input4MIPs 39 , whereas other models include chemistry schemes, but they vary in complexity.Mainly, chemistry schemes in the models are either developed for the whole atmosphere, only for the troposphere or only for the stratosphere.In this study, we use a radiative kernel (see below) to calculate RF since 1850.We include the models that have uploaded monthly ozone fields to the CMIP6 database for the historical experiment, except those that use the ozone fields provided by input4MIPs and NorESM, which uses the same ozone input fields as CESM2(CAM6) and GISS-E2-1-G, which is identical to GISS-E2-1-H, except a different ocean model.Kernel ozone RF results for the models using input4MIPs, the NorESM compared with CESM2(CAM6), and GISS-E2-1-G compared with GISS-E2-1-H are shown in Supplementary Fig. 9.An overview of the models included in this study is presented in Table 1.When several ensemble members are available, one member is chosen (Table 1).Total ozone RF is also calculated based on the ozone fields provided by input4MIPs and results from a CTM 40 using the historical emission inventories prepared for CMIP6 24,41 .

Chemical transport model
The CMIP6 simulations are supplemented by time-slice simulations using the Oslo Chemistry transport model (OsloCTM3) 40,42 .It is a threedimensional chemistry transport model driven by three-hourly meteorological forecast data by the Open Integrated Forecast System (Open IFS, cycle 38 revision 1) at the European Centre for Medium-Range Weather Forecasts.The horizontal resolution is ~2.25°×2.25°with 60 vertical layers ranging from the surface up to 0.1 hPa.
Time-slice simulations are performed for the years 1750, 1850, 1950,  1970, 1980, 1990, 2000, 2010, 2014, 2017, and 2020, but with fixed year 2010 meteorological data for all simulations.For the year simulated, historical anthropogenic emissions are from Hoesly et al. 24 and biomass burning emissions from van Marle et al. 41 , while natural emissions are kept constant.For the year 2017 and 2020, gridded emissions using the IAMC-MESSAGE-GLOBIOM-ssp245 scenario 43 are used.A linear interpolation between 2015 and 2020 is assumed for 2017 emissions.The upper boundary conditions are taken from simulations done with the Oslo 2D model and the surface concentration of methane is scaled to observed concentration.Spin up is 1 year and the concentrations of long-lived components from the restart files used for initialization are scaled to historical values.Year 1850 is used as baseline in the forcing calculations.Forcing calculations for OsloCTM3 are also performed for the period between 1750 and 1850.It is noteworthy that the OsloCTM3 results are snapshots rather than the 10-year averages used for the CMIP6 results.An additional simulation with year 2000 emissions and pre-industrial stratospheric ODS concentrations is performed to isolate the forcing due to tropospheric ozone precursors.

The radiative kernel
A radiative kernel is constructed using a radiative transfer model, perturbing ozone in one model layer at a time, giving gridded monthly fields of RF per DU change (Supplementary Fig. 1).The radiative transfer code 44 consists of a broadband code for long-wave radiation 45 and the DIScrete Ordinate Radiative Transfer code for short-wave radiation 46 .The RF (all-sky) of total ozone changes is quantified at the tropopause, with stratospheric temperature adjustment.When stratospheric temperature adjustment is included, the RF at the tropopause will be similar to the forcing at the top of the atmosphere.The stratospheric temperature adjustment is calculated using the fixed dynamical heating approach, during an iterative process, taking short-wave and long-wave radiation into account 47 .The temperature changes from this iterative process will differ with altitude at the various model layers in the stratosphere.Meteorological fields of temperature, water vapor, and clouds are taken from the Integrated Forecast System of the European Centre for Medium-Range Weather Forecasts for year 2003 and applied as monthly mean fields.RF of well-mixed greenhouse gases using monthly data is within 1% of daily simulations 45 .The difference using the radiative kernel and the radiative transfer model applied for generating the kernel applying the OsloCTM3 ozone change is 0.001 W m −2 for both short-wave and long-wave radiation.

RF calculations
A radiative kernel is used for the RF calculation.The CMIP6 model results and input4MIPs ozone fields are bilinearly interpolated onto a T21 grid (~5.6°× 5.6°) and linearly interpolated in pressure to 60 vertical levels (ranging from the surface and up to 0.1 hPa), which is the resolution of the radiative kernel.Values in levels above the model top (Supplementary Table 1) are set to zero.To eliminate natural variability, 10-year averages are calculated.The time periods considered are 1850 (1850-1859) as the reference period for the forcing calculations, 1920 (1916-1925), 1930  (1926-1935), 1940 (1936-1945), 1950 (1946-1955), 1960 (1956-1965), 1970  (1966-1975), 1980 (1976-1985), 1990 (1986-1995), 2000 (1996-2005), 2007  (2003-2012), and 2010 (2005 to 2014).These 10-year averages of the mixing ratios are used in radiative kernel.To convert to DUs, the atmospheric fields from the OsloCTM3 are used.All results presented in this manuscript are for these interpolated and averaged model results, but with a horizontal distribution of T42 (~2.8°× 2.8°) for ozone field figures.As for the CMIP6 model results, the CTM results are interpolated to the T21 grid for the kernel calculations and T42 resolutions for the ozone field figures.

Effective radiative forcing
The rapid adjustment calculations other than stratospheric temperature adjustment are taken from two sources; offline calculations from the GFDL model as an additional simulation using setup as in RFMIP 16 and simulations from the Precipitation Driver Response Model Intercomparison Project (PDRMIP) 48 using the two models NorESM1 and MIROC-SPRINTARS.The estimates of rapid adjustments in the two PDRMIP models follow the approach in Smith et al. 49 and Myhre et al. 50, whereby the radiative kernel technique 51 is applied to fixed sea surface temperature simulations.In this method, individual rapid adjustments are diagnosed by multiplying temperature, water vapor, and surface albedo radiative kernels by the model's simulated change in those variables.Cloud rapid adjustments are diagnosed from change in cloud RF corrected for cloud masking using the non-cloud radiative kernels.This accounts for the fact that change in cloud RF includes not only cloud rapid adjustments but the radiative differences in non-cloud rapid adjustments under all-sky vs. clear-sky conditions.For added robustness, estimates are conducted with five different sets of radiative kernels 49,[51][52][53] , including one set where water vapor and cloud adjustments are alternatively estimated using a partial radiative perturbation approach.This is a more direct but more computationally intensive method than radiative kernels where offline radiative transfer calculations are conducted on the base state of the model, while rapid adjustment variables from the perturbed forcing simulation are substituted into the calculation one at a time.The RFMIP calculations include ozone changes from pre-industrial (1850) to present-day and PDRMIP calculations apply a large perturbation of ten times anthropogenic changes to ozone into the lower part of the atmosphere from MacIntosh, et al. 54 for NorESM1 and a five times anthropogenic emissions in MIROC-SPRINTARS.

Fig. 3 .Fig. 1
Fig. 3.The models that include tropospheric chemistry are mostly within the range of the observations, indicating that seasonal cycle, latitudinal distribution, and vertical distribution of tropospheric ozone are well simulated.For ozone trends in the free

Fig. 2
Fig. 2 Total ozone trend in different latitude bands for two time periods.The period 1980-2000 to the left and the period 2000-2010 to the right, compared with the trend prior and post 1997 from Weber et al. 25 using different merged data sets from satellite and ground-based observations for the period 1979-2016.The error bar indicates the 2σ trend uncertainty and does not include uncertainties due to possible drifts between the data sets and from the merging procedure.

Fig. 3
Fig. 3 Relative change in tropospheric ozone (since 1850) and recent trend in tropospheric ozone.The relative change in tropospheric ozone (since 1850) in different latitude bands (a-e) and trend in tropospheric ozone 2000 to 2010 compared with the trend 1998 to 2012 in the TOST product 5 (f-i).The error bars indicate the 95% confidence intervals for the trend.The tropopause used for the models is the OsloCTM3 tropopause that is based on potential vorticity from the meteorological data generated by the Open IFS model at ECMWF.This definition differs from the one used in TOST, which is based on the WMO definition (2 K km −1 lapse rate) and individual models' own definition.Models without tropospheric chemistry are indicated with a triangle.

Table 1 .
Overview of CMIP6 models and other ozone data used in this study.
40Indicated in the table is if the model includes a chemistry scheme in the troposphere and/or stratosphere (Y: yes, N: no, P: prescribed), ensemble member, table ID, and model references.A brief description of the models including the vertical resolution of model output are presented in Supplementary Table1.